About Me

header ads

BIG DATA ANALYTICS (BIS701)

BIG DATA ANALYTICS

Course Code BIS701 
CIE Marks 50
Teaching Hours/Week (L:T:P: S) 3:0:2:0 
SEE Marks 50
Total Hours of Pedagogy 40 hours Theory + 8-10 Lab slots 
Total Marks 100
Credits 04 
Exam Hours 3
Examination nature (SEE) Theory/practical




MODULE-1

Classification of data, Characteristics, Evolution and definition of Big data, What is Big data, Why Big data,

Traditional Business Intelligence Vs Big Data,Typical data warehouse and Hadoop environment.

Big Data Analytics: What is Big data Analytics, Classification of Analytics, Importance of Big Data

Analytics, Technologies used in Big data Environments, Few Top Analytical Tools , NoSQL, Hadoop.

TB1: Ch 1: 1.1, Ch2: 2.1-2.5,2.7,2.9-2.11, Ch3: 3.2,3.5,3.8,3.12, Ch4: 4.1,4.2




MODULE-2

Introduction to Hadoop: Introducing hadoop, Why hadoop, Why not RDBMS, RDBMS Vs Hadoop, History

of Hadoop, Hadoop overview, Use case of Hadoop, HDFS (Hadoop Distributed File System),Processing data

with Hadoop, Managing resources and applications with Hadoop YARN(Yet Another Resource Negotiator).

Introduction to Map Reduce Programming: Introduction, Mapper, Reducer, Combiner, Partitioner,

Searching, Sorting, Compression.

TB1: Ch 5: 5.1-,5.8, 5.10-5.12, Ch 8: 8.1 - 8.8




MODULE-3

Introduction to MongoDB: What is MongoDB, Why MongoDB, Terms used in RDBMS and MongoDB,

Data Types in MongoDB, MongoDB Query Language.

TB1: Ch 6: 6.1-6.5




MODULE-4

Introduction to Hive: What is Hive, Hive Architecture, Hive data types, Hive file formats, Hive Query

Language (HQL), RC File implementation, User Defined Function (UDF).

Introduction to Pig: What is Pig, Anatomy of Pig, Pig on Hadoop, Pig Philosophy, Use case for Pig, Pig

Latin Overview, Data types in Pig, Running Pig, Execution Modes of Pig, HDFS Commands, Relational

Operators, Eval Function, Complex Data Types, Piggy Bank, User Defined Function, Pig Vs Hive.

TB1: Ch 9: 9.1-9.6,9.8, Ch 10: 10.1 - 10.15, 10.22




MODULE-5

Spark and Big Data Analytics: Spark, Introduction to Data Analysis with Spark.

Text, Web Content and Link Analytics: Introduction, Text Mining, Web Mining, Web Content and Web

Usage Analytics, Page Rank, Structure of Web and Analyzing a Web Graph.

TB2: Ch5: 5.2,5.3, Ch 9: 9.1-9.4




PRACTICAL COMPONENT OF IPCC


Experiments 

1 Install Hadoop and Implement the following file management tasks in Hadoop:

Adding files and directories

Retrieving files

Deleting files and directories.

Hint: A typical Hadoop workflow creates data files (such as log files) elsewhere and copies them into

HDFS using one of the above command line utilities.

2 Develop a MapReduce program to implement Matrix Multiplication

3 Develop a Map Reduce program that mines weather data and displays appropriate messages indicating the weather conditions of the day.

4 Develop a MapReduce program to find the tags associated with each movie by analyzing movie lens

data.

5 Implement Functions: Count – Sort – Limit – Skip – Aggregate using MongoDB

6 Write Pig Latin scripts to sort, group, join, project, and filter the data.

7 Use Hive to create, alter, and drop databases, tables, views, functions, and indexes.

8 Implement a word count program in Hadoop and Spark.

9 Use CDH (Cloudera Distribution for Hadoop) and HUE (Hadoop User Interface) to analyze data and

generate reports for sample datasets




Suggested Learning Resources:

Books:

1. Seema Acharya and Subhashini Chellappan “Big data and Analytics” Wiley India Publishers, 2nd Edition, 2019.

2. Rajkamal and Preeti Saxena, “Big Data Analytics, Introduction to Hadoop, Spark and Machine Learning”, McGraw Hill Publication, 2019.



Reference Books:

1. Adam Shook and Donald Mine, “MapReduce Design Patterns: Building Effective Algorithms and Analytics

for Hadoop and Other Systems” - O'Reilly 2012

2. Tom White, “Hadoop: The Definitive Guide” 4th Edition, O’reilly Media, 2015.

3. Thomas Erl, Wajid Khattak, and Paul Buhler, Big Data Fundamentals: Concepts, Drivers & Techniques,

Pearson India Education Service Pvt. Ltd., 1st Edition, 2016

4. John D. Kelleher, Brian Mac Namee, Aoife D'Arcy -Fundamentals of Machine Learning for Predictive Data

Analytics: Algorithms, Worked Examples, MIT Press 2020, 2nd Edition

Post a Comment

0 Comments