DATA MINING AND DATA WAREHOUSING
SEMESTER – VI
Course Code-18CS641
CIE Marks-40
Number of Contact Hours/Week-3:0:0
SEE Marks-60
Total Number of Contact Hours-40
Exam Hours-03
CREDITS –3
Course Learning Objectives: This course (18CS641) will enable students to:
ï‚· Define multi-dimensional data models.ï‚· Explain rules related to association, classification and clustering analysis.
ï‚· Compare and contrast between different classification and clustering algorithms
Module 1
Data Warehousing & modeling: Basic Concepts: Data Warehousing: A multitier Architecture, Data warehouse models: Enterprise warehouse, Datamart and virtual warehouse, Extraction, Transformation and loading, Data Cube: A multidimensional data model, Stars, Snowflakes and Fact constellations: Schemas for multidimensional Data models, Dimensions: The role of concept Hierarchies, Measures: Their Categorization and computation, Typical OLAP OperationsTextbook 2: Ch.4.1,4.2
RBT: L1, L2, L3
Click here to download Module-1
Module 2
Data warehouse implementation& Data mining: Efficient Data Cube computation: An overview, Indexing OLAP Data: Bitmap index and join index, Efficient processing of OLAP Queries, OLAP server Architecture ROLAP versus MOLAP Versus HOLAP. : Introduction: What is data mining, Challenges, Data Mining Tasks, Data: Types of Data, Data Quality, Data Preprocessing, Measures of Similarity and Dissimilarity.Textbook 2: Ch.4.4
Textbook 1: Ch.1.1,1.2,1.4, 2.1 to 2.4
RBT: L1, L2, L3
Click here to download Module-2
Module 3
Association Analysis: Association Analysis: Problem Definition, Frequent Item set Generation, Rule generation. Alternative Methods for Generating Frequent Item sets, FP-Growth Algorithm, Evaluation of Association Patterns.Textbook 1: Ch 6.1 to 6.7 (Excluding 6.4)
RBT: L1, L2, L3
Click here to download Module-3
Module 4
Classification: Decision Trees Induction, Method for Comparing Classifiers, Rule Based Classifiers, Nearest Neighbor Classifiers, Bayesian Classifiers.Textbook 1: Ch 4.3,4.6,5.1,5.2,5.3
RBT: L1, L2, L3
Click here to download Module-4
Module 5
Clustering Analysis: Overview, K-Means, Agglomerative Hierarchical Clustering, DBSCAN, Cluster Evaluation, Density-Based Clustering, Graph-Based Clustering, Scalable Clustering Algorithms.Textbook 1: Ch 8.1 to 8.5, 9.3 to 9.5
RBT: L1, L2, L3
Click here to download Module-5
Important Links:
Course Outcomes: The student will be able to :
ï‚· Identify data mining problems and implement the data warehouseï‚· Write association rules for a given data pattern.
ï‚· Choose between classification and clustering solution.
Question Paper Pattern:
ï‚· The question paper will have ten questions.ï‚· Each full Question consisting of 20 marks
ï‚· There will be 2 full questions (with a maximum of four sub questions) from each module.
ï‚· Each full question will have sub questions covering all the topics under a module.
ï‚· The students will have to answer 5 full questions, selecting one full question from each module.
Textbooks:
1. Pang-Ning Tan, Michael Steinbach, Vipin Kumar: Introduction to Data Mining, Pearson, First impression,2014.2. Jiawei Han, Micheline Kamber, Jian Pei: Data Mining -Concepts and Techniques, 3rd Edition, Morgan Kaufmann Publisher, 2012.
0 Comments