Overview and concepts Data Warehousing and Business Intelligence:
Why reporting and Analysing data, Raw data to valuable information-Lifecycle of Data - What is Business Intelligence - BI and DW in today‟s perspective - What is data warehousing - The building Blocks: Defining Features - Data warehouses and data 1marts - Overview of the components - Metadata in the data warehouse - Need for data warehousing - Basic elements of data warehousing - trends in data warehousing. The Architecture of BI and DW BI and DW architectures and its types - Relation between BI and DW - OLAP (Online analytical processing) definitions - Difference between OLAP and OLTP - Dimensional analysis - What are cubes? Drill-down and roll-up - slice and dice or rotation - OLAP models - ROLAP versus MOLAP - defining schemas: Stars, snowflakes and fact constellations.
Introduction to data mining (DM):
Motivation for Data Mining - Data Mining-Definition and Functionalities – Classification of DM Systems - DM task primitives - Integration of a Data Mining system with a Database or a Data Warehouse - Issues in DM – KDD Process Data Pre-processing:Why to pre-process data? - Data cleaning: Missing Values, Noisy Data - Data Integration and transformation - Data Reduction: Data cube aggregation, Dimensionality reduction - Data Compression - Numerosity Reduction - Data Mining Primitives - Languages and System Architectures: Task relevant data - Kind of Knowledge to be mined - Discretization and Concept Hierarchy.
Concept Description and Association Rule Mining
What is concept description? - Data Generalization and summarization-based characterization - Attribute relevance - class comparisons Association Rule Mining: Market basket analysis - basic concepts - Finding frequent item sets: Apriori algorithm - generating rules – Improved Apriori algorithm – Incremental ARM – Associative Classification – Rule Mining.
Classification and prediction:
What is classification and prediction? – Issues regarding Classification and prediction: Classification methods: Decision tree, Bayesian Classification, Rule based, CART, Neural Network Prediction methods: Linear and nonlinear regression, Logistic Regression. Introduction of tools such as DB Miner /WEKA/DTREG DM Tools.
Data Mining for Business Intelligence Applications:
Data mining for business Applications like Balanced Scorecard, Fraud Detection, Click stream Mining, Market Segmentation, retail industry, telecommunications industry, banking & finance and CRM etc., Data Analytics Life Cycle: Introduction to Big data Business Analytics - State of the practice in analytics role of data scientists Key roles for successful analytic project - Main phases of life cycle - Developing core deliverables for stakeholders. TeachingLearning Process Chalk and talk method / PowerPoint Presentation
Assessment Details (both CIE and SEE)
Continuous Internal Evaluation:
1. Three Unit Tests each of 20 Marks
2. Two assignments each of 20 Marks or one Skill Development Activity of 40 marks to attain the COs and POs
The sum of three tests, two assignments/skill Development Activities, will be scaled down to 50 marks
CIE methods /question paper is designed to attain the different levels of Bloom’s taxonomy as per the outcome defined for the course.
Semester End Examination:
1. The SEE question paper will be set for 100 marks and the marks scored will be proportionately reduced to 50.
2. The question paper will have ten full questions carrying equal marks.
3. Each full question is for 20 marks. There will be two full questions (with a maximum of four sub-questions) from each module.
4. Each full question will have a sub-question covering all the topics under a module.
5. The students will have to answer five full questions, selecting one full question from each module
Suggested Learning Resources:
Text Books:
1. J. Han, M. Kamber, “Data Mining Concepts and Techniques”, Morgan Kaufmann
2. M. Kantardzic, “Data mining: Concepts, models, methods and algorithms, John Wiley &Sons Inc.
3. PaulrajPonnian, “Data Warehousing Fundamentals”, John Willey.
4. M. Dunham, “Data Mining: Introductory and Advanced Topics”, Pearson Education.
5. G. Shmueli, N.R. Patel, P.C. Bruce, “Data Mining for Business Intelligence: Concepts, Techniques, and Applications in Microsoft Office Excel with XLMiner”, Wiley India
Skill Development Activities Suggested
Course outcome (Course Skill Set)
At the end of the course the student will be able to :
CO1 Analyse the concept of data warehouse, Business Intelligence and OLAP. L2
CO2 Demonstrate data pre-processing techniques and application of association rule mining Algorithms. L2
CO3 Apply various classification algorithms and evaluation of classifiers for the given Problem. L2
CO4 Analyse data mining for various business intelligence applications for the given problem. L2
CO5 Apply classification and regression techniques for the given problem. L2
Program Outcome of this course
1 Engineering knowledge: Apply the knowledge of mathematics, science, engineering fundamentals, and computer science and business systems to the solution of complex engineering and societal problems. PO1
2 Problem analysis: Identify, formulate, review research literature, and analyze complex engineering and business problems reaching substantiated conclusions using first principles of mathematics, natural sciences, and engineering sciences. PO2
3 Design/development of solutions: Design solutions for complex engineering problems and design system components or processes that meet the specified needs with appropriate consideration for the public health and safety, and the cultural, societal, and environmental considerations. PO3
4 Conduct investigations of complex problems: Use research-based knowledge and research methods including design of experiments, analysis and interpretation of data, and synthesis of the information to provide valid conclusions. PO4
5 Modern tool usage: Create, select, and apply appropriate techniques, resources, and modern engineering and IT tools including prediction and modeling to complex engineering activities with an understanding of the limitations PO5
6 The engineer and society: Apply reasoning informed by the contextual knowledge to assess societal, health, safety, legal and cultural issues and the consequent responsibilities relevant to the professional engineering and business practices. PO6
7 Environment and sustainability: Understand the impact of the professional engineering solutions in business societal and environmental contexts, and demonstrate the knowledge of, and need for sustainable development. PO7
8 Ethics: Apply ethical principles and commit to professional ethics and responsibilities and norms of the engineering and business practices. PO8
9 Individual and team work: Function effectively as an individual, and as a member or leader in diverse teams, and in multidisciplinary settings. PO9
10 Communication: Communicate effectively on complex engineering activities with the engineering community and with society at large, such as, being able to comprehend and write effective reports and design documentation, make effective presentations, and give and receive clear instructions. PO10
11 Project management and finance: Demonstrate knowledge and understanding of the engineering, business and management principles and apply these to one‟s own work, as a member and leader in a team, to manage projects and in multidisciplinary environments. PO11
12 Life-long learning: Recognize the need for, and have the preparation and ability to engage in independent and life-long learning in the broadest context of technological change. PO12