Introduction to Big Data Analytics:
Big Data, Scalability and Parallel Processing, Designing Data Architecture, Data Sources, Quality, Pre-Processing and Storing, Data Storage and Analysis, Big Data Analytics Applications and Case Studies.
Text book 1: Chapter 1: 1.2 -1.7
RBT: L1, L2, L3
Introduction to Hadoop (T1):
Introduction, Hadoop and its Ecosystem, Hadoop Distributed File System, MapReduce Framework and Programming Model, Hadoop Yarn, Hadoop Ecosystem Tools.
Hadoop Distributed File System Basics (T2):
HDFS Design Features, Components, HDFS User Commands.
Essential Hadoop Tools (T2):
Using Apache Pig, Hive, Sqoop, Flume, Oozie, HBase.
Text book 1: Chapter 2 :2.1-2.6
Text Book 2: Chapter 3 Text Book 2: Chapter 7 (except walk throughs)
RBT: L1, L2, L3
NoSQL Big Data Management, MongoDB and Cassandra:
Introduction, NoSQL Data Store, NoSQL Data Architecture Patterns, NoSQL to Manage Big Data, Shared-Nothing Architecture for Big Data Tasks, MongoDB, Databases, Cassandra Databases.
Text book 1: Chapter 3: 3.1-3.7
RBT: L1, L2, L3
MapReduce, Hive and Pig:
Introduction, MapReduce Map Tasks, Reduce Tasks and MapReduce Execution, Composing MapReduce for Calculations and Algorithms, Hive, HiveQL, Pig.
Text book 1: Chapter 4: 4.1-4.6
RBT: L1, L2, L3
Machine Learning Algorithms for Big Data Analytics:
Introduction, Estimating the relationships, Outliers, Variances, Probability Distributions, and Correlations, Regression analysis, Finding Similar Items, Similarity of Sets and Collaborative Filtering, Frequent Itemsets and Association Rule Mining. Text, Web Content, Link, and Social Network Analytics: Introduction, Text mining, Web Mining, Web Content and Web Usage Analytics, Page Rank, Structure of Web and analyzing a Web Graph, Social Network as Graphs and Social Network Analytics:
Text book 1: Chapter 6: 6.1 to 6.5
Text book 1: Chapter 9: 9.1 to 9.5
Course Outcomes:
The student will be able to:
Question Paper Pattern:
Textbooks:
1. Raj Kamal and Preeti Saxena, “Big Data Analytics Introduction to Hadoop, Spark, and Machine-Learning”, McGraw Hill Education, 2018 ISBN: 9789353164966, 9353164966
2. Douglas Eadline, "Hadoop 2 Quick-Start Guide: Learn the Essentials of Big Data Computing in the Apache Hadoop 2 Ecosystem", 1 stEdition, Pearson Education, 2016. ISBN13: 978-9332570351
Reference Books:
1. Tom White, “Hadoop: The Definitive Guide”, 4 th Edition, O‟Reilly Media, 2015.ISBN-13: 978- 9352130672
2. Boris Lublinsky, Kevin T Smith, Alexey Yakubovich, "Professional Hadoop Solutions", 1 stEdition, Wrox Press, 2014ISBN-13: 978-8126551071
3. Eric Sammer, "Hadoop Operations: A Guide for Developers and Administrators",1 stEdition, O'Reilly Media, 2012.ISBN-13: 978-9350239261
4. Arshdeep Bahga, Vijay Madisetti, "Big Data Analytics: A Hands-On Approach", 1st Edition, VPT Publications, 2018. ISBN-13: 978-0996025577