Data Science (IT-8003)
rgpv bhopal, diploma, rgpv syllabus, rgpv time table, how to get transcript from rgpv, rgpvonline,rgpv question paper, rgpv online question paper, rgpv admit card, rgpv papers, rgpv scheme
RGPV notes CBGS Bachelor of engineering
Syllabus
UNIT 1:
Introduction, Grasping the Fundamentals of Big Data, The Evolution of Data
Management, Defining Big Data, Building a Successful Big Data Management Architecture,
Beginning with capture, organize, integrate, analyze, and act, Setting the architectural
foundation, Performance matters, Big Data Types, Defining Structured Data, sources of big
structured data, role of relational databases in big data, Defining Unstructured Data, sources of
unstructured data, Integrating data types into a big data environment
UNIT 2:
Statistics- Population, Sample, Sampled data, Sample space, Random sample,
Sampling distribution, Variable, Variation, Frequency, Random variable, Uniform random
variable, Exponential random variable, Mean, Median, Range, Mode, Variance, Standard
deviation, Correlation, Linear Correlation, Correlation and Causality, Regression, Linear
Regression, Linear Regression with Nonlinear Substitution, Classification, Classification
Criteria, NaiveBayes Classifier,SupportVector Machine
UNIT 3:
Introduction Data Analytics, Drivers for analytics, Core Components of analytical data
architecture, Data warehouse architecture, column oriented database, Parallel vs. distributed
processing, Shared nothing data architecture and Massive parallel processing, Elastic
scalability, Data loading patterns, Data Analytics lifecycle: Discovery, Data Preparation, Model
Planning, Model Building, Communicating results and findings, Methods: K means clustering,
Associationrules.
UNIT 4:
Data Science Tools- Cluster Architecture vs Traditional Architecture, Hadoop, Hadoop
vs.Distributed databases, The building blocks of Hadoop, Hadoop datatypes, Hadoop software
stack, Deployment of Hadoop in data center, Hadoop infrastructure, HDFS concepts, Blocks,
Name nodes and Data nodes, Overview of HBase, Hive, Cassandra and Hypertable,Sqoop.
UNIT 5:
Introduction to R, Data Manipulation and Statistical Analysis with R, Basics, Simple
manipulations, Numbers and vectors, Input/Output, Arrays and Matrices, Loops and
conditional execution, functions, Data Structures, Data transformations, Strings and dates,
Graphics.
NOTES
- Unit 1
- Unit 2
- Unit 3
- Unit 4
- Unit 5
Books Recommended
1. Judith Hurwitz, Alan Nugent, Fern Halper, Marcia Kaufman, Wiley Big Data For Dummies, 3
2. Runkler, Thomas A., Springer Vieweg Data Analytics, Models and Algorithms for Intelligent Data
Analysis
3.Vignesh Prajapati Big Data Analytics with R and Hadoop, Packt Publication,
You May Also Like
- IT-8001 - Information Security
- IT-8002 - Soft Computing
- IT-8003 - Digital Image Processing [Elective-V]
- IT-8003 - Information theory and coding [Elective-V]
- IT-8004 - Data Mining & Warehousing [Elective-VI]
- IT-8004 - Internet of Things [Elective-VI]
- IT-8004 - Unix & Shell Programming [Elective-VI]
- IT-8005 - Project-II
- IT-8006 - Lab (Elective-VI)
- IT-8007 - Group Discussion (Internal Assessment)