CS/IT TECHNOLOGIES:

The Big data Hadoop Advance program

Big Data refers to technologies and initiatives that include data that is also different, fast-altering or huge for traditional technologies, skills and infra- structure to tackle efficiently. Said separately, the volume, velocity or variety of data is too great.
The course cover what is big data, Howhadoop ropes concepts of Big Data and how diverse components like Pig, Hive, sqoop, Hbase, Flume &MapReduce of hadoopwithstand large sets of data analysis.
Big Data is the fresh murmur work linking the latest culture of data analysis. Data administration has tilted its attention from a significant proficiency to a decisive differentiator that can decide market winners. So to run down with trendies check out this program to understand basics of Big Data. This course is intended for people who want to acknowledge what big data is.

After completion students will be able to:

  • be taught What is Big Data
  • explanation
  • necessities
  • be trained at HDFS , MapReduce , Hive, PIg, Sqoop, Flume and HBASE
  • Use of all component in matching to Hadoop
  • How does dissimilar workings behave supporting the concept of Big Data

Recommended to:

  • Any analytic seeking to study the potential technology

Course Syllabus

    Chapter-1: Introduction to BigData, Hadoop
  • Big Data Introduction
  • Hadoop Introduction
  • What is Hadoop? Why Hadoop?
  • Hadoop History?
  • Different types of Components in Hadoop? HDFS, Map Reduce, PIG, Hive, SQOOP, HBASE, OOZIE, Flume, Zookeeper and so on…
  • What is the scope of Hadoop?
  • Chapter-2: Deep Drive in HDFS (for Storing the Data)
  • Introduction of HDFS
  • HDFS Design
  • HDFS role in Hadoop
  • Features of HDFS
  • Daemons of Hadoopand its functionality
    • Name Node
    • Secondary Name Node
    • Job Tracker
    • Data Node
    • Task Tracker
  • Anatomy of File Wright
  • Anatomy of File Read
  • Network Topology
    • Nodes
    • Racks
    • Data Center
  • Parallel Copying using DistCp
  • Basic Configuration for HDFS
  • Data Organization
    • Blocks
    • Replication
  • Rack Awareness
  • Heartbeat Signal
  • How to Store the Data into HDFS
  • How to Read the Data from HDFS
  • Accessing HDFS (Introduction of Basic UNIX commands)
  • Chapter-3: MapReduce using Java (Processing the Data)
  • CLI commands
  • Introduction of MapReduce.
  • MapReduce Architecture
  • Dataflow in MapReduce
    • Splits
    • Mapper
    • Portioning
    • Sort and shuffle Combiner
    • Reducer
  • Understand Difference Between Block and InputSplit
  • Role of RecordReader
  • Basic Configuration of MapReduce
  • MapReduce life cycle
    • Driver Code
    • Mapper
    • and Reducer
  • How MapReduce Works
  • Writing and Executing the Basic MapReduce Program using Java
  • Submission & Initialization of MapReduce Job.
  • File Input/output
  • Formatsin MapReduce Jobs
    • Text Input Format
    • Key Value Input Format
    • Sequence File Input Format
    • NLine Input FormatJoins
    • Mapside Joins
    • Reducer
    • Side Joins
  • Word Count Example
  • Side Data Distribution
    • Distributed Cache (with Program)
  • Counters (with Program)
    • Types of Counters
    • Task Counters
    • Job Counters
    • User Defined Counters
    • Propagation of Counters
  • Job Scheduling
  • Chapter-4: PIG
  • Introduction to Apache PIG
  • Introduction to PIG Data Flow Engine
  • MapReduce vs PIG in detail
  • When should PIG used?
  • Data Types in PIG
  • Basic PIG programming
  • Modes of Execution in PIG
    • Local Modeand
    • MapReduce Mode
  • Execution Mechanisms
    • Grunt Shell
    • Script
    • Embedded
  • Operators/Transformations in PIG
  • PIG UDF’swith Program
  • Word Count Examplein PIG
  • The difference between the MapReduce and PIG
  • Chapter-5: SQOOP
  • Introduction to SQOOP
  • Use of SQOOP
  • Connect to mySql database
  • SQOOP commands
    • Import
    • Export
    • Eval
    • Codegen and etc…
  • Joins in SQOOP
  • Export to MySQL
  • Export to HBase
  • Chapter-6: HIVE
  • Introduction to HIVE
  • HIVE Meta Store
  • HIVE Architecture
  • Tables in HIVE
  • Managed Tables
    • External Tables
  • Hive Data Types
    • Primitive Types
    • Complex Types
  • Joins in HIVE
  • Partition
  • HIVE UDF’s and UADF’s with Programs
  • Word Count Example
  • Chapter-7: HBASE
  • Introduction to HBASE
  • Basic Configurations of HBASE
  • Fundamentals of HBase
  • What is NoSQL?
  • HBase DataModel
    • Table and Row
    • Column Family and Column Qualifier
    • Cell and its Versioning
  • Categories of NoSQL Data Bases
    • KeyValue Database
    • Document Database
  • Column Family Database
  • HBASE Architecture
    • HMaster
    • Region Servers
    • Regions
    • MemStore
    • Store SQL vs NOSQL
  • How HBASE is differ from RDBMS
  • HDFS vs HBase Client side buffering or bulk uploads
  • HBase Designing Tables
  • HBase Operations
    • Get
    • Scan
    • Put
    • Delete
    Chapter-8: MongoDB
  • What is MongoDB?
  • Where to Use?
  • Configuration On Windows
  • Insertingthe data into MongoDB?
  • Reading the MongoDB data.
  • Chapter-9: Cluster Setup
  • Downloading and installing the Ubuntu12.x
  • Installing Java
  • Installing Hadoop
  • Creating Cluster
  • Increasing Decreasing the Cluster size
  • Monitoring the Cluster Health
  • Starting and Stoppingthe Nodes
  • Chapter-10: Zookeeper
  • Introduction Zookeeper
  • Data Modal
  • Operations
  • Chapter-11: OOZIE
  • Introduction to OOZIE
  • Use of OOZIE
  • Where to use?
  • Chapter-12: Flume
  • Introduction to Flume
  • Uses of Flume
  • Flume Architecture
    • Flume Master
    • Flume Collectors
    • Flume Agents
    Chapter-13: Impala
  • Over View
  • Data Load
  • Architecture
  • Hands-on
  • Hive vs Impala
  • Chapter-14: Apache Spark
  • Spark Architecture
    • RDD
  • Integration with Hadoop
    • Text File
  • Introduction to Spark Sql
    • CSV data
  • Spark Streaming Architecture
  • Dstreams
  • Project: Project Explanations with Architecture

Course Information

  • Class Start: Every Monday, Wednesday & Friday
  • Course Duration: 60 hours(40 hours for Software Training & 20 hours for Project Handling)
  • Student Capacity: 8-12 students per batch
  • Certification: For Software Training(1) & For Project Handling(1)
  • Course Benefits Include:
    • Industrial Visit
    • Tool Kit
    • Lifelong Support
    • Placement Guaranteed
    • Project Handling
    • Resume Writing
    • Moneyback Guaranteed

Course Reviews

Average Rating:5.0

5 Stars210
4 Stars90
3 Stars40
2 Stars2
1 Star0