Digital skills in Data and AI

COURSE INTRODUCTION

This is an ideal course package for individuals who want to understand the basic concepts of Big Data and Hadoop. On completing this course, learners will be able to interpret what goes behind the processing of huge volumes of data as the industry switches over from Excel-based analytics to real-time analytics.

COURSE OBJECTIVES

After finish the course, student will have knowledge and skills to:

  • Understand the characteristics of Big Data
  • Describe the basics of Hadoop and HDFS architecture
  • List the features and processes of MapReduce
  • Learn the basics of Pig, Hive, and HBase
  • Explore the commercial distributions of Hadoop
  • Understand the key components of the Hadoop ecosystem
  • Get introduced to Sqoop & ZooKeeper

AUDIENCE

This course is meant for professionals who intend to gain a basic understanding of Big Data and Hadoop. It is ideal for professionals in senior management who requires a theoretical understanding of how Hadoop can solve their Big Data problem.

COURSE CONTENTS

 

Lesson 1.0 - Introduction to Big Data and Hadoop

  • Introduction to Big Data and Hadoop
  • Objectives
  • Need for Big Data
  • Three Characteristics of Big Data
  • Characteristics of Big Data Technology
  • Appeal of Big Data Technology
  • Handling Limitations of Big Data
  • Introduction to Hadoop
  • Hadoop Configuration
  • Apache Hadoop Core Components
  • Hadoop Core Components—HDFS
  • Hadoop Core Components—MapReduce
  • HDFS Architecture
  • Ubuntu Server—Introduction
  • Hadoop Installation—Prerequisites
  • Hadoop Multi-Node Installation—Prerequisites
  • Single-Node Cluster vs. Multi-Node Cluster
  • MapReduce
  • Characteristics of MapReduce
  • Real-Time Uses of MapReduce
  • Prerequisites for Hadoop Installation in Ubuntu Desktop 12.04
  • Hadoop MapReduce—Features
  • Hadoop MapReduce—Processes
  • Advanced HDFS–Introduction
  • Advanced MapReduce
  • Data Types in Hadoop
  • Distributed Cache
  • Distributed Cache (contd.)
  • Joins in MapReduce
  • Introduction to Pig
  • Components of Pig
  • Data Model
  • Pig vs. SQL
  • Prerequisites to Set the Environment for Pig Latin
  • Summary

 

Lesson 1.1 - Hive HBase and Hadoop Ecosystem Components

  • Hive, HBase and Hadoop Ecosystem Components
  • Objectives
  • Hive—Introduction
  • Hive—Characteristics
  • 5 System Architecture and Components of Hive
  • Basics of Hive Query Language
  • Data Model—Tables
  • Data Types in Hive
  • Serialization and De serialization
  • UDF/UDAF vs. MapReduce Scripts
  • HBase—Introduction
  • Characteristics of HBase
  • HBase Architecture
  • HBase vs. RDBMS
  • Cloudera—Introduction
  • Cloudera Distribution
  • Cloudera Manager
  • Hortonworks Data Platform
  • MapR Data Platform
  • Pivotal HD
  • Introduction to ZooKeeper
  • Features of ZooKeeper
  • Goals of ZooKeeper
  • Uses of ZooKeeper
  • Sqoop—Reasons to Use It
  • Sqoop—Reasons to Use It (contd.)
  • Benefits of Sqoop
  • Apache Hadoop Ecosystem
  • Apache Oozie
  • Introduction to Mahout
  • Usage of Mahout
  • Apache Cassandra
  • Apache Spark
  • Apache Ambari
  • Key Features of Apache Ambari

Hadoop Security—Kerberos

CÓ THỂ BẠN QUAN TÂM
Array
(
)