Apache Hadoop is an open-source distributed fault-tolerant system that leverages commodity hardware to achieve large-scale agile data storage and processing. Hadoop empowers applications to work with thousands of nodes and petabytes of data without exposing the complexity of clustering to the end user.
This course discusses the design principles behind Apache Hadoop and explains the architecture of its core sub-systems: HDFS and MapReduce
Understand the main Hadoop components and other open source software related to Hadoop. Understand how HDFS works and the concepts of map and reduce operations.
This course is intended for developers, architects and technical managers who wish to understand Hadoop’s architecture.
Module 1: Big Data Brief Overview
Module 2: Introduction to Hadoop
Module 3: MapReduce
Module 4: The Hadoop Distributed File System (HDFS)
Module 5: Hadoop Related Projects
This course assumes no prior knowledge of Hadoop. Participants should be comfortable with Java code and familiar with DWH concepts