Apache Hadoop is an open-source distributed fault-tolerant system that leverages commodity hardware to achieve large-scale agile data storage and processing. Hadoop empowers applications to work with thousands of nodes and petabytes of data without exposing the complexity of clustering to the end user.
This course discusses the design principles behind Apache Hadoop and explains the architecture of its core sub-systems: HDFS and MapReduce