Over the last few years, many organizations have made a strategic decision to turn into big data. At the heart of this challenge, there is the process of extracting data from many sources, transforming it, and then load it into your Data Warehouse for subsequent analysis. A process known as “Extract, Transform & Load” (ETL).
Apache Hadoop is one of the most common platforms for managing big data, and in this course, we’ll introduce you with three common methods of transporting and streaming your data into your Hadoop Data File System (HDFS):
- Data transfer between Hadoop and relational databases using Apache Sqoop
- Collecting, aggregating, and moving large amounts of streaming data into Hadoop using Apache Flume and Apache Kafka
Hand on exercises are included.