This course covers the HDFS architecture and its main building blocks. In addition, subjects such as data replication, communication protocols, and accessibility are introduced.
Learning Objectives
Apache Hadoop HDFS
- start the course
- provide an overview of the HDFS architecture and its main building blocks
- list considerations for the HDFS architecture, such as hardware failure, large data sets, and the coherency model
- describe NameNode and DataNodes in HDFS
- describe the file system namespace
- provide an overview of data replication
- list considerations relating to robustness
- describe the various HDFS communication protocols
- describe data organization considerations such as data blocks and replication pipelining
- list accessibility features such as FS Shell, DFSAdmin, and Browser Interface
- describe space reclamation considerations such as file deletes and replication factors
Practice: HDFS Architecture and Components
- work with the HDFS architecture