• Online, Self-Paced
Course Description

This course covers the HDFS architecture and its main building blocks. In addition, subjects such as data replication, communication protocols, and accessibility are introduced.

Learning Objectives

Apache Hadoop HDFS

  • start the course
  • provide an overview of the HDFS architecture and its main building blocks
  • list considerations for the HDFS architecture, such as hardware failure, large data sets, and the coherency model
  • describe NameNode and DataNodes in HDFS
  • describe the file system namespace
  • provide an overview of data replication
  • list considerations relating to robustness
  • describe the various HDFS communication protocols
  • describe data organization considerations such as data blocks and replication pipelining
  • list accessibility features such as FS Shell, DFSAdmin, and Browser Interface
  • describe space reclamation considerations such as file deletes and replication factors

Practice: HDFS Architecture and Components

  • work with the HDFS architecture

Framework Connections

The materials within this course focus on the Knowledge Skills and Abilities (KSAs) identified within the Specialty Areas listed below. Click to view Specialty Area details within the interactive National Cybersecurity Workforce Framework.