• Online, Self-Paced
Course Description

Hadoop is a framework written in Java for running applications on large clusters of commodity hardware. In this course we will examine many of the HDFS administration and operational processes required to operate and maintain a Hadoop cluster. We will take a look at how to balance a Hadoop cluster, manage jobs, and perform backup and recovery for HDFS. This learning path can be used as part of the preparation for the Cloudera Certified Administrator for Apache Hadoop (CCA-500) exam.

Learning Objectives

Hadoop Operations

  • start the course
  • monitor and improve service levels
  • deploy a Hadoop release
  • describe the purpose of change management

Racks Awareness for Hadoop

  • describe rack awareness
  • write configuration files for rack awareness

File System Management for HDFS

  • start and stop a Hadoop cluster
  • write init scripts for Hadoop
  • describe the tools fsck and dfsadmin
  • use fsck to check the HDFS file system
  • set quotas for the HDFS file system
  • install and configure trash

DataNode Management for HDFS

  • manage an HDFS DataNode
  • use include and exclude files to replace a DataNode
  • describe the operations for scaling a Hadoop cluster
  • add a DataNode to a Hadoop cluster

Balancing a Hadoop Cluster

  • describe the process for balancing a Hadoop cluster
  • balance a Hadoop cluster

Backup and Recovery for HDFS

  • describe the operations involved for backing up data
  • use distcp to copy data from one cluster to another

Managing Jobs

  • describe MapReduce job management on a Hadoop cluster
  • perform MapReduce job management on a Hadoop cluster

Upgrades for a Hadoop Cluster

  • plan an upgrade of a Hadoop cluster

Practice: High Availability

  • write and complete a plan to install Hbase with high availability

Framework Connections

The materials within this course focus on the NICE Framework Task, Knowledge, and Skill statements identified within the indicated NICE Framework component(s):

Specialty Areas

  • Cyber Operational Planning
  • Cyber Operations
  • Data Administration

Feedback

If you would like to provide feedback for this course, please e-mail the NICCS SO at NICCS@hq.dhs.gov.