• Online, Self-Paced
Course Description

Apache Hadoop is an open source software framework for storage and large scale processing of data-sets on clusters of commodity hardware. This course focuses on the capacity management of Hadoop clusters. You will be introduced to the concepts of resource management through scheduling. You will learn how to use the Fair Scheduler Tool, and how to plan for scaling. This learning path can be used as part of the preparation for the Cloudera Certified Administrator for Apache Hadoop (CCA-500) exam.

Learning Objectives

Capacity Management

  • start the course
  • compare the differences of availability versus performance
  • describe different strategies of resource capacity management

HDFS Capacity

  • describe how schedulers perform various resource management
  • set quotas for the HDFS file system

YARN Capacity

  • recall how to set the maximum and minimum memory allocations per container
  • describe how the fair scheduling method allows all applications to get equal amounts of resource time
  • describe the primary algorithm and the configuration files for the Fair Scheduler
  • describe the default behavior of the Fair Scheduler methods
  • monitor the behavior of Fair Share
  • describe the policy for single resource fairness
  • describe how resources are distributed over the total capacity
  • identify different configuration options for single resource fairness
  • configure single resource fairness
  • describe the minimum share function of the Fair Scheduler
  • configure minimum share on the Fair Scheduler
  • describe the preemption functions of the Fair Scheduler
  • configure preemption for the Fair Scheduler

Service Performance

  • describe dominant resource fairness
  • write service levels for performance

Practice: Fair Scheduler

  • use the Fair Scheduler with multiple users

Framework Connections

The materials within this course focus on the NICE Framework Task, Knowledge, and Skill statements identified within the indicated NICE Framework component(s):

Specialty Areas

  • Cyber Operational Planning
  • Cyber Operations
  • Data Administration