Distributions provide performance and functionality enhancements over the base open source code Apache provides. In this course, you'll learn about the various distributions available and common maintenance tasks in a Hadoop environment.
Learning Objectives
Maintenance
- start the course
- demonstrate how to perform metadata and data backups
- create and delete snapshots
- list common problems for Hadoop administrators
- use the filesystem balancer tool to keep filesystem datanodes evenly balanced
- remove a node from a Hadoop cluster
Distributions
- describe the benefits of distributions
- list the components of a Cloudera distribution, including Impala, Crunch, Kite, and Cloudera Manager
- name the components of a Hortonworks distribution, including Tez, Falcon, and Ambari
- recall the benefits of the MapR distribution
Practice: Maintaining Hadoop
- perform Hadoop snapshot operations