There are important decisions you must make to ensure network, disks, and hosts are configured correctly when deploying a Hadoop Cluster. This course will walk you through all of the steps to install Hadoop in a pseudo-distributed mode and the set up of some of the common open source software used to create a Hadoop Ecosystem. This learning path can be used as part of the preparation for the Cloudera Certified Administrator for Apache Hadoop (CCA-500) exam.
Learning Objectives
Configuration Management Tools
- start the course
- describe the configurations management tools
- simulate a configuration management tool
Create Configuration Items
- build an image for a baseline server
- build an image for a DataServer
- build an image for a master server
Setup a CM Environment
- provision an admin server
Deploy a Hadoop Cluster
- describe the layout and structure of the Hadoop cluster
- provision a Hadoop cluster
- distribute configuration files and admin scripts
- use init scripts to start and stop a Hadoop cluster
- configure a Hadoop cluster
- configure logging for the Hadoop cluster
- build images for required servers in the Hadoop cluster
- configure a MySQL database
- build the Hadoop clients
- configure Hive daemons
- test the functionality of Flume, Sqoop, HDFS, and MapReduce
- test the functionality of Hive and Pig
- configure Hcatalog daemons
- configure Oozie
- configure Hue and Hue users
Practice: Configure the Admin Server
- install Hadoop on to the admin server