Amazon Web Services, also known as AWS, is a secure cloud-computing platform offered by Amazon.com. This course introduces AWS and it's most prominent tools such as IAM, S3, and EC2. Additionally we will cover how to install configure and use a Hadoop cluster on AWS. This learning path can be used as part of the preparation for the Cloudera Certified Administrator for Apache Hadoop (CCA-500) exam.
Learning Objectives
Amazon Web Services
- start the course
- describe how cloud computing can be used as a solution for Hadoop
- recall some of the most come services of the EC2 service bundle
- recall some of the most common services that Amazon offers
Setup of AWS
- describe how the AWS credentials are used for authentication
- create an AWS account
- describe the use of AWS access keys
- describe AWS identification and access management
- set up AWS IAM
AWS System Security
- describe the use of SSH key pairs for remote access
AWS S3 and EC2
- set up S3 and import data
- provision a micro instance of EC2
Setup of AWS Cluster
- prepare to install and configure a Hadoop cluster on AWS
- create an EC2 baseline server
- create an Amazon machine image
- create an Amazon cluster
- describe what the command line interface is used for
Moving Data
- use the command line interface
- describe the various ways to move data into AWS
Elastic MapReduce
- recall the advantages and limitations of using Hadoop in the cloud
- recall the advantages and limitations of using AWS EMR
- describe EMR End-user connections and EMR security levels
- set up an EMR cluster
- run an EMR job from the web console
- run an EMR job with Hue
- run an EMR job with the command line interface
Practice: Cloud Computing
- write an Elastic MapReduce script for AWS