GCP provides fully managed cloud services for running Apache Spark and Hadoop. This course will introduce you to the concepts of cluster management with Dataproc, including machine types and workers.
Learning Objectives
Cloud Dataproc
- start the course
- recognize big data concepts and solutions using GCP
- define Cloud Dataproc and its benefits
- recall the various ways to access Dataproc
- describe the various areas of the dashboard and create a project
Cluster Management
- recognize the process for creating a cluster in Dataproc
- recall the process for deleting a cluster using Dataproc
- define master and worker nodes in Dataproc
- describe custom machine types and preemptible worker nodes
- define the processes for identity and access management with permissions and IAM roles
Practice: Cloud Data and Cluster Management
- recognize the basic concepts of cluster management in Dataproc