• Online, Self-Paced
Course Description

GCP provides fully managed cloud services for running Apache Spark and Hadoop. This course will introduce you to the concepts of cluster management with Dataproc, including machine types and workers.

Learning Objectives

Cloud Dataproc

  • start the course
  • recognize big data concepts and solutions using GCP
  • define Cloud Dataproc and its benefits
  • recall the various ways to access Dataproc
  • describe the various areas of the dashboard and create a project

Cluster Management

  • recognize the process for creating a cluster in Dataproc
  • recall the process for deleting a cluster using Dataproc
  • define master and worker nodes in Dataproc
  • describe custom machine types and preemptible worker nodes
  • define the processes for identity and access management with permissions and IAM roles

Practice: Cloud Data and Cluster Management

  • recognize the basic concepts of cluster management in Dataproc

Framework Connections

The materials within this course focus on the Knowledge Skills and Abilities (KSAs) identified within the Specialty Areas listed below. Click to view Specialty Area details within the interactive National Cybersecurity Workforce Framework.