• Online, Self-Paced
Course Description

Analyze an Apache Spark DataFrame as though it were a relational database table. During this Aspire course, you will discover the different stages involved in optimizing any query or method call on the contents of a Spark DataFrame. Discover how to create views out of a Spark DataFrame's contents and run queries against them; and how to trim and clean a DataFrame. Next, learn how to perform an analysis of data by running different SQL queries; how to configure a DataFrame with an explicitly defined schema; and define what a window is in the context of Spark. Finally, observe how to create and analyze categories of data in a data set by using Windows.

Learning Objectives

Analyze an Apache Spark DataFrame as though it were a relational database table. During this Aspire course, you will discover the different stages involved in optimizing any query or method call on the contents of a Spark DataFrame. Discover how to create views out of a Spark DataFrame's contents and run queries against them; and how to trim and clean a DataFrame. Next, learn how to perform an analysis of data by running different SQL queries; how to configure a DataFrame with an explicitly defined schema; and define what a window is in the context of Spark. Finally, observe how to create and analyze categories of data in a data set by using Windows.

Framework Connections

The materials within this course focus on the Knowledge Skills and Abilities (KSAs) identified within the Specialty Areas listed below. Click to view Specialty Area details within the interactive National Cybersecurity Workforce Framework.