Course Description
In this course, you will be introduced to Apache Spark SQL, Datasets, and DataFrames.
Learning Objectives
Apache Spark SQL Introduction
- start the course
- describe Apache Spark SQL
- create a SparkSession
- create DataFrames with Spark SQL
- use aggregations with the built-in DataFrames functions
- run SQL queries programmatically
- create a global temporary view
- create Datasets with Spark SQL
- use JSON Datasets with Spark SQL
- use Load/Save functions
- manually specify a data source
- run SQL directly on files
- use SaveMode to handle save operations
- write parquet files with Spark SQL
- use Spark SQL to save a DataFrame as a persistent table
- use partitioning when saving persistent tables
Practice: Using Spark SQL
- use Spark SQL to create Datasets and DataFrames