• Online, Self-Paced
Course Description

There are many considerations when designing and implementing big data analytics solutions with Microsoft Azure. This course covers data ingesting and storage and designing and provisioning compute clusters, and it aligns with exam 70-475.

Learning Objectives

Ingesting Data for Batch Processing

  • start the course
  • identify basic features of Microsoft big data solutions
  • recognize storage options for big data and identify methods to load data into Azure Blob storage
  • list key features of the Azure Data Factory and the Azure Data Lake Store
  • use Azure PowerShell with Azure Storage
  • recognize best practices and considerations for data collection and loading in HDInsight
  • recognize key features of Apache Storm and Apache Flume
  • recognize key features of Azure Cosmos DB and DocumentDB
  • store and access .NET web application data with Azure Cosmo DB
  • install and use the Microsoft Azure Storage Explorer
  • load data into an Azure SQL Data Warehouse
  • install and use PolyBase to query data in an Azure Storage account
  • recognize common methods for moving data from an on-premises SQL Server to an Azure Virtual Machine SQL Server

Compute Clusters for Batch Processing

  • recognize features of Hadoop and HDInsight clusters
  • identify how Apache Spark is used with HDInsight
  • recognize the capabilities of HBase in HDInsight
  • identify how Apache Kafka is used with HDInsight
  • recognize the capabilities of Interactive Hive in HDInsight
  • identify how R is used with HDInsight
  • determine which tools to use and identify important security features

Practice: Provisioning Compute Clusters

  • recognize key features and capabilities of various tools used with HDInsight

Framework Connections

The materials within this course focus on the Knowledge Skills and Abilities (KSAs) identified within the Specialty Areas listed below. Click to view Specialty Area details within the interactive National Cybersecurity Workforce Framework.