• Online, Self-Paced
Course Description

Data engineering is the area of data science that focuses on practical applications of data collection and analysis. In this course, you will explore distributed systems, batch vs. in-memory processing, NoSQL uses, and the various tools available for data management/big data and the ETL process.

Learning Objectives

Data Engineering Fundamentals

  • describe distributed systems from a data perspective
  • identify the differences between batch and in-memory processing
  • describe NoSQL stores and how they are used
  • identify different tools available for data management
  • describe the ETL process and different tools available
  • use Talend Open Studio to showcase the ETL concept
  • describe and create a data model
  • describe the hierarchy of needs when working with AI and machine learning
  • describe and create a data partition
  • identify data engineering best practices
  • describe data reporting tools
  • create a data model

Framework Connections

The materials within this course focus on the Knowledge Skills and Abilities (KSAs) identified within the Specialty Areas listed below. Click to view Specialty Area details within the interactive National Cybersecurity Workforce Framework.