Data engineering is the area of data science that focuses on practical applications of data collection and analysis. In this course, you will explore distributed systems, batch vs. in-memory processing, NoSQL uses, and the various tools available for data management/big data and the ETL process.
Learning Objectives
Data Engineering Fundamentals
- describe distributed systems from a data perspective
- identify the differences between batch and in-memory processing
- describe NoSQL stores and how they are used
- identify different tools available for data management
- describe the ETL process and different tools available
- use Talend Open Studio to showcase the ETL concept
- describe and create a data model
- describe the hierarchy of needs when working with AI and machine learning
- describe and create a data partition
- identify data engineering best practices
- describe data reporting tools
- create a data model