• Online, Self-Paced
Course Description

No one wants to wait when there are issues that require a sysadmin to get back to you. Site reliability engineering (SRE) is a software engineering approach that tackles IT and operations problems through code, which helps when scaling to very large systems.

In this course, learn how to gather the information needed to trigger many self-healing types of operations. Next, explore how to implement alerts based on metrics, logs, or health checks. Finally, discover how to notify users of degraded systems and implement alerts that can trigger scaling or failover operations.

This course is one of a collection that prepares learners for the Designing and Implementing Microsoft DevOps Solutions (AZ-400) exam.

Learning Objectives

{"discover the key concepts covered in this course"}

Framework Connections

The materials within this course focus on the Knowledge Skills and Abilities (KSAs) identified within the Specialty Areas listed below. Click to view Specialty Area details within the interactive National Cybersecurity Workforce Framework.