• Online, Self-Paced
Course Description

No one wants to wait when there are issues that require a sysadmin to get back to you. Site reliability engineering (SRE) is a software engineering approach that tackles IT and operations problems through code, which helps when scaling to very large systems.

In this course, learn how to gather the information needed to trigger many self-healing types of operations. Next, explore how to implement alerts based on metrics, logs, or health checks. Finally, discover how to notify users of degraded systems and implement alerts that can trigger scaling or failover operations.

This course is one of a collection that prepares learners for the Designing and Implementing Microsoft DevOps Solutions (AZ-400) exam.

Learning Objectives

{"discover the key concepts covered in this course"}

Framework Connections

The materials within this course focus on the NICE Framework Task, Knowledge, and Skill statements identified within the indicated NICE Framework component(s):

Specialty Areas

  • Network Services