• Online, Instructor-Led
  • Classroom
Course Description

This interactive course will teach security professionals how to use data science techniques to quickly manipulate and analyze network and security data and ultimately, uncover valuable insights. You will learn how to read data in common formats and write scripts to analyze and visualize that data. Topics range from data preparation and machine learning to model evaluation, optimization and implementation—at scale.

Learning Objectives

Write scripts to efficiently read and manipulate CSV, XML, and JSON files
Quickly and efficiently parse executables, log files, pcap and extract artifacts from them
Make API calls to merge datasets
Use the Pandas library to quickly manipulate tabular data
Effectively visualize data using Python
Pre-process raw security data for machine learning and feature engineering
Build, apply and evaluate machine learning algorithms to identify potential threats
Automate the process of tuning and optimizing machine learning models
Hunt anomalous indicators of compromise and reducing false positives
Use supervised learning algorithms such as Random Forests, Naive Bayes, K-Nearest Neighbors (K-NN) and Support Vector Machines (SVM) to classify malicious URLs and identify SQL Injection
Apply unsupervised learning algorithms such as K-Means Clustering to detect anomalous behavior
Finally, you will be introduced to cutting edge Big Data tools including Apache Spark (PySpark), Apache Drill, and GPU accelerated parallel computing frameworks and learn how to apply these techniques to extremely large datasets.

Framework Connections

The materials within this course focus on the Knowledge Skills and Abilities (KSAs) identified within the Specialty Areas listed below. Click to view Specialty Area details within the interactive National Cybersecurity Workforce Framework.