• Classroom
  • Online, Instructor-Led
Course Description

This interactive course will teach security professionals how to use data science techniques to quickly manipulate and analyze network and security data and ultimately uncover valuable insights from this data. The course will cover the entire data science process from data preparation, feature engineering and selection, exploratory data analysis, data visualization, machine learning, model evaluation and optimization and finally, implementing at scale all with a focus on security related problems.

Participants will learn how to read in data in a variety of common formats then write scripts to analyze and visualize that data. A non-exhaustive list of what will be covered include:

  • Writing scripts to efficiently read and manipulate CSV, XML, and JSON files
  • Quickly and efficiently parsing executables, log files, pcap and extracting * artifacts from them
  • Making API calls to merge datasets
  • Use the Pandas library to quickly manipulate tabular data
  • Effectively visualizing data using Python
  • Preprocessing raw security data for machine learning and feature engineering
  • Building, applying and evaluating machine learning algorithms to identify potential threats
  • Automating the process of tuning and optimizing machine learning models
  • Hunting anomalous indicators of compromise and reducing false positives
  • Use supervised learning algorithms such as Random Forests, Naive Bayes, K-Nearest Neighbors (K-NN) and Support Vector Machines (SVM) to classify malicious URLs and identify SQL Injection
  • Apply unsupervised learning algorithms such as K-Means Clustering to detect anomalous behavior

Finally, we will introduce the students to cutting edge Big Data tools including Apache Spark (PySpark), Apache Drill, and GPU accelerated parallel computing frameworks and demonstrate how to apply these techniques to extremely large datasets.

Learning Objectives

By the end of the course students will be able to:

  • Prepare security data for machine learning using the latest techniques
  • Understand the machine learning process
  • Extract features from security data sets
  • Apply a machine learning technique to solving a security problem
  • Using python, construct a classifier
  • Evaluate and assess the performance of a model
  • Create effective visualizations of security data

Framework Connections

The materials within this course focus on the NICE Framework Task, Knowledge, and Skill statements identified within the indicated NICE Framework component(s):

Specialty Areas

  • Cyber Defense Analysis
  • Cyber Defense Infrastructure Support
  • Cyber Investigation
  • Cyber Operations
  • Incident Response

Specialty Areas have been removed from the NICE Framework. With the recent release of the new NICE Framework data, updates to courses are underway. Until this course can be updated, this historical information is provided to give better context as to how it can help you with your cybersecurity goals.

Feedback

If you would like to provide feedback on this course, please e-mail the NICCS team at NICCS@hq.dhs.gov. Please keep in mind that NICCS does not own this course or accept payment for course entry. If you have questions related to the details of this course, such as cost, prerequisites, how to register, etc., please contact the course training provider directly. You can find course training provider contact information by following the link that says “Visit course page for more information...” on this page.