• Online, Self-Paced
Course Description

Clustering is an unsupervised learning technique that finds logical groupings or clusters in your data, for example, identifying what social network users have the same interests and background.

In this course, explore how clustering models seek to find logical groupings in your data. Next, construct a KNIME workflow to load and explore data for a clustering model. Then, fill in missing values using different imputation techniques, identify highly correlated variables, and deal with outliers. Fit a k-means clustering model on your data, identify clusters, and use scatter plots to visualize the clusters in your data. Finally, perform dimensionality reduction using principal component analysis (PCA) and use the silhouette score to evaluate the number of clusters that gives you the best clustering for your data.

Upon course completion, you will be able to fit and evaluate clustering models on your data and visualize clusters using 2-D and 3-D visualizations.

Learning Objectives

{"discover the key concepts covered in this course"}

Framework Connections

The materials within this course focus on the NICE Framework Task, Knowledge, and Skill statements identified within the indicated NICE Framework component(s):

Specialty Areas

  • Systems Architecture

Feedback

If you would like to provide feedback for this course, please e-mail the NICCS SO at NICCS@hq.dhs.gov.