• Online, Self-Paced
Course Description

A number of tools are available for working with Big Data. Many of the tools are open source and Linux distribution based. This course covers the fundamentals of Big Data, including positioning it in a historical IT context, the tools available for working with Big Data, the Big Data stack, and finally, an in-depth look at Apache Hadoop.

Learning Objectives

Big Data in Perspective

  • start the course
  • put in Big Data into perspective of supercomputing
  • describe Big Data in context of technology waves and put it into the perspective by comparing to previous technology waves
  • list the six emerging technologies and relate them to Big Data

Global Data

  • define big data and describe Gartner's Vectors
  • define structured and unstructured data in terms of Gartner's model
  • list the standard sizes used in Big Data to determine sizes of data sets

The Key Contributors

  • list the three primary key contributors to the origins of Big Data
  • list the primary Big Data distro companies

The Apache Software Foundation

  • describe the Apache Software Foundation
  • list projects attributable to the Apache Software Foundation
  • list projects attributable to the Apache Software Foundation
  • describe Cascading and MongoDB

Big Data Stack

  • list the layers of the Big Data Stack
  • list the common Big Data components
  • describe columnar databases and Hbase

Hadoop in Detail

  • describe solutions for scaling computing
  • describe the design principles of Hadoop
  • map out the functional view of Hadoop
  • describe the architecture of HDFS
  • describe the architecture of Yarn
  • describe the attributes and processes of MapReduce
  • describe the architecture of Spark

Practice: Big Data elements and functions

  • describe Big Data in a historical context and the tools available for working with Big Data

Framework Connections

The materials within this course focus on the NICE Framework Task, Knowledge, and Skill statements identified within the indicated NICE Framework component(s):

Specialty Areas

  • Data Administration


If you would like to provide feedback for this course, please e-mail the NICCS SO at NICCS@hq.dhs.gov.