• Online, Self-Paced
Course Description

The administration of Apache HBase is a fundamental component to understand. HBase can be managed using the Java client API and can also be integrated with MapReduce to perform additional tasks that will help obtain maximum performance. This course will discuss and show how to implement filters to limit the results returned from a scan operation. It will also demonstrate how to administer the HBase cluster and instance and perform backup and restore operations. Using MapReduce is also discussed.

Learning Objectives

Filters

  • start the course
  • use utility filters that extend the FilterBase class to filter scan results
  • use comparison filters to limit the scan results using comparison operators and comparator instance
  • use custom filters to extend or change the behavior of an existing filter to achieve a more fine-grained control over the scan results

Cluster Administration

  • use the HBaseAdmin API to check the status of the master server, connection instance, and the configuration used by the instance
  • view a list of all the user space tables in HBase and the instance for the table
  • disable and delete tables from HBase
  • complete a major compaction using the HBase shell
  • merge regions in the same table using the Merge utility
  • stop and decommission a RegionServer
  • perform a rolling restart on the entire cluster
  • add a new node to HBase
  • view metrics to monitor HBase

Snapshots and Backups

  • take a snapshot
  • use a snapshot to clone a table and move it to another cluster
  • export and restore a snapshot to another cluster
  • perform a full shutdown backup of HBase
  • perform a backup of HBase on a live cluster
  • restore HBase

MapReduce

  • use the TableOutPutFormat class to set up a table as an output to the MapReduce process using HBase as the data sink
  • set up a table as an input to a MapReduce process using HBase as the data source
  • use MapReduce to bulk load data directly into HBase file system by bypassing the HBase API
  • use the getSplits method of the TableInputFormatBase class to create custom splitters when using an HBase table as a data source
  • access other HBase tables from within a MapReduce job by creating a Table instance in the setup method of Mapper

Practice: Manage HBase

  • perform HBase cluster and node maintenance

Framework Connections

The materials within this course focus on the NICE Framework Task, Knowledge, and Skill statements identified within the indicated NICE Framework component(s):

Specialty Areas

  • Data Administration
  • Systems Administration

Specialty Areas have been removed from the NICE Framework. With the recent release of the new NICE Framework data, updates to courses are underway. Until this course can be updated, this historical information is provided to give better context as to how it can help you with your cybersecurity goals.

Feedback

If you would like to provide feedback for this course, please e-mail the NICCS SO at NICCS@hq.dhs.gov.