• Online, Self-Paced
Course Description

Apache HBase is a NoSQL column-oriented database that provides big data storage for semi-structured data. It runs on HDFS and ZooKeeper and can be integrated with MapReduce. In a column-oriented database, data in a column is stored together using column families rather than in a row. The physical architecture uses a Master-Slave relationship and distributes the data in a cluster-like format. This course will show how to install HBase and discuss the HBase architecture and data modeling designs.

Learning Objectives

Installation

  • start the course
  • describe HBase and its features
  • identify the hardware requirements for HBase
  • identify the software requirements for HBase
  • describe the filesystems used for HBase
  • describe the different HBase installation modes
  • install HBase in local mode
  • install HBase in fully distributed mode
  • access and navigate the web-based management console for HBase
  • get started with using the HBase shell

Architecture

  • describe the HBase components and their functionalities
  • describe the HFile and Region components and their functionalities in the HBase architecture
  • describe the functionality of the WAL and MemStore in an HBase architecture
  • describe minor and major compaction and region splitting
  • describe how data replication is used in HBase
  • identify the various methods to access HBase through clients
  • secure HBase using authentication and authorization methods
  • describe MapReduce and how it is integrated with HBase

Data Modeling

  • describe the HBase schema
  • identify the considerations and practices that go into designing an HBase table
  • design rowkeys for HBase tables
  • design the schema to support versions, different datatypes, and joins
  • determine which rows and cells to keep after deletion from a table

Practice: Install HBase

  • install, configure, and secure HBase

Framework Connections

  • Data Administration
  • Systems Administration