Catalog / HBase Cheatsheet
HBase Cheatsheet
A comprehensive cheat sheet for HBase, covering architecture, data model, key operations, and administration.
HBase Architecture & Data Model
Core Components
HRegionServer |
Hosts and manages HRegions; handles read/write requests. |
HMaster |
Assigns regions to RegionServers, handles schema changes, and performs administrative tasks. |
ZooKeeper |
Maintains configuration information, naming, and distributed synchronization. |
HDFS |
Hadoop Distributed File System; stores HBase data persistently. |
Data Model
HBase is a distributed, scalable, big data store. It’s designed to store and retrieve data within large tables. Data is stored as key-value pairs.
|
Key Concepts
Regions |
Tables are split into regions. A region contains a subset of rows. Regions are the unit of distribution and scalability. |
Store |
Each region contains one or more stores. A store contains a MemStore and zero or more StoreFiles (HFiles). |
MemStore |
In-memory buffer that stores recent writes. |
HFile |
Sorted key-value pairs stored on HDFS. |
Basic HBase Operations (CLI)
Table Management
|
|
|
|
|
|
|
Data Manipulation
|
|
|
|
|
|
|
HBase Shell Commands
Namespace Management
|
|
|
|
|
Advanced Scan Operations
|
|
|
|
|
|
Configuration
HBase configuration is managed through
|
HBase Java API
Connecting to HBase
|
Basic Operations (Java)
|