- The MemStore is a write buffer where HBase accumulates data in memory before a permanent write.
- Its contents are flushed to disk to form an HFile when the MemStore fills up.
- It doesn't write to an existing HFile but instead forms a new file on every flush.
- The HFile is the underlying storage format for HBase.
- HFiles belong to a column family(one MemStore per column family). A column family can have multiple HFiles, but the reverse isn't true.
- size of the MemStore is defined in hbase-site.xml called hbase.hregion.memstore.flush.size.
What happens, when the server hosting a MemStore that has not yet been flushed crashes?
Every server in HBase cluster keeps a WAL to record changes as they happen. The WAL is a file on the underlying file system.A write isn't considered successful until the new WAL entry is successfully written, this guarantees durability.
If HBase goes down, the data that was not yet flushed from the MemStore to the HFile can be recovered by replaying the WAL, taken care by Hbase framework.