Review Board 1.7.22


HBASE-4377 [hbck] Offline rebuild .META. from fs data only

Review Request #2126 - Created Sept. 29, 2011 and submitted

Jonathan Hsieh
trunk
HBASE-4377
Reviewers
hbase
apurtell, stack
hbase-git
commit fbf82c17be6b3ecca5a981f5270cf93aac26e479
Author: Jonathan Hsieh <jon@cloudera.com>
Date:   Wed Sep 28 10:18:11 2011 -0700

    HBASE-4377 [hbck] Offline rebuild .META. from fs data only
    

This patch rebuilds a new .META. table by reading all the .regioninfo files in the hbase main directory.  It depends on the yet to be committed HBASE-4515 (either my verison or Gary's version), HBASE-4509, and HBASE-4506.  

Some follow on work includes backporting to 0.90, auto-patching true holes, and adding documentation.
An earlier version of this code (backported to 0.90) was used to diagnose and repair a cluster that had 2700 inconsistencies due to failed splits (the cluster was underprovisioned memory-wise, and on restart, the some regions would start splitting and then die due to oome's).  This was not actually used on a live cluster -- it was used to reconstruct a .META. from .regioninfo's laid out in hbase's directory structure.

Note also that this is not an automatic fix -- whenever any problems are found, this bails out but dumps info on holes, suggests some fixes, and displays sets of overlapping regions.  It is up to the user to merge regions, to create .regioninfo files to plug hole, and to do any potential data loosing operations.

The tests demonstrate current expected behavior -- rebuild meta if things line up, and fail without making modifications if holes or overlaps exist.
Review request changed
Updated (Oct. 31, 2011, 8:55 p.m.)
Addressed Stack's comments