Review Board 1.7.22


HBASE-4241: Optimize flushing of the Memstore.

Review Request #1650 - Created Aug. 25, 2011 and submitted

Lars Hofhansl
trunk
HBASE-4241
Reviewers
hbase
jgray, stack, tedyu
hbase
This avoids flushing row versions to disk that are known to be GC'd by the next compaction anyway.
This covers two scenarios:
1. maxVersions=N and we find at least N versions in the memstore. We can safely avoid flushing any further versions to disk.
2. similarly minVersions=N and we find at least N versions in the memstore. Now we can safely avoid flushing any further *expired* versions to disk.

This changes the Store flush to use the same mechanism that used for compactions.
I borrowed some code from the tests and refactored the test code to use a new utility class that wraps a sorted collection and then behaves like KeyValueScanner. The same class is used to create scanner over the memstore's snapshot.
Ran all tests. TestHTablePool and TestDistributedLogSplitting error out (with or without my change).
I had to change three tests that incorrectly relied on old rows hanging around after a flush (or were otherwise incorrect).

No new test, as this should cause no functional change.
Review request changed
Updated (Aug. 25, 2011, 3:43 p.m.)
Make sure the StoreScanner is always closed (in analogy what Store.compactStore(...) does).
Ship it!
Posted (Aug. 25, 2011, 4:08 p.m.)

   

  
Whitespace.
  1. I got this on the commit