Review Board 1.7.22


HBASE-4241: Optimize flushing of the Memstore.

Review Request #1650 - Created Aug. 25, 2011 and submitted

Lars Hofhansl
trunk
HBASE-4241
Reviewers
hbase
jgray, stack, tedyu
hbase
This avoids flushing row versions to disk that are known to be GC'd by the next compaction anyway.
This covers two scenarios:
1. maxVersions=N and we find at least N versions in the memstore. We can safely avoid flushing any further versions to disk.
2. similarly minVersions=N and we find at least N versions in the memstore. Now we can safely avoid flushing any further *expired* versions to disk.

This changes the Store flush to use the same mechanism that used for compactions.
I borrowed some code from the tests and refactored the test code to use a new utility class that wraps a sorted collection and then behaves like KeyValueScanner. The same class is used to create scanner over the memstore's snapshot.
Ran all tests. TestHTablePool and TestDistributedLogSplitting error out (with or without my change).
I had to change three tests that incorrectly relied on old rows hanging around after a flush (or were otherwise incorrect).

No new test, as this should cause no functional change.
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
Diff Revision 3 Diff Revision 4
[20] 508 lines
[+20] [+] private StoreFile internalFlushCache(final SortedSet<KeyValue> set,
509
              flushed += this.memstore.heapSizeChange(kv, true);
509
              flushed += this.memstore.heapSizeChange(kv, true);
510
            }
510
            }
511
            kvs.clear();
511
            kvs.clear();
512
          }
512
          }
513
        }
513
        }
514
        scanner.close();

   
515
      } finally {
514
      } finally {
516
        // Write out the log sequence number that corresponds to this output
515
        // Write out the log sequence number that corresponds to this output
517
        // hfile.  The hfile is current up to and including logCacheFlushId.
516
        // hfile.  The hfile is current up to and including logCacheFlushId.
518
        status.setStatus("Flushing " + this + ": appending metadata");
517
        status.setStatus("Flushing " + this + ": appending metadata");
519
        writer.appendMetadata(logCacheFlushId, false);
518
        writer.appendMetadata(logCacheFlushId, false);
520
        status.setStatus("Flushing " + this + ": closing flushed file");
519
        status.setStatus("Flushing " + this + ": closing flushed file");
521
        writer.close();
520
        writer.close();

    
   
521

   

    
   
522
        // do this last in case this fails for some reason.

    
   
523
        scanner.close();
522
      }
524
      }
523
    }
525
    }
524

    
   
526

   
525
    // Write-out finished successfully, move into the right spot
527
    // Write-out finished successfully, move into the right spot
526
    Path dstPath = StoreFile.getUniqueFile(fs, homedir);
528
    Path dstPath = StoreFile.getUniqueFile(fs, homedir);
[+20] [20] 1233 lines
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/CollectionBackedScanner.java
Diff Revision 3 Diff Revision 4
 
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/KeyValueScanFixture.java
Diff Revision 3 Diff Revision 4
 
  1. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java: Loading...
  2. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/CollectionBackedScanner.java: Loading...
  3. http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/KeyValueScanFixture.java: Loading...