Review Board 1.7.22

The increment operation can release the rowlock before sync-ing the Hlog

Review Request #2116 - Created Sept. 29, 2011 and updated

Dhruba Borthakur
The increment operation releases the rowlock before doing the sync to the HLog. This improves performance of increments on hot rows. Introduced method HLog.appendNoSync() that returns a txid. The increment method then release the rowlock and invokes HLog.sync(txid). The HLog.sync(txid) returns only if all the transactions upto the one identified by that txid has been successfully sycned to HDFS.

There is a single test TestIncrement that creates a 100 threads and ensures that all the concurrent increments match the final expected result. There is a benchmark TestHLogBench that measures the performance of the appendNoSync call.
All unit tests pass now (expect TestDistributedLogSplitting, TestRollingRestart, TestHTablePool), but I am seeing the same test to fail on trunk, so these failures do not seem to be related to this patch.
Ship it!
Posted (Sept. 29, 2011, 9:59 p.m.)
Patch v6 passed all unit tests.
Nice job Dhruba.
Posted (Sept. 29, 2011, 10:31 p.m.)
Looks good. I like the bench marking tool.

This is like a group commit in HLog.  We have a group committing going on inside in sync down in dfsclient.  Does this group commit not 'group' enough.

With this change, its no longer possible to sync each increment.  We ok w/ that?  We should at least release note this difference in increment behaviors and erring on the conservative side, I'd mark this an 'incompatible change' rather than an 'improvement' just so its more likely folks will know of this changed increment behavior.
  1. HRegion has:
      final Configuration conf;
    Can we pass new config parameter through conf which enables this new feature ?
Posted (Sept. 30, 2011, 5:01 a.m.)


Am not sure.  Just to clarify
Is this check not needed here then?