HBASE-3759: Eliminate use of ThreadLocals for CoprocessorEnvironment bypass() and complete()
Review Request #588 - Created April 13, 2011 and submitted
Profiling the HRegionServer process with a RegionObserver coprocessor loaded shows a fair amount of runnable thread CPU time spent getting the bypass and complete flag ThreadLocal values by RegionCoprocessorHost. See the HBASE-3759 JIRA for some attached graphs. With the caveat that this is runnable CPU time and not threads in all states, this still seems like a significant processing bottleneck on a hot call path. The workload profiled was a put-based bulk load, so for each multi-put request, RegionCoprocessorHost.prePut() could be called many times. Instead of using ThreadLocal variable for bypass/complete, which will incur contention on the underlying map of values, I think we can eliminate the bottleneck by using locally scoped variables for each preXXX/putXXX method called in the RegionCoprocessorHost, MasterCoprocessorHost and WALCoprocessorHost classes. The attached patch refactors the current RegionObserver, MasterObserver and WALObserver APIs to provide a locally scoped ObserverContext object for storing and checking the bypass and complete values. Summary of changes: * adds a new ObserverContext<T extends CoprocessorEnvironment> class, containing references for bypass, complete and the environment instance * in each pre/post method in RegionObserver, the RegionCoprocessorEnvironment parameter is replaced by ObserverContext<RegionCoprocessorEnvironment> * in each pre/post method in MasterObserver, the MasterCoprocessorEnvironment parameter is replaced by ObserverContext<MasterCoprocessorEnvironment> * in each pre/post method in WALObserver, the WALCoprocessorEnvironment parameter is replace by ObserverContext<WALCoprocesorEnvironment> This is obviously a large bulk change to the existing API. I could avoid the API change with hacky modification underneath the *CoprocessorEnvironment interfaces. But since we do not yet have a public release with coprocessors, I would prefer to take the time to make the initial API the best we can before we push it out. Please let me know your thoughts on this approach.
Posted (April 14, 2011, 3:12 a.m.)
First, please let me know if i am thinking in the right direction: In the threadlocal version, we are setting it to false because this variable is shared by the registered CPs in all their pre/postXXX hooks, and it was used to decide whether to continue with the CP chain or return from the currently executing CP. So, to reuse this variable, it was set to false again. If that is the case, in this version, we are having a separate instance of ObserverContext for one hook, and i don't see that we need to reset these variables. The same goes with the "current" variable. Am i getting it right? (I want to come up with a CP observer for 3607, therefore want to grok it a bit, hope you don't mind) Thanks.