Review Board 1.7.22

HBASE-5162 - Basic client pushback mechanism

Review Request #3930 - Created Feb. 16, 2012 and updated

Jesse Yates
jdcryans, lhofhansl, stack
Under heavy write load, HBase will create a saw-tooth pattern in accepting writes. This is due to the I/O in minor compactions not being able to keep up with the write load. Specifically, the memstore is attempting to flush while we are attempting to do a minor compaction, leading to blocking _all_ writes. Instead, we need to have the option of graceful degradation mechanism.

This patch supports both a short-term,adjustable server-side write blocking as well as client-side back-off to help alleviate temporary memstore pressure.

Posted (Feb. 18, 2012, 4:58 a.m.)


I find this a bit dubious.
This won't actually slow the client thread down, but just accumulate more data and reduce the number of RPCs. In the end it might lead to more load on the server, because we can deliver more puts as with fewer but larger batches.

I'd rather just rely on the server sleeping the thread for a bit (as you do later).
  1. The question here is what we consider the 'client'. To me, the client was the client-side HTable, not the using application. So yes, this will not slow the using application, but that was the intention...keeping HBase seeming as responsive, but just letting the messaging layer handle the queuing and buffering.
    Technically, this will actually slow the rate of writes _to the server_, which is what we really are worrying about. Essentially, this implementation is such that we temporarily increase batch size so the server has time to finish its compactions and flushing. 
    The server can handle the larger batch sizes (except for extreme cases) since the assumption has to be clientMemory << serverMemory. 
    This is just experimental :) I'll need to do some stress testing to actually see the effects
What if the flusher is not null? Should we re-calculate the wait time?
  1. We are temporarily growing the write buffer, so if we get another write, I was thinking we would just let it accumulate too - it will get flushed soon enough. Only corner case here is if the size grows beyond the allowed, in which case there will be a forced flush.
If sleepTime is 0 (for example from NoServerBackoffPolicy), we should probably not create the thread and flush right here.

(But as I said in the comment above, I'd probably not bother with this extra thread to begin with :) )
  1. +1 fixing in next version, assuming the client stuff is kept
Was this a problem before? Or only now becaue of the background thread?
  1. Just b/c of the background thread.
This will break backwards compatibility, right? (Not saying that's not ok, just calling it out)

I'd almost rather have the client not know about this, until we reach a bad spot (in which case we can throw back retryable exceptions).
  1. I don't think so. If the server isn't upgraded to the send monitoredResults, then this bit of code won't matter. If the client isn't upgraded, then it won't even consider if the MonitoredResult isn't just a Result.
Ah ok, this is where we gracefully delay the server thread a bit.
Seems this would need to be tweaked carefully to make it effective while not slowing normal operations.

Should the serverPauseTime be somehow related to the amount of pressure.
I.e. wait a bit more if the pressure is higher?
Maybe the pausetime calculation should be part of the pluggable policy?

Also in the jira there was some discussion about throwing (presumably retryable) exceptions back to the client is the pressure gets too high. That would slow the client, without consuming server resources (beyond multiple requests).
  1. Definitely needs some knobs with the policy here and tuning.
    It would be possible to tie the pause time to the amount of pressure (some sort of multiplier effect). Would have to think about the implications of that
    Throwing the exception back the client here means you get a lot of bandwidth overhead as the client passes back a bunch of puts _again_, which is rough. Otherwise, we are *definitely* going to be messing with wire-compatibility in a functional (if not strictly RPC) sense.
General comment: Where are we on putting/documenting these things in hbase-defaults.xml?
  1. That is already in there, but having the constants just reinforces the 'correct' behavior. Though I think there are some situations where the code is different from defaults.xml