Review Board 1.7.22

FLUME-1121: Recoverable Memory Channel cannot recover data

Review Request #4713 - Created April 13, 2012 and submitted

Brock Noland
Channel.start() is not being called. This is fixed in DefaultLogicalNodeManager.

Additionally RecoverableMemoryChannel now tracks it's own capacity due to the MemoryChannel semantics being completely different. Basically, if we rely on MemoryChannel capacity, then an error will be thrown when we commit the MemoryChannelTransaction. However, we will have already committed this data to disk. If we commit to MemoryChannel first (there by checking capacity) we could fail to write to disk resulting in data which is only in memory.

Also, I ran cleanup on modules touched. This removes whitespace, unused imports, and adds @Override tags where needed. This is one time cleanup which allows automated cleanup in the future.
All unit tests pass and manual testing passes as well.
Review request changed
Updated (April 13, 2012, 8:12 p.m.)
Added tests for restart and reconfiguration. Thrown an exception if the capacity is not sufficient on restart.
Ship it!
Posted (April 30, 2012, 7:22 p.m.)
Thanks for the patch Brock, changes look good. Some minor feedback follows.

Please rebase the patch when convenient and attach to the Jira.
  1. Thank you for your review! Attached to the JIRA!
nit: this seems to be duplicating part of the functionality offered by the LifecycleState flag maintained by the abstract base class. It will be good to see if that can be reused here.
  1. I agree, I have created FLUME-1169 to implement this checking in the superclass.
  1. done
This call needs to follow the transaction idiom with the appropriate try/catch. Also, since this is initialization time code, it may be easier to do a bulk put in a single transaction.
  1. Done
Since these are getting accessed by worker thread and no longer propagated, they should be marked volatile? (or propagated).
  1. I figured eventual consistency for this parameters was fine, however I have marked them volatile.
The source runner interface is no longer used anywhere and should have been removed altogether (along with its subtypes). Since there is other cleanup in your patch, it would be great if you could remove that as well. 
  1. Who is executing sources now if PollableSourceRunning is not being used?
  2. Looks to me like it's still being used in PropertiesFileConfigurationProvider.loadSources?
  3. Brock, I stand corrected. It is still being used and not removed yet. Please leave that in as is. I will file a separate Jira to address that if it becomes critical.
Posted (May 1, 2012, 1:02 a.m.)
Hi Brock,

Thanks for the patch! This is not really a review: I just have one feedback. Could you please not include the cleanup in this patch and submit a separate patch for that. The revision history will show this patch affecting every single file even though this is supposed to only fix the recoverable memory channel, and we will need to look at detailed revision history to look at what changes were caused by this patch. I'd rather see only the changes related to this bug.
  1. I was including these changes under the `Please fix any "broken windows" in your neighborhood' mantra from the `How to Contribute' section on the wiki. If we feel that changes like this should be done in separate JIRA's, lets update that item.