Review Board 1.7.22


1073: Move all journal stream management code into one place

Review Request #1688 - Created Aug. 31, 2011 and updated

Todd Lipcon
HDFS-2018
Reviewers
tlipcon
hadoop-common-git
Posting patch on review-board for easy review

 
Posted (Aug. 31, 2011, 7:53 p.m.)
In doing this review, I remembered some reasoning about why I wanted to introduce the EditLogReference interface:

1) The lifecycle of the open edit streams is now unbalanced -- the streams are all opened in one place and then closed far away in loadEdits() later. If there's an error in one of the middle streams, it will leak the open files of the other streams that come after it. Introducing EditLogReference (a reference to a stream that's not opened yet) means that loadEdits(EditLogReference) has the whole lifecycle of an open stream.
2) The old implementation had true unit tests for the various gap scenarios, which I found easy to read. The new tests are much harder to read, with the "AbortSpec" stuff, plus it relies on actually creating real files on disk, rather than mock objects. Having the FileJournalManager only deal with references in the "planning" state, and moving the validation to the reference, means we can test all of this code with only mocks.

Specific comments on the implementation below.
  1. 1) This uses a factory pattern where it's quite common for allocation and closing to be in different parts of the code. I've addressed the leaked streams. The problem here is that we're selecting all streams before we use them. In the original patch, #loadEdits() would select the logs as it went along. Another solution would be to add an #open() method to EditLogInputStream, which would be similar to the reference approach, though seems strange as streams are generally opened on instantiation in java.
    
    2) Which scenarios in particular are you referring to? 
If you pass "target - 1" above, then why do we have to explicitly remove it again? It already should be excluded here.
  1. The "target - 1" passed to selectInputStreams means that it should throw an exception if it doesn't find enough transactions to get to at least target - 1. Its not a limit though, selectInputStreams will try to return all streams which can serve transactions from lastAppliedTxId onwards.
    
shouldn't there be exactly 1 stream here?

What happens if you don't find any? there should be a check for stream == null
  1. Yes, exactly 1. 
    
    I've added a check now. I was thinking of adding an assert (the synchronization should ensure a stream exists) but added a LOG.warn instead, as a retry would likely recover the situation anyhow.
    
add a comment that this JM is never used for input, hence it can never provide transactions?
  1. done
adjust javadoc to explain what "best" means. In this case, it's the one who has the most transactions readable after that txid
  1. done
should log a warning
  1. done
this error path leaks streams
  1. done
while we're at it, we can add a good check here that lastTxId == INVALID_TXID || op.txid == lastTxid + 1 - log an ERROR and break the loop if that's hit, I think?
  1. done
I don't think this logic makes sense - in the case of a journal manager going away and coming back inside the BN or SBN, you may want to finalize old segments while writing to a new one. That is to say, this should be independent of currentInProgress.
  1. done
warning message should include the storage dir
  1. done
don't you also need to break, here? Otherwise you will improperly log all larger logs as gaps
  1. done
what's the purpose of this helper function?
  1. it was useful when there was caching going on. I've removed it now and made the validateLog call direct.
    
is this ever called?
  1. nope, removed.
Add javadoc that this mutates the structure - it's not clear from the name of the function.
  1. done. Also removed the EditLogValidation being returned, as thats never used.
please use American spelling to match the rest of Hadoop
  1. done
What would happen if the txid file hadn't been updated? Can we add a test for the case where we have an out-of-date seen_txid file, but we do have new logs? i.e:

fsimage_0
edits_1-10
edits_inprogress_11 (which 5 txns)
seen_txid = 0

Do we properly read all 15 txns? Do we have enough safeguards that we won't overwrite edits or do anything overlapping?

In the current implementation (ie pre-patch), it would see the newer logs and know that it should attempt to read them. In this implementation, I think it might overwrite one of the segments.

Please add a test case for this.


  1. I've added a test for this. If the seen_txid is out of date(I guess this happens if a NN crashes after creating a segment, but before the update) the NN still starts up. I've reinstated CorruptionException to take care of this. And added new variants of doTestCrashRecoverEmptyLog to check it.