Review Board 1.7.22

ByteBuffer-based read API for DFSInputStream

Review Request #4184 - Created March 5, 2012 and updated

Todd Lipcon
Posting patch to RB on behalf of Henry.

Ship it!
Posted (March 5, 2012, 9:57 p.m.)
first pass looks good. A few questions:
- do you have before/after benchmarks for the old TestParallelRead with this patch? ie make sure this patch doesn't regress performance of the "normal" array-based read path?
- have you run the various randomized tests overnight or so to get good coverage of all the cases?

I didn't look carefully at the libhdfs code, since you said offline there's another rev of that coming
  1. Thanks for the review! I'll get a new patch on the ticket and here momentarily. 
can you add javadoc explaining this var (even though it's not new)?
  1. Done.
I'd move this inside the if statement below. Then in the second "phase", re-declare it under that scope, since the "len" variable doesn't ever carry over between phases (and in fact becomes invalid)
  1. Good catch, done. 
Seems like this area could be optimized a bit to avoid the object creation here...

In one case, slowReadBuff.remaining() <= len, in which case you can directly do buf.put(slowReadBuff).

In the other case, you could probably save off the limit of slowReadBuff, lower the limit to len, do the read, then set the limit again.
  1. Done - I didn't bother making the special case, but have avoided the slice. I doubt the limit manipulation is costly.
this code's very similar to line 358-362. Can we extract a utility function which we could optimize as I mentioned above to avoid object allocation?
  1. Done.
I think it might simplify things if you used this function only for the case where verifyChecksum == true. Then above, the non-checksummed read case would just call fillBuffer directly. What do you think?

My reasoning is that this function is called 'readChunksIntoByteBuffer' implying that it always reads a multiple of checksum chunk size, which isn't really a requirement when not verifying checksums
  1. I take your point. My only reservation is that there's some logic that we'd have to duplicate around the handling of offsetFromChunkBoundary and updating the buffer position on a successful read. Maybe renaming the method is the easiest way to avoid confusion? Maybe doByteBufferRead or something equally vague...
maybe TRACE is more appropriate here. Also LOG.trace() inside
  1. Done on both counts.
remove commented out code
  1. Done
I think it's worth making these actual static inner classes with names instead of anonymous, for prettier stack traces.

I remember we discussed a few weeks back whether the creation of these lambdas would harm performance for short reads. Do you have any results for TestParallelRead performance for the lambda-based approach here vs the uglier approach
  1. I've made them static. I'll measure the performance with and without, and report on the jira. 
goofy empty clause. Other goofy indentation here.
  1. Fixed. Whitespace is all over the shop in hdfs.c. I've made this method a uniform 2-space indent.