Review Board 1.7.22

Hcatalog input format changes with integration of output format changes and record reader changes

Review Request #3901 - Created Feb. 14, 2012 and updated

Vikram Dixit Kumaraswamy
francisliu, gates, sushanth
This is a patch with input format specific changes integrated with record reader and output format changes.

Posted (Feb. 16, 2012, 4:59 p.m.)


This looks like it came from my patch. I've updated my patch and moved this into InternalUtil. Please update.
comments don't match the method.
This should probably be moved to internal util?
You don't really need the user to specify a class do you? I think you can arbitrarily pick one for HiveConf preferably one from the hive client library.
include the original exception as part of the new throw exception. provide a better message.
Sushanth has this in his patch as well...a more updated version I think.
We might as well fix this...better to throw the specific exception, IOException. This should break backwards compatibility.
Might as well fix this too.
maybe call it getMapRedInputFormat(), to make things clearer.

Is this used beyond this class? Consider making it private or package private.
Are you sure tableDesc.getJobProperties() hasn't been set you prior to this?
Add a message.
change to private?
this class was removed in my patch. please update.
shouldn't the contents of this be moved into mapreduce.initializ()? 
you should probably throw an exception if it's not an hcatsplit.
add a message
i'm wondering wether you should actually throw an exception as well. since the underlying inputformat did so itself?
shouldn't this be a mapreduce.InputSplit?
Posted (Feb. 17, 2012, 1:05 a.m.)
I realized this is missing FosterStorageHandler changes for the input side. You might've forgotten to include it?
Posted (Feb. 17, 2012, 2:12 a.m.)


Still seems to be in HCatUtil. Do I have to use another patch to see this change?
Discussed with Alan about this. If we leave it as an empty constructor, this loads the HiveClient class. Alan was of the opinion that the class does not really matter. So have changed this to be the empty constructor. We can use another if the need arises.
Fixed as part of Sushanth's changes?
Done. Both.
Fixed as part of the exception being thrown from getJobInfo.
The input split that is passed to the mapreduce initialize has to be the mapreduce inputsplit by definition. Added this new initialize function as an imitation. Can rename the function but it is doing the functionality of initialize.