BigDecimals converted to String in inconsistent format
Review Request #9081 - Created Jan. 24, 2013 and submitted
Currently when BigDecimal fields are saved as Strings Sqoop uses the ToString method. This leads to values like "0.0000001" being stored as "1E-7" which doesn't seem ideal. This patch changes Sqoop to use ToPlainString for BigDecimals so they will always be stored in the same format. This should have minimal effect as they can still be converted back to BigDecimals no matter which way they are stored - and the scale doesn't seem relevant - it seems to always be zero anyway so there shouldn't be any change there. I added a new parameter "sqoop.bigdecimal.format.string" which can be set to false to revert to the old behaviour. I didn't add this as a command line parameter as it seems like something most users would not change so didn't want to confuse the user with another option - they can override it in sqoop-site.xml or on the command line using -D.
Have manually tested text file, avro file and hbase imports using both values of the new parameter. Also checked that if the parameter is not set it will use the toPlainString. I tested sequence files but there is no change as they don't use the toString methods.
|I believe that we should use method getNullNonStringValue() for big decimal column type as it's not a string based column, ...||Jarek Cecho||Jan. 25, 2013, 4:04 p.m.||Open|
Posted (Jan. 25, 2013, 4:04 p.m.)
Hi David, please accept my apology for this late review. Your changes looks good to me, I have only few small comments: 1) Would you mind adding couple of test cases for introduced functionality? I can imagine two test cases for import - one with and second without the "sqoop.bigdecimal.format.string" parameter. I think that both can be simply to text file, just testing that we get what we're expecting to get. It would be also nice to have export test case that will verify that we can export files generated with "sqoop.bigdecimal.format.string". 2) Would you mind adding few sentences to our user guide to document this behavior? I know that various properties are not yet well documented, but I would like to improve that :-)
I believe that we should use method getNullNonStringValue() for big decimal column type as it's not a string based column, right?
Review request changed
Updated (Feb. 5, 2013, 3:53 a.m.)
- changed from pending to submitted