Review Board 1.7.22


Add the --bulk-load-dir option to support the HBase doBulkLoad function

Review Request #13052 - Created July 30, 2013 and updated

Zhancheng Deng
Reviewers
Sqoop
jarcec, vasanthkumar
sqoop-trunk
SQOOP-1032: Add the --bulk-load-dir option to support the HBase doBulkLoad function 

 
Total:
13
Open:
13
Resolved:
0
Dropped:
0
Status:
From:
Review request changed
Updated (July 30, 2013, 5:32 a.m.)
  • changed from to Add the --bulk-load-dir option to support the HBase doBulkLoad function
src/java/com/cloudera/sqoop/hbase/HBasePutProcessor.java (425b0f4)
src/java/org/apache/sqoop/SqoopOptions.java (01805f9)
src/java/org/apache/sqoop/hbase/HBasePutProcessor.java (9ceb5bd)
src/java/org/apache/sqoop/hbase/ToStringPutTransformer.java (5ccf311)
src/java/org/apache/sqoop/manager/SqlManager.java (2a4992d)
src/java/org/apache/sqoop/mapreduce/HBaseBulkImportJob.java (PRE-CREATION)
src/java/org/apache/sqoop/mapreduce/HBaseBulkImportMapper.java (PRE-CREATION)
src/java/org/apache/sqoop/tool/BaseSqoopTool.java (0eca991)
Posted (Sept. 2, 2013, 2:30 p.m.)
Hi Zhancheng,
thank you for working on this JIRA, appreciated! Would you mind adding test cases to ensure that this functionality works as expected?
  1. If I'm not mistaken, integration testing bulk loading is still very hard/impossible because part of the configureIncrementalLoad needs to read the partition list which fails with the combination of a mini cluster + LocalJobRunner. I'm not advocating for no tests but maybe the compromise here would be to extract the bulk loading operations and test the rest of the logic of the HBaseBulkImportJob.
Let's not alter the classes in com.cloudera package. They are already deprecated and should not be used.
Nit: Trailing white space.
Nit: Trailing white space.
The comment seems to be off.
Nit: Trailing white space.
Nit: Trailing white space.
Nit: Trailing white space.
Nit: Trailing white space.
User had to specify the bulk load directory at this point, so it would be better to rather die quickly here.
User had to specify the bulk load directory at this point, so it would be better to rather die quickly than run the slower non bulk import.
Nit: Trailing white space.
Nit: Trailing white space.
Nit: Trailing white space.
Jarcec