Review Board 1.7.22


Sqoop2: HdfsExportPartitioner is not always respecting maximal number of partitions

Review Request #10143 - Created March 26, 2013 and updated

Vasanth kumar RJ
sqoop-844
Reviewers
Sqoop
sqoop-sqoop2
HdfsExportPartitioner is not always respecting maximal number of partitions.
Modified partition logic.
Before using this patch, for simulating the failure in base code.
In TestHdfsExtract.java, change NUMBER_OF_ROWS_PER_FILE = 1. Then try running particular test case given in this patch. Where returning partitions size is greater than required.

Fix will create partitions less than or equal to required. Fixed.
execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsExportPartitioner.java
Revision 115ca54 New Change
[20] 70 lines
[+20] [+] public class HdfsExportPartitioner extends Partitioner {
71

    
   
71

   
72
    try {
72
    try {
73
      long numInputBytes = getInputSize(conf);
73
      long numInputBytes = getInputSize(conf);
74
      maxSplitSize = numInputBytes / context.getMaxPartitions();
74
      maxSplitSize = numInputBytes / context.getMaxPartitions();
75

    
   
75

   

    
   
76
      if(numInputBytes % context.getMaxPartitions() != 0 ) {

    
   
77
        maxSplitSize += 1;

    
   
78
       }

    
   
79

   
76
      long minSizeNode = 0;
80
      long minSizeNode = 0;
77
      long minSizeRack = 0;
81
      long minSizeRack = 0;
78
      long maxSize = 0;
82
      long maxSize = 0;
79

    
   
83

   
80
      // the values specified by setxxxSplitSize() takes precedence over the
84
      // the values specified by setxxxSplitSize() takes precedence over the
[+20] [20] 468 lines
execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsExtract.java
Revision 62f3a03 New Change
 
  1. execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsExportPartitioner.java: Loading...
  2. execution/mapreduce/src/test/java/org/apache/sqoop/job/TestHdfsExtract.java: Loading...