Review Board 1.7.22


Review request for SQOOP-1033 "CombineFileInputFormat does not work with paths not on default FS like ASV"

Review Request #10988 - Created May 8, 2013 and updated

Shuaishuai Nie
trunk
SQOOP-1033
Reviewers
Sqoop
sqoop-trunk
CombineFileInputFormat does not work with ASV. This appeared as an issue in Sqoop which failed to export files in ASV. CombineFileInputFormat strips out the scheme and authority components of the path, after which point the path is assumed to be on the default file system.
 Sqoop has it own copy of CombineFileInputFormat, but updated the ones in core as well for consistency.

There are currently already solved Jiras for the same issue:
https://issues.apache.org/jira/browse/MAPREDUCE-2704
https://issues.apache.org/jira/browse/MAPREDUCE-1806
Tested in ASV manually

Diff revision 3 (Latest)

1 2 3
1 2 3

  1. src/java/org/apache/sqoop/mapreduce/CombineFileInputFormat.java: Loading...
src/java/org/apache/sqoop/mapreduce/CombineFileInputFormat.java
Revision 7d2be38 New Change
[20] 221 lines
[+20] [+] protected boolean isSplitable(JobContext context, Path file) {
222
    // Convert them to Paths first. This is a costly operation and
222
    // Convert them to Paths first. This is a costly operation and
223
    // we should do it first, otherwise we will incur doing it multiple
223
    // we should do it first, otherwise we will incur doing it multiple
224
    // times, one time each for each pool in the next loop.
224
    // times, one time each for each pool in the next loop.
225
    List<Path> newpaths = new LinkedList<Path>();
225
    List<Path> newpaths = new LinkedList<Path>();
226
    for (int i = 0; i < paths.length; i++) {
226
    for (int i = 0; i < paths.length; i++) {
227
      Path p = new Path(paths[i].toUri().getPath());
227
      FileSystem fs = paths[i].getFileSystem(conf);

    
   
228

   

    
   
229
      //the scheme and authority will be kept if the path is

    
   
230
      //a valid path for a non-default file system

    
   
231
      Path p = fs.makeQualified(paths[i]);
228
      newpaths.add(p);
232
      newpaths.add(p);
229
    }
233
    }
230
    paths = null;
234
    paths = null;
231

    
   
235

   
232
    // In one single iteration, process all the paths in a single pool.
236
    // In one single iteration, process all the paths in a single pool.
[+20] [20] 481 lines
  1. src/java/org/apache/sqoop/mapreduce/CombineFileInputFormat.java: Loading...