Review Board 1.7.22


SQOOP-777. Sqoop2: Pluggable Intermediate Data Format

Review Request #12936 - Created July 25, 2013 and updated

Hari Shreedharan
SQOOP-777
Reviewers
Sqoop
sqoop-sqoop2
Implemented a pluggable intermediate data format that decouples the internal representation of the data from the connector and the output formats. Connectors can choose to implement and support a format that is more efficient for them. Also separated the SqoopWritable so that we can use the intermediate data format independent of (current) Hadoop. 

I ran a full build - all tests including integration tests pass. I have not added any new tests, yet. I will add unit tests for the new classes. Also, I have not tried running this on an actual cluster - so things may be broken. I'd like some initial feedback based on the current patch. 

I also implemented escaping of characters. There is some work remaining to support binary format, but it is mostly integration, the basic implementation is in place.

 

Diff revision 5 (Latest)

1 2 3 4 5
1 2 3 4 5

  1. pom.xml: Loading...
  2. common/pom.xml: Loading...
  3. common/src/main/java/org/apache/sqoop/etl/io/DataReader.java: Loading...
  4. common/src/main/java/org/apache/sqoop/etl/io/DataWriter.java: Loading...
  5. common/src/main/java/org/apache/sqoop/schema/type/Column.java: Loading...
  6. connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcConnector.java: Loading...
  7. connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcExportInitializer.java: Loading...
  8. connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportInitializer.java: Loading...
  9. connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/util/InitializationUtils.java: Loading...
  10. connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestExportLoader.java: Loading...
  11. connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportExtractor.java: Loading...
  12. connector/connector-sdk/pom.xml: Loading...
  13. connector/connector-sdk/src/main/java/org/apache/sqoop/connector/CSVIntermediateDataFormat.java: Loading...
  14. connector/connector-sdk/src/main/java/org/apache/sqoop/connector/IntermediateDataFormat.java: Loading...
  15. connector/connector-sdk/src/test/java/org/apache/sqoop/connector/CSVIntermediateDataFormatTest.java: Loading...
  16. core/src/main/java/org/apache/sqoop/framework/JobManager.java: Loading...
  17. core/src/main/java/org/apache/sqoop/framework/SubmissionRequest.java: Loading...
  18. execution/mapreduce/pom.xml: Loading...
  19. execution/mapreduce/src/main/java/org/apache/sqoop/execution/mapreduce/MapreduceExecutionEngine.java: Loading...
  20. execution/mapreduce/src/main/java/org/apache/sqoop/job/JobConstants.java: Loading...
This diff has been split across 2 pages: 1 2 >
pom.xml
Revision 5ea0633 New Change
[20] 143 lines
[+20]
144
            <artifactId>commons-io</artifactId>
144
            <artifactId>commons-io</artifactId>
145
            <version>${commons-io.version}</version>
145
            <version>${commons-io.version}</version>
146
          </dependency>
146
          </dependency>
147

    
   
147

   
148
          <dependency>
148
          <dependency>
149
            <groupId>com.google.guava</groupId>
Moved to 325

   
150
            <artifactId>guava</artifactId>
Moved to 326

   
151
            <version>${guava.version}</version>
Moved to 327

   
152
          </dependency>
Moved to 328

   
153

    
   

   
154
          <dependency>

   
155
            <groupId>org.apache.hadoop</groupId>
149
            <groupId>org.apache.hadoop</groupId>
156
            <artifactId>hadoop-core</artifactId>
150
            <artifactId>hadoop-core</artifactId>
157
            <version>${hadoop.1.version}</version>
151
            <version>${hadoop.1.version}</version>
158
            <scope>provided</scope>
152
            <scope>provided</scope>
159
          </dependency>
153
          </dependency>
[+20] [20] 166 lines
[+20]
326
        <groupId>commons-lang</groupId>
320
        <groupId>commons-lang</groupId>
327
        <artifactId>commons-lang</artifactId>
321
        <artifactId>commons-lang</artifactId>
328
        <version>${commons-lang.version}</version>
322
        <version>${commons-lang.version}</version>
329
      </dependency>
323
      </dependency>
330
      <dependency>
324
      <dependency>
Moved from 149

    
   
325
        <groupId>com.google.guava</groupId>
Moved from 150

    
   
326
        <artifactId>guava</artifactId>
Moved from 151

    
   
327
        <version>${guava.version}</version>
Moved from 152

    
   
328
      </dependency>

    
   
329
      <dependency>
331
        <groupId>javax.servlet</groupId>
330
        <groupId>javax.servlet</groupId>
332
        <artifactId>servlet-api</artifactId>
331
        <artifactId>servlet-api</artifactId>
333
        <version>${servlet.version}</version>
332
        <version>${servlet.version}</version>
334
      </dependency>
333
      </dependency>
335
      <dependency>
334
      <dependency>
[+20] [20] 343 lines
common/pom.xml
Revision db11b5b New Change
 
common/src/main/java/org/apache/sqoop/etl/io/DataReader.java
Revision 3e1adc7 New Change
 
common/src/main/java/org/apache/sqoop/etl/io/DataWriter.java
Revision d81364e New Change
 
common/src/main/java/org/apache/sqoop/schema/type/Column.java
Revision 8b630b2 New Change
 
connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcConnector.java
Revision e0da80f New Change
 
connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcExportInitializer.java
Revision 7212843 New Change
 
connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportInitializer.java
Revision 96818ba New Change
 
connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/util/InitializationUtils.java
New File
 
connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestExportLoader.java
Revision aa1c4ff New Change
 
connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportExtractor.java
Revision a7ed6ba New Change
 
connector/connector-sdk/pom.xml
Revision 4056e14 New Change
 
connector/connector-sdk/src/main/java/org/apache/sqoop/connector/CSVIntermediateDataFormat.java
New File
 
connector/connector-sdk/src/main/java/org/apache/sqoop/connector/IntermediateDataFormat.java
New File
 
connector/connector-sdk/src/test/java/org/apache/sqoop/connector/CSVIntermediateDataFormatTest.java
New File
 
core/src/main/java/org/apache/sqoop/framework/JobManager.java
Revision d0a087d New Change
 
core/src/main/java/org/apache/sqoop/framework/SubmissionRequest.java
Revision 53d0039 New Change
 
execution/mapreduce/pom.xml
Revision f9a2a0e New Change
 
execution/mapreduce/src/main/java/org/apache/sqoop/execution/mapreduce/MapreduceExecutionEngine.java
Revision 767080c New Change
 
execution/mapreduce/src/main/java/org/apache/sqoop/job/JobConstants.java
Revision 7fd9a01 New Change
 
  1. pom.xml: Loading...
  2. common/pom.xml: Loading...
  3. common/src/main/java/org/apache/sqoop/etl/io/DataReader.java: Loading...
  4. common/src/main/java/org/apache/sqoop/etl/io/DataWriter.java: Loading...
  5. common/src/main/java/org/apache/sqoop/schema/type/Column.java: Loading...
  6. connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcConnector.java: Loading...
  7. connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcExportInitializer.java: Loading...
  8. connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportInitializer.java: Loading...
  9. connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/util/InitializationUtils.java: Loading...
  10. connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestExportLoader.java: Loading...
  11. connector/connector-generic-jdbc/src/test/java/org/apache/sqoop/connector/jdbc/TestImportExtractor.java: Loading...
  12. connector/connector-sdk/pom.xml: Loading...
  13. connector/connector-sdk/src/main/java/org/apache/sqoop/connector/CSVIntermediateDataFormat.java: Loading...
  14. connector/connector-sdk/src/main/java/org/apache/sqoop/connector/IntermediateDataFormat.java: Loading...
  15. connector/connector-sdk/src/test/java/org/apache/sqoop/connector/CSVIntermediateDataFormatTest.java: Loading...
  16. core/src/main/java/org/apache/sqoop/framework/JobManager.java: Loading...
  17. core/src/main/java/org/apache/sqoop/framework/SubmissionRequest.java: Loading...
  18. execution/mapreduce/pom.xml: Loading...
  19. execution/mapreduce/src/main/java/org/apache/sqoop/execution/mapreduce/MapreduceExecutionEngine.java: Loading...
  20. execution/mapreduce/src/main/java/org/apache/sqoop/job/JobConstants.java: Loading...
This diff has been split across 2 pages: 1 2 >