Review Board 1.7.22


PIG-3015 Rewrite of AvroStorage

Review Request #8104 - Created Nov. 17, 2012 and updated

Joseph Adler
PIG-3015
Reviewers
pig
cheolsoo
pig-git
The current AvroStorage implementation has a lot of issues: it requires old versions of Avro, it copies data much more than needed, and it's verbose and complicated. (One pet peeve of mine is that old versions of Avro don't support Snappy compression.)

I rewrote AvroStorage from scratch to fix these issues. In early tests, the new implementation is significantly faster, and the code is a lot simpler. Rewriting AvroStorage also enabled me to implement support for Trevni.

This is the latest version of the patch, complete with test cases and TrevniStorage. (Test cases for TrevniStorage are still missing).

 
build.xml
Revision 7d468a0 New Change
[20] 707 lines
[+20]
708
                <include name="joda-time-${joda-time.version}.jar" />
708
                <include name="joda-time-${joda-time.version}.jar" />
709
                <include name="guava-${guava.version}.jar" />
709
                <include name="guava-${guava.version}.jar" />
710
                <include name="protobuf-java-${protobuf-java.version}.jar" />
710
                <include name="protobuf-java-${protobuf-java.version}.jar" />
711
                <include name="automaton-${automaton.version}.jar" />
711
                <include name="automaton-${automaton.version}.jar" />
712
                <include name="avro-${avro.version}.jar" />
712
                <include name="avro-${avro.version}.jar" />

    
   
713
                <include name="avro-mapred-${avro.version}.jar" />
713
                <include name="commons*.jar" />
714
                <include name="commons*.jar" />
714
                <include name="log4j*.jar" />
715
                <include name="log4j*.jar" />
715
                <include name="slf4j*.jar" />
716
                <include name="slf4j*.jar" />
716
                <include name="jsp-api*.jar" />
717
                <include name="jsp-api*.jar" />
717
                <include name="jansi-${jansi.version}.jar" />
718
                <include name="jansi-${jansi.version}.jar" />
[+20] [20] 1046 lines
ivy.xml
Revision 70e8d50 New Change
 
ivy/libraries.properties
Revision 317564f New Change
 
src/org/apache/pig/builtin/AvroStorage.java
New File
 
src/org/apache/pig/builtin/TrevniStorage.java
New File
 
src/org/apache/pig/impl/util/AvroBagWrapper.java
New File
 
src/org/apache/pig/impl/util/AvroMapWrapper.java
New File
 
src/org/apache/pig/impl/util/AvroRecordReader.java
New File
 
src/org/apache/pig/impl/util/AvroRecordWriter.java
New File
 
src/org/apache/pig/impl/util/AvroStorageDataConversionUtilities.java
New File
 
src/org/apache/pig/impl/util/AvroStorageSchemaConversionUtilities.java
New File
 
src/org/apache/pig/impl/util/AvroTupleWrapper.java
New File
 
test/commit-tests
Revision 5081fbc New Change
 
test/unit-tests
Revision 0f18a0e New Change
 
test/org/apache/pig/builtin/TestAvroStorage.java
New File
 
test/org/apache/pig/builtin/avro/createTests.bash
New File
 
test/org/apache/pig/builtin/avro/createests.py
New File
 
test/org/apache/pig/builtin/avro/code/pig/directory_test.pig
New File
 
test/org/apache/pig/builtin/avro/code/pig/identity.pig
New File
 
test/org/apache/pig/builtin/avro/code/pig/identity_ai1_ao2.pig
New File
 
  1. build.xml: Loading...
  2. ivy.xml: Loading...
  3. ivy/libraries.properties: Loading...
  4. src/org/apache/pig/builtin/AvroStorage.java: Loading...
  5. src/org/apache/pig/builtin/TrevniStorage.java: Loading...
  6. src/org/apache/pig/impl/util/AvroBagWrapper.java: Loading...
  7. src/org/apache/pig/impl/util/AvroMapWrapper.java: Loading...
  8. src/org/apache/pig/impl/util/AvroRecordReader.java: Loading...
  9. src/org/apache/pig/impl/util/AvroRecordWriter.java: Loading...
  10. src/org/apache/pig/impl/util/AvroStorageDataConversionUtilities.java: Loading...
  11. src/org/apache/pig/impl/util/AvroStorageSchemaConversionUtilities.java: Loading...
  12. src/org/apache/pig/impl/util/AvroTupleWrapper.java: Loading...
  13. test/commit-tests: Loading...
  14. test/unit-tests: Loading...
  15. test/org/apache/pig/builtin/TestAvroStorage.java: Loading...
  16. test/org/apache/pig/builtin/avro/createTests.bash: Loading...
  17. test/org/apache/pig/builtin/avro/createests.py: Loading...
  18. test/org/apache/pig/builtin/avro/code/pig/directory_test.pig: Loading...
  19. test/org/apache/pig/builtin/avro/code/pig/identity.pig: Loading...
  20. test/org/apache/pig/builtin/avro/code/pig/identity_ai1_ao2.pig: Loading...
This diff has been split across 3 pages: 1 2 3 >