Review Board 1.7.22


PIG-3585 Implement union in Tez

Review Request #15931 - Created Dec. 1, 2013 and submitted

Cheolsoo Park
PIG-3585
Reviewers
pig
abain, daijy, mwagner, rohini
pig-git
This patch implements union as follows: load vertices -> broadcast edges -> union vertex.

Th changes include:
* In the front-end, TezCompiler converts POUnion into a new vertex and connects it to its predecessors with broadcast edges.
* In the back-end, a new POPackage class called POBroadcastTezLoad is added. This classes implements TezLoad interface, and it pulls every record from ShuffledUnorderedKVInputs in order and unions them.
* New e2e test case is added.
* ant test-tez passes.
* All e2e tests pass.
src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/Packager.java
Revision e49de40 New Change
[20] 21 lines
[+20] [+] public class Packager implements Serializable, Cloneable {
22
    protected boolean[] readOnce;
22
    protected boolean[] readOnce;
23

    
   
23

   
24
    protected DataBag[] bags;
24
    protected DataBag[] bags;
25

    
   
25

   
26
    public static enum PackageType {
26
    public static enum PackageType {
27
        GROUP, JOIN
27
        GROUP, JOIN, UNION
28
    };
28
    };
29

    
   
29

   
30
    // The key being worked on
30
    // The key being worked on
31
    Object key;
31
    Object key;
32

    
   
32

   
[+20] [20] 263 lines
src/org/apache/pig/backend/hadoop/executionengine/tez/POBroadcastTezLoad.java
Revision e69de29 New Change
 
src/org/apache/pig/backend/hadoop/executionengine/tez/PigProcessor.java
Revision 9a2b499 New Change
 
src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java
Revision 529bf30 New Change
 
src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java
Revision e3f5a5d New Change
 
src/org/apache/pig/backend/hadoop/executionengine/tez/TezOperator.java
Revision dcd6a5a New Change
 
test/e2e/pig/tests/tez.conf
Revision 7fd5fb1 New Change
 
  1. src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/Packager.java: Loading...
  2. src/org/apache/pig/backend/hadoop/executionengine/tez/POBroadcastTezLoad.java: Loading...
  3. src/org/apache/pig/backend/hadoop/executionengine/tez/PigProcessor.java: Loading...
  4. src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java: Loading...
  5. src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java: Loading...
  6. src/org/apache/pig/backend/hadoop/executionengine/tez/TezOperator.java: Loading...
  7. test/e2e/pig/tests/tez.conf: Loading...