Review Board 1.7.22


PIG-3541 Add diagnostic information to TezStats

Review Request #15031 - Created Oct. 29, 2013 and updated

Cheolsoo Park
tez
PIG-3541
Reviewers
pig
daijy, mwagner, rohini
pig-git
This patch includes the following changes:
* Implement Input/OutputStats for Tez. (This makes DUMP work.) As of now, counters cannot be retrieved from Tez DAG, so only filenames are reported.
* Add the error message from DAGStatus.getDiagnostic() for failed DAG. As of now, backend error messages or stack traces cannot be retrieved from Tez DAG, so only the id of failed vertex is reported.
* Factor out a few methods/fields that can be used by both MR and Tez into PigStats. Duplicate code between SimplePigStats and TezStats is minimal now.

* Updated TestTezLauncher by adding asserts for input/output stats.
* Ran ant test-tez.
* Verified reports for succeeded/failed DAGs-

  Success!
            Input(s): Successfully read records from: "hdfs://localhost:57063/user/cheolsoop/foo"                         
           Output(s): Successfully stored records in: "/user/cheolsoop/13"

  Failed!
        ErrorMessage: Vertex failed vertex_1383071498815_0006_1_01                                                        
                    : DAG failed due to vertex failure. failedVertices:1 killedVertices:0                                 

            Input(s): Failed to read data from "hdfs://localhost:57063/user/cheolsoop/foo"                                
           Output(s): Failed to produce result in "/user/cheolsoop/14" 
Ship it!
Posted (Oct. 31, 2013, 4:19 a.m.)
We need to figure out why diagnostics is not the real error message. The patch itself looks good. Let's commit it first.
  1. Thank you Daniel!
    
    I opened a Tez jira about diagnostic information- https://issues.apache.org/jira/browse/TEZ-591