Review Board 1.7.22


HIVE-5325: Implement statistics providing ORC writer and reader interfaces

Review Request #14243 - Created Sept. 20, 2013 and updated

Prasanth_J
HIVE-5325
Reviewers
hive
ashutoshc, omalley
hive-git
HIVE-5324 adds new interfaces that can be implemented by ORC reader/writer to provide statistics. Writer provided statistics is used to update table/partition level statistics in metastore. Reader provided statistics can be used for reducer estimation, CBO etc. in the absence of metastore statistics.
ORC related unit and qfile tests are passing.
ql/src/java/org/apache/hadoop/hive/ql/io/orc/BinaryColumnStatistics.java
New File

    
   
1
/**

    
   
2
 * Licensed to the Apache Software Foundation (ASF) under one

    
   
3
 * or more contributor license agreements.  See the NOTICE file

    
   
4
 * distributed with this work for additional information

    
   
5
 * regarding copyright ownership.  The ASF licenses this file

    
   
6
 * to you under the Apache License, Version 2.0 (the

    
   
7
 * "License"); you may not use this file except in compliance

    
   
8
 * with the License.  You may obtain a copy of the License at

    
   
9
 *

    
   
10
 *     http://www.apache.org/licenses/LICENSE-2.0

    
   
11
 *

    
   
12
 * Unless required by applicable law or agreed to in writing, software

    
   
13
 * distributed under the License is distributed on an "AS IS" BASIS,

    
   
14
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

    
   
15
 * See the License for the specific language governing permissions and

    
   
16
 * limitations under the License.

    
   
17
 */

    
   
18
package org.apache.hadoop.hive.ql.io.orc;

    
   
19

   

    
   
20
/**

    
   
21
 * Statistics for binary columns.

    
   
22
 */

    
   
23
public interface BinaryColumnStatistics extends ColumnStatistics {

    
   
24
  long getSum();

    
   
25
}
ql/src/java/org/apache/hadoop/hive/ql/io/orc/ColumnStatisticsImpl.java
Revision 6268617 New Change
 
ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcOutputFormat.java
Revision 6f8ca73 New Change
 
ql/src/java/org/apache/hadoop/hive/ql/io/orc/ReaderImpl.java
Revision e034ca0 New Change
 
ql/src/java/org/apache/hadoop/hive/ql/io/orc/StringColumnStatistics.java
Revision 72e779a New Change
 
ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java
Revision c0b55ce New Change
 
ql/src/java/org/apache/hadoop/hive/ql/util/JavaDataModel.java
Revision e3eec02 New Change
 
ql/src/protobuf/org/apache/hadoop/hive/ql/io/orc/orc_proto.proto
Revision edbf822 New Change
 
ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcFile.java
Revision e6569f4 New Change
 
ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcNullOptimization.java
Revision b93db84 New Change
 
ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcSerDeStats.java
New File
 
ql/src/test/resources/orc-file-dump.out
Revision fac5326 New Change
 
ql/src/test/resources/orc-file-dump-dictionary-threshold.out
Revision 003c132 New Change
 
  1. ql/src/java/org/apache/hadoop/hive/ql/io/orc/BinaryColumnStatistics.java: Loading...
  2. ql/src/java/org/apache/hadoop/hive/ql/io/orc/ColumnStatisticsImpl.java: Loading...
  3. ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcOutputFormat.java: Loading...
  4. ql/src/java/org/apache/hadoop/hive/ql/io/orc/ReaderImpl.java: Loading...
  5. ql/src/java/org/apache/hadoop/hive/ql/io/orc/StringColumnStatistics.java: Loading...
  6. ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java: Loading...
  7. ql/src/java/org/apache/hadoop/hive/ql/util/JavaDataModel.java: Loading...
  8. ql/src/protobuf/org/apache/hadoop/hive/ql/io/orc/orc_proto.proto: Loading...
  9. ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcFile.java: Loading...
  10. ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcNullOptimization.java: Loading...
  11. ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcSerDeStats.java: Loading...
  12. ql/src/test/resources/orc-file-dump.out: Loading...
  13. ql/src/test/resources/orc-file-dump-dictionary-threshold.out: Loading...