Review Board 1.7.22


Hive-3159 Update AvroSerde to determine schema of new tables

Review Request #11925 - Created June 18, 2013 and updated

Mohammad Islam
trunk
HIVE-3159
Reviewers
hive
ashutoshc, jghoman
hive-git
Problem:
Hive doesn't support to create a Avro-based table using HQL create table command. It currently requires to specify Avro schema literal or schema file name.
For multiple cases, it is very inconvenient for user.
Some of the un-supported use cases:
1. Create table ... <Avro-SERDE etc.> as SELECT ... from <NON-AVRO FILE>
2. Create table ... <Avro-SERDE etc.> as SELECT from <AVRO TABLE>
3. Create  table  without specifying Avro schema.

Wrote a new java Test class for a new Java class. Added a new test case into existing java test class. In addition, there are 4 .q file for testing multiple use-cases.
Total:
29
Open:
29
Resolved:
0
Dropped:
0
Status:
From:
Description From Last Updated Status
Please maintain reasonable limits on line length (80 or 100 characters works for me). Carl Steinbach Jan. 11, 2014, 7:39 a.m. Open
Please put each clause on its own line: CREATE TABLE xxx ROW FORMAT SERDE '....' STORED AS INPUTFORMAT '....' OUTPUTFORMAT ... Carl Steinbach Jan. 11, 2014, 7:39 a.m. Open
Line length Carl Steinbach Jan. 11, 2014, 7:39 a.m. Open
Line length. Carl Steinbach Jan. 11, 2014, 7:39 a.m. Open
Formatting Carl Steinbach Jan. 11, 2014, 7:39 a.m. Open
Formatting Carl Steinbach Jan. 11, 2014, 7:39 a.m. Open
Formatting Carl Steinbach Jan. 11, 2014, 7:39 a.m. Open
s/Defintion/Definition/ Carl Steinbach Jan. 11, 2014, 7:39 a.m. Open
Please throw the exception here instead of returning null. This method should never return null. Carl Steinbach Jan. 11, 2014, 7:39 a.m. Open
What's the significance of "org.apache.hive.auto_gen_schema"? I can't find any other references to this namespace in the code. Carl Steinbach Jan. 11, 2014, 7:39 a.m. Open
table or partition properties, or both? Carl Steinbach Jan. 11, 2014, 7:39 a.m. Open
Formatting Carl Steinbach Jan. 11, 2014, 7:39 a.m. Open
Missing ASF license header. Carl Steinbach Jan. 11, 2014, 7:39 a.m. Open
To me "TypeInfoToSchema" sounds like the name of method, not a class. Please either change the name to AvroSchemaUtils or ... Carl Steinbach Jan. 11, 2014, 7:39 a.m. Open
map<string, any-type> Carl Steinbach Jan. 11, 2014, 7:39 a.m. Open
Why Hashtable instead of HashMap? Carl Steinbach Jan. 11, 2014, 7:39 a.m. Open
People who don't already know how Avro encodes nullable values will find this method easier to understand if the parameter ... Carl Steinbach Jan. 11, 2014, 7:39 a.m. Open
Instead of adding assertions to each of these methods I think it would be cleaner to specify the expected type ... Carl Steinbach Jan. 11, 2014, 7:39 a.m. Open
The code snippet "tag + "_" + tInfo.getCategory().name()" gets repeated a lot. I think this should be moved to a ... Carl Steinbach Jan. 11, 2014, 7:39 a.m. Open
Is it possible to change the LHS to List? Carl Steinbach Jan. 11, 2014, 7:39 a.m. Open
Formatting. Carl Steinbach Jan. 11, 2014, 7:39 a.m. Open
Missing ASF license header. Carl Steinbach Jan. 11, 2014, 7:39 a.m. Open
Formatting. Carl Steinbach Jan. 11, 2014, 7:39 a.m. Open
Please populate the source table with data before running a CTAS statement against it (the current example is functionally equivalent ... Carl Steinbach Jan. 11, 2014, 8:05 a.m. Open
Why have both avro_no_schema_test.q and avro_without_schema.q? Why does avro_without_schema.q create the table, but not load it with data or select ... Carl Steinbach Jan. 11, 2014, 8:05 a.m. Open
"Moreover, since any type could potentially hold a NULL value, all corresponding Avro schema should be a union of null ... Carl Steinbach Jan. 11, 2014, 8:05 a.m. Open
wrapWithUnion is not used. Carl Steinbach Jan. 11, 2014, 8:05 a.m. Open
wrapWithUnion is not used. Carl Steinbach Jan. 11, 2014, 8:05 a.m. Open
wrapWithUnion is not used. Carl Steinbach Jan. 11, 2014, 8:05 a.m. Open
Review request changed
Updated (Jan. 31, 2014, 2:18 a.m.)
Addressed Carl's review comments