Review Board 1.7.22

HIVE-3840 -hive cli null representation in output is inconsistent

Review Request #10312 - Created April 5, 2013 and updated

Thejas Nair
HIVE-3840 -hive cli null representation in output is inconsistent
unit tests updated
Review request changed
Updated (April 5, 2013, 6:32 p.m.)
Posted (April 5, 2013, 8:58 p.m.)


Couldn't understand the reason how come this change may affect the ordering of rows. Do you have any insight?
  1. The ascii sequence is like this - A .. Z [ \ ] .. a .. z
    'NULL' comes before [.. , while 'null' comes after [.. .
    In fact, this makes things more SQL complaint. SQL order-by clause requires the nulls to be all at the end or the begining. Now the behavior of primitive types and complex types will be consistent.
seems like serde's could have stored custom representation of null in properties which would have been used, aren't we taking away that capability now?
  1. The change now lets you customize the null string, which wasn't the case earlier.
Bit confusing to me. JSON_NULL is still null, how come than we have NULL being printed now.
  1. The DelimitedJSONSerDe class which i used for serialized output of hive cli, calls - buildJSONString(StringBuilder sb, Object o, ObjectInspector oi, String nullStr) . 
    For all other uses  of this funcitonality that don't care about customizing top level null, you still have - public static String getJSONString(Object o, ObjectInspector oi). In that case, the default of JSON_NULL is used.
    Note that only the *top level* null representation has been changed. ie, if the ObjectInspector.getXXX() returns null, then the custom null representation is used. For cases where there are nulls within it (eg one of the values in a struct is null), then the json style lowercase null continues to be used . 
Ship it!
Posted (April 5, 2013, 9:46 p.m.)
Ship It!