Review Request #883 - Created June 10, 2011 and updated
Follow-up for HIVE-2147.
Added test cases for new api.
I think this should be changed to "PartitionEventType" in order to make it clear that this applies to partitions only. If in the future we need to introduce event types for tables, indexes, etc, then we should add new enums for those event types as well.
This should also throw UnknownDBException and UnknownTableException. The same goes for isPartitionMarkedForEvent.
Collections aren't required to satisfy an ordering property, so we have to assume the output of this logging statement is ambiguous, e.g. "[a, b]" versus "[b, a]". We should disambiguate this by passing in the part_vals map and logging the key/value pairs instead of just the values.
Missing exceptions: UnknownDbException and UnknownTableException.
Checking to see if the DB and Table exist should be done in the same database transaction as the rest of the operation. If you do it here there's no guarantee that the db/table will still exist when ms.markPartitionForEvent() is called.
Should we add an InvalidPartitionException and UnknownPartitionException? Seems like those are both valid exceptions in this situation.
Same issue here as before. These checks need to get pushed into ms.isPartitionMarkedForEvent().
I think the name of this method is misleading. You're marking a single partition done, not a set of partitions, right? Also, in this context being "done" means that the load operation on that partition has completed, so it would be good to include "load" in the name of the method and event class, e.g. "LoadPartitionDoneEvent" and "onLoadPartitionDone".
Is it possible to use org.apache.hadoop.hive.metastore.api.EventType instead of int? Another approach is to create an MPartitionEvent baseclass, and then subclass that with MPartitionLoadDoneEvent, etc, and use eventType as the internal type discriminator for JDO.
You need to supply schema upgrade scripts for Derby and MySQL. Please either do that in this ticket or open a followup ticket and assign it to yourself.
It looks like it's possible for this table to hold more than one "MarkPartitionDone" event for the same partition, but is that a legal state? If it is, how do you know when the load operation for a partition is still in progress?
Can you subclass this with a remote and embedded version?
Any reason in particular why you switched to always running this test in local mode? If we can only test one scenario, then I think there's more value in focusing on the standalone client/server setup.