FLUME-1972 ElasticSearchSink now allows the id of the document to be provided by a custom class
Review Request #10380 - Created April 9, 2013 and discarded
Elasticsearch allows clients to specify the id of the newly added document - and will generate one by default if none is supplied. If the user knows that the incoming event has a unique id then that id can be used as the elasticsearch document id. This change enhances the ElasticSearchSink to allow users to provide a custom class to decide what the id should be.
All unit tests and integration tests pass. A snapshot using commit 5b9d31f1ad228 and the patch for flume-1972 has passed our internal integration tests using customisations.
Posted (April 10, 2013, 11:04 a.m.)
Mr. Sargisson, Thanks for posting this review. From an elasticsearch HTTP client perspective, the patch looks good. However, in your patch, the only class that implements ElasticSearchIdProvider is ElasticSearchNullIdProvider and it returns a null id which means ES is still going to have to generate one when the document is indexed. Based on the description of the patch reviewed, it would have been nice to see another example class that implements ElasticSearchIdProvider that actually generates a non-null document id so that this default behavior can be overriden. So maybe generating a document id based on the original timestamp in the event optionally in combination of other headers would be nice. Nevertheless, this is a good effort. Thanks.
Review request changed
Updated (April 10, 2013, 5 p.m.)
Posted (April 10, 2013, 6:07 p.m.)
Sometimes, typos do occur in the configuration files for these custom class names. I think it would be nice to see if clazz is not null and also assignable before we try to instantiate an object for the provider. You should also log that the code was not able to create an instance from class specified in the config file. This will make things easier to troubleshoot when things go wrong. Take a look at an example here for some tips: org.apache.flume.serialization.EventSerializerFactory
Posted (April 11, 2013, 10:10 a.m.)
Review request changed
Updated (April 26, 2013, 9:30 p.m.)
- changed from pending to discarded
Flume-2015 is a better patch.