FLUME-1425: Create a SpoolDirectory Source and Client
Review Request #6377 - Created Aug. 4, 2012 and updated
This patch adds a spooling directory based source. The idea is that a user can have a spool directory where files are deposited for ingestion into flume. Once ingested, the files are clearly renamed and the implementation guarantees at-least-once delivery semantics similar to those achieved within flume itself, even across failures and restarts of the JVM running the code. This helps fill the gap for people who want a way to get reliable delivery of events into flume, but don't want to directly write their application against the flume API. They can simply drop log files off in a spooldir and let flume ingest asynchronously (using some shell scripts or other automated process). Unlike the prior iteration, this patch implements a first-class source. It also extends the avro client to support spooling in a similar manner.
Extensive unit tests and I also built and played with this using a stub flume agent. If you look at the JIRA I have a configuration file for an agent that will print out Avro events to the command line - that's helpful when testing this.