Review Board 1.7.22


FLUME-1425: Create a SpoolDirectory Source and Client

Review Request #6377 - Created Aug. 4, 2012 and updated

Patrick Wendell
trunk
FlUME-1425
Reviewers
Flume
flume-git
This patch adds a spooling directory based source. The  idea is that a user can have a spool directory where files are deposited for ingestion into flume. Once ingested, the files are clearly renamed and the implementation guarantees at-least-once delivery semantics similar to those achieved within flume itself, even across failures and restarts of the JVM running the code.

This helps fill the gap for people who want a way to get reliable delivery of events into flume, but don't want to directly write their application against the flume API. They can simply drop log files off in a spooldir and let flume ingest asynchronously (using some shell scripts or other automated process).

Unlike the prior iteration, this patch implements a first-class source. It also extends the avro client to support spooling in a similar manner.
Extensive unit tests and I also built and played with this using a stub flume agent. If you look at the JIRA I have a configuration file for an agent that will print out Avro events to the command line - that's helpful when testing this.
Total:
40
Open:
7
Resolved:
33
Dropped:
0
Status:
From:
Description From Last Updated Status
This inner class should be declared as private static Mike Percy Aug. 23, 2012, 10:11 a.m. Open
Can we make this a static inner class by passing shared objects to the constructor as params? Mike Percy Aug. 23, 2012, 10:11 a.m. Open
Why make it static? A PoolDirectoryRunnable will only be created when there is a SpoolDirectorySource that is using it... right? Patrick Wendell Oct. 11, 2012, 6:02 p.m. Open
This test is failing for me. Not sure why, haven't dug into it much yet. Mike Percy Oct. 30, 2012, 6:22 a.m. Open
It's failing on this assert Mike Percy Nov. 1, 2012, 4:32 a.m. Open
fails: testBehaviorWithEmptyFile(org.apache.flume.client.avro.TestSpoolingFileLineReader) Time elapsed: 0.005 sec <<< FAILURE! java.lang.AssertionError at org.junit.Assert.fail(Assert.java:92) at org.junit.Assert.assertTrue(Assert.java:43) at org.junit.Assert.assertTrue(Assert.java:54) at org.apache.flume.client.avro.TestSpoolingFileLineReader.testBehaviorWithEmptyFile(TestSpoolingFileLineReader.java:396) Alexander Alten-Lorenz Nov. 6, 2012, 9:15 a.m. Open
fails: 2012-11-06 10:04:43,339 (main) [ERROR - org.apache.flume.client.avro.SpoolingFileLineReader.readLines(SpoolingFileLineReader.java:196)] Found line longer than 15 characters, cannot make progress. 2012-11-06 10:04:43,339 (main) [ERROR ... Alexander Alten-Lorenz Nov. 6, 2012, 9:15 a.m. Open
Review request changed
Updated (Nov. 6, 2012, 6:34 a.m.)
This is a small patch that should fix the unit test issues. I need someone to run on Mac to confirm that this works.
Ship it!
Posted (Nov. 6, 2012, 8:27 a.m.)
Patch applied, OSX ML

------------------------------------------------------
 T E S T S
-------------------------------------------------------

-------------------------------------------------------
 T E S T S
-------------------------------------------------------
Running org.apache.flume.api.TestFailoverRpcClient
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 9.597 sec
Running org.apache.flume.api.TestLoadBalancingRpcClient
Tests run: 13, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 33.793 sec
Running org.apache.flume.api.TestNettyAvroRpcClient
Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.845 sec
Running org.apache.flume.api.TestRpcClientFactory
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.279 sec
Running org.apache.flume.event.TestEventBuilder
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.045 sec

Results :

Tests run: 35, Failures: 0, Errors: 0, Skipped: 0
Posted (Nov. 6, 2012, 9:15 a.m.)

   

  
fails:
testBehaviorWithEmptyFile(org.apache.flume.client.avro.TestSpoolingFileLineReader)  Time elapsed: 0.005 sec  <<< FAILURE!
java.lang.AssertionError
        at org.junit.Assert.fail(Assert.java:92)
        at org.junit.Assert.assertTrue(Assert.java:43)
        at org.junit.Assert.assertTrue(Assert.java:54)
        at org.apache.flume.client.avro.TestSpoolingFileLineReader.testBehaviorWithEmptyFile(TestSpoolingFileLineReader.java:396)
fails:
2012-11-06 10:04:43,339 (main) [ERROR - org.apache.flume.client.avro.SpoolingFileLineReader.readLines(SpoolingFileLineReader.java:196)] Found line longer than 15 characters, cannot make progress.
2012-11-06 10:04:43,339 (main) [ERROR - org.apache.flume.client.avro.SpoolingFileLineReader.readLines(SpoolingFileLineReader.java:200)] Invalid line starts with: reallyreallyreallyreallyreally
2012-11-06 10:04:43,342 (main) [ERROR - org.apache.flume.client.avro.SpoolingFileLineReader.readLines(SpoolingFileLineReader.java:196)] Found line longer than 15 characters, cannot make progress.
2012-11-06 10:04:43,342 (main) [ERROR - org.apache.flume.client.avro.SpoolingFileLineReader.readLines(SpoolingFileLineReader.java:200)] Invalid line starts with: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
2012-11-06 10:04:43,344 (main) [ERROR - org.apache.flume.client.avro.SpoolingFileLineReader.readLines(SpoolingFileLineReader.java:196)] Found line longer than 15 characters, cannot make progress.
2012-11-06 10:04:43,344 (main) [ERROR - org.apache.flume.client.avro.SpoolingFileLineReader.readLines(SpoolingFileLineReader.java:200)] Invalid line starts with: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx