Review Board 1.7.22


FLUME-2030 - Documentation of Configuration Changes JMSSource, HBaseSink, AsyncHBaseSink and ElasticSearchSink

Review Request #10817 - Created April 28, 2013 and updated

Israel Ekpo
flume-1.4
FLUME-1886, FLUME-1889, FLUME-1994, FLUME-2030
Reviewers
Flume
flume-git
FLUME-2030 - Documentation of Configuration Changes JMSSource, HBaseSink, AsyncHBaseSink and ElasticSearchSink

- Updated user guide to illustrate the replacement of FQCNs with enum constants (lowercased) for built-in sources and sinks.
- Added System Requirements needed to run Flume.
- Added documentation to encourage users to migrate to 1.x from 0.9.x so as to take advantage of improvements available in Flume NG
- Added documentation to inform users that Apache Flume is not limited only to log data aggregation.
N/A
Posted (April 29, 2013, 4:10 p.m.)

   

  
flume-ng-doc/sphinx/FlumeUserGuide.rst (Diff revision 1)
 
 
"not only restricted" doesn't make sense in this context. Consider "is not restricted to".
  1. Thanks I will fix this 
    
flume-ng-doc/sphinx/FlumeUserGuide.rst (Diff revision 1)
 
 
Commas need to be inserted for correct grammar. "including, but not limited to, network"
  1. This will be addressed.
flume-ng-doc/sphinx/FlumeUserGuide.rst (Diff revision 1)
 
 
Correct me if I'm wrong but Flume may have a problem if the size of each event gets too large? I'm not sure what might be too large but the software is designed with a log line in mind - and a log line might only get to a few KB.
  1. Just like how SpoolDirectorySource is configured, limits for event size can be set so that occurrences that exceed that threshold will be truncated.
flume-ng-doc/sphinx/FlumeUserGuide.rst (Diff revision 1)
 
 
"Sufficient" is meaningless to Operations staff. The default service script uses 20MB - which works okay-ish unless you use the memory channel. (By okay-ish I mean that I've seen Flume stop with an OOM which was most likely a memory leak.)
  1. I see what you mean. I will think more about this and come up with reasonable values that makes sense for typical use cases. Do you have any recommendations?