Review Board 1.7.22


FLUME-2202. AsyncHBaseSink should coalesce increments to reduce RPC roundtrips

Review Request #14454 - Created Oct. 2, 2013 and updated

Hari Shreedharan
FLUME-2202
Reviewers
Flume
flume-git
Added a new config to coalesce increments. 
All current tests pass. Added 2 new tests
flume-ng-doc/sphinx/FlumeUserGuide.rst
Revision 5a59b56 New Change
[20] 1847 lines
[+20]
1848

    
   
1848

   
1849
AsyncHBaseSink
1849
AsyncHBaseSink
1850
''''''''''''''
1850
''''''''''''''
1851

    
   
1851

   
1852
This sink writes data to HBase using an asynchronous model. A class implementing
1852
This sink writes data to HBase using an asynchronous model. A class implementing
1853
AsyncHbaseEventSerializer
1853
AsyncHbaseEventSerializer which is specified by the configuration is used to convert the events into
1854
which is specified by the configuration is used to convert the events into

   
1855
HBase puts and/or increments. These puts and increments are then written
1854
HBase puts and/or increments. These puts and increments are then written
1856
to HBase. This sink provides the same consistency guarantees as HBase,
1855
to HBase. This sink uses the `Asynchbase API <https://github.com/OpenTSDB/asynchbase>`_ to write to

    
   
1856
HBase. This sink provides the same consistency guarantees as HBase,
1857
which is currently row-wise atomicity. In the event of Hbase failing to
1857
which is currently row-wise atomicity. In the event of Hbase failing to
1858
write certain events, the sink will replay all events in that transaction.
1858
write certain events, the sink will replay all events in that transaction.
1859
The type is the FQCN: org.apache.flume.sink.hbase.AsyncHBaseSink.
1859
The type is the FQCN: org.apache.flume.sink.hbase.AsyncHBaseSink.
1860
Required properties are in **bold**.
1860
Required properties are in **bold**.
1861

    
   
1861

   
1862
================  ============================================================  ====================================================================================
1862
===================  ============================================================  ====================================================================================
1863
Property Name     Default                                                       Description
1863
Property Name        Default                                                       Description
1864
================  ============================================================  ====================================================================================
1864
===================  ============================================================  ====================================================================================
1865
**channel**       --
1865
**channel**          --
1866
**type**          --                                                            The component type name, needs to be ``asynchbase``
1866
**type**             --                                                            The component type name, needs to be ``asynchbase``
1867
**table**         --                                                            The name of the table in Hbase to write to.
1867
**table**            --                                                            The name of the table in Hbase to write to.
1868
zookeeperQuorum   --                                                            The quorum spec. This is the value for the property ``hbase.zookeeper.quorum`` in hbase-site.xml
1868
zookeeperQuorum      --                                                            The quorum spec. This is the value for the property ``hbase.zookeeper.quorum`` in hbase-site.xml
1869
znodeParent       /hbase                                                        The base path for the znode for the -ROOT- region. Value of ``zookeeper.znode.parent`` in hbase-site.xml
1869
znodeParent          /hbase                                                        The base path for the znode for the -ROOT- region. Value of ``zookeeper.znode.parent`` in hbase-site.xml
1870
**columnFamily**  --                                                            The column family in Hbase to write to.
1870
**columnFamily**     --                                                            The column family in Hbase to write to.
1871
batchSize         100                                                           Number of events to be written per txn.
1871
batchSize            100                                                           Number of events to be written per txn.
1872
timeout           60000                                                         The length of time (in milliseconds) the sink waits for acks from hbase for
1872
coalesceIncrements   false                                                         Should the sink coalesce multiple increments to a cell per batch. This might give

    
   
1873
                                                                                   better performance if there are multiple increments to a limited number of cells.

    
   
1874
timeout              60000                                                         The length of time (in milliseconds) the sink waits for acks from hbase for
1873
                                                                                all events in a transaction.
1875
                                                                                   all events in a transaction.
1874
serializer        org.apache.flume.sink.hbase.SimpleAsyncHbaseEventSerializer
1876
serializer           org.apache.flume.sink.hbase.SimpleAsyncHbaseEventSerializer
1875
serializer.*      --                                                            Properties to be passed to the serializer.
1877
serializer.*         --                                                            Properties to be passed to the serializer.
1876
================  ============================================================  ====================================================================================
1878
===================  ============================================================  ====================================================================================
1877

    
   
1879

   
1878
Note that this sink takes the Zookeeper Quorum and parent znode information in
1880
Note that this sink takes the Zookeeper Quorum and parent znode information in
1879
the configuration. Zookeeper Quorum and parent node configuration may be
1881
the configuration. Zookeeper Quorum and parent node configuration may be
1880
specified in the flume configuration file. Alternatively, these configuration
1882
specified in the flume configuration file. Alternatively, these configuration
1881
values are taken from the first hbase-site.xml file in the classpath.
1883
values are taken from the first hbase-site.xml file in the classpath.
[+20] [20] 1426 lines
flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/AsyncHBaseSink.java
Revision 5e297b1 New Change
 
flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/HBaseSinkConfigurationConstants.java
Revision 7fdc75b New Change
 
flume-ng-sinks/flume-ng-hbase-sink/src/test/java/org/apache/flume/sink/hbase/IncrementAsyncHBaseSerializer.java
New File
 
flume-ng-sinks/flume-ng-hbase-sink/src/test/java/org/apache/flume/sink/hbase/TestAsyncHBaseSink.java
Revision a0c04eb New Change
 
  1. flume-ng-doc/sphinx/FlumeUserGuide.rst: Loading...
  2. flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/AsyncHBaseSink.java: Loading...
  3. flume-ng-sinks/flume-ng-hbase-sink/src/main/java/org/apache/flume/sink/hbase/HBaseSinkConfigurationConstants.java: Loading...
  4. flume-ng-sinks/flume-ng-hbase-sink/src/test/java/org/apache/flume/sink/hbase/IncrementAsyncHBaseSerializer.java: Loading...
  5. flume-ng-sinks/flume-ng-hbase-sink/src/test/java/org/apache/flume/sink/hbase/TestAsyncHBaseSink.java: Loading...