SQOOP-683 Documenting sqoop.mysql.export.sleep.ms - easy throttling feature for direct MySQL exports
Review Request #7880 - Created Nov. 5, 2012 and updated
Code review for SQOOP-683, see https://issues.apache.org/jira/browse/SQOOP-683.
Converted to XML with asciidoc, the affected part: <simpara>Sometimes you need to export large data with Sqoop to a live MySQL cluster that is under a high load serving random queries from the users of our product. While data consistency issues during the export can be easily solved with a staging table, there is still a problem: the performance impact caused by the heavy export.</simpara> <simpara>First off, the resources of MySQL dedicated to the import process can affect the performance of the live product, both on the master and on the slaves. Second, even if the servers can handle the import with no significant performance impact (mysqlimport should be relatively "cheap"), importing big tables can cause serious replication lag in the cluster risking data inconsistency.</simpara> <simpara>With <literal>-D sqoop.mysql.export.sleep.ms=time</literal>, where <emphasis>time</emphasis> is a value in milliseconds, you can let the server relax between checkpoints and the replicas catch up by pausing the export process after transferring the number of bytes specified in <literal>sqoop.mysql.export.checkpoint.bytes</literal>. Experiment with different settings of these two parameters to archieve an export pace that doesn’t endanger the stability of your MySQL cluster.</simpara> <important><simpara>Note that any arguments to Sqoop that are of the form <literal>-D parameter=value</literal> are Hadoop <emphasis>generic arguments</emphasis> and must appear before any tool-specific arguments (for example, <literal>--connect</literal>, <literal>--table</literal>, etc). Don’t forget that these parameters only work with the <literal>--direct</literal> flag set.</simpara></important>