Review Board 1.7.22

Export snapshot

Review Request #7137 - Created Sept. 17, 2012 and submitted

Matteo Bertozzi
jmhsieh, jyates
Export snapshot to another cluster
 - Copy .snapshot/name references folder
 - Copy hfiles using MR job (we can't use distcp, because we need the HFileLink logic)

 - HLogs
 - Mark the snapshot as "ready" in the other cluster
$ hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot test-snapshot -copy-to hdfs://srv2/hbase/ 
Description From Last Updated Status
Since snapshotFiles is sorted by file size, this would make the leading input file having deterministically larger size compared to ... Ted Yu Sept. 18, 2012, 5:52 p.m. Open
a nice to have would be to do this in parallel with the hfile copy. not necessary given the way ... Jesse Yates Sept. 24, 2012, 5:51 p.m. Open
hbase prefix these? (I could be convinced otherwise..) hbase.snapshot.export.*? Jonathan Hsieh Oct. 8, 2012, 5:44 p.m. Open
we are just copying snapshot metadata here right? Jonathan Hsieh Oct. 15, 2012, 1:56 p.m. Open
Why write your own rather than use org.apache.hadoop.fs.FileUtil.copy()? You can update the the written bytes when finishing each file copy, ... Jesse Yates Oct. 16, 2012, 12:12 a.m. Open
so it creates balanced groups in terms of file sizes, not numbers of files? Jesse Yates Oct. 16, 2012, 12:12 a.m. Open
nit: I'm not a big fan of the nested ?: practice, it tends to be more confusing than just using ... Jesse Yates Oct. 16, 2012, 12:12 a.m. Open
I feel like this should be a different parameter. The JobClient/Submission stuff is an internal mechanism for the job to ... Jesse Yates Oct. 16, 2012, 12:12 a.m. Open
this is a really long method - maybe break it up into a couple submethods? Not sure if it makes ... Jesse Yates Oct. 16, 2012, 12:12 a.m. Open
Have you considered using org.apache.commons.Options? Its a cleaner way to handle the options parsing and auto-format the help, though a ... Jesse Yates Oct. 16, 2012, 12:12 a.m. Open
is this necessary? Its not a common paradigm that Ive seen and seems to be handled via run() Jesse Yates Oct. 16, 2012, 12:12 a.m. Open
When are we going to see this util? Jesse Yates Oct. 25, 2012, 12:33 a.m. Open
Does this overwrite if there is a file there but not the same file? Jonathan Hsieh Nov. 9, 2012, 6:24 p.m. Open
nit: verfy Would be nice to explain what you are doing in the comment. something along the line of the ... Jonathan Hsieh Nov. 12, 2012, 5:13 p.m. Open
Review request changed
Updated (Jan. 8, 2013, 5:33 p.m.)
  • changed from pending to submitted