Review Board 1.7.22


HBASE-6241 - HBaseCluster interface for interacting with the cluster from system tests

Review Request #5653 - Created June 29, 2012 and updated

enis
HBASE-6241
Reviewers
hbase
hbase-git
Introduces HBaseCluster and ClusterManager interfaces, RealHBaseCluster and MiniHBaseCluster extends HBaseCluster. ClusterManager provides services for starting / killing daemons. 

Some high-level notes on the patch:
- uses hbase-it module, and adds a new test there called IntegrationTestDataIngestWithChaosMonkey. This class runs LoadTestTool with a chaos monkey(http://www.codinghorror.com/blog/2011/04/working-with-the-chaos-monkey.html). Chaos monkey is very sipmle right now, just does selects a random RS, kills and restarts it.
- Introduces HBaseCluster, RealHBaseCluster and changes MiniHBaseCluster to extends HBaseCluster.
- Adds a ClusterManager interface, and a default HBase shell scripts based HBaseClusterManager. These are internal-classses and tests does not directly refer to them, so we can improve on them, maybe add another implementation when BIGTOP-635 is done.
- Adds an IntegrationTestsDriver class as a driver for running integration tests from command line. You can do "bin/hbase --config <hbase_conf_dir> o.a.h.h.ITD" to run all the integration tests on a real cluster. mvn verify runs them on mini cluster. I'll open another issue for mvn verify on real clusters.

This patch is now close to what I have in mind as an API for HBASE-6241, and HBASE-6201. We can start with this, and incrementally improve on this base, once we convert more unit tests to be system tests, and add more system tests. ClusterManager is also a simple candidate for BIGTOP-635 (but we can deal with these later on, since those parts are not exposed)

tested with mvn verify in hbase-it module (runs the test as integration test)
set up a cluster of 8 nodes (using hbase trunk, and starting the cluster with bin/start-hbase.sh scripts), run the tests with: 
bin/hbase --config <conf_dir>  org.apache.hadoop.hbase.IntegrationTestsDriver 
Total:
77
Open:
71
Resolved:
6
Dropped:
0
Status:
From:
Description From Last Updated Status
This change needs to make it into release notes. Its a significant change. Is it necessary? Its needed doing safe ... Michael Stack June 29, 2012, 9:38 p.m. Open
Why we need this annotation when we have a distinct naming pattern? Michael Stack June 29, 2012, 9:38 p.m. Open
Why should I know whether its a real or otherwise cluster? This should be hidden? Can't I call startCluster and ... Michael Stack June 29, 2012, 9:38 p.m. Open
What is this Shim stuff? Why not just call getHBaseCluster or whatever it is in HBaseTestUtility before this change? Why ... Michael Stack June 29, 2012, 9:38 p.m. Open
ditto Michael Stack June 29, 2012, 9:38 p.m. Open
We want this in a test? Michael Stack June 29, 2012, 9:38 p.m. Open
Is there a configurable rate at which chaos monkey runs? Is there a report on how much chaos it caused? ... Michael Stack June 29, 2012, 9:38 p.m. Open
Is getHBaseClusterShim even a good name for this? Shim means something else to me. Michael Stack June 29, 2012, 9:38 p.m. Open
See util.Threads for a method that does this. Michael Stack June 29, 2012, 9:38 p.m. Open
We need annotation on top of special naming? Michael Stack June 29, 2012, 9:38 p.m. Open
Should this be in hbase-server? Could it go into hbase-it? Michael Stack June 29, 2012, 9:38 p.m. Open
This Interface looks like the core Interface for a generic "Server" ... something from guice or IoC kinda container. Should ... Michael Stack June 29, 2012, 9:38 p.m. Open
Can it be in hbase-it rather than in hbase-server? I suppose it can't if minihbasecluster is to inherit it. So, ... Michael Stack June 29, 2012, 9:38 p.m. Open
Real is a bad name...as though the minihbasecluster was imaginary. Michael Stack June 29, 2012, 9:38 p.m. Open
Is this right? Should it be a protocol too? Maybe this change hasn't happened yet. Michael Stack June 29, 2012, 9:38 p.m. Open
We are going to shell out to do this stuff? Unless minihbasecluster. Michael Stack June 29, 2012, 9:38 p.m. Open
This is in hadoop 1.0? Its well tested? Michael Stack June 29, 2012, 9:38 p.m. Open
Do we have to have all this stuff in hbase? Is it in hadoop or in some external project already? Michael Stack June 29, 2012, 9:38 p.m. Open
Bash is a requirement for hbase/hadoop? This is portable shell? That -d on the cut and the -f? Michael Stack June 29, 2012, 9:38 p.m. Open
To run all commands? Michael Stack June 29, 2012, 9:38 p.m. Open
Real cluster is not a good name. Michael Stack June 29, 2012, 9:38 p.m. Open
Will Maven let you write IntegrationTests that are in hbase-it depending on classes over in src/test rather than src/main? Michael Stack June 29, 2012, 9:38 p.m. Open
public? Michael Stack June 29, 2012, 9:38 p.m. Open
isRealCluster is not the name of a boolean, its the name of the method you call to ask about the ... Michael Stack June 29, 2012, 9:38 p.m. Open
Michael Stack June 29, 2012, 9:38 p.m. Open
Ain't this going to be useless unless its called before construction? Should say so? Or be in the constructor for ... Michael Stack June 29, 2012, 9:38 p.m. Open
Is this needed? Seems like something that shoudl be on HBaseCluster rather than on this class. And even then, seems ... Michael Stack June 29, 2012, 9:38 p.m. Open
Michael Stack June 29, 2012, 9:38 p.m. Open
This is functionality that can only be done on minihbasecluster? Michael Stack June 29, 2012, 9:38 p.m. Open
Seems wrong? Should return the HBaseCluster Interface? Not the mini? It breaks too many tests changing this? Michael Stack June 29, 2012, 9:38 p.m. Open
This name seems wrong as said above. Michael Stack June 29, 2012, 9:38 p.m. Open
This looks good. Michael Stack June 29, 2012, 9:38 p.m. Open
Should these methods throw exceptions if they fail rather than true or false? Michael Stack June 29, 2012, 9:38 p.m. Open
Yeah, I think this will change but whoever commits first... Michael Stack June 29, 2012, 9:38 p.m. Open
This good in a test? Let it out? Michael Stack June 29, 2012, 9:38 p.m. Open
Distributed instead of Real? Michael Stack June 29, 2012, 9:38 p.m. Open
THis is a repeated method. Just use whats in util.Threads. Michael Stack June 29, 2012, 9:38 p.m. Open
This method fails badly if we cannot restore? Michael Stack June 29, 2012, 9:38 p.m. Open
Do we have to depend on another module? What would we have to copy over? Michael Stack Aug. 22, 2012, 5:30 a.m. Open
Why this setter when its passed on construction? Michael Stack Aug. 22, 2012, 5:30 a.m. Open
Why this method? If really necessary, could do instanceof? Michael Stack Aug. 22, 2012, 5:30 a.m. Open
This class cannot be in hbase-server/src/test because of mvn limitations? Michael Stack Aug. 22, 2012, 5:30 a.m. Open
The search for server doesn't have to be repeated every iteration of while. Sameer Vaishampayan Aug. 24, 2012, 12:46 a.m. Open
Should this be restore or reset ? I believe that it actually checks the current config and then shuts/starts services ... Sameer Vaishampayan Aug. 24, 2012, 12:46 a.m. Open
Its kinda funny we have to define these. I poked around thinking that they must be defined somewhere else but ... Michael Stack Sept. 8, 2012, 5:13 a.m. Open
Do we need to get port in here so multiple of same service type on the one host? Michael Stack Sept. 8, 2012, 5:13 a.m. Open
this method says abort of a regionserver but its killing it.... normal not much of an issue but over in ... Michael Stack Sept. 8, 2012, 5:13 a.m. Open
Ditto (see above comment on abort regionserver) Michael Stack Sept. 8, 2012, 5:13 a.m. Open
Do we have to close this protocol when done w/ it? Michael Stack Sept. 8, 2012, 5:13 a.m. Open
FYI, ProtobufUtil#toServerName (but this is fine too) Michael Stack Sept. 8, 2012, 5:13 a.m. Open
This is some fancy dancing! Michael Stack Sept. 8, 2012, 5:13 a.m. Open
Dup code? Break out method that master and regionserver difference calculation could share? No biggie. Michael Stack Sept. 8, 2012, 5:13 a.m. Open
This is so I can override default config? And following method is so can set alternate options to whats in ... Michael Stack Sept. 8, 2012, 5:13 a.m. Open
Off topic, you have an idea how this could work for bigtop? (It looks like good idea putting this stuff ... Michael Stack Sept. 8, 2012, 5:13 a.m. Open
Have you fellas done this once before? This is thorough? Michael Stack Sept. 8, 2012, 5:13 a.m. Open
This is good for now. Michael Stack Sept. 8, 2012, 5:13 a.m. Open
If you are making new patch , put space after comma here and below (ill-formatted log lines make me think ... Michael Stack Sept. 8, 2012, 5:13 a.m. Open
This all works? (It looks like it could) Michael Stack Sept. 8, 2012, 5:13 a.m. Open
Good doc Michael Stack Sept. 8, 2012, 5:13 a.m. Open
Good doc Michael Stack Sept. 8, 2012, 5:13 a.m. Open
Because we don't have means of figuring what servers we can start a service on? Michael Stack Sept. 8, 2012, 5:13 a.m. Open
Does the mini hbase cluster not implement the HBaseClusterInterface so you could call restoreInitialStatus on it? (Or they are too ... Michael Stack Sept. 8, 2012, 5:13 a.m. Open
Hmm.. Seems odd to do this post-construction. You see us using same instance first to do mini and then distributed? ... Michael Stack Sept. 8, 2012, 5:13 a.m. Open
This is great Michael Stack Sept. 8, 2012, 5:13 a.m. Open
Is this pass-through or not yet implemented? Could options be passed and ignored? Michael Stack Sept. 8, 2012, 5:13 a.m. Open
Code example on how to set it running? Would adding a main be of use? Michael Stack Sept. 8, 2012, 5:13 a.m. Open
Looks like a bug fix Michael Stack Sept. 8, 2012, 5:13 a.m. Open
Yeah suggest rename of this method to killRegionServer Michael Stack Sept. 8, 2012, 5:13 a.m. Open
Can HTU not implement HBaseCluster so you don't have to set it? Or am I missing point and the passed ... Michael Stack Sept. 8, 2012, 5:13 a.m. Open
Thanks for the note Michael Stack Sept. 11, 2012, 12:02 a.m. Open
Is this right package? How is this different From Stoppable we currently have? Michael Stack Sept. 8, 2012, 5:13 a.m. Open
Review request changed
Updated (Sept. 13, 2012, 12:57 a.m.)
Incorporated review comments: 
 - rename SimpleStoppable -> StoppableImplementation. 
 - rename HBaseCluster.abortRegionServer() -> killRegionServer(). It uses HRS.kill() instead of HRS.abort(). Also renamed killMaster(). There is no HM.kill() so mini cluster still is implemented using abort(). 
 - Added file filter for TestCheckTestClasses.findTestClasses()
 - Renamed HBaseIntegrationTestingUtility -> IntegrationTestingUtility
 - Clarified documentation on some places.