Review Board 1.7.22


FLUME-721: Webapps 'autofindport' feature does not work

Review Request #1604 - Created Aug. 19, 2011 and submitted

Jonathan Hsieh
flume-721
Reviewers
Flume
aprabhakar, esammer
flume-git
This was previously posted at https://review.cloudera.org/r/1894/ and pushed because of timeout.  The patch actually was later then reverted because it had some problems with imports.  Since the previous patch had timed-out as opposed to being actually reviewed, I'm reposting for review.


  FLUME-721: Webapps 'autofindport' feature does not work
    
    This refactors the internal http server so that context are created by a callback object.
ran tests, they passed.
Ship it!
Posted (Aug. 19, 2011, 11:27 p.m.)
We now expose Jetty dependencies (and enforce them) on consumers of InternalHttpServer. We've further deviated from normal war deployment complicating things for developers. Once again, we need to modify the core of Flume to add / remove / update its web applications which completely defeats the purpose of the original refactoring. The correct way to solve the missing log view and stacks servlet errors was to just create two wars (or even one) with the applications and drop it into the webapps directory; none of that works anymore. I disagree with the approach entirely but I'm positive it works as advertised. If we were going to just rewrite this back to StatusHttpServer, we should have just reverted the InternalHttpServer patch (because that's what this and the last change effectively do).
  1. I understand that you are saying that there is ideal approach that I'm not using.  I'm also very happy to take a patch that updates the implementation to use "stardard" war deployment.  Admittedly I'm not an expert on standard war deployments (arvind knows more about this).  What is vital to me is that we the ability to provide the basic mechanism to debug a user's running system.  To me this is a vital requirement that trumps idealism.
    
    The ideal version would also be user-friendly, developer-friendly, and "standard".  While you claim that 0.9.4 version is developer friendly, as a developer I tend to disagree -- we've lost the ability to debug jsp based pages in a IDE (if possible please update the old instructions on how to do this), lost the ability to debug jsp's live (again instructions please), lost unit tests, lost stack dumps, and lost the autoincrementing port feature that enables one to test on a single machine.  
    
    The main improvement I see the 0.9.4 version and this version of the InternalHttpServer is a better decoupling.  The 0.9.4 version still depended on FlumeConfiguration and this version no longer depends on it to look for specific wars in a particular directory (it provides statics that instantiators could call/use).   
    
The whole point of InternalHttpServer was to eliminate Jetty references from FlumeNode.
  1. My goal was to eliminate refrences of Flume from InternalHttpServer.  I feel that Flume needs to be able to inject flume specific servlets that will be served.
A nicer way to handle this would be with dependencies injection. This just hard codes the binding here (rather than in InternalHttpServer) so it doesn't buy us anything in terms of abstraction or control.
  1. If you want to select a list of wars via some specification here (like jetty's xml file -- http://docs.codehaus.org/display/JETTY/Jetty+Xml+Configuration+Syntax+Reference ) that is definitely possible.   This seems like something reasonable to do when we add https:// or the ability to disable certain servlets.
    
    In this case, there are only two instances (FlumeNode FlumeMaster) so I hard code.  (I have a rule of 3 -- copy and paste up to 3x, afterwards refactor).
    
    
A clean way to do this would be to configure a factory and have it return configured instances of InternalHttpServer. What's missing is a DI container to externalize the wiring of contexts -> http server -> flume node.
  1. patches welcome (see rule of 3 above)
flume-core/src/main/java/com/cloudera/util/InternalHttpServer.java (Diff revision 1)
 
 
 
 
 
 
 
 
 
 
 
 
By removing this we've effectively moved back out of the normal deployment model. I don't understand why we can't adhere to conventions.
  1. The first case is an unused code path that would potentially cause unexpected behavior (if I put in the webapps dir as opposed to a war file, it would load th master and node war which is not what is supposed to happen).  
Further exposure of Jetty dependency.
  1. This is a limitation if we want the ability to auto-increment the port selected.  I tried reopening the after a bind attempt failed and the contexts did not get setup properly.