OOZIE-667 Change the way Oozie brings in Hadoop JARs into the build
Review Request #3726 - Created Feb. 1, 2012 and submitted
The current mechanism allow to easily package/use new versions of hadoop without complicating Oozie's POMs. mr1 & mr2 profiles are gone from oozie, now it refers to one of the versions in hadooplibs. New maven modules under hadooplib define the hadoop-client/hadoop-test POMs for different hadoop versions. Note that because of HADOOP-8009, hadoop will start providing a hadoop-client artifact (even for already released versions), still we'll need the corresponding hadooplibs module to be able to use the assembly (as it is done as part of this patch) to pull into the oozie distro the JARs for the supported/tested versions of Hadoop required for the client side. Note that this can be used tosimplify the logic of addtowar.sh that won't have to be aware of the JARs deps of different versions of Hadoop or of hadoop JARs at all.
Tested with 1.0.0 and 0.24.0-SNAPSHOTs (several testcases failures here but that is still work in progress for the mr2 integration)
Posted (Feb. 2, 2012, 7:59 a.m.)
So it is WIP. right? So we are now bundling the multiple hadoop jars. As the new version of hadoop comes, we will need add a new pom file and associated changes in other pom files. Moreover (in some cases) to support a newly released version we might need to release new oozie version with only new hadoop packages. In addition, oozie tarball size will increase substantially. I think we might need to be little cautious on this.
Review request changed
Updated (Feb. 2, 2012, 10:37 p.m.)
updated patch that does option #2, it creates a separate tarball for the hadooplibs. Answering the last question by Mohammad, for each version of hadoop we want to test or have hadooplibs we have to create the corresponding oozie-hadoop artifact. This includes minor&patch releases.
Posted (Feb. 3, 2012, 3:11 a.m.)
Posted (Feb. 3, 2012, 8:38 a.m.)
+1 after addressing the doc comments for clarifications.
It is not necessary to use the oozie-bundled hadoop libs. For example, H 0.24.2 is released and we are late or not packaging for some reason. There should be a way, for user to collect the necessary hadoop jars and put it into libext/. I know this is obvious. Just to mention in some way that user can bring there own hadoop library and copy it into libext.