Review Board 1.7.22

MAPREDUCE-2324 Job should fail if a reduce task can't be scheduled anywhere

Review Request #1164 - Created July 21, 2011 and updated

Robert Evans
naisbitt, tgraves, tlipcon
Job should fail if a reduce task can't be scheduled anywhere. V2 of the patch.
Unit tests and ran manual tests on a single node cluster.
Posted (July 21, 2011, 7:52 p.m.)


I think a default of 0.8 or so would probably make more sense -- just in case one of the TTs is in some bad state where it isn't heartbeating, we don't want to wait forever.
since this is a set of trackers, not attempts, a better name might be: failedReduceSchedulingTrackers, or something?
this key should probably be defined as a constant in MRJobConfig, right?
branches/branch-0.20-security/src/mapred/org/apache/hadoop/mapred/ (Diff revision 1)
refactor this into a new method?
I think the operator precedence is off here.

(int)reduceInputAttemptFactor is higher precedence, so it will end up rounding anything < 1.0 down to 0.
style: add space between if and (
I think we mostly avoid the 1st person in error messages. Change to "Tried to schedule..." rather than "We tried"...
StringUtils.humanReadableInt might be useful here.
jobId is the job, not the task, right?
typos: "failes", "then that"
isn't the default input limit unlimited? why do we need this?
we should check that the failure info of the job has the correct type of error message (ie that it didn't fail due to some other error)