Review Board 1.7.22

Review requests for mahout

Starred Summary Submitter Posted (Ascending) Unsort Last Updated Edit columns
README.txt has lines > 80 characters wide. Shortening for reviewboard demogoguery. jake.mannix November 17th, 2011, 4:10 p.m.
MAHOUT-827 Another version of RecommenderJob that broadcasts the similarity matrix ssc November 20th, 2011, 7:47 p.m.
New implementation for LDA: Collapsed Variational Bayes (0th derivative approximation), with map-side model caching jake.mannix November 28th, 2011, 5:35 a.m.
Matrix methods for DistributedRowMatrix cendrillon November 29th, 2011, 2:31 p.m.
MAHOUT-918 Parallelized SGD in MapReduce issay December 8th, 2011, 3:50 p.m.
Support for Randomizing Input in SplitInput Class cendrillon December 9th, 2011, 5:53 p.m.
Row mean job for PCA cendrillon December 12th, 2011, 9:27 a.m.
MAHOUT-922-2: add DistributedCache broadcast to B' files for AB' job and R-hat files for B' job dlyubimov December 20th, 2011, 5:21 a.m.
MAHOUT-944 LuceneIndexToSequenceFiles (lucene2seq) utility frankscholten January 26th, 2012, 1:13 a.m.
PCA options for SSVD dlyubimov February 11th, 2012, 12:09 p.m.
Refactored CandopyDriver to use ClusterClassificationDriver paritoshranjan March 5th, 2012, 2:15 p.m.
MAHOUT-822: Mahout needs to be made compatible with Hadoop .23 releases tcp March 8th, 2012, 10:19 a.m.
Simple patch to reduce our checkstyle warnings tcp March 8th, 2012, 12:16 p.m.
Added outlier removal capability to Canopy Clustering. paritoshranjan March 9th, 2012, 2:13 a.m.
Refactored clustering out of K-Means and Dirichlet Clustering. paritoshranjan March 14th, 2012, 10:49 p.m.
Mahout-991 Converted K-Means, Canopy, FuzzyKMeans, Dirichlet and MeanShift to emit ClusterWritable paritoshranjan March 23rd, 2012, 4:20 a.m.
Used ClusterIterator and ClusteringPolicy to buildClusters for KMeans. Removed KMeansClusterer, KMeansReducer, KMeansMapper and KMeansCombiner, along with their unit tests. paritoshranjan March 31st, 2012, 1:27 a.m.
MAHOUT-989. Refactored FuzzyKMeans buildClusters phase to use FuzzyKMeansClusteringPolicy and ClusterIterator. paritoshranjan April 1st, 2012, 3:10 a.m.
MAHOUT-990, Changed DirichletClustering to do buildClusters using ClusterIterator. paritoshranjan April 4th, 2012, 3:48 a.m.
mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches rvs April 4th, 2012, 9:25 a.m.
MAHOUT-1067 dlyubimov September 28th, 2012, 9 a.m.
Basic Iterable for OpenKeyTypeValueTypeHashMap jake.mannix March 12th, 2013, 1:34 p.m.
MAHOUT-1181: Adds StreamingKMeans MapReduce classes dfilimon March 29th, 2013, 10:16 p.m.
MAHOUT-1162: Adding BallKMeans and StreamingKMeans classes dfilimon March 29th, 2013, 10:37 p.m.
MAHOUT-1156: Adding nearest neighbor Searchers dfilimon March 29th, 2013, 10:45 p.m.
Fixes MAHOUT-1186 jake.mannix April 6th, 2013, 3:07 a.m.
MAHOUT-1180: Multinomial<T> throws ConcurrentModificationException when iterating and setting probabilities dfilimon April 9th, 2013, 8:19 p.m.
MAHOUT-1189: CosineDistanceMeasure doesn't return 0 for two 0 vectors dfilimon April 10th, 2013, 12:09 a.m.
MAHOUT-1178 Improve Lucene support in Mahout gokhancapan April 12th, 2013, 12:48 a.m.
Fix for Vector iterators and perf enhancement for RASV robinanil April 15th, 2013, 10:08 a.m.
Refactors Benchmarking code to be cleaner, adds randomization and dependency to prevent dead code elimination robinanil April 16th, 2013, 11:32 a.m.
Speed up Vector Operations robinanil April 16th, 2013, 2:28 p.m.
MAHOUT-1202: Speed up Vector Operations dfilimon April 21st, 2013, 1:30 a.m.
MAHOUT-1194 Allow to change java target version during the build jarcec April 22nd, 2013, 11:30 a.m.
MAHOUT-1216: Add locality sensitive hashing and a LocalitySensitiveHash searcher dfilimon May 16th, 2013, 5:17 p.m.
MAHOUT-1217: Nearest neighbor searchers sometimes fail to remove points dfilimon May 17th, 2013, 6:46 p.m.
MAHOUT-1222: Fix total weight in FastProjectionSearch dfilimon May 21st, 2013, 12:19 a.m.
MAHOUT-1223: Point skipped in StreamingKMeans when iterating through centroids from a reducer dfilimon May 21st, 2013, 12:29 a.m.
MAHOUT-1224: Add the option of running a StreamingKMeans pass in the Reducer before BallKMeans dfilimon May 21st, 2013, 12:36 a.m.
Add input resplitting tool and cluster evaluation tool dfilimon May 21st, 2013, 7:59 p.m.
Add Iterable<Element> all(), and Iterable<Element> nonZeroes() to Vector, remove iterator() and iterateNonZeroes() jake.mannix May 24th, 2013, 8:22 a.m.
Final patch for Mahout-833 smarthi June 10th, 2013, 6:57 a.m.
Cleanup LDA code ssc June 14th, 2013, 7:34 a.m.
MAHOUT-1267 Remove object instantiations from RowSimilarityJob ssc June 19th, 2013, 3:02 p.m.
#MAHOUT-1273: Single Pass Algorithm for Penalized Linear Regression with Cross Validation on MapReduce kunyang August 1st, 2013, 7:40 a.m.
mahout-1265: add multilayer perceptron. yexijiang August 8th, 2013, 12:34 p.m.
Frequent Pattern Set Mining for Mahout smoens November 14th, 2013, 9:59 p.m.
Quick fix for M-1030 akm December 4th, 2013, 8 a.m.
[MAHOUT-1349] Added logic to make array larger of dictionary size and largest index plus one. akm December 8th, 2013, 4:28 a.m.
MAHOUT-1508 Performance problems with sparse matrices ssc April 12th, 2014, 8:17 p.m.