Review Board 1.7.22


Review requests for mahout

Starred Summary Submitter Posted Last Updated Edit columns
MAHOUT-1508 Performance problems with sparse matrices ssc April 12th, 2014, 11:17 a.m.
mahout-1265: add multilayer perceptron. yexijiang August 8th, 2013, 3:34 a.m.
[MAHOUT-1349] Added logic to make array larger of dictionary size and largest index plus one. akm December 7th, 2013, 7:28 p.m.
Quick fix for M-1030 akm December 3rd, 2013, 11 p.m.
Frequent Pattern Set Mining for Mahout smoens November 14th, 2013, 12:59 p.m.
#MAHOUT-1273: Single Pass Algorithm for Penalized Linear Regression with Cross Validation on MapReduce kunyang July 31st, 2013, 10:40 p.m.
MAHOUT-1194 Allow to change java target version during the build jarcec April 22nd, 2013, 2:30 a.m.
Final patch for Mahout-833 smarthi June 9th, 2013, 9:57 p.m.
MAHOUT-1267 Remove object instantiations from RowSimilarityJob ssc June 19th, 2013, 6:02 a.m.
Cleanup LDA code ssc June 13th, 2013, 10:34 p.m.
MAHOUT-1224: Add the option of running a StreamingKMeans pass in the Reducer before BallKMeans dfilimon May 20th, 2013, 3:36 p.m.
Add Iterable<Element> all(), and Iterable<Element> nonZeroes() to Vector, remove iterator() and iterateNonZeroes() jake.mannix May 23rd, 2013, 11:22 p.m.
Add input resplitting tool and cluster evaluation tool dfilimon May 21st, 2013, 10:59 a.m.
MAHOUT-1223: Point skipped in StreamingKMeans when iterating through centroids from a reducer dfilimon May 20th, 2013, 3:29 p.m.
MAHOUT-1222: Fix total weight in FastProjectionSearch dfilimon May 20th, 2013, 3:19 p.m.
MAHOUT-1217: Nearest neighbor searchers sometimes fail to remove points dfilimon May 17th, 2013, 9:46 a.m.
MAHOUT-1216: Add locality sensitive hashing and a LocalitySensitiveHash searcher dfilimon May 16th, 2013, 8:17 a.m.
MAHOUT-1181: Adds StreamingKMeans MapReduce classes dfilimon March 29th, 2013, 1:16 p.m.
MAHOUT-1162: Adding BallKMeans and StreamingKMeans classes dfilimon March 29th, 2013, 1:37 p.m.
MAHOUT-1156: Adding nearest neighbor Searchers dfilimon March 29th, 2013, 1:45 p.m.
MAHOUT-1202: Speed up Vector Operations dfilimon April 20th, 2013, 4:30 p.m.
MAHOUT-1189: CosineDistanceMeasure doesn't return 0 for two 0 vectors dfilimon April 9th, 2013, 3:09 p.m.
MAHOUT-1180: Multinomial<T> throws ConcurrentModificationException when iterating and setting probabilities dfilimon April 9th, 2013, 11:19 a.m.
Refactors Benchmarking code to be cleaner, adds randomization and dependency to prevent dead code elimination robinanil April 16th, 2013, 2:32 a.m.
Speed up Vector Operations robinanil April 16th, 2013, 5:28 a.m.
MAHOUT-1178 Improve Lucene support in Mahout gokhancapan April 11th, 2013, 3:48 p.m.
Fix for Vector iterators and perf enhancement for RASV robinanil April 15th, 2013, 1:08 a.m.
Fixes MAHOUT-1186 jake.mannix April 5th, 2013, 6:07 p.m.
Basic Iterable for OpenKeyTypeValueTypeHashMap jake.mannix March 12th, 2013, 4:34 a.m.
MAHOUT-1067 dlyubimov September 28th, 2012, midnight
mahout script shouldn't rely on HADOOP_HOME since that was deprecated in all major Hadoop branches rvs April 4th, 2012, 12:25 a.m.
MAHOUT-990, Changed DirichletClustering to do buildClusters using ClusterIterator. paritoshranjan April 3rd, 2012, 6:48 p.m.
MAHOUT-989. Refactored FuzzyKMeans buildClusters phase to use FuzzyKMeansClusteringPolicy and ClusterIterator. paritoshranjan March 31st, 2012, 6:10 p.m.
Used ClusterIterator and ClusteringPolicy to buildClusters for KMeans. Removed KMeansClusterer, KMeansReducer, KMeansMapper and KMeansCombiner, along with their unit tests. paritoshranjan March 30th, 2012, 4:27 p.m.
Mahout-991 Converted K-Means, Canopy, FuzzyKMeans, Dirichlet and MeanShift to emit ClusterWritable paritoshranjan March 22nd, 2012, 7:20 p.m.
Refactored clustering out of K-Means and Dirichlet Clustering. paritoshranjan March 14th, 2012, 1:49 p.m.
MAHOUT-822: Mahout needs to be made compatible with Hadoop .23 releases tcp March 8th, 2012, 1:19 a.m.
Added outlier removal capability to Canopy Clustering. paritoshranjan March 8th, 2012, 5:13 p.m.
Simple patch to reduce our checkstyle warnings tcp March 8th, 2012, 3:16 a.m.
MAHOUT-944 LuceneIndexToSequenceFiles (lucene2seq) utility frankscholten January 25th, 2012, 4:13 p.m.
Refactored CandopyDriver to use ClusterClassificationDriver paritoshranjan March 5th, 2012, 5:15 a.m.
PCA options for SSVD dlyubimov February 11th, 2012, 3:09 a.m.
Support for Randomizing Input in SplitInput Class cendrillon December 9th, 2011, 8:53 a.m.
MAHOUT-922-2: add DistributedCache broadcast to B' files for AB' job and R-hat files for B' job dlyubimov December 19th, 2011, 8:21 p.m.
Row mean job for PCA cendrillon December 12th, 2011, 12:27 a.m.
MAHOUT-918 Parallelized SGD in MapReduce issay December 8th, 2011, 6:50 a.m.
Matrix methods for DistributedRowMatrix cendrillon November 29th, 2011, 5:31 a.m.
New implementation for LDA: Collapsed Variational Bayes (0th derivative approximation), with map-side model caching jake.mannix November 27th, 2011, 8:35 p.m.
MAHOUT-827 Another version of RecommenderJob that broadcasts the similarity matrix ssc November 20th, 2011, 10:47 a.m.
README.txt has lines > 80 characters wide. Shortening for reviewboard demogoguery. jake.mannix November 17th, 2011, 7:10 a.m.