Datasets : http://archive.ics.uci.edu/ml/datasets/twenty+newsgroups http://archive.ics.uci.edu/ml/datasets/iris
- MR Job program to word count the files in the twenty newsgroups dataset. The program reports the number of occurrences of each word.
- Inverted Index program for the directory we used for the previous exercise using the top 15 keywords.
- Computed the mean sepal length for each species using Map Reduce for the iris dataset.
(The program name is the number above)