OCRUG Hackathon 2021-04 Data Set: MovieLens

For this hackathon, we will be using the MovieLens data set. This dataset consists of movie ratings compiled from nearly 300,000 users from the MovieLens service. There are also additional data tables with movie tags, genre, and movie "genome" information.

The data set is contained in multiple .csv files that can be linked together using various ID fields. Teams are not required to use all of the data files -- they can choose to use any and all of them. Teams can also incorporate outside data as they wish.

Data Download

The data sets should be downloaded from the grouplens website:

MovieLens Latest Datasets

There are two versions of the data set:

Small - a subset of the full dataset (1 MB, zipped); could be useful if you have limited compute resources or want to test your analysis on a small version of the data first.
Full - the full data set (265 MB, zipped)

Notes

Teams can use either the small or full data set for their submitted presentation, but be sure to specify which you used.
Each data set has a README file describing the data -- be sure to read these in detail!
- Small Dataset README
- Full Dataset README
Be sure to reference the data set in the last side of your presentation.

Data Set Usage

The MovieLens data set has been approved for use for educational/non-commercial use.

Approved Usage Request

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hackathon_dataset.md

hackathon_dataset.md

OCRUG Hackathon 2021-04 Data Set: MovieLens

Data Download

Notes

Data Set Usage

Files

hackathon_dataset.md

Latest commit

History

hackathon_dataset.md

File metadata and controls

OCRUG Hackathon 2021-04 Data Set: MovieLens

Data Download

Notes

Data Set Usage