diff --git a/notebooks/src/301-priprava_pregled.ipynb b/notebooks/src/301-priprava_pregled.ipynb index 31af8aa..4742e59 100644 --- a/notebooks/src/301-priprava_pregled.ipynb +++ b/notebooks/src/301-priprava_pregled.ipynb @@ -65,28 +65,24 @@ "source": [ "### Podatki\n", "\n", - "V nalogi boste pregledali in pripravili podatke gledanosti Hollywoodskih filmov\n", - "zbirke [MovieLens](https://grouplens.org/datasets/movielens/) v obdobju **1995-2016**.\n", + "V nalogi boste pregledali in pripravili podatke gledanosti filmov zbirke [MovieLens](https://grouplens.org/datasets/movielens/).\n", "\n", - "Iste podatke boste uporabili v vseh nalogah, zato jih dodobra spoznajte. Gre za podatkovno zbirko za\n", - "vrednotenje priporočilnih sistemov, ki vsebuje gledalce ter njihove ocene za posamezni film na lestvici 1 do 5. \n", - "Poleg osnovne matrike uporabnikov in ocen vsebuje še dodatne podatke o filmih (npr. žanr, datum, oznake,\n", - "igralci).\n", + "Iste podatke boste uporabili v vseh nalogah, zato jih dodobra spoznajte. Gre za podatkovno zbirko za vrednotenje priporočilnih sistemov, ki vsebuje gledalce ter njihove ocene za posamezni film na lestvici 1 do 5. \n", + "Poleg osnovne matrike uporabnikov in ocen vsebuje še dodatne podatke o filmih (npr. žanr, datum, oznake, igralci).\n", "\n", - "Podatki so v mapi `./podatki/ml-latest-small`. Podatkovna zbirka vsebuje naslednje datoteke:\n", + "Podatke poberite [s spleta](https://files.grouplens.org/datasets/movielens/ml-32m.zip) in jih odpakirajte v mapo `./data`, kjer vas že čaka datoteka `cast.csv`. Podatkovna zbirka vsebuje naslednje datoteke:\n", "\n", "* ratings.csv: podatki o uporabnikih in ocenah,\n", "* movies.csv: podatki o žanrih filmov,\n", "* cast.csv: podatki o igralcih,\n", - "* tags.csv: podatki o oznakah (ang. \\emph{tags}),\n", + "* tags.csv: podatki o oznakah (ang. *tags*),\n", "* links.csv: povezave na sorodne podatkovne zbirke.\n", "\n", + "Pred pričetkom reševanja naloge si dobro oglejte podatke in datoteko **README.txt**. \n", "\n", - "Pred pričetkom reševanja naloge si dobro oglejte podatke in datoteko **README.txt**. Podrobnosti o zbirki lahko preberete na [spletni strani](http://files.grouplens.org/datasets/movielens/ml-latest-small-README.html).\n", + "Pripravite metode za nalaganje podatkov v ustrezne podatkovne strukture. Te vam bodo prišle prav tudi pri nadaljnjih nalogah.\n", "\n", - "Pripravite metode za nalaganje podatkov v ustrezne podatkovne strukture. Te vam bodo prišle\n", - "prav tudi pri nadaljnjih nalogah.\n", - "Bodite pozorni na velikost podatkov." + "Podatkov **NE** nalagajte na svoje repozitorije" ] }, { @@ -99,26 +95,24 @@ "source": [ "### Data\n", "\n", - "In the task you will review and prepare Hollywood movie ratings from\n", - "the [MovieLens](https://grouplens.org/datasets/movielens/) collection from the period **1995-2016**.\n", + "In the task you will review and prepare movie ratings from the [MovieLens](https://grouplens.org/datasets/movielens/) collection.\n", "\n", - "The same data is used in all assignments, so you should get to know the data well. This is a database for\n", - "evaluating recommendations systems that include viewers and their ratings on a scale of 1 to 5.\n", - "In addition to the basic user and rating matrix, it includes also movie information (e.g., genre, date, tags, players).\n", + "The same data is used in all assignments, so you should get to know the data well. This is a dataset for evaluating recommendations systems that include viewers and their ratings on a scale of 1 to 5.\n", + "In addition to the basic user and rating matrix, it includes also movie information (e.g., genre, date, tags, actors).\n", "\n", - "The dataset is in folder `./podatki/ml-latest-small`. The database contains the following files:\n", + "Download the dataset [from the web](https://files.grouplens.org/datasets/movielens/ml-32m.zip) and unpack it in the folder `./data`, where you already have the file `cast.csv`. The dataset contains the following files:\n", "\n", "* ratings.csv: user data and ratings,\n", "* movies.csv: movie genre information,\n", - "* cast.csv: player information,\n", - "* tags.csv: tag information (\\emph{tags}),\n", + "* cast.csv: actors information,\n", + "* tags.csv: tag information,\n", "* links.csv: links to related databases.\n", "\n", "Before starting to solve the task, take a good look at the data and read the **README.txt** file. You can learn about the details on the [website](http://files.grouplens.org/datasets/movielens/ml-latest-small-README.html).\n", "\n", - "Prepare methods for loading data into the appropriate data structures. They will come in handy\n", - "also for further tasks.\n", - "Pay attention to the size of the data." + "Prepare methods for loading data into the appropriate data structures. They will come in handy also for further tasks.\n", + "\n", + "**Don't** uplod the data to you repositories." ] }, {