Lab5 fixes #53

Deaponn · 2024-12-27T15:21:46Z

First commit fixes some typos
Second commit fixes not using every paragraph in train_data and dev_data datasets, details below.

Sometimes an article inside train_data or dev_data has more than one paragraph.
In this situation, data["paragraph"][0] only uses the first one.
The code
set([len(article["paragraphs"]) for article in train_data]) outputs {1, 2}
both for train_data and dev_data

Fixed the code to utilize every paragraph in the data souce. Changes:

In the cell right below header "Ładowanie danych", line 1376:
Train data articles: 8553 -> 11624
Dev data articles: 1402 -> 1453
Train questions: 41577 -> 56618
Dev questions: 6809 -> 7060

In the cell calculating all_contexts, line 1415:
len(all_contexts): 9955 -> 13077

In the cell which is changing the dataset to a shape of Pytanie: Kontekst:, line 1466:
Total count in train/dev: 75605/12372 -> 102805/12824
Positive count in train/dev: 34028/5563 -> 46187/5764

Sometimes, and article inside `train_data` has more than one paragraph. In this situation, data["paragraph"][0] does not use the second one. The code `set([len(article["paragraphs"]) for article in train_data])` outputs `{1, 2}` both for `train_data` and `dev_data` Fixed the code to utilize every paragraph in the data souce. Changes: In the cell right below header "Ładowanie danych": Train data articles: 8553 -> 11624 Dev data articles: 1402 -> 1453 Train questions: 41577 -> 56618 Dev questions: 6809 -> 7060 In the cell calculating `all_contexts`: `len(all_contexts)`: 9955 -> 13077 In the cell which is changing the dataset to a shape of Pytanie: Kontekst: Total count in train/dev: 75605/12372 -> 102805/12824 Positive count in train/dev: 34028/5563 -> 46187/5764

Deaponn added 2 commits December 27, 2024 16:19

Fix typos

81e445f

Deaponn changed the title ~~Fix typos~~ Lab5 fixes Dec 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lab5 fixes #53

Lab5 fixes #53

Deaponn commented Dec 27, 2024 •

edited

Loading

Lab5 fixes #53

Are you sure you want to change the base?

Lab5 fixes #53

Conversation

Deaponn commented Dec 27, 2024 • edited Loading

Deaponn commented Dec 27, 2024 •

edited

Loading