Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build Channel for Funda Wande #13

Closed
jredrejo opened this issue Feb 15, 2024 · 12 comments
Closed

Build Channel for Funda Wande #13

jredrejo opened this issue Feb 15, 2024 · 12 comments
Assignees

Comments

@jredrejo
Copy link
Member

Overview

Build public channel for https://fundawande.org/learning-resources

Description and outcomes

Content is in pdf format

License: CC-BY (page 3 of the workbooks)
Copyright owner: Funda Wande

@jredrejo
Copy link
Member Author

@radinamatic
Copy link
Member

I updated the channel to the most recent version where the note stated that 2 corrupted PDF files have been replaced, and it did fix one that I noted in my first test, but I can still see at least 2 others:

fundawande fundawande2

@jredrejo jredrejo changed the title Build Channel for Fund Awande Build Channel for Funda Wande Mar 1, 2024
@jredrejo
Copy link
Member Author

jredrejo commented Mar 1, 2024

@radinamatic thank you, Funda Wande site was really slow and some files seem to have been downloaded with problems. I've detected some but I haven't opened all of the 118 pdf files to check all of them.
I've re-downloaded these two you've found and created a new version of the channel. Please, let me know if you find any other.

@radinamatic
Copy link
Member

So I downloaded the full channel, but to expedite checking if any of the PDF files are corrupted (I also did not have the patience to go through 100+) I used another application that found other 6 (they are marked corrupted and maybe corrupted in the attached report.

CorruptedPDFinder_results.txt

corrupted-PDFs

I confirmed that they cannot be opened in Firefox, so the report is probably accurate. Do you have the means of figuring out which files are those from the Studio resource file name? I can't think of the way to do it myself in Kolibri... 🤔

There are also other commands you could use that I would have tried had I not decided to test this channel in Windows... 🤦🏽‍♀️
Could these be run after you download the files from the source site, and prior to uploading them to Studio?

Comes to mind that it would be an useful addition to the chef workflow in any case, checking that downloaded PDF files are not corrupted (and maybe re-trying the download), does the ricecooker has something to that effect, @rtibbles?

@radinamatic
Copy link
Member

Some other points for improvements:

  1. There are 2 folders named Reading Academy, which is confusing. One contains videos, and the other PDF files. Given that those two types of resources are presented together in the rest of the folders, can we be fully consistent here and have one Reading Academy folder with both?

    fundawande3
  2. I can understand why we may not want to create deep nested subfolders for Term 1, Term 2 etc. inside Literacy Workbooks and Teaching Guides, but instead prepend the respective term to the resource name. Could we use the dash - instead of the slash / to do that? I believe it would improve the readability.

  3. The order of the resources inside the Reading for Meaning Course folder is a bit scattered. After the modules the resources are numbered, but start with 55, then 2, 14, 23, 139, 122... And it ends with 234, 1, 7, 142.

    fundawande4

    I understand not all may be available in English, but could we at least have them in the proper ascending order, even if not fully sequential?

@jredrejo
Copy link
Member Author

jredrejo commented Mar 4, 2024

So I downloaded the full channel, but to expedite checking if any of the PDF files are corrupted (I also did not have the patience to go through 100+) I used another application that found other 6 (they are marked corrupted and maybe corrupted in the attached report.

CorruptedPDFinder_results.txt

corrupted-PDFs I confirmed that they cannot be opened in Firefox, so the report is probably accurate. Do you have the means of figuring out which files are those from the Studio resource file name? I can't think of the way to do it myself in Kolibri... 🤔

There are also other commands you could use that I would have tried had I not decided to test this channel in Windows... 🤦🏽‍♀️ Could these be run after you download the files from the source site, and prior to uploading them to Studio?

Comes to mind that it would be an useful addition to the chef workflow in any case, checking that downloaded PDF files are not corrupted (and maybe re-trying the download), does the ricecooker has something to that effect, @rtibbles?

Thank you, after checking them, it was only 3 of them that were corrupted.
Also, for the ricecooker check, I think that's a good idea, could you open an issue in ricecooker for it? if not, I'll fill it

@jredrejo
Copy link
Member Author

jredrejo commented Mar 4, 2024

Some other points for improvements:

  1. There are 2 folders named Reading Academy, which is confusing. One contains videos, and the other PDF files. Given that those two types of resources are presented together in the rest of the folders, can we be fully consistent here and have one Reading Academy folder with both?
    fundawande3

There was some trailing spaces in the name of the topic coming from errors in the original page. Fixed

  1. I can understand why we may not want to create deep nested subfolders for Term 1, Term 2 etc. inside Literacy Workbooks and Teaching Guides, but instead prepend the respective term to the resource name. Could we use the dash - instead of the slash / to do that? I believe it would improve the readability.

Actually, I don't know why dash is better than slash, but as I don't have any opinion on it, I've changed it.

  1. The order of the resources inside the Reading for Meaning Course folder is a bit scattered. After the modules the resources are numbered, but start with 55, then 2, 14, 23, 139, 122... And it ends with 234, 1, 7, 142.
    fundawande4
    I understand not all may be available in English, but could we at least have them in the proper ascending order, even if not fully sequential?

In fact, they were sorted by the module, but module was not visible in the names, so I've added it.

@radinamatic
Copy link
Member

Thank you, after checking them, it was only 3 of them that were corrupted.

I updated the channel locally, but there still 6 PDF files that report as corrupted in all browsers (Firefox, Chrome and Edge).
CorruptedPDFinder_results.txt

One weird thing happened during the update: it looked like it was performed in 2 separate tasks, one of whom failed, but upon another check to import more, channel was apparently fully downloaded with all resources on device 🤷🏽‍♀️

task check
2024-03-04_20-03-11 2024-03-04_22-30-22

Also, for the ricecooker check, I think that's a good idea, could you open an issue in ricecooker for it? if not, I'll fill it

Done! 🙂

@radinamatic
Copy link
Member

Some other points for improvements:

  1. There are 2 folders named Reading Academy, which is confusing. One contains videos, and the other PDF files. Given that those two types of resources are presented together in the rest of the folders, can we be fully consistent here and have one Reading Academy folder with both?

There was some trailing spaces in the name of the topic coming from errors in the original page. Fixed

Excellent! ✔️

  1. I can understand why we may not want to create deep nested subfolders for Term 1, Term 2 etc. inside Literacy Workbooks and Teaching Guides, but instead prepend the respective term to the resource name. Could we use the dash - instead of the slash / to do that? I believe it would improve the readability.

Actually, I don't know why dash is better than slash, but as I don't have any opinion on it, I've changed it.

I don't have hard empirical data on this, but suspect that all but nerdy computer people would find separating with slashes easier to read 😛
Thank you for changing that! 🙏🏽

  1. The order of the resources inside the Reading for Meaning Course folder is a bit scattered. After the modules the resources are numbered, but start with 55, then 2, 14, 23, 139, 122... And it ends with 234, 1, 7, 142.

In fact, they were sorted by the module, but module was not visible in the names, so I've added it.

Ok, thank you, looks less unordered now! 👍🏽

@radinamatic
Copy link
Member

I re-checked all the files that kept being reported as corrupted, and maybe corrupted, and were able to open them through Kolibri, so the content checks out, good work! 👏🏽 💯 :shipit:

@jredrejo
Copy link
Member Author

@rtibbles waiting for your technical review after the good to go passes from @radinamatic & @revanthvle

@rtibbles
Copy link
Member

rtibbles commented May 7, 2024

This is complete - technical pieces are fine.

@rtibbles rtibbles closed this as completed May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants