Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace multiple comma-separated languages with 'mul' #1181

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Optimus-NP
Copy link

Fixes #724

  • Refactored language code handling to replace multiple comma-separated values with 'mul'.
  • Single-language entries remain unchanged.
  • Improved consistency in language representation.

Testing

/usr/local/bin/kiwix-serve --port=8080 --library /tmp/zimlibrary/linux_library.xml

Used this command to start the server locally. I ensured that the ZIM library have at least one book with multiple language codes.
Please check the image below to verify that it's working fine.

Screenshot 2025-02-23 002830
Screenshot 2025-02-23 002732

@kelson42
Copy link
Collaborator

Thank you for your PR... We will have a look to it.

That said, it's a mystery why the Wikipedia in Bambara is tagged as containing English content!

@Optimus-NP
Copy link
Author

Thank you for your PR... We will have a look to it.

Great, thanks so much.

That said, it's a mystery why the Wikipedia in Bambara is tagged as containing English content!

In my example test case, I linked the Wikipedia in Bambara to both eng and bam language codes. And yes, it is mystery the way ted zim files show up on the website: Kiwix Library NoJX endpoint

Copy link
Collaborator

@veloman-yunkan veloman-yunkan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For full consistency across the jsful and nojs versions you should also add a tooltip (via the title attribute of the <div> tag) that shows the languages names on hover.

@@ -72,7 +72,7 @@ std::string HTMLDumper::dumpPlainHTML(kiwix::Filter filter) const
contentId = urlEncode(nameMapper->getNameForId(bookId));
} catch (...) {}
const auto bookDescription = bookObj.getDescription();
const auto langCode = bookObj.getCommaSeparatedLanguages();
const auto langCode = (bookObj.getCommaSeparatedLanguages().find(',') != std::string::npos) ? "mul" : bookObj.getCommaSeparatedLanguages();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Long line. Please try to abide by the Thiruvananthapuram Convention on coding guidelines.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback, comment addressed.

@Optimus-NP
Copy link
Author

For full consistency across the jsful and nojs versions you should also add a tooltip (via the title attribute of the <div> tag) that shows the languages names on hover.

Thanks for the suggestion, incorporated it in the code.

nojshover

const auto langList = bookObj.getCommaSeparatedLanguages();
const auto langCodes = kiwix::split(langList, ",", true, false);
std::string langCode = "";
std::string languageSelfName = "Undetermind";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. In order to avoid issues with translation, better initialize this variable to "???"
  2. The singular form used in the name of the variable can be misleading. I would rename it to bookLanguages.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, addressed the comment.

const auto bookIconUrl = rootLocation + "/catalog/v2/illustration/" + bookId + "/?size=48";
const auto tags = bookObj.getTags();
const auto downloadAvailable = (bookObj.getUrl() != "");
std::string faviconAttr = "style=background-image:url(" + bookIconUrl + ")";
std::string languageAttr = "title=" + languageSelfName + " aria-label=" + languageSelfName;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The values of HTML attributes should be quoted. This is best solved by updating static/templates/no_js_library_page.html - instead of passing languageAttr, the title and aria-label attributes should appear explicitly in the template and their value should be passed via a differently named parameter (as proposed in the other comment).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great feedback, addressed.

Comment on lines 75 to 91
const auto langCode = bookObj.getCommaSeparatedLanguages();
const auto langList = bookObj.getCommaSeparatedLanguages();
const auto langCodes = kiwix::split(langList, ",", true, false);
std::string langCode = "";
std::string languageSelfName = "Undetermind";

if (langCodes.size() > 1) {
std::vector<std::string> mulLanguages;
langCode = "mul";
for (const auto& lang : langCodes) {
mulLanguages.push_back(getLanguageSelfName(lang));
}
languageSelfName = kiwix::join(mulLanguages, ",");
} else if (langCodes.size() == 1) {
langCode = langCodes[0];
languageSelfName = getLanguageSelfName(langCode);
}
languageSelfName = kiwix::toTitle(languageSelfName);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function containing this change was already longer than typical human short-term memory can accommodate and this change makes the situation worse. Please encapsulate this enhancement in a couple of helper functions makeLanguagesShortString() and makeLanguagesFullString() so that the relevant code in dumpPlainHTML() reads (BTW, note the use of Book::getLanguages() instead of Book::getCommaSeparatedLanguages()):

    const auto bookLangs = book.getLanguages();

    booksData.push_back(kainjow::mustache::object{
      ...
      {"langShortString", makeLanguagesShortString(bookLangs)},
      {"langFullString",  makeLanguagesFullString(bookLangs)},
      ...
    });

Note, that the switch from langCode to langShortString suggests a further UX enhancement (not for this PR, though, since it has to be discussed first) - if the list of languages is limited to two (or, maybe, even three) languages, all of their codes may be shown instead of being replaced with mul.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, adding the helper function is much cleaner.

langCode = langCodes[0];
languageSelfName = getLanguageSelfName(langCode);
}
languageSelfName = kiwix::toTitle(languageSelfName);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't toTitle() be applied to each individual language?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the attached demo, I have already showcased an example using multiple languages, which is displaying as expected. However, for consistency, I have called the Kiwix Title API for each individual language.

@kelson42
Copy link
Collaborator

kelson42 commented Mar 4, 2025

Sorry for the late feedback, but from the user perspective, it should work exactly like on the js homepage https://library.kiwix.org/#q=&category=ted

@Optimus-NP
Copy link
Author

Sorry for the late feedback, but from the user perspective, it should work exactly like on the js homepage https://library.kiwix.org/#q=&category=ted

Thanks for the review.

Copy link
Collaborator

@veloman-yunkan veloman-yunkan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please squash your changes into a single commit

@Optimus-NP Optimus-NP force-pushed the kiwix-tools_issues_724 branch from f12e219 to da16aea Compare March 6, 2025 15:57
@Optimus-NP
Copy link
Author

Please squash your changes into a single commit

Addressed this comment.

@Optimus-NP
Copy link
Author

@kelson42 Requesting approval on the PR.

Copy link
Collaborator

@veloman-yunkan veloman-yunkan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Ok to merge.

@Optimus-NP
Copy link
Author

@kelson42 Can you please help me understand this failure ?
And if it is safe to merge please help with the needful.

@kelson42
Copy link
Collaborator

kelson42 commented Mar 6, 2025

Many things in the CI does not pass (and this is not only packaging). I can not merge like this.

@Optimus-NP
Copy link
Author

Many things in the CI does not pass (and this is not only packaging). I can not merge like this.

Thank you, I will look into the non packaging failure.

@Optimus-NP Optimus-NP force-pushed the kiwix-tools_issues_724 branch from da16aea to 7e80399 Compare March 8, 2025 18:17
- Refactored language code handling to replace multiple comma-separated values with 'mul'.
- Single-language entries remain unchanged.
- created the helper function to get the lang tag
- ensured the consistent behaviour for js and nojs version for the kiwix library view
@Optimus-NP Optimus-NP force-pushed the kiwix-tools_issues_724 branch from 7e80399 to fe1d3d9 Compare March 8, 2025 19:37
@Optimus-NP
Copy link
Author

Many things in the CI does not pass (and this is not only packaging). I can not merge like this.

I figured out nojs test were failing in the library server test. Now I have fixed them.

@Optimus-NP
Copy link
Author

@kelson42 @veloman-yunkan Requesting approval on the PR.

@kelson42
Copy link
Collaborator

kelson42 commented Mar 9, 2025

@veloman-yunkan This is still failing, I guess we face something which might be unrelated to this PR?

@Optimus-NP
Copy link
Author

@veloman-yunkan This is still failing, I guess we face something which might be unrelated to this PR?

Thank you for approving and running these workflows. I looked into the testcase failure on the link below is the diff observed

With diff:
@@ -100,5 +100,5 @@
             </a>
             <div class=\"book__meta\">
-              <div class=\"book__languageTag\" title=\"\" aria-label=\"\">fra</div>
+              <div class=\"book__languageTag\" title=\"Fran\xC3\xA7" "ais\" aria-label=\"Fran\xC3\xA7" "ais\">fra</div>
               <div class=\"book_tags\"><div class=\"book_tags--wrapper\">
                   <span class=\"tag__link\" aria-label='unittest' title='unittest'>unittest</span>
@@ -125,5 +125,5 @@
             </a>
             <div class=\"book__meta\">
-              <div class=\"book__languageTag\" title=\"\" aria-label=\"\">eng</div>
+              <div class=\"book__languageTag\" title=\"English\" aria-label=\"English\">eng</div>
               <div class=\"book_tags\"><div class=\"book_tags--wrapper\">
                   <span class=\"tag__link\" aria-label='public_tag_without_a_value' title='public_tag_without_a_value'>public_tag_without_a_value</span>
@@ -150,5 +150,5 @@
             </a>
             <div class=\"book__meta\">
-              <div class=\"book__languageTag\" title=\",\" aria-label=\",\">mul</div>
+              <div class=\"book__languageTag\" title=\"\xD0\xA0\xD1\x83\xD1\x81\xD1\x81\xD0\xBA\xD0\xB8\xD0\xB9,English\" aria-label=\"\xD0\xA0\xD1\x83\xD1\x81\xD1\x81\xD0\xBA\xD0\xB8\xD0\xB9,English\">mul</div>
               <div class=\"book_tags\"><div class=\"book_tags--wrapper\">
                   <span class=\"tag__link\" aria-label='public_tag_with_a_value:value_of_a_public_tag' title='public_tag_with_a_value:value_of_a_public_tag'>public_tag_with_a_value:value_of_a_public_tag</span>
@@ -169,3 +169,3 @@
     </body>
 </html>
-"

If I look at the difference, it seems that the full language name is missing, indicating that the getLanguageSelfName function returned an empty response. As you mentioned, this issue doesn’t appear to be caused by my PR.

@kelson42, could we proceed with merging this while we track the failure separately in a new GitHub issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

"nojs" mode does not handle properly the ZIM with multiple languages
3 participants