Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate using EPUB for ebooks #1390

Open
j9t opened this issue Oct 28, 2020 · 6 comments
Open

Investigate using EPUB for ebooks #1390

j9t opened this issue Oct 28, 2020 · 6 comments
Labels
development Building the Almanac tech stack enhancement New feature or request
Milestone

Comments

@j9t
Copy link
Member

j9t commented Oct 28, 2020

The 2019 ebook refers to and depends on links, but none seem to be present at least in the version on Google Play Books.

Screenshots attached for both mobile and desktop. I haven’t checked the whole books but so far, nothing seems accessible. (This doesn’t seem intended—if it was, please consider links at least for author information and for anything indicated as a link, as with expressions starting with “http”.)

Screen Shot 2020-10-28 at 17 35 41

Screenshot_20201028-162047

Screenshot_20201028-162237


Also some centering issues as noted in #1391

@rviscomi rviscomi added this to the 2019 Backlog milestone Oct 28, 2020
@rviscomi rviscomi added bug Something isn't working development Building the Almanac tech stack labels Oct 28, 2020
@rviscomi
Copy link
Member

rviscomi commented Oct 28, 2020

@tunetheweb seems like the URLs in the footnotes were all removed by Books. Any ideas to workaround that?

@tunetheweb
Copy link
Member

Oh looks like they are gone in the PDF version too!: https://almanac.httparchive.org/static/pdfs/web_almanac_2019_en.pdf

The links still work (at least on PDFs) but the footnotes showing the URL are gone.

Will take a look.

@tunetheweb
Copy link
Member

I tell a lie - they are still there online. Pheww.

@rviscomi, I don't show foot notes when the full URL is shown as seems a bit redundant.

So my profile for example at the bottom of the HTTP/2 chapter has this:

Barry Pollard profile

But only includes the hidden link to my book, at the bottom of the page:

Footnote 46

There is no footnote URL link to my social media icons, nor the Twitter account and website in the text. This was for presentational reasons as otherwise we ended up with loads and loads of footnotes that looked really untidy. To me the URL is obvious from those links so felt better to hide.

It appears Google Books does include the footnotes, so that's good. @j9t I presume that is what you are showing in your 3rd screenshot wiht Una's links showing?

However it looks like it removes the clickable links themselves 😞 Both form the original link and the footnotes. The PDF version has both links in the text, and in the footnotes as clickable, which is much nicer.

I suspect this is to do with the auto conversion of PDF to EPUB. I did look at converting our PDF to EPUB (using Calibre) but didn't get nice results. The table of contents for example is below:

Calibre EPUB conversion

So on one hand I'm impressed that Google Books did such a good job on converting so it looks nice. But unfortunately it doesn't retain links. I don't think we can solve this until we find a decent PDF -> EPUB conversion tool.

@tunetheweb tunetheweb changed the title 2019 ebook: Fix or add links 2019 ebook not formatting correctly in Google Play Books due to conversion from PDF to EPUB Oct 28, 2020
@tunetheweb tunetheweb changed the title 2019 ebook not formatting correctly in Google Play Books due to conversion from PDF to EPUB 2019 ebook not formatting correctly in Google Play Books Oct 28, 2020
@j9t
Copy link
Member Author

j9t commented Oct 28, 2020

On the chance that I can contribute somehow:

  1. For PDF to EPUB conversion there must be many tools—could some other tool do the job?

  2. What’s the source material—Markdown, HTML? I can’t tell how much work that would be to switch there, but Leanpub is one example for where that conversion works really well, generating decently formatted books from HTML or Markdown into PDF, EPUB, and MOBI. I swear on it, and the output is definitely compatible with Google Books.

Just on the chance this can be useful—I can tell that much already went into this.

@tunetheweb
Copy link
Member

Just found this note: https://support.google.com/books/partner/answer/107073?hl=en&ref_topic=3238502

Hyperlinks
If your PDF contains hyperlinks, either to other parts of the same book or to external websites, please note that the links will be disabled when your book is processed.

Looks like it is supported for EPUB though: https://support.google.com/books/partner/answer/3316879?hl=en&ref_topic=3238502

The source is HTML: https://almanac.httparchive.org/en/2019/ebook and CSS Page Media. We then use PrinceXML to convert to PDF.

I'm open to ideas on better EPUB converters. Calibre seemed a recommended free on last time I looked but, as I say, results weren't great.

@rviscomi rviscomi modified the milestones: 2019 Backlog, 2020 Backlog Nov 11, 2020
@rviscomi rviscomi changed the title 2019 ebook not formatting correctly in Google Play Books Investigate using EPUB for ebooks Dec 1, 2020
@rviscomi
Copy link
Member

rviscomi commented Dec 1, 2020

Renaming the issue to focus on the EPUB format which should solve the centering issue in #1391 and the clickable links issue reported here.

@rviscomi rviscomi added enhancement New feature or request and removed bug Something isn't working labels Dec 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
development Building the Almanac tech stack enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants