Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YouTube] Add support for poTokens #11955

Merged
merged 19 commits into from
Feb 5, 2025
Merged

[YouTube] Add support for poTokens #11955

merged 19 commits into from
Feb 5, 2025

Conversation

Stypox
Copy link
Member

@Stypox Stypox commented Jan 25, 2025

What is it?

  • Bugfix (user facing)

Description of the changes in your PR

General information about poTokens and about this PR structure:

  • YouTube now requires integrity checks to access their clients. The most "vulnerable" client is the WEB client, since they can't enforce integrity checks on all web browsers, so that's the only client (for now) that we have found a way to obtain an integrity token for.
  • In order to obtain a poToken, we need to run BotGuard, an obfuscated virtual machine implemented in JavaScript that performs the integrity checks and gives us an integrity token. In order to make the integrity checks succeed, we need to run this VM in an environment that resembles a browser as much as possible. The integrity token can be used to generate multiple poTokens. Two network requests are needed: Create to obtain the VM code, GenerateIT to obtain the integrity token after running the VM code. See the README here for the detailed steps.
  • PoTokenGenerator is the base class for all poToken generators. It has a factory method that allows asynchronously obtaining a new instance of a PoTokenGenerator, and then two methods to generate a poToken given a specific identifier, and a method to check if the integrity token has expired.
  • PoTokenWebView is currently the only implementation of PoTokenGenerator, but we might want to add other implementations in the future, e.g. ones that do not rely on WebView.
  • PoTokenProviderImpl implements the extractor interface and is supposed to take care of possibly multiple PoTokenGenerators (although right now there is only one based on WebView). It takes care of retrying in case of problems, recreates a new PoTokenGenerator if the current one expired, and finally returns a PoTokenResult. A PoTokenResult contains two poTokens: one for the specific requested video id (used to fetch the player), and another that can be generated only once as the first thing and is specific to a visitor data (used in streaming urls).

TODO:

  • The JavaScript poToken implementation comes from https://github.com/LuanRT/BgUtils
  • Obtaining a poToken via WebView
  • Obtaining a poToken with something like HtmlUnit not doable unfortunately
  • Handling devices that don't have a WebView (needs to be tested)
  • Passing the poToken to the extractor when requested
  • Passing the poToken to player network requests (not sure if needed?)
  • Change challenge fetching to InnerTube and sending integrity result to youtube.com like the desktop website does, this avoids requirement of the jnn-pa.googleapis.com domain - partially done, InnerTube attestation endpoint is not used yet
  • Understand whether we need to change user agent everywhere

You can test whether the poTokens generated work also using the latest yt-dlp commit from their git repo (older commits won't work!), this way (take PLAYER_POT, STREAMING_POT and VISITOR_DATA from logcat):

yt-dlp "https://www.youtube.com/watch?v=i_SsnRdgitA" --extractor-args 'youtube:player_client=web;player-skip=webpage,configs;po_token=web.player+PLAYER_POT,web.gvs+STREAMING_POT;visitor_data=VISITOR_DATA'

Fixes the following issue(s)

Relies on the following changes

APK testing

The APK can be found by going to the "Checks" tab below the title. On the left pane, click on "CI", scroll down to "artifacts" and click "app" to download the zip file which contains the debug APK of this PR. You can find more info and a video demonstration on this wiki page.

Due diligence

@Stypox
Copy link
Member Author

Stypox commented Jan 26, 2025

Now the PR builds fine based on TeamNewPipe/NewPipeExtractor#1247, you can download the APK which uses poTokens! Let us know if you notice any issues.

private val TAG = PoTokenWebView::class.simpleName
private const val GOOGLE_API_KEY = "AIzaSyDyT5W0Jh49F30Pqqtyfdf7pDLFKLJoAnw"
private const val REQUEST_KEY = "O43z0dpjhgX20SCx4KAo"
private const val USER_AGENT = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.3"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be Firefox ESR like in DownloaderImpl?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, for some reason it does not work with the Firefox user agent. It would work with the curl user agent though, I don't know why...

Copy link

@ale5000-git ale5000-git Jan 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps checks on Firefox are stricter.
Just a deduction of what may be the cause:
Some webservers may also check the "case" of headers (User-Agent vs user-agent; this also apply to other headers) and even the order in which the headers are sent to the server may matter.
The WebView is more likely to have an implementation similar to Chrome, so it will be more likely to fail with Firefox User-Agent since the implementation differs completely.

@gechoto
Copy link

gechoto commented Jan 27, 2025

Would it be possible to move the po token implementation to a library?

Currently this is in NewPipe (the app repo) which makes it inaccessible by other apps which also have the need for po tokens.

This will lead to a lot of duplicate code because it needs to be implement over and over again for each YT client app.

Would be cool if this can be maintained in just one place (and multiple apps could benefit like it is already the case with NewPipeExtractor).

@Figim
Copy link

Figim commented Jan 27, 2025

Would it be possible to move the po token implementation to a library?

Currently this is in NewPipe (the app repo) which makes it inaccessible by other apps which also have the need for po tokens.

This will lead to a lot of duplicate code because it needs to be implement over and over again for each YT client app.

Would be cool if this can be maintained in just one place (and multiple apps could benefit like it is already the case with NewPipeExtractor).

You can recreate this PR in your own application.

This simply connects to the extractor to support the Potoken stream. You will need to do this separately in your application. It should have been like this.

@gechoto
Copy link

gechoto commented Jan 27, 2025

You can recreate this PR in your own application.

my point was this would be inefficient

If you want to implement this over and over again for each app - sure, go ahead.

Keep in mind that this will likely not be "done" after the initial implementation.
YT will probably try to break this solution every few months.

You will have to update the implementation in many places again. And again. And again...
What a great way to waste time.

If this was implemented in just one place as a library it would be easier for more developers to share efforts.
To me this sounds like a reasonable thing to discuss - if possible.

Comment on lines 48 to 55
// an asynchronous function runs in the background and it will eventually call
// `vmFunctionsCallback`, however we need to manually tell JavaScript to pass
// control to the things running in the background by interrupting this async
// function in any way, e.g. with a delay of 1ms. The loop is most probably not
// needed but is there just because.
for (let i = 0; i < 10000 && !this.vmFunctions.asyncSnapshotFunction; ++i) {
await new Promise(f => setTimeout(f, 1))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I … don’t think this is how async works. The timeout is just gonna be scheduled on a new task, but the code before the loop still runs on a microtask on the previous task.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but this.vm.a seems to start a standalone task in the background or something like that, and we need to explicitly pass control back to the event loop by pausing this async execution, for the background task to finish executing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The loop actually executes only once as far as I know, I still put a loop because you never know

@Profpatsch
Copy link
Contributor

Can there be an architecture overview of this somewhere? From a skim of the code I don’t get any idea of what problem this solves or how the solution is structured.

@Stypox
Copy link
Member Author

Stypox commented Jan 27, 2025

  • YouTube now requires integrity checks to access their clients. The most "vulnerable" client is the WEB client, since they can't enforce integrity checks on all web browsers, so that's the only client (for now) that we have found a way to obtain an integrity token for.
  • In order to obtain a poToken, we need to run BotGuard, an obfuscated virtual machine implemented in JavaScript that performs the integrity checks and gives us an integrity token. In order to make the integrity checks succeed, we need to run this VM in an environment that resembles a browser as much as possible. The integrity token can be used to generate multiple poTokens. Two network requests are needed: Create to obtain the VM code, GenerateIT to obtain the integrity token after running the VM code. See the README here for the detailed steps.
  • PoTokenGenerator is the base class for all poToken generators. It has a factory method that allows asynchronously obtaining a new instance of a PoTokenGenerator, and then two methods to generate a poToken given a specific identifier, and a method to check if the integrity token has expired.
  • PoTokenWebView is currently the only implementation of PoTokenGenerator, but we might want to add other implementations in the future, e.g. ones that do not rely on WebView.
  • PoTokenProviderImpl implements the extractor interface and is supposed to take care of possibly multiple PoTokenGenerators (although right now there is only one based on WebView). It takes care of retrying in case of problems, recreates a new PoTokenGenerator if the current one expired, and finally returns a PoTokenResult. A PoTokenResult contains two poTokens: one for the specific requested video id (used to fetch the player), and another that can be generated only once as the first thing and is specific to a visitor data (used in streaming urls).

Let me know which places are not documented enough.

@Profpatsch
Copy link
Contributor

@Stypox I think it would be good to include this documentation into the source code somewhere, maybe in the interface module.

@Profpatsch
Copy link
Contributor

So that people who want to understand the code later don’t have to find this PR and looks through lots of issues first

@AudricV AudricV added bug Issue is related to a bug ASAP Issue needs to be fixed as soon as possible youtube Service, https://www.youtube.com/ labels Jan 31, 2025
@AudricV AudricV changed the title PoToken implementation to solve 403 errors [YouTube] Add support for PoTokens Jan 31, 2025
@Profpatsch
Copy link
Contributor

As a general comment, I don’t think it’s wise to add interfaces before even having need of a second implementation, it just makes the code more indirect and harder to read than necessary.

Copy link
Contributor

@Profpatsch Profpatsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get an “Couldn get HLS manifest” Extraction Exception, but the stream extraction seems to work just fine, so LGTM I’d say so we can create a release.

@AudricV AudricV self-assigned this Jan 31, 2025
app/build.gradle Outdated
implementation 'com.github.TeamNewPipe:NewPipeExtractor:v0.24.4'
implementation 'com.github.FireMasterK:NewPipeExtractor:d2cbd09089e8af933738f98b671ad58236a79d6e'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I missed that the NewPipeExtractor still points to an unofficial repo, that needs to be fixed before merge

@Stypox
Copy link
Member Author

Stypox commented Jan 31, 2025

As a general comment, I don’t think it’s wise to add interfaces before even having need of a second implementation, it just makes the code more indirect and harder to read than necessary.

Yes I totally agree with you, but when I started working on this I hoped I would be able to create two implementations

@Figim

This comment was marked as outdated.

This prevents non-abilities to fetch BotGuard challenge and send its
result with the jnn-pa.googleapis.com domain (domain block like done
on Pi-hole lists or DNS servers).

That's what the official website uses to send the challenge execution
result, however it uses InnerTube to fetch the challenge. Embeds
still use the jnn-pa.googleapis.com domain.

Also rename the makeJnnPaGoogleapisRequest method appropriately.
@gechoto gechoto mentioned this pull request Feb 3, 2025
3 tasks
FineFindus added a commit to FineFindus/LibreTube that referenced this pull request Feb 3, 2025
Implements support for locally generating PoTokens using the device
webview. This is a direct port of
TeamNewPipe/NewPipe#11955 to native kotlin.
@Stypox
Copy link
Member Author

Stypox commented Feb 4, 2025

@spvkgn thanks for reporting, please try the latest APK, I should have fixed that. Your WebView seems to use a really old version of JavaScript that doesn't have the globalThis object, but in our case this should return the same global scope anyway when called inside of a global function.

Some old Android devices have a broken WebView implementation, that can't execute the poToken code. This is now detected and the getWebClientPoToken return null instead of throwing an error in such a case, to allow the extractor to try to extract the video data even without a poToken.
@spvkgn
Copy link

spvkgn commented Feb 4, 2025

@Stypox Thanks, I checked and it works now.

FineFindus added a commit to FineFindus/LibreTube that referenced this pull request Feb 4, 2025
Implements support for locally generating PoTokens using the device
webview. This is a direct port of
TeamNewPipe/NewPipe#11955 to native Kotlin.

Closes: libre-tube#7065
FineFindus added a commit to FineFindus/LibreTube that referenced this pull request Feb 4, 2025
Implements support for locally generating PoTokens using the device
webview. This is a direct port of
TeamNewPipe/NewPipe#11955 to native Kotlin.

Closes: libre-tube#7065
FineFindus added a commit to FineFindus/LibreTube that referenced this pull request Feb 4, 2025
Implements support for locally generating PoTokens using the device
webview. This is a direct port of
TeamNewPipe/NewPipe#11955 to native Kotlin.

Closes: libre-tube#7065
@Stypox
Copy link
Member Author

Stypox commented Feb 4, 2025

I reworked the JavaScript code to not use any new JS features to achieve compatibility with as many WebView implementations as possible (including the API 21 emulator I was testing this on, which previously did not work). To do this, I moved all conversion functions to Kotlin (since those do not need to run in JS, so it's better if they are written in Kotlin where errors are caught at compile time and/or with nice stacktraces) and used Promise.then() instead of async functions.

While at it, I added more ways to detect if the WebView is not able to actually run the code, by making it so that the poToken generator returns null if there are any uncaught (syntax) errors. This makes the extractor aware of the fact that no poTokens can be generated, and tries other methods. Note that uncaught errors are thrown only while parsing JS, because in all places where JS is actually executed there is a try{}catch(){} around it.

I tested on API 21 emulated, API 34 emulated, and Android 14 on-device, and it worked everywhere (with playable videos).

Please test this ASAP, and report any issues. I will make a release tomorrow morning.


The compat version of this method might be needed to support older versions of Android.

Thank you, solved.

@Figim
Copy link

Figim commented Feb 4, 2025

Everything works great!😊🤗

@PepeTux
Copy link

PepeTux commented Feb 5, 2025

Tested on a
Moto G8 Power, Android 11: ✅
Moto G82, Android 13: ✅
TV Box Rockchip, Android 11: ✅

Great, thanks 👍

Using commit 9f83b385a since JitPack is buggy...
@tzagim
Copy link

tzagim commented Feb 5, 2025

works great on A15!
TNX

@Stypox
Copy link
Member Author

Stypox commented Feb 5, 2025

I am merging this now since we need to do a new release. This PR probably needs some refinement, and will especially need to adapt to future extractor changes (because the current extractor API around poTokens is going to be changed), but works well enough. Let's hope all poToken implementations behave well!

Copy link

sonarqubecloud bot commented Feb 5, 2025

@Stypox Stypox merged commit dd223af into TeamNewPipe:dev Feb 5, 2025
7 checks passed
@bodtx
Copy link

bodtx commented Feb 5, 2025

Working great on redmi note 8 pro 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ASAP Issue needs to be fixed as soon as possible bug Issue is related to a bug size/giant PRs with more than 750 changed lines youtube Service, https://www.youtube.com/
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[YouTube] HTTP error 403 for playback or download