-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix lrclib lyrics #5406
base: lyrics-refactor-tests
Are you sure you want to change the base?
Fix lrclib lyrics #5406
Conversation
d4bed72
to
829192d
Compare
cb8929f
to
c2807f0
Compare
The build on I will address this in a separate PR and rebase this one accordingly once the fix is merged. Note: this issue popped up now because I added a new |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a much better implementation now, well done. I especially like that you've managed to remove the test resources -- I wasn't expecting 2000 lines of lyrics in there.
…5407) See my comment under #5406 for context > The build on win32 is failing to install reflink because it's [only supported until Python 3.7](https://gitlab.com/rubdos/pyreflink/-/blob/master/setup.py?ref_type=heads). > > I will address this in a separate PR and rebase this one accordingly once the fix is merged. > > Note: this issue popped up now because I added a new requests-mock dependency which invalidated cached dependencies.
c2807f0
to
64c0439
Compare
dc02c94
to
622ed3c
Compare
622ed3c
to
3eb04ba
Compare
1ff3ff2
to
4d481d7
Compare
9518b5d
to
27453e3
Compare
Hey, LRCLIB author here 👋. It would be great if you could make the Also, the If you have any ideas for further improvements to LRCLIB's API, feel free to let me know! |
Hi @tranxuanthang, thanks for popping in! Absolutely, that's no problem at all. This should make most lyrics queries even speedier on our side. Now that we're on this topic, I think it may be a good idea to also add caching: for example, if we're getting lyrics for two separate files
Ideally we should only ask for @tranxuanthang thanks for a reliable and performant API! |
e914d59
to
73e4a5a
Compare
e70f61b
to
278279e
Compare
8d6762f
to
16188fa
Compare
278279e
to
4970dda
Compare
16188fa
to
79b2af5
Compare
Snap, I wish I'd seen your PR earlier! It does, however I'm not sure when will it get reviewed and merged in, so let's get your PR merged in first! Also note that the lyrics tests / configuration issue you experienced while working on that PR has been fixed here: #5452 |
4970dda
to
09863b8
Compare
79b2af5
to
b879e9c
Compare
Okay, sounds like a plan to me! This is probably more suitable for your other PR (#5474), but I'll mention this here. In the comments of my PR, I noted that there's opportunity for a bigger change regarding synced lyrics and fallbacks. If I recall correctly, with how things currently are, it's first come, first serve with regards to choosing lyrics from the enabled backends. In other words, if a plain-only backend fetches lyrics before we try a backend that supports synced lyrics, the plain ones will be used. That's not great if we prefer synced lyrics. The idea for solving this that I had was:
Is anything like this in your plans? Thoughts on it in general? Seeing how you're working a lot on this plugin, I think you'd be well suited to tackle this. |
I like the concept of this! Though we need to be realistic here: lrclib is the only provider that supports synced lyrics 😅 Well at least this simplifies engineering a solution, since we never need to jump between the backends. Now if we zoom into LRCLib, you will find that the results are currently scored this way: @cached_property
def dist(self) -> tuple[bool, float]:
"""Distance/score of the given lyrics item.
Return a tuple with the following values:
1. Absolute difference between lyrics and target duration
2. Boolean telling whether synced lyrics are available.
Best lyrics match is the one that has the closest duration to
``target_duration`` and has synced lyrics available.
"""
return self.duration_dist, not self.synced It prioritizes lyrics that have duration closer to the music file. Presence of synced lyrics comes secondary. Though now that I think of this, I think we may be able to swap these two around: return not self.synced, self.duration_dist And thus prioritize results that have synced lyrics available. I guess the danger is what if we receive some wrong synced lyrics where the duration is vastly different?. Results that have mismatching durations actually get ignored early on as the algorithm is fairly strict about what's allowed: it tolerates ±5% difference: DURATION_DIFF_TOLERANCE = 0.05
@cached_property
def is_valid(self) -> bool:
"""Return whether the lyrics item is valid.
Lyrics duration must be within the tolerance defined by
:attr:`DURATION_DIFF_TOLERANCE`.
"""
return (
abs(self.duration - self.target_duration)
<= self.target_duration * self.DURATION_DIFF_TOLERANCE
) This means that when we calculate |
So far. 🙃 Regarding the algorithm, I think if you first disqualify matches that are too short/long and then rank by synced-ness, that makes sense. That's all good. The issue I'm thinking of is more for when there are multiple synced lyric providers. Imagine there's LCRLib and bilRCL, both can provide synced lyrics. As it is now, if I understand it correctly, if LRCLib is first in the
|
I see! You're correct, the plugin now picks any kind of lyrics returned by the first source that returns something.
This makes sense - I am myself after synced lyrics if possible, so if there was another source that provides them, I would indeed try to engineer a way to retrieve them.
This looks like a good approach. This of course depends on knowing in advance which sources provide synced lyrics, so that we can try them first.
I'd say we may even want to have a separate source priority list for non-synced lyrics: for example, I found that I prefer lyrics from LRCLib only if they are synced. Otherwise, I'd rather get them from Genius where they've been reviewed by the community. Alternatively, if users want synced lyrics, we firstly query backends that provide them (respecting their order in the sources configuration), and then query the rest if synced weren't found. This way, if I have the following configuration: lyrics:
sources: [genius, lrclib, tekstowo, bilrcl]
synced: yes The plugin tries LRCLib and bilRCL first and returns the first valid synced lyrics, if found. If not, it tries Genius, then (cached) plain lyrics from LRCLib, then Tekstowo, and finally bilRCL. |
I think your suggestions make perfect sense! If you were to implement it like that, I wouldn't be mad at all. Though this probably warrants a bigger discussion with more maintainers, I would imagine. At any rate, I'll leave this idea with you, since you're working a lot on this plugin and you're way better suited for the task, if you choose to accept it, than I am. |
a4111b8
to
44b4b46
Compare
b879e9c
to
645c12e
Compare
a4b45e9
to
2b80596
Compare
645c12e
to
121ae5f
Compare
Adjust the base URL to perform a '/search' instead of attempting to '/get' specific lyrics where we're unlikely to find lyrics for the specific combination of album, artist, track names and the duration (see https://lrclib.net/docs). Since we receive an array of matching lyrics candidates, rank them by their duration similarity to the item's duration, and whether they contain synced lyrics.
2b80596
to
02386cc
Compare
121ae5f
to
788a1d7
Compare
Fixes #5102
LRCLib lyrics backend fixes
Bug Fixes
lrclib
source. If lyrics for a specific album, artist, and title combination are not found, the plugin now searches for the artist and title and picks the most relevant result, scoring them bysources
configuration to prioritizelrclib
over other sources for faster and more reliable results.Code Improvements
fetch
method in all backends.LRCLyrics
andLRCLibItem
classes to encapsulate lyrics data and improve code structure.LRCLib
backend. These will be added to the rest of the backends in a separate PR.Tests
To Do
docs/
to describe it.)docs/changelog.rst
to the bottom of one of the lists near the top of the document.)