-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Weird audio issues with high audio per packet settings #6415
Comments
I have encountered a similar issue when working on a project based on libmumble. Turns out the Mumble client struggles with packets that contain more than 10ms (480 samples at 48000 Hz) of audio. Try to send 480 samples (per channel): the issue should not be present. |
Good to know. I was kinda going crazy thinking I was doing something wrong somewhere. Yeah, setting it to 10 ms seems to work fine. 20 is okay too mostly, but 40 and 60 are basically unusable. I usually like to use at least 20 ms since I find it seems to keep the audio clarity stable. I notice little hiccups or stutters with 10 that are nowhere near as bad as this though. |
Thank you for confirming. This is definitely something that has to be fixed in the client. Ideally it should handle all frame sizes that are supported by libopus. Line 22 in dfc8dac
mumble/src/mumble/AudioInput.h Lines 219 to 224 in dfc8dac
The Unfortunately an option to change them was never added and as a result problems like this one went unnoticed until third-party clients (e.g. bots) started popping up. |
So wait, I'm a little confused. I thought this setting in the client adjusts it? Are you saying the client just thinks all incoming audio is 10 ms? I made my bot mimic this functionality. I was defaulting it to 20 ms like mumble does on a fresh install, but I can adjust it via a method call. https://github.com/bkacjios/lua-mumble/blob/54c2272e2f0ba06bacf87c2fdc21441d14505140/mumble/mumble.h#L81 |
The setting itself adjusts the number of 10ms chunks to send per packet, not the actual number of audio frames: mumble/src/mumble/AudioInput.h Lines 235 to 239 in dfc8dac
mumble/src/mumble/AudioInput.cpp Lines 694 to 728 in dfc8dac
mumble/src/mumble/AudioInput.cpp Lines 1105 to 1140 in dfc8dac
Another issue in the code above is that it doesn't clamp the number of chunks to match Putting that aside, let's see what is going on in the audio output section: mumble/src/mumble/AudioOutputSpeech.cpp Lines 58 to 153 in dfc8dac
As you can see, the code assumes that incoming frames are always 10ms. This is wrong because (at least in our case) the Opus encoder always produces packets that contain a single encoded frame. Basically, there is no concept of "chunks" in encoded packets, regardless of the client's audio input settings. Finally, just to add some more confusion to the mix and possibly clarifying it: xiph/opus#315 |
Ahhh.. I understand now. My bot has a timer that determines how often it should send the audio data. So if I set the bots audio packet size to 60, it's encoding 2880 frames into one audio packet every 60 ms. I'm guessing I should send 6 individual packets with 480 encoded frames in one go? |
Yup! |
Am I missing something? When my bot receives audio from me speaking, it isn't getting 10 ms chunks if I adjust my client to have a 60 ms delay. Doesn't this go against what you said? Shouldn't I still be receiving 6 individual 240 bytes every 0.06 seconds?
Compared to the results when I have it set to 10 ms.
|
Are you referring to the size of the encoded packet or the raw audio data? |
Yeah, this was the encoded packet I receive from a speaking client. I'm starting to think we had a misunderstanding here.
My bot resamples all playing audio to 48000hz, since that's what mumble expects. I always encode 480 samples per 10ms. The issue I am having is with sending more than 10 ms per packet. (Replicating the audio per packet setting in an official client) If I set it to 20, I will encode (480 * 2) bytes of PCM data and send that over. Is that not correct? |
With libopus you can encode either 16 bit (2 bytes) signed integer or 32 bit (4 bytes) float samples. You can choose by calling either 20ms (0.02s) of mono 16 bit audio data at 48000Hz would be:
Multiply the result by the number of channels (3840 bytes for stereo). |
Description
Hello, I am the author of lua-mumble, a module for allowing someone to create a bot via the Lua scripting language.
I've been updating it to support the new protobuf UDP packets that were introduced in 1.5 and have been running into this weird issue. I have a test bot that is configured with a 60 ms audio frame per packet with a 96000 bitrate. It is also configured to output stereo audio. The server has a max bitrate of 558000.
I noticed that sometimes I was having weird audio issues with the bot I couldn't explain. Sometimes I would start the bot and the audio would be stuttering from the start, other times I would start it and it would sound perfect. The only way I was able to reproduce it consistently is by setting the audio packet size to 60 on the bot and muting/unmuting my client.
2024-05-10.09-01-27.mp4
I even managed to get this to happen when having my bot not play any music at all. Instead I had it loop back my audio so I could hear myself through the bot. The stuttering issues were also present. It almost seems like my clients decoder is trying to decode it in some other mode or something, but honestly I'm not too sure what is going on here.
Would anyone perhaps know what is going on here? I'm still unsure if this is an issue with the way I am encoding audio or an issue with how mumble is decoding audio.
Steps to reproduce
Have a client start transmitting audio at 60ms audio frame packet.
Mute and unmute the audio. They will now be stuttering.
Mumble version
1.5.628
Mumble component
Client
OS
Windows
Reproducible?
Yes
Additional information
No response
Relevant log output
No response
Screenshots
No response
The text was updated successfully, but these errors were encountered: