Skip to content

Conversation

@ARoxdale
Copy link

@ARoxdale ARoxdale commented Jul 13, 2025

2nd attempt at a fix for the long standing PushSamplesRequestBufs issue. Fix for #74

The root cause of the issue is the inter-relationships between the sound.cpp code, frame skipper, and allegro.

  • Because the output framerate was being set by fskipper, the elapsed emulated seconds were different from the true output seconds needed by allegro.
    output_seconds = CORRECTION_RATIO * emulated_seconds
  • The required correction ratio needs to be determined from frame skipper. The way this is done depends on the the frame skip mode, with unthrottled mode needing to use the estimated output FPS.
  • Also, to cope with very low (10Hz) framerates, the sound buffer sample count needed to be increased.

The solution isn't perfect and PushSamplesRequestBufs errors can still be seen when changing the framerate, but in testing at fixed framerates this appears stable.

I'm not confident this solution is truly robust given the complexity of the sound code, but it appears to be a working fix, finally.

ARoxdale added 3 commits July 13, 2025 00:20
…s errors at lower framethrottled rates.

Previously number of samples requested at 10Hz, 20Hz exceeded capacity of buffers.
Complex interactions of allegro requests and buffer code make deeper analysis difficult.
…Estimated FPS rate of fskipper needs to be used instead of machine frequency due to lack of allegro timer or other limiters when using unthrottled mode
…mplesRequestBufs errors will occur at low framerates (20Hz and especially 10Hz) due to buffers not being large enough
@ARoxdale
Copy link
Author

ARoxdale commented Jul 18, 2025

I've reverted the sound buffer sample capacity to 1024 to avoid the unacceptable increase in latency of the fix.
This means that only extremely low framerates (20Hz and especially 10Hz) still exhibit the problem.

One workaround for this could be a variable capacity samples buffer, increase in capacity and latency at just the lower framerates (but the sound lag is still noticeable even here). But this feels like a bit of a kludge.

I don't think this constitutes a solution so much a duct tape fix. I will try to think a bit more over the weekend, but this may not be fully solvable without going way deeper than I'm currently able.

Edit: It occurs to me that maybe another solution is, instead of increasing the samples generated in the buffer at lower framerates, to instead get allegro consume more samples instead. Not sure if this works for higher framerates, but I will try and see if it at least works.

…ng extremely low framerates where output delay exceeds sound buffer latency capacity
@ARoxdale
Copy link
Author

ARoxdale commented Jul 20, 2025

I added a timeout to the allegro wait on the the framerate timer throttle queue, followed by a Sound_Update() call.
This deals with with the extremely long delays that occur at lower throttled framerates (<10Hz).
This has successfully removed the sound skips that were occurring at 10Hz frameskip.

There are a small number of remaining PushSamplesRequestBufs() errors when loading certain new roms (Sonic 2) and apparently during sound changeover from the bootup logo. Apart from these the results appear stable.

I have only tested this on SMS roms.

@ARoxdale ARoxdale requested a review from maxim-zhao July 28, 2025 19:19
@maxim-zhao
Copy link
Collaborator

I usually leave it to Omar to merge here, but it all LGTM

@ocornut ocornut added the audio label Aug 13, 2025
@ocornut
Copy link
Owner

ocornut commented Aug 13, 2025

If I try sample playing roms such as "Music Station" or "Alex Kidd: The Lost Stars" the samples are noticeably too highly pitched with this patch.

@ocornut
Copy link
Owner

ocornut commented Aug 13, 2025

If I try sample playing roms such as "Music Station" or "Alex Kidd: The Lost Stars" the samples are noticeably too highly pitched with this patch.

Hmm, actually it only happened because I was running emulating a PAL/SECAM system yet having emulator running at 60 Hz.

According to this factor:

const double emulated_to_output_ratio = g_machine.TV->screen_frequency / (double)output_frequency;

Would it make sense to support that use case without effectively pitching the sound?

@ARoxdale
Copy link
Author

ARoxdale commented Aug 14, 2025

Would it make sense to support that use case without effectively pitching the sound?

Yes. I will need to fix this. This must be to do with the FM Sound unit somehow.
The FM unit probably has a more physically realistic interaction with the sound rates. Up to now, the fix has been getting away with the regular sound producing samples with constant frequencies. I'll try looking into it further.

It turns out this issue affects samples being played by the PSG chip only.

The "Music Station" rom is helpful

Edit: The pitching is definitely present and increases noticeably as you change framerate.
What I can't understand is the following:

  • When you reduce framerate, to say 30Hz, this increases the emulated_to_output_ratio and the output_seconds is now double the emulated.
  • So the number of samples we request from the FM synthesizer the PSG chip is now double the emulated amount
  • But for some reason this decreases the pitch of the output FM PSG sound.

I don't understand why simply asking for more samples changes the pitch of the sound from the sound unit. It is behaving as if the sound were truly "slowed down", like running a tape at half speed. But why should this be the case if all we did was ask for more sound samples from the sound unit? There must be something complex going on here within the FM chip emulator.

Before this change, the output from the sound would causes skips and PushSamplesRequestBufs() errors, but the pitch would be unchanged. I don't currently understand why pitch is affected. I will try to see if there is a setting in the FM emulator that has to be altered somehow.

Edit2: The FM module may need to be updated in the Sound_UpdateClockSpeed() functions. Maybe this function holds a simpler key to an overall fix.

Edit 3: This issue affects samples generated by the PSG chip. I'll need to revisit some assumptions and check how these samples are being generated.
It is possible to edit the throttle values in Sound_UpdateClockSpeed() so that both the samples and the regular sound and music pitches are shifted downwards. In effect, everything sounds like a slowed down old tape. But I think this solution would be incorrect / break least surprise in 2025, where most time shifting systems in video make an effort not to shift the pitch of sounds.

Edit 4: Recorded sound samples are a tricky case.

For ordinary sounds/music, the PSG chip registers are simply set to generate constantly ossillating tones. You then asking for enough samples for a certain output number of seconds are you are done. Pitch remains unchanged.

But to generate recorded voice samples, the chip does not use tones. Instead the PSG chip volume is periodically set to a constant (non-ossillating) volume level directly, with volume changes only occurring when registers are changed.
The timing of register changes creates the voice sampled sound.
Notably, a change in constant volume only occurs once in every call to PSG_WriteSamples().

This explains the change in pitch when the number of samples was increased at lower framerates(e.g. 30Hz). The volumes were held constant for longer, decreasing the number of volume changes and hence lowering the perceived frequency.

But there is another catch with samples. The timing is controlled by the CPU, during calls to SN76489_Write() in Out_SMS() I think. In practice this means that voice samples can only be generated for at maximum 1/60 emulated seconds. In the current master branch, this is the case. At lower framerates, 30Hz say, a voice sample is generated for 1/60th of a second at the correct pitch. But then in the remaining 1/60th of a second fill-in time Allegro needs to fill up with samples in SoundStream_Update(), there is only one large call to PSG_WriteSamples(), during which the PSG tone volume is constant and the voice sample does not play. This is leading to starts and stops in the voice sample playback.

This is quite a tricky issue to solve given the fundamentally different ways ordinary music tones and voice sample sounds are generated. Note, it is possible to pitch down regular music as well by using SN76489_SetClock(), but I haven't found a way to do this acceptably.

A more drastic option would be to somehow record and repeat the PSG write commands to fill in time during lower framerates, and cut these off at higher framerates, restoring the proper state of the PSG chip when starting the next frame. Maybe something so radical might not be needed if I can figure it out.

@maxim-zhao
Copy link
Collaborator

I think it is sensible to have the samples pitch change when the system speed is incorrect.

@ARoxdale
Copy link
Author

ARoxdale commented Aug 21, 2025

I think it is sensible to have the samples pitch change when the system speed is incorrect.

  • It would be consistent if the ordinary background sounds and music also changed pitch as well. This may be possible by changing the Sound clock rate. It would be the "old school tape" / chipmunk voice effect.
  • But nowadays for video and audio speed changes on, say Youtube videos, it's more commonplace for the pitch of sounds to be kept the same. This uses some complicated time stretching algorithms. https://en.wikipedia.org/wiki/Audio_time_stretching_and_pitch_scaling . But I think today this would be the "expected" behavior for most users. It was for me when I was just dealing with the background music.

My current thinking on how to achieve constant pitch time stretching (and possibly allow the chipmunk effect too), is to make the following changes to the sound system. I think this will at least work for the PSG chip.

  1. Create per frame sound buffers. Record "sound samples" into the frames, or equivalently, PSG chip states and sample count durations for that state.
  2. When playback is requested by allegro, take the sound buffers for each frame.
    • If the framerate is over the normal rate (frame durations are shorter), take only a part of the sound buffer for each frame. Append these cropped samples for all frame and produce the output.
    • [123][123][123] -> [12][12][12]
    • If the framerate is lower than the normal rate (frame durations are longer), increase the buffer sample count for each frame by looping back to the beginning of the frame, and keep doing so until the required number of samples. Append the extended outputs for all frames to produce the output
    • [123][123][123] - > [12312][12312][12312]
  3. Just chopping up/repeating the sound like this will produce popping artifacts. But, if instead we store the PSG chip states, then I think those can be used on the fly to generate the samples allegro requires. In other words, we don't generate any actual samples until Allegro requests them, and we generate them from a "midi" sequence of PSG chip states, potentially cropping or extending those sequences if required.

I believe this method could work to "time extend" the sounds consistently. I think it could also be adjusted at the output stage to produce the "chipmunk" effect as well (by adjusting the number of sample being output for each PSG state change).
I think there will also be an advantage in that the emulated sound will be more distinguished from the output sound.

I'm not 100% if this will work or not, but I'd like to give it a go.
I think the frameskipper module can be used to give a concept of "frame number", or that could be put in the SMS_TYPE struct. This will end up using more memory for the sound obviously.

@ocornut
Copy link
Owner

ocornut commented Aug 21, 2025

I think it is sensible to have the samples pitch change when the system speed is incorrect.

In theory perhaps but our general UI design + database override (certain games are forced to PAL mode for compatibility, yet speed is still 60 hz, which is somehow nice in some situation, yet somehow incorrect) doesn’t make it very sensible ?

I would assume most modern emulators have had to deal with issues like this and there’s probably good code references, if not whole librairies that may be used for it.

@maxim-zhao
Copy link
Collaborator

I think Meka should be a bit more like Emulicious and have NTSC/60Hz and PAL/50Hz as the "100%" speed, and let people pick relative percentage speeds. But that's a much bigger change than this. So I would say it is fine as it is, and the bigger change can be done separately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants