-
Notifications
You must be signed in to change notification settings - Fork 443
Realtime CLI spectrogram example #987
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Realtime CLI spectrogram example #987
Conversation
This is really cool and I can definitely see it landing somewhere in the Rust audio ecosystem. I'm wondering if Rodio would be the best place to get it landed? Normally I'd think so but then this PR also adds the point of raw access on WASAPI. That's something Rodio does not have access to, and actually, So offering additional WASAPI knobs is interesting, though I'm not a fan of the approach with the environment variable. What else can we think of that's more idiomatic - in the sense of a builder pattern, host/stream configuration options, or the like? I can imagine that other hosts also could have knobs that are worthwhile exposing, so I'd be interested to see what we could conjure up. Then maybe split the PR into a spectrograph for Rodio and host options in |
Forgive me if I'm misunderstanding, but is this implementing a way to create a spectrogram—built into I like the idea of a fast way to generate a spectrogram, and I would 100% use that in the near future, but I feel like that dilutes the goal of If this is just for an example though, that, I believe, is a really good feature to introduce, and showcases a little bit more of what CPAL can do for non-input-to-output features. |
@roderickvd Thanks for the ideas! I'll split the PR in near future and I'm thinking the same that env is quite clumsy way to try to force raw input with wasapi, would for example feature flag be much better option in this case? I'll check the rodeo option and might create pr there later! @wgibbs-rs Spectrogram would be just an example, no integration to cpal :) |
Cool. Yes, a feature flag could also work. Thinking out loud, is there any reason why a user would not want it? Or would this be going into too much of opinionated territory? |
If you are referring to changes in wasapi's build_input_stream_raw_inner function it could possibly be what user actually expects of but I would not add this without a way to explicitly enable it just yet |
I share the feeling that we could make it opt-in for now and consider transitioning to making it the default later. Just to rationalize it though, what would be pros/cons? |
Primary purpose of this PR is to add an example to demonstrate additional way to utilize audio input stream while offering a tool to visualize the stream for many purposes. While the amount of code and features included in the spectogram example are quite complex, it may be usefull for someone like me who is just entering the world of lower level audio. Currently the code uses OS's default input device.
While developing this example I noticed that on Windows using wasapi host AGC or noise suppression comes quickly into play when visualizing realtime audio input, regardless that
build_input_stream_raw_inner
should supposedly give raw audio stream if OS or driver agrees on that. To get real raw input audio on Windows with wasapi, I added feature to request raw audio stream behind new environment variable (global with OnceLock) that can be used to enable the mentioned feature. With that in place I was able to disable AGC/noice suppression on Windows 11 and get truly raw stream which allowed to run the spectrogram indefinitely without disturbance from OS/driver level filters. I did not face similar challenges on MacOS where the audiostream was seemingly untouched or atleast did not affect it at runtime. On Windows possible usecases with this could be for example longer running audio recordings where the volume and quality should stay constant, or if one would like to handle those by themselves. Windows seems to start lowering the input sound volume after certain period of inactivitySpectrogram example is briefly tested to work on real devices: Mac Mini M4 (15.5 Sequio), Linux and Windows 11.
The example is built with existing dependencies, only change to Cargo.toml so far is addition of libc for MacOS's dev-dependencies to allow creation of TUI app with minimal dependencies.
Check the comments along the code for additional information. This example has been reviewed by multiple runs on number of different LLMs such as Claude 4 Opus.
Br.
Matias