Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using WASI io::stdin::read_to_end and defining the WASI context in add_to_linker_sync results in a hang #9723

Open
jeffcharles opened this issue Dec 3, 2024 · 5 comments
Labels
bug Incorrect behavior in the current implementation that needs fixing

Comments

@jeffcharles
Copy link
Contributor

Thanks for filing a bug report! Please fill out the TODOs below.

Note: if you want to report a security issue, please read our security policy!

Test Case

check_read_to_end.wasm.gz

The source code for the Wasm module looks like:

use std::io::{self, Read};

#[export_name = "wizer.initialize"]
pub fn initializer() {
    let mut buf = vec![];
    io::stdin().read_to_end(&mut buf).unwrap();
}

The source code for the host looks like:

use wasmtime::{Engine, Linker, Module, Store};
use wasmtime_wasi::{pipe::MemoryInputPipe, preview1::WasiP1Ctx, WasiCtxBuilder};

fn main() {
    let engine = Engine::default();
    let module = Module::from_file(
        &engine,
        "check_read_to_end.wasm",
    )
    .unwrap();
    let mut linker = Linker::new(&engine);
    let mut store = Store::new(&engine, Data { wasi: None });
    wasmtime_wasi::preview1::add_to_linker_sync(&mut linker, |cx: &mut Data| {
        cx.wasi = Some(
            WasiCtxBuilder::new()
                .inherit_stdout()
                .inherit_stderr()
                .stdin(MemoryInputPipe::new(vec![1]))
                .build_p1(),
        );
        cx.wasi.as_mut().unwrap()
    })
    .unwrap();
    let instance = linker.instantiate(&mut store, &module).unwrap();
    instance
        .get_typed_func::<(), ()>(&mut store, "wizer.initialize")
        .unwrap()
        .call(&mut store, ())
        .unwrap();
}

struct Data {
    wasi: Option<WasiP1Ctx>,
}

Steps to Reproduce

  1. Ungzip the Wasm file (or compile it) and move it to check_read_to_end.wasm.
  2. cargo run the main program for the host.

Expected Results

The function call to run and exit.

Actual Results

The function call hangs.

It looks like the wasmtime_wasi::preview1::add_to_linker_sync's closure is run infinitely and in my debugger I can see the MemoryInputPipe's read being called infinitely.

Versions and Environment

Wasmtime version or commit: 27.0.0 and also 23.0.3

Operating system: MacOS

Architecture: AArch64

Extra Info

Anything else you'd like to add?

Normally I'd define the WASI context on the store but I'm using Wizer which doesn't expose a public API that I can use to do that so this is a workaround I tried. However, I found it very unexpected that this causes a hang.

@jeffcharles jeffcharles added the bug Incorrect behavior in the current implementation that needs fixing label Dec 3, 2024
@pchickey
Copy link
Contributor

pchickey commented Dec 4, 2024

I think the bug here is that you are creating a new WasiP1Ctx in the closure to add_to_linker, which is a “getter” function into your Ctx type, when the usual place to create it is in Store::new - if you change that does the behavior resolve?

@jeffcharles
Copy link
Contributor Author

Yes it does resolve. However, as I mentioned in the issue:

Normally I'd define the WASI context on the store but I'm using Wizer which doesn't expose a public API that I can use to do that so this is a workaround I tried.

I did file bytecodealliance/wizer#123 to ask for an example of how to define a custom WASI context when using Wizer.

It might be worth augmenting the documentation around add_to_linker_sync and similar methods to clarify the closure may execute one or more times during an invocation of an exported Wasm method. That was definitely missing from my mental model and surprised me.

@pchickey
Copy link
Contributor

pchickey commented Dec 4, 2024

Sorry, I missed that context at the end of your post. Yes, we should improve the docs to describe that those linker closures shouldn’t perform side effects.

@alexcrichton
Copy link
Member

I'll admit I'm a bit lost about what to do about this. The problem here is that you're re-initializing WasiCtx so the guest itself is repeatedly reading and getting more data. The guest is waiting for a 0 return value signaling "eof" but every time it reads the context is reinitialized which provides more data.

We could certainly provide more notes in add_to_linker and friends about the intent of the closure, but would that really help here? The documentation already says that the closure should be a "projection" which isn't as clear as it could be, but regardless that's a pretty niche location to document this and I suspect when the app here "hangs" you probably wouldn't go consult add_to_linker_sync as the first thing to see what went wrong.

So all-in-all I agree that the docs of add_to_linker_sync could be improved, but would that really meaningfully change the experience here?

@jeffcharles
Copy link
Contributor Author

The problem here is that you're re-initializing WasiCtx so the guest itself is repeatedly reading and getting more data. The guest is waiting for a 0 return value signaling "eof" but every time it reads the context is reinitialized which provides more data.

So technically it's the MemoryInputPipe::new(...) inside the closure that's introducing the problem for me, not the context reinitialization (even if it's not ideal).

would that really help here?

That's admittedly subjective. In my case, it would have. I actually was looking at how this function was called first when my program started hanging because my initial implementation was locking a mutex inside this closure to retrieve the byte array I wanted to pass to stdin and I wanted to check that I didn't introduce a deadlock which is what I thought was happening initially. And then I got confused when I noticed the closure was being called more than once because that didn't align with my prior mental model.

Don't get me wrong, I would rather not re-initialize a WasiCtx in that closure, that's just a result of me using Wizer and Wizer seemingly not letting me set the stdin for its WasiCtx. But, it really wasn't clear to me that add_to_linker_sync would be called while inside a Wasm call.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Incorrect behavior in the current implementation that needs fixing
Projects
None yet
Development

No branches or pull requests

3 participants