Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

faiss-napi package segfault from version 1.2.2 to 1.2.5 #18071

Closed
asilvas-godaddy opened this issue Mar 11, 2025 · 9 comments
Closed

faiss-napi package segfault from version 1.2.2 to 1.2.5 #18071

asilvas-godaddy opened this issue Mar 11, 2025 · 9 comments
Assignees
Labels
atw crash An issue that could cause a crash napi Compatibility with the native layer of Node.js

Comments

@asilvas-godaddy
Copy link

asilvas-godaddy commented Mar 11, 2025

How can we reproduce the crash?

Reverting to version 1.2.1 was required to stop crash.

Relevant log output

Bun v1.2.5 (013fdddc) macOS Silicon
macOS v14.7.4
Args: "bun" "scripts/build-integrations.ts"
Features: Bun.stderr(4) Bun.stdin(2) Bun.stdout(4) fetch(4) jsc transpiler_cache(5) tsconfig(5) workers_spawned napi_module_register process_dlopen 
Builtins: "abort-controller" "bun:jsc" "bun:main" "node:assert" "node:buffer" "node:child_process" "node:crypto" "node:events" "node:fs" "node:fs/promises" "node:http" "node:https" "node:module" "node:os" "node:path" "node:process" "node:querystring" "node:stream" "node:string_decoder" "node:tty" "node:url" "node:util" "node:util/types" "node:worker_threads" "node-fetch" "node:v8" "node:http2" 
Elapsed: 1417ms | User: 408ms | Sys: 126ms
RSS: 0.18GB | Peak: 0.18GB | Commit: 0.90GB | Faults: 59

Stack Trace (bun.report)

Bun v1.2.5 (013fddd) on macos aarch64 [AutoCommand]

Segmentation fault at address 0x00000018

  • 6 unknown/js code
  • long long Zig::NapiClass_ConstructorFunction<false>
  • 1 unknown/js code
  • jsc_llint_commonCallOp__llintOpWithMetadata__llintOpWithReturn__llintOp__commonOp__fn__fn__makeReturn__fn__fn__fn__684_callHelper__dispatch_LowLevelInterpreter64_asm_2535
  • llint_call_javascript

Features: transpiler_cache, tsconfig, workers_spawned, napi_module_register, process_dlopen, Bun.stderr, Bun.stdin, Bun.stdout, fetch, jsc

@asilvas-godaddy asilvas-godaddy added the crash An issue that could cause a crash label Mar 11, 2025
@github-actions github-actions bot added macOS An issue that occurs on macOS runtime labels Mar 11, 2025
Copy link
Contributor

Thank you for reporting this crash.

For Bun's internal tracking, this issue is BUN-E2H.

@Electroid
Copy link
Contributor

We rewrote a large chunk of N-API in 1.2.5, @190n could you take a look?

@Electroid Electroid added napi Compatibility with the native layer of Node.js and removed macOS An issue that occurs on macOS runtime labels Mar 11, 2025
@Electroid
Copy link
Contributor

@asilvas-godaddy Is it possible to share some code so it's easier for us to reproduce?

@190n 190n self-assigned this Mar 11, 2025
@190n
Copy link
Collaborator

190n commented Mar 11, 2025

I tried running the example code in the faiss-napi README, and I do get a C++ exception:

panic(main thread): A C++ exception occurred

But the same thing happens in Node, too:

libc++abi: terminating due to uncaught exception of type faiss::FaissException: Error in virtual void faiss::IndexIVFFlat::add_core(idx_t, const float *, const int64_t *, const int64_t *) at /Users/runner/work/faiss-node/faiss-node/deps/faiss/faiss/IndexIVFFlat.cpp:51: Error: 'is_trained' failed

I would love to see a code example that reproduces the segfault if you can share one.

Also, do versions 1.2.3 and 1.2.4 have this segfault?

@asilvas-godaddy
Copy link
Author

Having a hard time trying to provide an isolated repro code, but I'll keep trying.

Yes versions 1.2.2, 1.2.3, 1.2.4, 1.2.5 all had the same issue. Exceptions are not the concern unless those exceptions are resulting in the segfault.

@Jarred-Sumner
Copy link
Collaborator

@asilvas-godaddy can you try running with BUN_JSC_useGC=0 to help us rule out whether or not it's related to finalizers in NAPI functions?

@190n
Copy link
Collaborator

190n commented Mar 17, 2025

Also, if you have a script you can run that will either crash or not (even if it's somewhat complex), you could try bisecting to narrow down commits that might be responsible:

  • Clone the Bun repo

  • Start bisecting changes between 1.2.1 and 1.2.2: git bisect start bun-v1.2.2 bun-v1.2.1

  • Write a runner script with something like the following:

    #!/bin/bash
    
    sha=$(git rev-parse HEAD)
    bunx bun-pr $sha
    if [ $? != 0 ]; then
      echo "could not download artifact for this commit, trying a nearby one"
      exit 125
    fi
    
    bun-$sha your-script.js || exit 1

    And use git bisect run ./runner.sh. This will try using bun-pr to download a build of Bun for each commit from our CI. It won't be perfectly accurate, since some commits don't have builds because they got cancelled, but you won't have to compile Bun a bunch of times and this may at least narrow down the range of possible commits. I tested using this script to find the commit for a behavior change I made in node:timers fixes #16855 and it narrowed it down to either that PR or one other commit.

@kravetsone
Copy link

Also, if you have a script you can run that will either crash or not (even if it's somewhat complex), you could try bisecting to narrow down commits that might be responsible:

  • Clone the Bun repo

  • Start bisecting changes between 1.2.1 and 1.2.2: git bisect start bun-v1.2.2 bun-v1.2.1

  • Write a runner script with something like the following:

    #!/bin/bash
    
    sha=$(git rev-parse HEAD)
    bunx bun-pr $sha
    if [ $? != 0 ]; then
      echo "could not download artifact for this commit, trying a nearby one"
      exit 125
    fi
    
    bun-$sha your-script.js || exit 1

    And use git bisect run ./runner.sh. This will try using bun-pr to download a build of Bun for each commit from our CI. It won't be perfectly accurate, since some commits don't have builds because they got cancelled, but you won't have to compile Bun a bunch of times and this may at least narrow down the range of possible commits. I tested using this script to find the commit for a behavior change I made in node:timers fixes #16855 and it narrowed it down to either that PR or one other commit.

Looks like good idea for bun-bisect interactive package

@asilvas-godaddy
Copy link
Author

@asilvas-godaddy can you try running with BUN_JSC_useGC=0 to help us rule out whether or not it's related to finalizers in NAPI functions?

I would but the issue magically vanished and can no longer repro. I'll reopen the issue if it comes back. Appreciate all the replies!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
atw crash An issue that could cause a crash napi Compatibility with the native layer of Node.js
Projects
None yet
Development

No branches or pull requests

5 participants