Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

invokeai terminated by signal SIGSEGV #49

Closed
muni-corn opened this issue Oct 11, 2023 · 2 comments
Closed

invokeai terminated by signal SIGSEGV #49

muni-corn opened this issue Oct 11, 2023 · 2 comments

Comments

@muni-corn
Copy link

hi! i'm trying to get invokeai running on my setup but i'm running into an address boundary error.

2023-10-11 07:49:52.269848: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE3 SSE4.1 SSE4.2 AVX AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
/nix/store/6nyknk2dj5kxial6ymksbpgqhcmw2x7c-python3.10-pytorch-lightning-1.9.0/lib/python3.10/site-packages/pytorch_lightning/utilities/distributed.py:258: LightningDeprecationWarning: `pytorch_lightning.utilities.distributed.rank_zero_only` has been deprecated in v1.8.1 and will be removed in v2.0.0. You can import it from `pytorch_lightning.utilities` instead.
  rank_zero_deprecation(
* Initializing, be patient...
>> Initialization file /home/muni/invokeai/invokeai.init found. Loading...
>> Internet connectivity is True
>> InvokeAI, version 2.3.1.post2
>> InvokeAI runtime directory is "/home/muni/invokeai"
>> GFPGAN Initialized
>> CodeFormer Initialized
>> ESRGAN Initialized
>> Using device_type cuda
>> xformers not installed
>> Initializing NSFW checker
fish: Job 1, 'nix run github:nixified-ai/flak…' terminated by signal SIGSEGV (Address boundary error)

i have an AMD RX 7600 GPU (gfx1102). let me know what other information i can provide to help!

@muni-corn
Copy link
Author

sorry; this may be unrelated to nix. i was able to get a backtrace and it seems related to AMD and HIP:

#0  0x00007fffa5417085 in ?? () from /nix/store/n3p8ykaypcx80g1c8pifd9jfvbkr1hbz-python3.10-torch-1.13.1/lib/python3.10/site-packages/torch/lib/libamdhip64.so
#1  0x00007fffa541ae47 in ?? () from /nix/store/n3p8ykaypcx80g1c8pifd9jfvbkr1hbz-python3.10-torch-1.13.1/lib/python3.10/site-packages/torch/lib/libamdhip64.so
#2  0x00007fffa5427ce9 in ?? () from /nix/store/n3p8ykaypcx80g1c8pifd9jfvbkr1hbz-python3.10-torch-1.13.1/lib/python3.10/site-packages/torch/lib/libamdhip64.so
#3  0x00007fffa53a6009 in ?? () from /nix/store/n3p8ykaypcx80g1c8pifd9jfvbkr1hbz-python3.10-torch-1.13.1/lib/python3.10/site-packages/torch/lib/libamdhip64.so
#4  0x00007fffa53a61a0 in ?? () from /nix/store/n3p8ykaypcx80g1c8pifd9jfvbkr1hbz-python3.10-torch-1.13.1/lib/python3.10/site-packages/torch/lib/libamdhip64.so
#5  0x00007fffa528c6fe in ?? () from /nix/store/n3p8ykaypcx80g1c8pifd9jfvbkr1hbz-python3.10-torch-1.13.1/lib/python3.10/site-packages/torch/lib/libamdhip64.so
#6  0x00007fffa5313941 in hipMemcpyWithStream () from /nix/store/n3p8ykaypcx80g1c8pifd9jfvbkr1hbz-python3.10-torch-1.13.1/lib/python3.10/site-packages/torch/lib/libamdhip64.so
#7  0x00007fffa6f632ba in c10::hip::memcpy_and_sync(void*, void*, long, hipMemcpyKind, ihipStream_t*) () from /nix/store/n3p8ykaypcx80g1c8pifd9jfvbkr1hbz-python3.10-torch-1.13.1/lib/python3.10/site-packages/torch/lib/libtorch_hip.so
#8  0x00007fffa6f4f749 in at::native::copy_kernel_cuda(at::TensorIterator&, bool) () from /nix/store/n3p8ykaypcx80g1c8pifd9jfvbkr1hbz-python3.10-torch-1.13.1/lib/python3.10/site-packages/torch/lib/libtorch_hip.so
#9  0x00007fffcddcd6ea in at::native::copy_impl(at::Tensor&, at::Tensor const&, bool) () from /nix/store/n3p8ykaypcx80g1c8pifd9jfvbkr1hbz-python3.10-torch-1.13.1/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so
#10 0x00007fffcddcea61 in at::native::copy_(at::Tensor&, at::Tensor const&, bool) () from /nix/store/n3p8ykaypcx80g1c8pifd9jfvbkr1hbz-python3.10-torch-1.13.1/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so
#11 0x00007fffce8f7896 in at::_ops::copy_::call(at::Tensor&, at::Tensor const&, bool) () from /nix/store/n3p8ykaypcx80g1c8pifd9jfvbkr1hbz-python3.10-torch-1.13.1/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so
#12 0x00007fffce0b5919 in at::native::_to_copy(at::Tensor const&, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>, bool, c10::optional<c10::MemoryFormat>) ()
   from /nix/store/n3p8ykaypcx80g1c8pifd9jfvbkr1hbz-python3.10-torch-1.13.1/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so
#13 0x00007fffcec0497a in c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor (at::Tensor const&, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>, bool, c10::optional<c10::MemoryFormat>), &at::(anonymous namespace)::(anonymous namespace)::wrapper___to_copy>, at::Tensor, c10::guts::typelist::typelist<at::Tensor const&, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>, bool, c10::optional<c10::MemoryFormat> > >, at::Tensor (at::Tensor const&, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>, bool, c10::optional<c10::MemoryFormat>)>::call(c10::OperatorKernel*, c10::DispatchKeySet, at::Tensor const&, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>, bool, c10::optional<c10::MemoryFormat>) () from /nix/store/n3p8ykaypcx80g1c8pifd9jfvbkr1hbz-python3.10-torch-1.13.1/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so
#14 0x00007fffce50572d in at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>, bool, c10::optional<c10::MemoryFormat>) () from /nix/store/n3p8ykaypcx80g1c8pifd9jfvbkr1hbz-python3.10-torch-1.13.1/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so

@MatthewCroughan
Copy link
Member

Yeah the AMD stuff is very buggy, and there's nothing we can do about that, since we only build that code, we do not write or patch that code. If there are any patches we can apply, let me know, and we can do that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants