-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AIDE.sh: line 10: 1647 Illegal instruction (core dumped) #44
Comments
Hello Francesco, Thanks for opening the issue. I was about to try and reproduce it but could not find any issue in my tests. Is this a new Docker installation, or do you perhaps have an existing volume and project where, for some reason, an incorrect configuration entry has been made in the past? |
Hi :) |
Disabling models allows me to run the container:
|
This will bypass the import of available AI models. You will be able to use AIDE as a labeling tool, but not for training models (none will be visible). It might well be that PyTorch has a conflict on your machine (see below); in this case the reason why disabling models works is that no (Py-) Torch modules are imported this way.
This could be the cuplrit indeed. Browsing through PyTorch's issue tracker I see multiple mentions of this problem, with AVX2 instructions repeatedly finding their way into the code base. I do not have a CPU at hand that does not support AVX (or AVX2) to try things out, but perhaps you can get it to work with a different version of PyTorch? Detectron2 lists 1.6.0 as the minimal version of PyTorch, but perhaps a newer one resolved this issue? This should be straightforward to check in a new Conda or virtualenv environment and by simply calling |
Hi and nice to meet you. FYI I ran into the same issue on a local VirtualBox Ubuntu 18.04 instance. It does not occur on my other 2 VM instances, I have a Ubuntu 18.04 instance provided by AllianceCan.ca (openstack), which is like my lab box #2, and a GCP instance which is the production box. My GCP & VBox instances have 4GB memory allocation, my AllianceCan.ca instance has 3GB. Fault in util/helpers.py:get_class_executable, execFile = importlib.import_module(classPath) I did not investigate much, as that does not block me, and as long as it works in my other instances GCP and AllianceCan.ca. And anyway we do not use this module yet, we are only using the annotation interface. I tried surrounding the statement between try / except but that's not an exception thrown. The actual function that crashes is "return _bootstrap._gcd_import(name[level:], package, level)" in "def import_module(name, package=None):" but I cannot tell which package, probably the standard Python library. Reproduced in Docker in the vbox instance using : Or just simply in dev mode with Pycharm (remote debug to the local vbox instance) or directly with the command line on the vbox from a conda venv. ssh://[email protected]:7722/home/vince/anaconda3/envs/aide/bin/python3 -u /home/vince/.pycharm_helpers/pydev/pydevd.py --multiprocess --qt-support=auto --client 127.0.0.1 --port 44833 --file setup/assemble_server.py --launch=1 --check_v1=0 --migrate_db=0 --force_migrate=0 --verbose=1 I have forked main/master version 2 in ~june 2021 and did not sync with latest changes (too many changes and conflicts). Hope that helps a bit ! If you need anything let me know. Best regards, Vincent |
I tested with the latest main branch (v2.0) on my VM Ubuntu 18.04 LTS Virtual Box (no CUDA/GPU) in remote debug mode with PyCharm (not docker), and same issue : when the AIController module is set, the error occurs. |
I tested with the v3.0 branch and installed the not cuda version of torch and torchvision (torch==1.12.1 torchvision==0.13.1 vs torch==1.12.1+cu113 torchvision==0.13.1+cu113) on the same VM Ubuntu 18.04 LTS Virtual Box (no CUDA/GPU), in remote debug mode with PyCharm (not docker), and no issue, it starts well. |
I just run
cd docker && docker-compose up
on master.The software seems to crash here:
aerial_wildlife_detection/util/helpers.py
Line 100 in 08954b5
path
is set toai.models.detectron2.AlexNet
.The text was updated successfully, but these errors were encountered: