Skip to content

Problem when setting up docker enviroment #8

@L-Kernegger

Description

@L-Kernegger

Plattform: Nvidia Jetson Nano with Edge TPU installed
Docker version: Docker version 20.10.21, build 20.10.21-0ubuntu1~18.04.3
Ubuntu version:
Distributor ID: Ubuntu
Description: Ubuntu 18.04.6 LTS
Release: 18.04
Codename: bionic
(this version is from the official Nvidia Image)

When executing the command:

nvidia@nvidia-desktop:/mnt/sd/SHMT$ sudo sh scripts/docker_setup_partition.sh 
[sudo] password for nvidia: 
[gpgtpu_partition] - building docker image from dockerfile...
[+] Building 2.5s (23/23) FINISHED                                              
 => [internal] load build definition from Dockerfile                       0.0s
 => => transferring dockerfile: 38B                                        0.0s
 => [internal] load .dockerignore                                          0.0s
 => => transferring context: 2B                                            0.0s
 => [internal] load metadata for nvcr.io/nvidia/l4t-base:r32.4.4           2.1s
 => [auth] nvidia/l4t-base:pull token for nvcr.io                          0.0s
 => [opencv_base 1/4] FROM nvcr.io/nvidia/l4t-base:r32.4.4@sha256:e9d0631  0.0s
 => [internal] load build context                                          0.0s
 => => transferring context: 81B                                           0.0s
 => CACHED [opencv_base 2/4] COPY ./opencv_install_deps.sh opencv_install  0.0s
 => CACHED [opencv_base 3/4] RUN ./opencv_install_deps.sh                  0.0s
 => CACHED [opencv_base 4/4] RUN echo $(ls -lh /usr/include/$(uname -i)-l  0.0s
 => CACHED [build1 1/4] RUN ln -snf /usr/share/zoneinfo/$CONTAINER_TIMEZO  0.0s
 => CACHED [build1 2/4] RUN  sh -c "echo '/usr/local/cuda/lib64' >> /etc/  0.0s
 => CACHED [build1 3/4] RUN  ldconfig                                      0.0s
 => CACHED [build1 4/4] RUN  apt-get install -y build-essential cmake git  0.0s
 => CACHED [build2 1/7] COPY update_sources.sh /                           0.0s
 => CACHED [build2 2/7] RUN /update_sources.sh                             0.0s
 => CACHED [build2 3/7] RUN dpkg --add-architecture armhf                  0.0s
 => CACHED [build2 4/7] RUN dpkg --add-architecture arm64                  0.0s
 => CACHED [build2 5/7] RUN DEBIAN_FRONTEND=noninteractive apt-get instal  0.0s
 => CACHED [build2 6/7] RUN apt-get install -y libeigen3-dev &&     sudo   0.0s
 => CACHED [build2 7/7] RUN apt-get install -y python-scipy                0.0s
 => CACHED [final 1/2] RUN echo "export LD_LIBRARY_PATH=/usr/local/cuda/l  0.0s
 => CACHED [final 2/2] WORKDIR /home                                       0.0s
 => exporting to image                                                     0.2s
 => => exporting layers                                                    0.0s
 => => writing image sha256:9c2344e9f541d38be18d04500859cdae3703ba3380271  0.0s
 => => naming to docker.io/library/gpgtpu_partition_image                  0.0s
gpgtpu_partition_container
gpgtpu_partition_container
[gpgtpu_partition] - build docker container...
42388e49673538748e6f8931cbbb0404455be5e92d7e6f72e319c3c2b5d3a4bd
docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #1: error running hook: exit status 1, stdout: , stderr: time="2024-12-03T05:12:28-05:00" level=info msg="Symlinking /mnt/sd/docker-data/overlay2/ecdbbf8f1a0d9147b218c4edbdb70b3d82fe18e175fcd02ed6c9ba34933b475b/merged/etc/vulkan/icd.d/nvidia_icd.json to /usr/lib/aarch64-linux-gnu/tegra/nvidia_icd.json"
time="2024-12-03T05:12:28-05:00" level=error msg="failed to create link [/usr/lib/aarch64-linux-gnu/tegra/nvidia_icd.json /etc/vulkan/icd.d/nvidia_icd.json]: failed to create symlink: failed to remove existing file: remove /mnt/sd/docker-data/overlay2/ecdbbf8f1a0d9147b218c4edbdb70b3d82fe18e175fcd02ed6c9ba34933b475b/merged/etc/vulkan/icd.d/nvidia_icd.json: device or resource busy": unknown.
nvidia@nvidia-desktop:/mnt/sd/SHMT$ 

The folder that this symlink is supposed to be created in is recreated every time the script runs and the original file isn't being held by anything:

nvidia@nvidia-desktop:/mnt/sd/SHMT$ lsof | grep /usr/lib/aarch64-linux-gnu/tegra/nvidia_icd.json
nvidia@nvidia-desktop:/mnt/sd/SHMT$ lsof | grep /etc/vulkan/icd.d/nvidia_icd.json
nvidia@nvidia-desktop:/mnt/sd/SHMT$ lsof | grep /mnt/sd/docker-data/overlay2/ecdbbf8f1a0d9147b218c4edbdb70b3d82fe18e175fcd02ed6c9ba34933b475b/merged/etc/vulkan/icd.d/nvidia_icd.json
nvidia@nvidia-desktop:/mnt/sd/SHMT$ 

Things I already tried that didnt work:
reinstalling docker
using a different storage driver (vfs instead of overlay2)
removing the file while the script is running using another script
mounting a copy of the file to make sure it isn't in use

I hope that this repository is still maintained and that somebody knows how to fix this.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions