Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] agent: symbol lookup error: /opt/datadog-agent/bin/agent/agent: undefined symbol: nvmlVgpuTypeGetCapabilities #32419

Open
rawlingsj opened this issue Dec 20, 2024 · 1 comment

Comments

@rawlingsj
Copy link

Agent Environment
datadog-agent version 7.60.1
Linux

Describe what happened:

After compiling from source I've not been able to run the agent:

/opt/datadog-agent/bin/agent/agent: symbol lookup error: /opt/datadog-agent/bin/agent/agent: undefined symbol: nvmlVgpuTypeGetCapabilities

It looks like there's a relatively recent go mod dependency on go-nvml, then used here. AFAICT this is wired in at compile time so adding any runtime config to disable the gpu monitor doesn't seem to work.

Describe what you expected:

For the gpu monitor to be optional if nvidia libraries are not available.

Steps to reproduce the issue:

We're building the agent using

invoke -e agent.build \
        --bundle process-agent \
        --bundle trace-agent \
        --bundle system-probe \
        --bundle security-agent \
        --exclude-rtloader \
        --no-development \
        --bundle-ebpf \
        --embedded-path /usr/lib

Additional environment details (Operating System, Cloud provider, etc):

This is a package build built on Wolfi OS using Melange, the build pipeline is a little complex but if more context is wanted this may help... https://github.com/wolfi-dev/os/blob/f4727bc/datadog-agent.yaml

rawlingsj added a commit to wolfi-dev/os that referenced this issue Dec 20, 2024
- add libpcap
- symlink so build avoids downloading libpcap and uses system lib
- regen dep bump patch
- build with python 3.12 as integration dependencies require it
	ERROR: Package 'datadog-slurm' requires a different Python: 3.11.11 not in '>=3.12'
- compile datadog-agent-nvml with python 3.12
- patch to disable gpu monitor as causing test failures, upstream issue tracked DataDog/datadog-agent#32419

Signed-off-by: James Rawlings <[email protected]>
@gjulianm
Copy link
Contributor

Hi, I see that you might be overriding the ldflags in the config. In our build code we're adding -extldflags=-Wl,-z,lazy to the ldflags argument, and this flag in your code might be overriding it. Could you try adding -ldflags="-extldflags=-Wl,-z,lazy" to the GOFLAGS variable?

rawlingsj added a commit to wolfi-dev/os that referenced this issue Dec 20, 2024
- add libpcap
- symlink so build avoids downloading libpcap and uses system lib
- regen dep bump patch
- build with python 3.12 as integration dependencies require it
	ERROR: Package 'datadog-slurm' requires a different Python: 3.11.11 not in '>=3.12'
- compile datadog-agent-nvml with python 3.12
- patch to disable gpu monitor as causing test failures, upstream issue tracked DataDog/datadog-agent#32419

Signed-off-by: James Rawlings <[email protected]>
powersj pushed a commit to wolfi-dev/os that referenced this issue Dec 20, 2024
datadog-agent and datadog-agent-nvml: various fixes to work with latest

- add libpcap
- symlink so build avoids downloading libpcap and uses system lib
- regen dep bump patch
- build with python 3.12 as integration dependencies require it"
`ERROR: Package 'datadog-slurm' requires a different Python: 3.11.11 not
in '>=3.12'`
- compile datadog-agent-nvml with python 3.12
- patch to disable new gpu monitor as causing test failures, it's not
expected to be enabled by default, upstream issue tracked
DataDog/datadog-agent#32419

<p align="center">
<img
src="https://raw.githubusercontent.com/wolfi-dev/.github/b535a42419ce0edb3c144c0edcff55a62b8ec1f8/profile/wolfi-logo-light-mode.svg"
/>
</p>

---------

Signed-off-by: wolfi-bot <[email protected]>
Signed-off-by: James Rawlings <[email protected]>
Co-authored-by: wolfi-bot <[email protected]>
Co-authored-by: Hunter Harris <[email protected]>
Co-authored-by: James Rawlings <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants