Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOCS-2930: Add first_run flow for Docker modules #4187

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

JessamyT
Copy link
Collaborator

@JessamyT JessamyT commented Apr 3, 2025

No description provided.

@viambot viambot added the safe to build This pull request is marked safe to build from a trusted zone label Apr 3, 2025
Copy link

netlify bot commented Apr 3, 2025

Deploy Preview for viam-docs ready!

Name Link
🔨 Latest commit 3776863
🔍 Latest deploy log https://app.netlify.com/sites/viam-docs/deploys/67f01edb8fe7e80008db06b0
😎 Deploy Preview https://deploy-preview-4187--viam-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.
Lighthouse
Lighthouse
1 paths audited
Performance: 31 (🟢 up 6 from production)
Accessibility: 100 (no change from production)
Best Practices: 100 (no change from production)
SEO: 92 (no change from production)
PWA: 70 (no change from production)
View the detailed breakdown and full score reports

To edit notification comments on pull requests, go to your Netlify site configuration.

Copy link
Member

@penguinland penguinland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! All my feedback is either minor or non-actionable.

Use cases for this include:

- Your module has complex system dependencies that cannot be installed on a machine.
- You have a large bundle and want to use layer caching to reduce the size of the download.
Copy link
Member

@penguinland penguinland Apr 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The word "bundle" seems odd, but I can't think of a better word. If no one else can think of a better word either, leave as-is.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose that depends on what's meant by bundle? Does the layer caching help because an image with the same base container and possibly some layers can be used between different installed modules? That's the only way I can think this would be helpful.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The people who love docker will talk your ear off about how useful layer caching is: the idea is that if you have a bunch of small layers that add specific things to your docker image, you can reuse these layers in different projects to get different images without needing to create each thing multiple times. This is an argument for docker use in general, not something Viam-specific.

I think the bundle is supposed to refer to the collection of layers. I don't know enough about docker to know if that's the standard term; it might be. Even if it's not, I think the meaning is clear by the end of the sentence, I just stumbled a little in the middle.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abe-winter I borrowed your words that were perhaps not intended for official documentation. Is there a word other than "bundle" you'd recommend here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe 'artifact'?

```sh {id="terminal-prompt" class="command-line" data-prompt="$"}
#!/usr/bin/env bash

if [[ -n "$VIAM_TEST_FAIL_RUN_FIRST" ]]; then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with this env var. Is it supposed to be in all first_run scripts? What does it do? Maybe I should add it to mine... 😳

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great question--just pulled this in from the setup phase guide on confluence which in turn links to https://github.com/viam-labs/wifi-sensor/blob/7823b6ad3edcbbbf20b06c34b3181453f5f3f078/first_run.sh

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also am unclear on what this is for here. Can you add an explanation in? Does this get set to true if a previous run failed? Or is this jsut for testing?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the RDK, this env var is mentioned in 2 places: a test and an example first_run.sh. I have cloned about 60 Viam-related repos locally, and it's not mentioned in any of the others. I've searched Github for it, and can only find copies of those 2 locations. I wonder if there was some plan to add this in, but it never happened and these are just the vestiges of a feature that never was.

I think we should remove this paragraph from the example.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking around. Will remove.

It will only execute once per module or per version of the module.

1. (Optional) After a first run script runs successfully, Viam adds a marker file with a `.first_run_succeeded` suffix in the module’s data directory on disk.
It has the location and form: `/root/.viam/packages/data/module/<MODULE_ID>-<VERSION>-<ARCH>.first_run_succeeded`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect there should be a slash before .first_run_succeeded, but haven't checked on that personally. Maybe it's correct as-is.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm also from Maxim's guide--not sure if there's a reasonably practical way to check for sure

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😳 I went to one of my machines with a module with a first_run script to check, and got surprised that instead of having such a file anywhere, I have a meta.json does not exist, skipping first run message scattered throughout my logs! I can't help, and I've got some modules to go fix.

Thanks for pointing me towards this bug!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eep! I don't have any Docker modules to test this on... @cheukt since it looks like you've been maintaining the confluence instructions, can you confirm?

penguinland added a commit to viam-modules/viam-mlmodelservice-triton that referenced this pull request Apr 3, 2025
Jessamy had questions about first_run scripts in viamrobotics/docs#4187, I went to check on it for her, and got surprised that instead of running `first_run.sh`, this module instead logs `meta.json does not exist, skipping first run` and continues without running it! I've filed https://viam.atlassian.net/browse/RSDK-10382 so it's harder to get into this position next time.

I haven't tried this: local modules don't start up without meta.json, so I think this is only tryable once we deploy to the registry. However, I've run `make -f Makefile.module module.tar.gz` and checked that it successfully puts meta.json in the .tar.gz file.
Copy link
Collaborator

@npentrel npentrel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused at a few points.
I also think it would be really helpful to link to a full example. I think soleng used docker in a few projects, is there none we can link to?

#!/usr/bin/env bash

if [[ -n "$VIAM_TEST_FAIL_RUN_FIRST" ]]; then
echo "Sorry, I've failed you."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this error indicate an issue with the module? Maybe better to change to something that indicates where the error is

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure where it is. Since this is an edge case documented within a hidden page, I figured it made sense to act quickly and use the existing example provided by Maxim. Sounds like I'll remove the whole paragraph anyway.

```sh {id="terminal-prompt" class="command-line" data-prompt="$"}
#!/usr/bin/env bash

if [[ -n "$VIAM_TEST_FAIL_RUN_FIRST" ]]; then
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also am unclear on what this is for here. Can you add an explanation in? Does this get set to true if a previous run failed? Or is this jsut for testing?

exit 1
fi

docker pull mongo:6
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this pulls the image but then how does the container get run? I think we probably want to link to an example of that. Also

And should this run with https://github.com/viam-soleng/viam-docker-manager or is this entirely separate?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this example is right: in first_run.sh, you pull the image, and then in your module's main executable, you run the image, and that second part should be obvious for anyone developing with docker. We could add in such an example, but it's trivial:

#!/bin/bash
set -euxo pipefail
# This example module runs inside the docker image you should have downloaded in first_run.sh
SOCKET_DIR=`dirname $1`
VIAM_DIR=`realpath ~/.viam`
exec docker run \
     --rm \
     --runtime=nvidia --gpus=all \
     -v /etc/passwd:/etc/passwd \
     -v /etc/group:/etc/group \
     -v $SOCKET_DIR:$SOCKET_DIR \
     -v $VIAM_DIR:$VIAM_DIR \
     ghcr.io/viam-modules/viam-mlmodelservice-triton/jetpack6:0.9.0 \
     LD_PRELOAD=libjemalloc.so.2 VIAM_MODULE_DATA="$VIAM_MODULE_DATA" /opt/viam/bin/viam_mlmodelservice_triton "$@" 2>&1

Copy link
Member

@penguinland penguinland Apr 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(that example is from the Triton module, and although there are lots of options specified for the docker line, anyone developing with docker should know which ones they need to set.)

It's really just

#!/bin/bash
exec docker run <whatever-the-module-author-intended-to-put-here>

@JessamyT
Copy link
Collaborator Author

JessamyT commented Apr 4, 2025

RE examples to point to, it looks like https://github.com/abe-winter/python-container-module doesn't actually use a first_run script as such? And https://github.com/viam-labs/wifi-sensor/tree/RSDK-8886-setup-phase is what Maxim pointed to as an example but the main branch of wifi-sensor doesn't use it. Based on the thread it sounds like the sol-eng uses would've been a bit atypical. @abe-winter do you recommend a particular example? Or @penguinland is https://github.com/viam-modules/viam-mlmodelservice-triton/tree/main like you mentioned a good one, and if so, should the examples in this PR be changed to match it?

@penguinland
Copy link
Member

penguinland commented Apr 4, 2025

I've been mentioning the Triton one because that's the only docker module I work with. TBH, it's kind of a pain: the way Drew set it up, in the repo there are merely templates of the module script and first_run script, and you need to generate the actual ones from those templates. My hope is there's a simpler example out there, but I'm not aware of one (because I haven't looked).

@JessamyT JessamyT changed the title DOCS-3068, DOCS-3114: Add first_run flow for Docker modules DOCS-2930: Add first_run flow for Docker modules Apr 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
safe to build This pull request is marked safe to build from a trusted zone
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants