Skip to content

Submission Project Docker Microcredential #4

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 29 commits into
base: main
Choose a base branch
from

Conversation

rabuono
Copy link

@rabuono rabuono commented Apr 22, 2025

Submission of project.
Updates made according to Deliverables in README.md.
HPC portion run on both VIB Data Core Cluster and HPC Ugent.

rabuono and others added 29 commits April 8, 2025 10:26
Containerize training the machine learning model
Containerize serving of the machine learning model
Copy link
Contributor

@abotzki abotzki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your submission @rabuono

# Create the models directory with higher permissions
RUN mkdir -p /app/models && chmod 777 /app/models

EXPOSE 8080
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a particular reason of defining a port?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed that.
No, there is no need to expose that port there, as the next step does not rely on network communication to exchange the necessary data.
Thanks for spotting it.

# TODO: Run the training script that generates the model
CMD [...]
# Create the models directory with higher permissions
RUN mkdir -p /app/models && chmod 777 /app/models
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be omitted

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍
It was an overzealous directory creation


# Pull Docker images using Apptainer and save logs
echo "Building training image..."
apptainer build --fakeroot model-train.sif $TRAIN_IMG > $LOG_DIR/train_image_build.log 2>&1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logs can also be defined via SBATCH statements

Copy link
Author

@rabuono rabuono Apr 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!
Indeed.
The attempt was to combine them in more specifically named log files for future consumption.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rabuono what is the difference between the two shell scripts for the two HPC systems which produces the xattr warning?

(base) albot@Alexanders-MacBook-Pro project_docker_microcredential % diff image_build.sh image_build_ugent.sh
2,3c2,4
< #SBATCH --job-name=apptainer_build
< #SBATCH --partition=debug_28C_56T_750GB
---
>
> #SBATCH --job-name=job_submission
> #SBATCH --partition=donphan
5c6
< #SBATCH --time=01:00:00
---
> #SBATCH --time=00:30:00

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey!
The difference is basically on the sbatch statements.
While I have not dug deeper into it, I suspect that the xattr warnings come from differences in deployment of the two HPC systems.

# Create the models directory with higher permissions
RUN mkdir -p /app/models && chmod 777 /app/models

EXPOSE 8080
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
EXPOSE 8080

# docker build -t sklearn_train:v1 -f Dockerfile.train .

# run command:
# docker run --rm -p 8080:8080/tcp -v ./app/models:/app/models sklearn_train:v1
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# docker run --rm -p 8080:8080/tcp -v ./app/models:/app/models sklearn_train:v1
# docker run --rm -v ./app/models:/app/models sklearn_train:v1

As no port exposing was actually necessary

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants