-
Notifications
You must be signed in to change notification settings - Fork 7
Submission Project Docker Microcredential #4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Containerize training the machine learning model
Containerize serving of the machine learning model
…f Dockerfile.infer comment
…ent to Project_steps.md
Apptainer logs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your submission @rabuono
# Create the models directory with higher permissions | ||
RUN mkdir -p /app/models && chmod 777 /app/models | ||
|
||
EXPOSE 8080 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a particular reason of defining a port?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed that.
No, there is no need to expose that port there, as the next step does not rely on network communication to exchange the necessary data.
Thanks for spotting it.
# TODO: Run the training script that generates the model | ||
CMD [...] | ||
# Create the models directory with higher permissions | ||
RUN mkdir -p /app/models && chmod 777 /app/models |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be omitted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
It was an overzealous directory creation
|
||
# Pull Docker images using Apptainer and save logs | ||
echo "Building training image..." | ||
apptainer build --fakeroot model-train.sif $TRAIN_IMG > $LOG_DIR/train_image_build.log 2>&1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logs can also be defined via SBATCH statements
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Indeed.
The attempt was to combine them in more specifically named log files for future consumption.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rabuono what is the difference between the two shell scripts for the two HPC systems which produces the xattr
warning?
(base) albot@Alexanders-MacBook-Pro project_docker_microcredential % diff image_build.sh image_build_ugent.sh
2,3c2,4
< #SBATCH --job-name=apptainer_build
< #SBATCH --partition=debug_28C_56T_750GB
---
>
> #SBATCH --job-name=job_submission
> #SBATCH --partition=donphan
5c6
< #SBATCH --time=01:00:00
---
> #SBATCH --time=00:30:00
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey!
The difference is basically on the sbatch statements.
While I have not dug deeper into it, I suspect that the xattr warnings come from differences in deployment of the two HPC systems.
# Create the models directory with higher permissions | ||
RUN mkdir -p /app/models && chmod 777 /app/models | ||
|
||
EXPOSE 8080 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EXPOSE 8080 |
# docker build -t sklearn_train:v1 -f Dockerfile.train . | ||
|
||
# run command: | ||
# docker run --rm -p 8080:8080/tcp -v ./app/models:/app/models sklearn_train:v1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# docker run --rm -p 8080:8080/tcp -v ./app/models:/app/models sklearn_train:v1 | |
# docker run --rm -v ./app/models:/app/models sklearn_train:v1 |
As no port exposing was actually necessary
Submission of project.
Updates made according to Deliverables in README.md.
HPC portion run on both VIB Data Core Cluster and HPC Ugent.