-
Notifications
You must be signed in to change notification settings - Fork 5
Assigment Dries #7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
ddebeer
wants to merge
7
commits into
vib-tcp:main
Choose a base branch
from
ddebeer:main
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
f3c8b9e
changed path in train.py
ddebeer 79bfd17
dockerfile created
ddebeer 07dae85
docker command added as comment
ddebeer f076523
order corrected and comment addedt
ddebeer b111274
final part of assignment
ddebeer d771464
path change in server.py
ddebeer 7c7c797
readme update
ddebeer File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,29 @@ | ||
FROM python:3.9-slim | ||
|
||
|
||
# Set a working directory | ||
WORKDIR /app | ||
|
||
|
||
# Copy the requirements.txt file to the working directory | ||
COPY requirements.txt . | ||
COPY server.py . | ||
FROM python:3.9-slim | ||
CMD ["python", "server.py"] | ||
|
||
|
||
# Install the Python dependencies | ||
RUN pip install --no-cache-dir -r requirements.txt | ||
|
||
|
||
# indicate which port should be exposed in the image | ||
EXPOSE 8080 | ||
|
||
|
||
# Copy the training script (server.py) to the working directory | ||
COPY server.py . | ||
|
||
|
||
# run the server script that generates a server | ||
CMD ["python", "server.py"] | ||
|
||
|
||
# Command to build the image: | ||
# docker build . --tag sever:v01 -f Dockerfile.infer |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,12 +1,34 @@ | ||
FROM <base imagae> | ||
FROM python:3.9-slim | ||
|
||
# TODO: Set a working directory | ||
|
||
# TODO: Copy the requirements.txt file to the working directory | ||
# Set a working directory | ||
WORKDIR /app | ||
|
||
# TODO: Install the Python dependencies | ||
|
||
# TODO: Copy the training script (train.py) to the working directory | ||
# Copy the requirements.txt file to the working directory | ||
COPY requirements.txt ./ | ||
|
||
|
||
# Install the Python dependencies | ||
RUN apt update && apt -y upgrade | ||
RUN apt install -y wget | ||
RUN pip install --no-cache-dir -r requirements.txt | ||
|
||
|
||
# Copy the training script (train.py) to the working directory | ||
COPY train.py ./ | ||
|
||
|
||
# Setup an app user so the container doesn't run as the root user | ||
# RUN useradd app | ||
# USER app | ||
|
||
|
||
|
||
# run the training script that generates the model | ||
CMD ["python", "train.py"] | ||
|
||
|
||
# Command to build the image: | ||
# docker build . --tag train:v01 -f Dockerfile.train | ||
|
||
# TODO: Run the training script that generates the model | ||
CMD [...] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
# How to use Apptainer on the HPC | ||
|
||
## Step 1 connect to HPC | ||
|
||
1. Open WinSCP and connect to the HPC | ||
|
||
2. Open a Putty terminal to communicate with the HPC | ||
|
||
|
||
## Step 2 create a batch file | ||
|
||
1. Create a batch file .sh | ||
|
||
* start with `#!/bin/bash` | ||
|
||
* Specify job options via `#SBATCH --<job option>=<your choice>` [See documentation](https://docs.hpc.ugent.be/Windows/running_batch_jobs/#defining-and-submitting-your-job] | ||
|
||
* Change the working directory, for instance to `$VSC_DATA` to `$VSC_SCRATCH` | ||
|
||
* Use `module purge` to purge all loded modules | ||
|
||
* Set the chache directory for apptainer: `export APPTAINER_CACHEDIR=$VSC_SCRATCH` | ||
|
||
2. Pull container images via `apptainer pull <name_version.sif>` <location> | ||
|
||
|
||
## Step 3 Save/copy batch file to DATA folder | ||
|
||
* copy via WinSCP | ||
* upload via the hpc portal website | ||
* create directly using vi | ||
|
||
|
||
## Step 4 submit batch file as a job | ||
|
||
* using `sbatch` <batch_file.sh> | ||
* check running processes with `squeue` | ||
|
||
|
||
|
||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,44 +1,46 @@ | ||
# Project Docker Microcredential | ||
micro-credential VIB/UGent - Reproducible data analysis | ||
|
||
In this project, you will train, run and serve a machine learning model using Docker. Furthermore, you will store the Docker images on your own account on Docker hub. Using the image of the training step, you will build an Apptainer image on the HPC of UGent. | ||
In this project, you will train, run and serve a machine learning model using Docker. | ||
Furthermore, you will store the Docker images on your own account on Docker hub. | ||
Using the image of the training step, you will build an Apptainer image on the HPC of UGent. | ||
|
||
## Deliverables | ||
|
||
- [ ] Clone this repository to your personal github account | ||
- [ ] Containerize training the machine learning model | ||
- [ ] Containerize serving of the machine learning model | ||
- [ ] Train and run the machine learning model using Docker | ||
- [ ] Run the Docker container serving the machine learning model | ||
- [ ] Store the Docker images on your personal account on Docker Hub | ||
- [ ] Provide the resulting Dockerfiles in GitHub | ||
- [ ] Build an Apptainer image on a HPC of your choice | ||
- [ ] Provide the logs of the slurm job in GitHub | ||
- [ ] Document the steps in a text document in GitHub | ||
- [x] Clone this repository to your personal github account | ||
- [x] Containerize training the machine learning model | ||
- [x] Containerize serving of the machine learning model | ||
- [x] Train and run the machine learning model using Docker | ||
- [x] Run the Docker container serving the machine learning model | ||
- [x] Store the Docker images on your personal account on Docker Hub | ||
- [x] Provide the resulting Dockerfiles in GitHub | ||
- [x] Build an Apptainer image on a HPC of your choice | ||
- [x] Provide the logs of the slurm job in GitHub | ||
- [x] Document the steps in a text document in GitHub | ||
|
||
## Proposed steps - containerize and run training the machine learning model | ||
|
||
Complete file named `Dockerfile.train` | ||
|
||
- Copy requirements.txt and install dependencies | ||
- Copy train.py to the working directory | ||
- Set the command to run train.py | ||
- Run the training of the model on your computer | ||
- Document the command as comment in the Dockerfile | ||
- Store the created Dockerfile in your cloned github repository | ||
+ Copy requirements.txt and install dependencies | ||
+ Copy train.py to the working directory | ||
+ Set the command to run train.py | ||
+ Run the training of the model on your computer | ||
+ Document the command as comment in the Dockerfile | ||
+ Store the created Dockerfile in your cloned github repository | ||
|
||
## Proposed steps - containerize and serve the machine learning model | ||
|
||
- Correct the order of the instructions in the Dockerfile.infer | ||
- Document the steps in the Dockerfile.infer as comments | ||
- Document the succesful `docker run` command in the Dockerfile.infer as a comment | ||
+ Correct the order of the instructions in the Dockerfile.infer | ||
+ Document the steps in the Dockerfile.infer as comments | ||
+ Document the succesful `docker run` command in the Dockerfile.infer as a comment | ||
|
||
## Proposed steps - store images on Dockerhub and build an Apptainer image on the HPC | ||
|
||
- Create an account on Dockerhub | ||
- Store the built images on your account | ||
- Create a shell script on the HPC of your preference | ||
- Store the shell script in your cloned github repository | ||
+ Create an account on Dockerhub | ||
+ Store the built images on your account | ||
+ Create a shell script on the HPC of your preference | ||
+ Store the shell script in your cloned github repository | ||
|
||
|
||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
#!/bin/bash | ||
#SBATCH --partition=donphan | ||
#SBATCH --mem=8G | ||
#SBATCH --time=00:30:00 | ||
#SBATCH [email protected] | ||
#SBATCH --ntasks=1 | ||
|
||
|
||
module purge | ||
|
||
export APPTAINER_CACHEDIR=$VSC_SCRATCH | ||
|
||
cd $VSC_DATA | ||
echo Start Job | ||
apptainer pull server_v0.sif docker://ddebeer/server:v01 | ||
echo halfway | ||
apptainer pull train_v0.sif docker://ddebeer/train:v01 | ||
echo end Job |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -7,7 +7,7 @@ | |
app = Flask(__name__) | ||
|
||
# Check if the model file exists and wait until it does | ||
model_path = '/app/models/iris_model.pkl' | ||
model_path = '/app/iris_model.pkl' | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, works as well |
||
|
||
while not os.path.exists(model_path): | ||
print(f"Waiting for model file at {model_path}...") | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
The following modules were not unloaded: | ||
(Use "module --force purge" to unload all): | ||
|
||
1) env/vsc/donphan 3) env/software/donphan | ||
2) env/slurm/donphan 4) cluster/donphan | ||
Start Job | ||
INFO: Converting OCI blobs to SIF format | ||
INFO: Starting build... | ||
Copying blob sha256:05802d3ba2ead9e590dd748b23da547106549ef0fa66bbe6cf14583d1450db04 | ||
Copying blob sha256:8a628cdd7ccc83e90e5a95888fcb0ec24b991141176c515ad101f12d6433eb96 | ||
Copying blob sha256:74018f7cfa8f2965fd86b13c38f71417bc846e071a5f5bb5ae569ccb5a6e7248 | ||
Copying blob sha256:a0b0cfc480ce03c723a597904bcfbf28c71438c689e6d5097c2332835f67a40c | ||
Copying blob sha256:97d21b95fb00ac3b08975ab6f8709f3a7e35a05d75e2f9a70fa95348279dac27 | ||
Copying blob sha256:7c0a46d2d00fd6b3bfbaf17d1a66701c9f045b106b2b77d30308d83b4997e91a | ||
Copying blob sha256:722a684821197aa750a57327cce12518f2aff787af1bf353c4ddae4beae7ad44 | ||
Copying blob sha256:179efd259301554f31db82f1ab6362ef03724ce395462216777ef0636ad6a7c0 | ||
Copying config sha256:9302d5dd202ce9285b6d62d0f0fb5d8e95b7e6a278db8bdd7ba0baf6961cc324 | ||
Writing manifest to image destination | ||
2025/04/24 15:09:22 info unpack layer: sha256:8a628cdd7ccc83e90e5a95888fcb0ec24b991141176c515ad101f12d6433eb96 | ||
2025/04/24 15:09:23 info unpack layer: sha256:74018f7cfa8f2965fd86b13c38f71417bc846e071a5f5bb5ae569ccb5a6e7248 | ||
2025/04/24 15:09:23 info unpack layer: sha256:a0b0cfc480ce03c723a597904bcfbf28c71438c689e6d5097c2332835f67a40c | ||
2025/04/24 15:09:24 info unpack layer: sha256:97d21b95fb00ac3b08975ab6f8709f3a7e35a05d75e2f9a70fa95348279dac27 | ||
2025/04/24 15:09:24 info unpack layer: sha256:7c0a46d2d00fd6b3bfbaf17d1a66701c9f045b106b2b77d30308d83b4997e91a | ||
2025/04/24 15:09:24 info unpack layer: sha256:05802d3ba2ead9e590dd748b23da547106549ef0fa66bbe6cf14583d1450db04 | ||
2025/04/24 15:09:24 info unpack layer: sha256:722a684821197aa750a57327cce12518f2aff787af1bf353c4ddae4beae7ad44 | ||
2025/04/24 15:09:27 info unpack layer: sha256:179efd259301554f31db82f1ab6362ef03724ce395462216777ef0636ad6a7c0 | ||
INFO: Creating SIF file... | ||
halfway | ||
INFO: Converting OCI blobs to SIF format | ||
INFO: Starting build... | ||
Copying blob sha256:05802d3ba2ead9e590dd748b23da547106549ef0fa66bbe6cf14583d1450db04 | ||
Copying blob sha256:8a628cdd7ccc83e90e5a95888fcb0ec24b991141176c515ad101f12d6433eb96 | ||
Copying blob sha256:74018f7cfa8f2965fd86b13c38f71417bc846e071a5f5bb5ae569ccb5a6e7248 | ||
Copying blob sha256:a0b0cfc480ce03c723a597904bcfbf28c71438c689e6d5097c2332835f67a40c | ||
Copying blob sha256:97d21b95fb00ac3b08975ab6f8709f3a7e35a05d75e2f9a70fa95348279dac27 | ||
Copying blob sha256:7c0a46d2d00fd6b3bfbaf17d1a66701c9f045b106b2b77d30308d83b4997e91a | ||
Copying blob sha256:91f69e43c7b2a4191a9e05dd6f01c34b0993d76608c14e955f33397cb915ed5f | ||
Copying blob sha256:d6713be2e29283b75210dbd238dbd8a40037de079961bf7afa4865cd8687ef0e | ||
Copying blob sha256:0d13e422e987d4968e849a9fcaab036164e43e03c0d8c69d4394ad8dc1a01c7b | ||
Copying blob sha256:fc73ac045e6437103a9183eb474e41a67bbc65874896418df1049c6cb1cc9ecb | ||
Copying config sha256:64f10bf38baaa55ff5b37c771a34fe67de171b8c98ea87a7ffb73b2b506adcaf | ||
Writing manifest to image destination | ||
2025/04/24 15:10:25 info unpack layer: sha256:8a628cdd7ccc83e90e5a95888fcb0ec24b991141176c515ad101f12d6433eb96 | ||
2025/04/24 15:10:26 info unpack layer: sha256:74018f7cfa8f2965fd86b13c38f71417bc846e071a5f5bb5ae569ccb5a6e7248 | ||
2025/04/24 15:10:26 info unpack layer: sha256:a0b0cfc480ce03c723a597904bcfbf28c71438c689e6d5097c2332835f67a40c | ||
2025/04/24 15:10:27 info unpack layer: sha256:97d21b95fb00ac3b08975ab6f8709f3a7e35a05d75e2f9a70fa95348279dac27 | ||
2025/04/24 15:10:27 info unpack layer: sha256:7c0a46d2d00fd6b3bfbaf17d1a66701c9f045b106b2b77d30308d83b4997e91a | ||
2025/04/24 15:10:27 info unpack layer: sha256:05802d3ba2ead9e590dd748b23da547106549ef0fa66bbe6cf14583d1450db04 | ||
2025/04/24 15:10:27 info unpack layer: sha256:91f69e43c7b2a4191a9e05dd6f01c34b0993d76608c14e955f33397cb915ed5f | ||
2025/04/24 15:10:28 info unpack layer: sha256:d6713be2e29283b75210dbd238dbd8a40037de079961bf7afa4865cd8687ef0e | ||
2025/04/24 15:10:28 info unpack layer: sha256:0d13e422e987d4968e849a9fcaab036164e43e03c0d8c69d4394ad8dc1a01c7b | ||
2025/04/24 15:10:31 info unpack layer: sha256:fc73ac045e6437103a9183eb474e41a67bbc65874896418df1049c6cb1cc9ecb | ||
INFO: Creating SIF file... | ||
end Job |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -12,6 +12,6 @@ | |
model = clf.fit(iris.data, iris.target_names[iris.target]) | ||
|
||
#Save the trained model to the shared volume (make sure to use the correct path) | ||
joblib.dump(model, '/app/models/iris_model.pkl') | ||
joblib.dump(model, '/app/iris_model.pkl') | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, works as well |
||
|
||
print("Model training complete and saved as iris_model.pkl") |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interesting suggestion which we could apply in other edition