Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,9 @@
.prod.env
.test.env

# Certificates
certs/

# IDE
.idea/

Expand Down
11 changes: 11 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,5 +1,16 @@
FROM python:3.13.7-trixie

# Install CA infrastructure
RUN apt-get update && apt-get install -y ca-certificates && rm -rf /var/lib/apt/lists/*

# Add AIoD cert and build custom bundle
RUN mkdir -p /certs
COPY certs/aiod-insight-centre.crt /certs/aiod-insight-centre.crt
RUN cat /etc/ssl/certs/ca-certificates.crt /certs/aiod-insight-centre.crt > /certs/custom-ca-bundle.crt

# Make Python requests use the custom CA bundle
ENV REQUESTS_CA_BUNDLE=/certs/custom-ca-bundle.crt

RUN useradd -m appuser
USER appuser
WORKDIR /home/appuser
Expand Down
14 changes: 7 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# AI-on-Demand Hugging Face connector
Collects dataset metadata from [Hugging Face](https://huggingface.co) and uploads it to AI-on-Demand.
# AI-on-Demand OpenML connector
Collects dataset metadata from [OpenML](https://www.openml.org) and uploads it to AI-on-Demand.

This package is not intended to be used directly by others, but may serve as an example of how to build a connector for the AI-on-Demand platform.
For more information on how to test this connector locally as a springboard for developing your own connector, reference the [Development](#Development) section below.
Expand Down Expand Up @@ -27,7 +27,7 @@ the default configuration can be found in the [`/script/config.prod.toml`](/scri
You will also need to have the 'Client Secret' for the client, which can be obtained from the keycloak administrator.
The client secret must be provided to the Docker container as an environment variable or in a dotenv file *similar to* [`script/.local.env`](script/.local.env) but named `script/.prod.env`.

Please contact the Keycloak service maintainer to obtain said credentials you need if you are in charge of deploying this Hugging Face connector.
Please contact the Keycloak service maintainer to obtain said credentials you need if you are in charge of deploying this OpenML connector.

## Running the Connector
You will need to mount the aiondemand configuration to `/home/appuser/.aiod/config.toml` and provide environment variables directly with `-e` or through mounting the dotfile in `/home/appuser/.aiod/openml/.env`. The [`script/run.sh`](script/run.sh) script provides a convenience that automatically does this.
Expand All @@ -36,17 +36,17 @@ Any following arguments are interpreted as arguments to the main script.
For the latest commandline arguments, use `docker run aiondemand/openml-connector --help`.
Some example invocations that use the `script/run.sh` script:

- `script/run.sh local --mode id --value 61 --app-log-level debug` syncs one specific openml dataset, and produces debug logs for the connector only.
- `script/run.sh test --mode since --value 100 --root-log-level debug` syncs all datasets with identifier 100 or greater (in ascending order).
- `script/run.sh local --mode id --value 61 --app-log-level debug` syncs one specific OpenML dataset, and produces debug logs for the connector only.
- `script/run.sh test --mode since --value 100 --root-log-level debug` syncs all datasets with identifier `>= 100` (in ascending identifier order).
- `script/run.sh prod --mode all --root-log-level info` indexes all datasets on OpenML, producing info logs for the connector and all its dependencies (this is the default).

## Development
You can test the connector when running the [metadata catalogue](https://github.com/aiondemand/aiod-rest-api) locally.
The default configurations for this setup can be found in the [`.local.env`](script/.local.env) and [`config.local.toml`](script/config.local.toml) files.

When connecting to the AI-on-Demand test or production server, you will need to a dedicated client registered in the keycloak instance which is connected to the REST API you want to upload data to.
See [this form]() to apply for a client. The client will need to have a `platform_X` role attached, where `X` is the name of the platform from which you register assets.
See [this form]() to apply for a client. The client will need to have a `platform_X` role attached, where `X` is the name of the platform from which you register assets. For this connector that will typically be `platform_openml`.
When a client is created, you will need its 'Client ID' and 'Client Secret' and update the relevant configuration and environment files accordingly.

## Disclaimer
This project is not affiliated with OpenML in any way.
This project is not affiliated with OpenML in any way.
4 changes: 2 additions & 2 deletions script/config.test.toml
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
api_server = 'https://test.openml.org/aiod/'
api_server = 'https://aiod.insight-centre.org/'
auth_server = 'https://ai4europe.test.fedcloud.eu/aiod-auth/'
client_id = 'platform_openml'
client_id = 'platform_openml'
Loading