Skip to content
Merged
Show file tree
Hide file tree
Changes from 41 commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
74324fe
update tabular.py
zaRizk7 Sep 11, 2025
38e6429
update ehr.py
zaRizk7 Sep 11, 2025
d7350e7
reorder param validator for ehr_path in ehr.py
zaRizk7 Sep 11, 2025
97e7a14
remove comment in demo script for tabular.py
zaRizk7 Sep 11, 2025
1ca9143
update cxr.py
zaRizk7 Sep 11, 2025
b76932e
remove omics modules from hackathon
zaRizk7 Sep 11, 2025
6b25e1d
update cxr.py
zaRizk7 Sep 11, 2025
700249b
update ecg.py
zaRizk7 Sep 11, 2025
fb7c8f4
update echo.py
zaRizk7 Sep 11, 2025
4c036c3
update ehr.py
zaRizk7 Sep 11, 2025
0380419
update molecule.py
zaRizk7 Sep 11, 2025
6c356ff
update protein.py
zaRizk7 Sep 11, 2025
bbfe351
update supervised_labels.py
zaRizk7 Sep 11, 2025
b4bf037
update tabular.py
zaRizk7 Sep 11, 2025
9bbd776
update text.py
zaRizk7 Sep 11, 2025
7e761d9
update ehr.py
zaRizk7 Sep 11, 2025
07970c0
update text.py
zaRizk7 Sep 11, 2025
8184c64
update tabular.py
zaRizk7 Sep 11, 2025
fdf338f
update dataset.py
zaRizk7 Sep 11, 2025
799b4a4
remove utils.py
zaRizk7 Sep 11, 2025
63fe37e
include __all__ for all modules
zaRizk7 Sep 11, 2025
db5cf20
include dropbox api to dependencies
zaRizk7 Sep 11, 2025
c6aa21e
include mit license
zaRizk7 Sep 11, 2025
2205919
include script to download the dataset from dropbox in dropbox_downlo…
zaRizk7 Sep 11, 2025
ea838c8
update dataset.py
zaRizk7 Sep 12, 2025
ace563d
update for load_data module
zaRizk7 Sep 12, 2025
e8faaf2
add tests/dropbox_download.py
zaRizk7 Sep 12, 2025
9c9dda2
include test cases for dataset and load_data module
zaRizk7 Sep 12, 2025
ad78ad7
include noqa to allow skipping test if the library is not there
zaRizk7 Sep 12, 2025
ff60d06
include MIT license for build
zaRizk7 Sep 12, 2025
8b500ff
update tabular.py and test_dataset.py
zaRizk7 Sep 12, 2025
98cdcc9
run pre-commit on tabular.py
zaRizk7 Sep 12, 2025
cfeb795
update pyproject.toml
zaRizk7 Sep 12, 2025
4b58859
minimize test.yml footprint
zaRizk7 Sep 12, 2025
150b554
minimize pre-commit-config
zaRizk7 Sep 12, 2025
ba08515
reorganize readme.md
zaRizk7 Sep 12, 2025
c422248
Merge branch 'main' into standardize-api
zaRizk7 Sep 12, 2025
afaef7a
test.yml update
zaRizk7 Sep 12, 2025
db5c454
update api access for dropbox data download
zaRizk7 Sep 12, 2025
fda48e3
update conditionals for dropbox data download
zaRizk7 Sep 12, 2025
1840157
enable continue on error for dropbox download
zaRizk7 Sep 12, 2025
7451568
Update mmai25_hackathon/load_data/supervised_labels.py
zaRizk7 Sep 12, 2025
f865f1f
update test_supervised_labels.py
zaRizk7 Sep 12, 2025
48ebd79
update cxr.py
zaRizk7 Sep 12, 2025
30a5f30
update text.py
zaRizk7 Sep 12, 2025
fdca46e
include filter_rows for all read_tabular dependent functions
zaRizk7 Sep 12, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
84 changes: 41 additions & 43 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,64 +14,62 @@ on:

jobs:
test:
name: Test (${{ matrix.os }}, python version ${{ matrix.python-version }})
runs-on: ${{ matrix.os }}
env:
DROPBOX_APP_KEY: ${{ secrets.DROPBOX_APP_KEY }}
DROPBOX_APP_SECRET: ${{ secrets.DROPBOX_APP_SECRET }}
DROPBOX_REFRESH_TOKEN: ${{ secrets.DROPBOX_REFRESH_TOKEN }}
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
strategy:
matrix:
# os: [ubuntu-latest, windows-latest]
os: [ubuntu-latest]
python-version: ["3.10", "3.11", "3.12"] # list of Python versions to test
include:
- os: ubuntu-latest
path: ~/.cache/pip
# - os: windows-latest
# path: ~\AppData\Local\pip\Cache

python-version: ["3.10", "3.11", "3.12"]
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python using Miniconda
uses: conda-incubator/setup-miniconda@v3
with:
auto-update-conda: true
python-version: ${{ matrix.python-version }}
miniconda-version: latest

- name: Cache pip dependencies
id: cache_pip
uses: actions/cache@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
path: ${{ matrix.path }}
key: ${{ runner.os }}-python${{ matrix.python-version }}-pip-20250908-${{ hashFiles('**/pyproject.toml', '**/requirements*.txt') }}
restore-keys: |
${{ runner.os }}-python${{ matrix.python-version }}-pip-20250908-
python-version: ${{ matrix.python-version }}
cache: "pip"
cache-dependency-path: |
pyproject.toml
requirements*.txt

- name: Install from pyproject (single list)
- name: Install dependencies
env:
PIP_EXTRA_INDEX_URL: https://download.pytorch.org/whl/cpu
run: |
python -m pip install --upgrade pip
python -m pip install -e .
python -m pip install -e .[dev]
# PyG compiled extensions (need the wheel index)
python -m pip install \
-f https://data.pyg.org/whl/torch-2.6.0+cpu.html \
torch_scatter==2.1.2 torch_sparse==0.6.18 torch_cluster==1.6.3 torch_spline_conv==1.2.2
shell: bash -l {0}

# - name: Run tests with thread limits
# id: run_tests
# run: |
# export OMP_NUM_THREADS=1
# export MKL_NUM_THREADS=1
# export NUMEXPR_NUM_THREADS=1
# pytest --cov=mmai25_hackathon
# shell: bash -l {0}
- name: Download datasets from Dropbox (optional)
if: ${{ env.DROPBOX_REFRESH_TOKEN != '' }}
continue-on-error: true
run: |
python -m tests.dropbox_download \
"/MMAI25Hackathon" \
"MMAI25Hackathon" \
--app-key "$DROPBOX_APP_KEY" \
--app-secret "$DROPBOX_APP_SECRET" \
--refresh-token "$DROPBOX_REFRESH_TOKEN" \
--unzip

- name: Run tests
env:
OMP_NUM_THREADS: "1"
MKL_NUM_THREADS: "1"
NUMEXPR_NUM_THREADS: "1"
run: pytest --cov=mmai25_hackathon

# - name: Determine coverage
# run: |
# coverage xml
# shell: bash -l {0}
- name: Generate coverage XML
run: coverage xml

# - name: Report coverage
# uses: codecov/codecov-action@v4
# with:
# token: ${{ secrets.CODECOV_TOKEN }}
- name: Upload coverage to Codecov (optional)
if: ${{ env.CODECOV_TOKEN != '' }}
uses: codecov/codecov-action@v4
with:
token: ${{ env.CODECOV_TOKEN }}
49 changes: 15 additions & 34 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,48 +2,29 @@ repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v6.0.0
hooks:
- id: check-added-large-files
args: ["--maxkb=300"]
- id: fix-byte-order-marker
- id: check-case-conflict
- id: check-merge-conflict
- id: check-yaml
- id: end-of-file-fixer
- id: forbid-new-submodules
- id: mixed-line-ending
- id: trailing-whitespace
- id: debug-statements
- id: check-yaml
- id: requirements-txt-fixer
- repo: https://github.com/pycqa/flake8.git
rev: 6.1.0
hooks:
- id: flake8
args: [ --config=setup.cfg ]

- repo: https://github.com/psf/black
rev: 23.11.0
rev: 25.1.0
hooks:
- id: black
language_version: python3
additional_dependencies: [ 'click==8.0.4' ]
args: ["--line-length=120"]

- repo: https://github.com/pycqa/isort
rev: 5.11.2
rev: 6.0.1
hooks:
- id: isort
name: isort
entry: python -m isort
args: [ --settings-path, ./pyproject.toml ]
language: system
types: [ python ]
# - repo: https://github.com/astral-sh/ruff-pre-commit
# # Ruff version.
# rev: v0.12.11
# hooks:
# # Run the linter.
# - id: ruff-check
# args: [--fix]
# # Run the formatter.
# - id: ruff-format
args: ["--profile=black", "--line-length=120"]

- repo: https://github.com/PyCQA/flake8
rev: 7.3.0
hooks:
- id: flake8
additional_dependencies: []
args: ["--max-line-length=120"]
- repo: https://github.com/pre-commit/mirrors-mypy
rev: v1.7.1
rev: v1.18.1
hooks:
- id: mypy
21 changes: 21 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2025 mmai-hackathon

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
149 changes: 54 additions & 95 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,127 +16,86 @@ This repository provides the base source code for the MultimodalAI'25 workshop H

## Installation

### Prerequisite
The steps below are linear and work with `venv`, `conda`, or `uv`. Pick one method and follow it end‑to‑end.

Before installing other dependencies, install pykale with all optional dependencies (full extras) from git:
### 1) Clone and create an environment

```bash
pip install "git+https://github.com/pykale/pykale@main[full]"
```

You can set up your development environment using one of the following methods: `venv`, `conda`, or `uv`.
git clone https://github.com/pykale/mmai-hackathon.git
cd mmai-hackathon

### Main Installation Steps
# conda (recommended)
conda create -n mmai-hackathon python=3.11 -y
conda activate mmai-hackathon

1. **Clone the repository:**

```bash
git clone https://github.com/pykale/mmai-hackathon.git
cd mmai-hackathon
```
# venv (alternative)
# python3 -m venv .venv && source .venv/bin/activate

2. **Set up a virtual environment (recommended):**
# uv (alternative)
# uv venv .venv && source .venv/bin/activate
```

```bash
python3 -m venv .venv
source .venv/bin/activate
```
### 2) Install dependencies (with tests)

3. **Install dependencies:**
```bash

```bash
pip install --upgrade pip
# Install pykale with all optional dependencies (full extras) from git first
pip install "git+https://github.com/pykale/pykale@main[full]"
pip install -e .
```
# Recommended for development and testing (includes pytest, coverage, linters)
pip install -e .[dev]

#### Installing torch-geometric (pyg) and its extensions
# If you only need runtime dependencies (not recommended for contributors):
# pip install -e .
```

To install torch-geometric (`pyg`) and its required extensions (such as `torch-scatter`, `torch-sparse`, etc.), use the following command with the appropriate URL for your PyTorch and CUDA version:
If you use features that depend on PyG (graph loaders, SMILES), install torch‑geometric wheels that match your Torch/CUDA.
The snippet below detects your installed Torch and CUDA, constructs the correct find‑links URL, and installs the wheels:

```bash
pip install torch-geometric torch-scatter torch-sparse torch-cluster torch-spline-conv -f https://data.pyg.org/whl/torch-2.6.0+cpu.html
# Inspect Torch / CUDA (optional)
python - <<'PYINFO'
import torch
print('Torch:', torch.__version__)
print('CUDA version:', torch.version.cuda)
print('CUDA available:', torch.cuda.is_available())
PYINFO

# Install PyG wheels matching your Torch/CUDA
PYG_INDEX=$(python - <<'PYG'
import torch
torch_ver = torch.__version__.split('+')[0]
cuda = torch.version.cuda
if cuda:
cu_tag = f"cu{cuda.replace('.', '')}"
else:
cu_tag = 'cpu'
print(f"https://data.pyg.org/whl/torch-{torch_ver}+{cu_tag}.html")
PYG
)
echo "Using PyG wheel index: $PYG_INDEX"
pip install torch-geometric torch-scatter torch-sparse torch-cluster torch-spline-conv -f "$PYG_INDEX"
```

Replace the URL with the one matching your PyTorch and CUDA version. For more details and the latest URLs, see the official torch-geometric installation guide: https://pytorch-geometric.readthedocs.io/en/latest/notes/installation.html

---

You can also use the following environment-specific guides:

### Using conda (Anaconda/Miniconda)

1. **Create and activate a conda environment:**

```bash
conda create -n mmai-hackathon python=3.10
conda activate mmai-hackathon
```

2. **Install dependencies:**

```bash
pip install -e .
```

### Using uv (Ultra-fast Python package manager)

Assuming `uv` is already installed:

1. **Create and activate a uv virtual environment:**
More details: https://pytorch-geometric.readthedocs.io/en/latest/notes/installation.html

```bash
uv venv .venv
source .venv/bin/activate
```
### 3) (Optional) Pre‑commit hooks

2. **Install dependencies:**

```bash
uv pip install -e .
```

---

1. **Clone the repository:**

```bash
git clone https://github.com/pykale/mmai-hackathon.git
cd mmai-hackathon
```

2. **Set up a virtual environment (recommended):**

```bash
python3 -m venv .venv
source .venv/bin/activate
```

3. **Install dependencies:**

```bash
pip install --upgrade pip
pip install -e .
```

4. **(Optional) Install pre-commit hooks:**

```bash
pre-commit install
```
```bash
pre-commit install
```

5. **Run tests:**
### 4) Run tests

```bash
pytest
```
```bash
pytest
```

## Notes

- The project restricts Python versions to 3.10–3.12 as specified in `.python-version` and `pyproject.toml`.
- For more information about the dependencies, see `pyproject.toml`.

Tip: Integration tests optionally use real data. In CI, datasets are downloaded with `python -m tests.dropbox_download "/MMAI25Hackathon" "MMAI25Hackathon" --unzip` when a Dropbox token is configured.

## Authors

- Shuo Zhou (<[email protected]>)
Expand Down
Loading