-
Notifications
You must be signed in to change notification settings - Fork 126
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merged PR 32883: Pymarian improvements
List of changes/updates/fixes to pymarian * Rename model IDs to match with hugging face (e.g., comet22-da -> wmt22-comet-da) * Rename CLI to make it short pymarian-evaluate -> pymarian-eval. * Rename pymarian.evaluate.py -> pymarian.eval.py to reflect CLI * The functional code from pymarian.eval.py is moved to Evaluator class (goal: allow reuse of Evaluator object for scoring many small files like WMT metric task) * Use mmap *.bins instead of *.npz * Downloads *.bin and *.spm individually instead of .tgz. Future plan to support quantized / gemm models. Downloading .tgz is okay but it will get too expensive since we dont need all variants of model (.npz, .bin, fp32, fp16, avx512 ...) * Uses file locking mechanism (based on `portalocker`) to avoid race condition between parallel download processes * Added optional `-v/--vocab` argument to pymarian-eval. * Added `--fields|-f` argument: supports `src mt ref` or a subsequence of this. Raises an error when missing fields are detected, ignores that extra fields * pymarian build improvements: strict on python version match between package and native extension. Also removes custom logic for extension detection, instead uses EXT_SUFFIX from sysconfig * add `--like` argument for local models * Ran black and isort to fix code formatting issues * pypdl -- parallel download * Regression tests to pymarian -- Other scripts * Added `convert-all-models.sh` : convert pytorch to marian .npz, convert .npz to .bin and creates directory structure compatible with pymarian-eval * Added `compare.sh` to compare metrics between original implementation and pymarian
- Loading branch information
Thamme Gowda
committed
Feb 15, 2024
1 parent
22ed792
commit 9e40ac3
Showing
40 changed files
with
1,514 additions
and
923 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
/regression-tests | ||
/build* | ||
/.pytest_cache | ||
/.vscode | ||
/dist | ||
/doc | ||
.history* |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,4 @@ | ||
bins/ | ||
tmp.* | ||
/bins | ||
tmp.* | ||
/workspace | ||
/marian-metric |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,36 +1,41 @@ | ||
# Marian Evaluate | ||
# Marian Metrics | ||
|
||
The main script is `compare.sh`, however it needs to be run in an environment where all three -- marian, unbabel-comet(pytorch), and bleurt(tensorflow) are available. | ||
Hence, 1) we create a docker container with all the necessary libs. | ||
and 2) run compare.sh inside the docker environment | ||
Hence we create a new python environment using conda to run comparisons. | ||
|
||
## Setup: build docker image | ||
## Setup | ||
|
||
```bash | ||
./setup.sh | ||
./run.sh | ||
``` | ||
This setups a conda environment named `metrics` which will have all the necessary requirements, except pymarian-eval, which you will have to install based on your CMAKE settings | ||
```bash | ||
# from the root dir of this repository | ||
conda activate metrics | ||
mkdir build; cd build | ||
cmake .. -DPYMARIAN=on #.. other flags | ||
pip install pymarian-*.whl | ||
``` | ||
|
||
## Run compare.sh in docker container | ||
## Run Compare.sh | ||
|
||
```bash | ||
./docker-run.sh | ||
|
||
# option 1: | ||
./run.sh | ||
|
||
# option 2 | ||
conda activate metrics | ||
bash compare.sh | ||
``` | ||
The `docker-run.sh` script mounts cache directory from the host to container. | ||
The necessary files (weights and vocabularies) will be automatically downloaded and cached for unbabel-comet and Bleurt metrics. | ||
However, for `marian-score.sh` expects the cache to be prepared under `$HOME/.cache/marian/metrics`. | ||
The structure/format of the cache directory for marian-score.sh looks as follows: | ||
|
||
This script produces reports at `workspace/*.report.txt`, which shows average difference segment level scores between original implementation and `pymarian-eval` | ||
|
||
## Convert Metrics Weights to Marian format | ||
|
||
```bash | ||
/home/$USER/.cache/marian/metrics/ | ||
├── bleurt20-ref | ||
│ ├── bleurt-20.model.npz | ||
│ ├── bleurt.vocab.spm | ||
├── comet20-da-src | ||
│ ├── comet20-qe-da.model.npz | ||
│ └── roberta.vocab.spm | ||
└── comet20-da-src+ref | ||
├── comet20-da.model.npz | ||
└── roberta.vocab.spm | ||
conda activate metrics | ||
MARIAN=../build/marian ./convert-all-models.sh | ||
``` | ||
Each metric subdir should have a `*model.npz` and a `*vocab.spm` files, and the name of metric directory should end with `-src|-qe|-ref|-src+ref` suffix to indicate the category of metric. | ||
|
||
> TODO: Upload Marian compatible comet and bleurt models to public blob storage and modify script to automatically download | ||
|
||
To add a new model ID, edit `known-models.txt` file in the same directory as this README |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.