-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversion from Eole to CTranslate2 #72
Comments
Hi! Any news on the ctranslate2 converter? I'd be happy to help if needed, I would need some guidance though. Would the opennmt-py-converter differ much for eole? |
no should be similar but from safetensors file. Also if you have the will we'll need to add the estimator but I'm not sure @minhthuc2502 did the layer part already |
Hi! Sorry for taking so long to answer. I've been trying to implement this for the past few days. Code here: https://github.com/isanvicente/CTranslate2/blob/master/python/ctranslate2/converters/eole.py So far, I've mapped the config and layers of the old ONMT models to the new eole format (starting from _get_model_spec_seq2seq). Conversion is executed properly, but when translating with the model all I get is gibberish. My guess is either I messed up with the layer mapping at some point (decoder layers most probably) or some config parameter is not parsed properly. Could you take a look and see if you can find what I missed? You sure now better what changes were implemented from onmt to eole. Thanks! |
many options have changed. for instance all these https://github.com/isanvicente/CTranslate2/blob/master/python/ctranslate2/converters/eole.py#L24-L29 |
@isanvicente I just pushed a PR here: OpenNMT/CTranslate2#1832 I tested it with EuroLLM-9B-instruct, seems to run fine. |
Hi @vince62s, Thanks! I Tried one of the seq2seq models, but I got errors similar to those I had on my first tries. This is what the layer names look like in my eole model:
What I've tried so far:
I've had to make a few more changes in encoder layer names, but still not there. Last error is:
I'll document it and try to get back to you as soon as possible. |
I'll fix those I would be interested by the speed you get because with a 9B LLM (EuroLM-9B-instruct) I am getting the same speed within eole vs ct2 |
Yes! Got it working! I had to make two minor changes:
The whole diff looks like this:
I'll come back later today with speed results. I need to tweak the inference parameters. cheers! |
Hi again @vince62s , I tested the speed of a Basque --> french model on nvidia RTX A5000 GPU. Compute type is float16 (training, conversion to CT2 and all inferences). Inference parameters:
CT2 is giving me 2x speed. Table below shows results in seconds for a 5000 sentence file.
I'm cheking the actual outputs, because translations are not 100% equal (very close, but I'm observing a few differences here and there. |
can you fix your stats table ? can you post your inference config ? |
My inference config only contains transforms (eole_inference_config.yaml):
I execute inference with the following command: $eole predict -model_path ~/NMT/eu-fr/EXP_eufr_back41bicln09_fulldev/models/step_670000 -src test_5000.eu.txt -output test_5000.eufr-eole-beam3.out -world_size 1 -gpu_ranks 0 --max_length 300 --batch_size 64 --beam_size 3 --config ~/NMT/eu-fr/test.yaml --report_time CT2 model inference is done with a python script. I can't attach the script, so here is the relevant code:
|
with my latest commit you should be able to run the same "predict" command line but with an extra flag -engine which defaults to "eole" but can take "ct2" BUT: the ctranslate2 model should be in a subfolder "ctranslate2" of the model_path of the config file / command line |
Hi @vince62s, this took longer than expected, because the last commit requires pytorch >2.5 (I had 2.3.1), and I had to put together a new virtualenv. So far I could not make the "predict " command run with ct2. First, predict with ct2 is hardcoded to "decoder" type models, so I got: When I tried to set the model type to "encoder_decoder" it complained about not finding the model.bin, so I gave the ct2 path directly, but got stuck with a pydantic config error:
I can't dig deeper right now. I'll check it later. |
my fault, will push a fix. |
A lot of the OpenNMT-py ecosystem encourages the use of CTranslate2 downstream for efficient inference. Would really love this to be added to the new eole.
Doing some retraining of some custom multilingual NMT models and am using Eole to keep everything as up-to date as possible.
The text was updated successfully, but these errors were encountered: