Releases: OpenNMT/CTranslate2
Releases · OpenNMT/CTranslate2
CTranslate2 1.10.0
New features
- Coverage penalty as in Wu et al. 2016 with the option
coverage_penalty
- Batch size can be expressed in number of tokens with the option
batch_type
- Translation scores can be disabled with the option
return_scores
(if disabled, the final SoftMax is skipped during greedy decoding) - Support compilation without TensorRT by setting
-DWITH_TENSORRT=OFF
during CMake configuration (in this case, beam search is no longer supported) - Experimental integration of Intel MKL's packed GEMM which can be enabled by setting the environment variable
CT2_USE_EXPERIMENTAL_PACKED_GEMM=1
Fixes and improvements
- Remove direct dependency to cuDNN (still an indirect dependency via TensorRT)
- Static AVX optimization for the ReLU operator
- Remove unnecessary memory initialization when creating temporary buffers
- Dissociate SoftMax and LogSoftMax in profiling report
CTranslate2 1.9.1
Fixes and improvements
- Fix parallel translations when calling
Translator.translate_batch
from multiple Python threads - Fix crash on invalid
num_hypotheses
value
CTranslate2 1.9.0
New features
- Return 2 additional statistics from file translation APIs:
- the number of translated examples
- the total translation time in milliseconds
Fixes and improvements
- Fix exceptions that were not catched by the Python wrapper
- Fix an invalid insertion in the variables collection while iterating over it
- Optimize filling operation of float storages
- Internal refactoring of decoding functions to make them reusable for other tasks (e.g. generative language models)
CTranslate2 1.8.0
CTranslate2 1.8.0
New features
- [Python] Add methods
Translator.unload_model
andTranslator.load_model
to manually manage memory - [Docker] Move all images to Python 3 only
- Expose options that enable an internal sorting by length to increase the translation efficiency:
- for file translation:
read_batch_size
contiguous examples will be loaded, sorted by length, and batched with sizemax_batch_size
- for batch translation: if the batch is larger than
max_batch_size
, examples will be sorted by length and batched with sizemax_batch_size
- for file translation:
Fixes and improvements
- Fix another error when releasing a translator that is placed on a GPU that is not GPU 0
- Fix possible memory corruption when creating GPU translators in parallel
- Fix memory that is briefly allocated on GPU 0 when destroying a translator that is placed on another GPU
- Reduce latency of model loading, especially on GPU
CTranslate2 1.7.1
CTranslate2 1.7.1
Fixes and improvements
- Revert "Parallelize some low level transformations on CPU" which caused incorrect computation
- Avoid unnecessary TensorFlow runtime initialization when converting checkpoints
- Fix compilation without MKL
CTranslate2 1.7.0
CTranslate2 1.7.0
New features
- Translation option
return_alternatives
to return multiple choices at the first unconstrained decoding position: combined with a target prefix, this could be used to provide alternative words and translations at a specific location in the target - Support Transformers with different number of encoder/decoder layers
- Allow compilation without OpenMP with
-DOPENMP_RUNTIME=NONE
Fixes and improvements
- Fix SavedModel conversion when TensorFlow Addons 0.8 is installed
- Fix error when releasing a translator/model that is placed on a GPU that is not GPU 0
- Fix memory that was allocated on GPU 0 even when the translator/model was placed on another GPU
- Query GPU int8 support on the first model load, and then cache the result for future loads
- Avoid creating an empty model directory on conversion errors
- Parallelize some low level transformations on CPU
- Reduce memory usage when translating large files by limiting the work queue size
CTranslate2 1.6.3
CTranslate2 1.6.3
Fixes and improvements
- Fix incorrectness in relative representation computation
CTranslate2 1.6.2
CTranslate2 1.6.2
Fixes and improvements
- Fix conversion of models with shared embeddings
CTranslate2 1.6.1
CTranslate2 1.6.1
Fixes and improvements
- [Docker] Remove translation client in CentOS 7 images as it can cause compatibility issues with downstream images
CTranslate2 1.6.0
CTranslate2 1.6.0
New features
- Support Transformers with relative position representations (as in Shaw et al. 2018)
- Accept target prefix in batch request
- Support
return_attention
with prefixed translation