Skip to content

Releases: OpenNMT/CTranslate2

CTranslate2 1.10.0

17 Apr 08:59
Compare
Choose a tag to compare

New features

  • Coverage penalty as in Wu et al. 2016 with the option coverage_penalty
  • Batch size can be expressed in number of tokens with the option batch_type
  • Translation scores can be disabled with the option return_scores (if disabled, the final SoftMax is skipped during greedy decoding)
  • Support compilation without TensorRT by setting -DWITH_TENSORRT=OFF during CMake configuration (in this case, beam search is no longer supported)
  • Experimental integration of Intel MKL's packed GEMM which can be enabled by setting the environment variable CT2_USE_EXPERIMENTAL_PACKED_GEMM=1

Fixes and improvements

  • Remove direct dependency to cuDNN (still an indirect dependency via TensorRT)
  • Static AVX optimization for the ReLU operator
  • Remove unnecessary memory initialization when creating temporary buffers
  • Dissociate SoftMax and LogSoftMax in profiling report

CTranslate2 1.9.1

08 Apr 13:56
Compare
Choose a tag to compare

Fixes and improvements

  • Fix parallel translations when calling Translator.translate_batch from multiple Python threads
  • Fix crash on invalid num_hypotheses value

CTranslate2 1.9.0

24 Mar 16:50
Compare
Choose a tag to compare

New features

  • Return 2 additional statistics from file translation APIs:
    • the number of translated examples
    • the total translation time in milliseconds

Fixes and improvements

  • Fix exceptions that were not catched by the Python wrapper
  • Fix an invalid insertion in the variables collection while iterating over it
  • Optimize filling operation of float storages
  • Internal refactoring of decoding functions to make them reusable for other tasks (e.g. generative language models)

CTranslate2 1.8.0

10 Mar 17:42
Compare
Choose a tag to compare

CTranslate2 1.8.0

New features

  • [Python] Add methods Translator.unload_model and Translator.load_model to manually manage memory
  • [Docker] Move all images to Python 3 only
  • Expose options that enable an internal sorting by length to increase the translation efficiency:
    • for file translation: read_batch_size contiguous examples will be loaded, sorted by length, and batched with size max_batch_size
    • for batch translation: if the batch is larger than max_batch_size, examples will be sorted by length and batched with size max_batch_size

Fixes and improvements

  • Fix another error when releasing a translator that is placed on a GPU that is not GPU 0
  • Fix possible memory corruption when creating GPU translators in parallel
  • Fix memory that is briefly allocated on GPU 0 when destroying a translator that is placed on another GPU
  • Reduce latency of model loading, especially on GPU

CTranslate2 1.7.1

03 Mar 10:16
Compare
Choose a tag to compare

CTranslate2 1.7.1

Fixes and improvements

  • Revert "Parallelize some low level transformations on CPU" which caused incorrect computation
  • Avoid unnecessary TensorFlow runtime initialization when converting checkpoints
  • Fix compilation without MKL

CTranslate2 1.7.0

28 Feb 13:48
Compare
Choose a tag to compare

CTranslate2 1.7.0

New features

  • Translation option return_alternatives to return multiple choices at the first unconstrained decoding position: combined with a target prefix, this could be used to provide alternative words and translations at a specific location in the target
  • Support Transformers with different number of encoder/decoder layers
  • Allow compilation without OpenMP with -DOPENMP_RUNTIME=NONE

Fixes and improvements

  • Fix SavedModel conversion when TensorFlow Addons 0.8 is installed
  • Fix error when releasing a translator/model that is placed on a GPU that is not GPU 0
  • Fix memory that was allocated on GPU 0 even when the translator/model was placed on another GPU
  • Query GPU int8 support on the first model load, and then cache the result for future loads
  • Avoid creating an empty model directory on conversion errors
  • Parallelize some low level transformations on CPU
  • Reduce memory usage when translating large files by limiting the work queue size

CTranslate2 1.6.3

24 Feb 10:01
Compare
Choose a tag to compare

CTranslate2 1.6.3

Fixes and improvements

  • Fix incorrectness in relative representation computation

CTranslate2 1.6.2

21 Feb 14:30
Compare
Choose a tag to compare

CTranslate2 1.6.2

Fixes and improvements

  • Fix conversion of models with shared embeddings

CTranslate2 1.6.1

17 Feb 10:17
Compare
Choose a tag to compare

CTranslate2 1.6.1

Fixes and improvements

  • [Docker] Remove translation client in CentOS 7 images as it can cause compatibility issues with downstream images

CTranslate2 1.6.0

14 Feb 12:41
Compare
Choose a tag to compare

CTranslate2 1.6.0

New features

  • Support Transformers with relative position representations (as in Shaw et al. 2018)
  • Accept target prefix in batch request
  • Support return_attention with prefixed translation