published the release of the machine translation system Opennmt-TF 2.30.0 (Open Neural Machine Translation), using machine learning methods. The code developed by the Opennmt-TF modules project is written in Python, uses the Tensorflow and library. develops Opennmt version based on the Pytorch library, which differs at the level of supported opportunities . In addition, Opennmt based on Pytorch is presented as simpler for use and multimodal, and the option based on TensorFlow is noted as modular, stable and allows you to use the GPU capabilities to accelerate the process of teaching the neural network. To simplify the distribution of the product, the project also develops a self -sufficient version of the translator in C ++ – ctranslate2 , which uses pre -training models without reference to additional dependencies.
Models prepared for English, German and Catalan languages, for other languages you can independently form a model from the project opus (two files are transmitted for teaching the system – one with sentences in the original language, and the second with a qualitative translation of these sentences into the target language).
The project is developing with the participation of the company systran , specializing in the creation of machine translation, and the group of researchers harvard , developing a model of the human language for machine learning systems. The user interface is simplified as much as possible and requires only specifying the input file with the text and the file to save the translation result. The expansion system makes it possible to implement additional functionality on the basis of OpennMT, for example, abstracting, classification of texts and generation of subtitles.
In the new version:
- Added support for the Tensorflow 2.11 library, but the new optimizers keras are not yet supported (the use of tf.keras.optimizers.legacy regime is required ).
- Added support for the new branch of the engine ctranslate2 3.x , designed to efficiently perform models with architecture “
- Added parameter for Pad_to_bucket_Boundary models to include an additional filling that levels the size of the block to values multiple leength_bucket_width.
- Integrated support of the ChRF and ChRF ++ metrics from the project saacrebleu .
- The attribute of the Ctranslate2_SPec model is removed, which is no longer used in Ctranslate2.
. . .