Meta builds new language translation model
2022-07-08Meta is marking its first victory with the open-source release of a language model able to decrypt 202 languages. Dubbed NLLB-200, the model is the first to translate so many languages, all with the goal to improve translation for languages overlooked by similar projects.
The model can translate 55 African languages with high-quality results. Prior to NLLB-200's creation, Meta said fewer than 25 African languages were covered by widely used translation tools.
When tested against the BLEU standard, Meta said NLLB-200 showed an average improvement of 44 percent over other state-of-the-art translation models. For some African and Indian languages, the improvement reportedly went as high as 70 percent.
Meta has been working with the Wikimedia Foundation to use NLLB-200 as the back end of Wikipedia’s Content Translation Tool. The model and other results from the NLLB program will support more than 25 billion translations served every day on Facebook News Feed, Instagram, and other platforms.
As part of its open sourcing of NLLB-200, Meta is also releasing the new Flores-200 evaluation dataset it built for the project, seed training data, its 200-language toxicity list, its new LASER3 sentence encoder, the stopes data mining library, 3.3 billion and 1.3 billion parameter dense transformer models, 1.3 billion and 600 million parameter models distilled from NLLB-200 and NLLB-200 itself, which contains 54.5 billion parameters.
See What’s Next in Tech With the Fast Forward Newsletter
Tweets From @varindiamag
Nothing to see here - yet
When they Tweet, their Tweets will show up here.