More than 7,000 languages are currently spoken on this planet and Meta seemingly wants to understand them all. Six months ago, the company launched its ambitious No Language Left Behind (NLLB) project, training AI to translate seamlessly between numerous languages without having to go through English first. On Wednesday, the company announced its first big success, dubbed NLLB-200. It's an AI model that can speak in 200 tongues, including a number of less-widely spoken languages from across Asia and Africa, like Lao and Kamba.
According to a Wednesday blog post from the company, NLLB-200 can translate 55 African languages with “high-quality results.” Meta boasts that the model’s performance on the FLORES-101 benchmark surpassed existing state-of-the-art models by 44 percent on average, and by as much as 70 percent for select African and Indian dialects.
Translating between any two given languages — especially if neither of them is English — has proven a significant challenge to AI language models because, in part, many of these translation systems rely on written data scraped from the internet to train on. Super easy to do if you speak what this sentence is in, much more difficult if you’re looking for quality content in Fan or Kikuyu.
Like most of its other publicly promoted AI programs, Meta has decided to open-source NLLB-200 as well as provide $200,000 in grants to nonprofits to develop real-world applications for the technology. Applications like Facebook News Feed or Instagram, for example. “Imagine visiting a favorite Facebook group, coming across a post in Igbo or Luganda, and being able to understand it in your own language with just a click of a button,” the Meta post hypothesized. You can get a sense of how the new model works on Meta’s demo site.