Meta’s New AI Model Can Translate and Transcribe 100 Languages

Meta's New AI Model Can Translate and Transcribe 100 Languages

Meta Platforms Inc. announced an AI model that can transcribe and translate up to 100 languages. SeamlessM4T supports translations between speech and text, making it easier for people to interact regardless of their native language.

In a blog post published on Aug. 22, the Facebook parent company said the model also supports full speech-to-speech translations in 36 languages, an improvement over previous models, which could only translate speech or text in one language at a time.

SeamlessM4T can perform various other functions, including speech recognition, speech-to-text translation, text-to-text translation, and text-to-speech translation, the company said.

Also read: AI Cameras Nab 300 Errant British Drivers in 3 Days

Meta claims SeamlessM4T reduces errors

According to the blog post, the company is making its SeamlessM4T model available to the public for non-commercial use – meaning researchers and developers can use the model to build their own applications and improve the state of AI translation.

Researchers say the model was trained on four million hours of “raw audio originating from a publicly available repository of crawled web data,” Reuters reports. Text data was taken from datasets created in 2022 by scraping content from Wikipedia and other related websites.

Meta admits the data is not copyrighted – something that has led to lawsuits against AI firms using publicly available data to train their models. The company described SeamlessM4T as a “significant breakthrough” in the field of speech-to-speech and speech-to-text technology.

“Compared to approaches using separate models, SeamlessM4T’s single system approach reduces errors and delays, increasing the efficiency and quality of the translation process,” Meta said.

SeamlessM4T builds on Meta’s No Language Left Behind, a text-to-text machine translation model released last year, and Universal Speech Translator, which supports Hokkien, a variety of the Chinese language.

It also builds on the company’s Massively Multilingual Speech framework, which provides speech recognition, language identification, and speech synthesis technology for more than 1,100 languages.

AI – metaverse nexus

Meta CEO Mark Zuckerberg previously said he expects tools like SeamlessM4T to facilitate interactions between users from around the world in the metaverse.

By making it easier for people to communicate across languages, AI could help to make the metaverse a more inclusive and accessible space for everyone.

Zuckerberg believes that Meta benefits from an open AI ecosystem because it allows the company to crowdsource the creation of consumer-facing tools for its social platforms, rather than charging for access to the models, as per the Reuters report.

In its blog post, the company wrote:

“Our single model provides on-demand translations that enable people who speak different languages to communicate more effectively.”

Meta is not the only company building AI-based translation models. Amazon, Microsoft, OpenAI, and Google are all working on commercial or open source AI translation services. Mozilla created Common Voice, a large multilingual database of voices that can be used to train automatic speech recognition algorithms.

Meta warned SeamlessM4T might be prone to some biases. It tends to “overgeneralize to masculine forms when translating from neutral terms” and performs better when translating from the masculine reference (e.g. nouns like “he” in English) for most languages.

Image credits: Shutterstock, CC images, Midjourney, Unsplash.