Google announced a new ambition project to develop a single language model artificialof intelligence that will support the "1.000 most spoken languages" in the world.
As a first step toward that goal, the company is unveiling an AI model trained in more than 400 languages, which it describes as "the largest language coverage available in a speech model today."
Google's "1.000 Languages Initiative" will not focus on any specific functionalτητα, αλλά στη creation of a single system with a vast range of knowledge in all the languages of the world.
Speaking to The Verge, Zoubin Ghahramani, vice president researchs in Google AI, said that the company believes that building a model of this size will make it easier to port various AI functions to languages that are underrepresented in online spaces and AI training datasets (also known as “low-resource languages”).
“By having a single model that is exposed and trained in many different languages, we will perform much better in languages with low pconditionsGhahramani said.
“The way we're going to get to 1.000 languages is not going to be by building 1.000 different models. Languages are like organisms, they have evolved from each other and have certain similarities. We can have quite spectacular advances in what we call zero learning λήψηs when we incorporate data from a new language into our 1.000-language model and we'll be able to translate [what it learned] from one high-resource language to another low-resource language.”
The company says it has no immediate plans for where to implement this model's functionality, only that it expects it to have a range of uses in Google products, from Google Translate, YouTube subtitles and more.
"One of the really interesting things about large language models and language research in general is that they can do many different jobs," says Ghahramani.
“The same language model can turn commands for a robot into code, it can solve math problems, it can do translation. The really interesting things about language models is that they become repositories of a lot of knowledge, and by looking at them in different ways you can get to different functionalities.”