A word of warning for starters: even though it’s hit a lot of home runs (it’s still where we introduced Android and Google Photos), I/O is also the birthplace of Wave and Glass, products that were discontinued before they were launched in due form for the general public.
Everything you read here will not necessarily see the light of day. The Google I/O conference, where the Californian company has been unveiling its upcoming products and technologies for 15 years, is also somewhat recognized as an event where the failures of tomorrow are sometimes presented.
These failures can however announce other developments. Wave, a sort of Google Docs before its time, nevertheless paved the way for online collaborative work that allowed us to function better professionally during the pandemic, and Glass was for many the first contact with augmented reality.
Trends don’t lie, and computer systems’ understanding of language is certainly one to keep an eye on.
Synthesize documents and conversations at work
Summarize a long working document to give you an idea of what it contains, highlight the main points that were discussed during your absence in a chat service and even summarize a team meeting by videoconference: these different features coming to the Workspace suite illustrate how far language understanding has come since the arrival of artificial intelligence techniques like deep learning.
Summarizing the main topics of a video meeting requires, for example, automatic transcription, understanding of long text passages and automatic text generation, to create the summary. Each of these steps would have been impossible just a few years ago.
The feature will roll out over the next few months to popular Docs software, followed by Chat and Meet. Note that the company did not mention in which languages the summaries would be offered, but everything indicates that the launch will be in English initially.
Generate text automatically
Text generation is another area that Google is exploring, with the launch of its LaMDA 2 algorithm, which allows an artificial intelligence to converse in English on any subject.
The risks associated with such technology are numerous, especially if it is misused, to create misinformation, for example. The algorithm will therefore be accessible to small groups of researchers at first (other invitations will follow over the next few months), and only in the context of three experiments, where they can ask LaMDA 2 to describe a scene, carry on a conversation about dogs or generate a to-do list based on any topic.
Communicate more naturally with assistants
Voice assistants like the Google Assistant are good enough at recognizing our everyday speech, but they demand speech that fits a certain mold. If you take too long to complete the title of a song, the assistant risks carrying out a search based on your partial request (and thus playing “L’amour” by Karim Ouellet rather than “L’amour est without pity” by Jean Leloup, which you intended to ask).
The voice assistant will soon be able to grasp the subtleties of human speech a little better. It should thus understand that a “hmm” is not part of the request, and detect the different intonation between a pause and the end of a request.
Examples of these improved algorithms were shown at I/O, but their release date has not been announced.
Translate the 7,000 languages forgotten by machine translation
Even though the languages spoken by most people on earth can be translated by machine translation tools, around 7,000 languages around the world still cannot take advantage of these technologies, including many of North America’s native languages. .
Until now, machine translation models were trained with bilingual datasets. By comparing enough already translated texts between two languages, the artificial intelligence tools were able to “learn” to translate them.
For the majority of languages that have not benefited from such treatment, there are not enough accessible bilingual texts. Google researchers have therefore developed a “monolinguistic” model, capable of being trained without previously translated texts.
The approach, explained in more detail here, is still in its infancy, but the company believes that it would already be good enough to generate imperfect, but practical and usable translations. Google Translate has also added 24 additional languages this week thanks to this approach, including Mizo, a Tibeto-Burmese language spoken by only 800,000 people worldwide. This brings the total of languages that can be translated with the tool to 133.
Chat with augmented reality glasses
The big tech companies have all confirmed their interest in augmented reality, where digital information is displayed above the real world in the user’s field of vision, in particular with connected glasses. Google is no exception to the rule.
The company presented at Google I/O a prototype of augmented reality glasses, which have the shape and size of normal glasses. The improvement over the not very discreet Google Glass is obvious.
The new glasses will be used, among other things, to display in real time a translation of what the interlocutor of the person wearing them is saying, like subtitles on TV. The technology should be useful especially for chatting with someone in another language, but also for the hearing impaired.
No launch date has been announced for these glasses, but it will probably be a few more years before they go from prototype to finished product, ready for the general public.