A set of new features for Android could alleviate some of the difficulties of living with hearing impairment and other conditions. Live transcription, captioning, and relay use speech recognition and synthesis to make content on your phone more accessible — in real time.
Announced today at Google’s I/O event in a surprisingly long segment on accessibility, the features all rely on improved speech-to-text and text-to-speech algorithms, some of which now run on-device rather than sending audio to a datacenter to be decoded.
The first feature to be highlighted, live transcription, was already mentioned by Google before. It’s a simple but very useful tool: open the app and the device will listen to its surroundings and simply display any speech it recognizes as text on the screen.
We’ve seen this in translator apps and devices, like the One Mini, and the meeting transcription highlighted yesterday at Microsoft Build. One would think that such a straightforward tool is long overdue, but in fact everyday circumstances like talking to a couple friends at a cafe, can be remarkably difficult for natural language systems trained on perfectly recorded single-speaker audio. Improving the system to the point where it can track multiple speakers and display accurate transcripts quickly has no doubt been a challenge.
Another feature enabled by this improved speech recognition ability is live captioning, which essentially does the same thing as above, but for video. Now when you watch a YouTube video, listen to a voice message, or even take a video call, you’ll be able to see what the person in it is saying, in real time.
That should prove incredibly useful not just for the millions of people who can’t hear what’s being said, but also those who don’t speak the language well and could use text support, or anyone watching a show on mute when they’re supposed to be going to sleep, or any number of other circumstances where hearing and understanding speech just isn’t the best option.
Captioning phone calls is something CEO Sundar Pichai said is still under development, but the “live relay” feature they demoed on stage showed how it might work. A person who is hearing-impaired or can’t speak will certainly find an ordinary phone call to be pretty worthless. But live relay turns the call immediately into text, and immediately turns text responses into speech the person on the line can hear.
Live captioning should be available on Android Q when it releases, with some device restrictions. Live transcribe is available now but a warning states that it is currently in development. Live relay is yet to come, but showing it on stage in such a complete form suggests it won’t be long before it appears.