The University of Illinois (UIUC) is working with Apple and other tech giants on the Speech Accessibility Project, which aims to improve voice recognition systems for people with speech patterns and disabilities current versions have trouble understanding.
While often derided for mishearing a user's request, voice recognition systems for digital assistants like Siri have become more accurate over the years, including the development of on-device recognition. In a new move, a project is aiming to increase the accuracy further, by targeting people with speech impediments and disabilities.
Partnering with Apple, Amazon, Google, Meta, and Microsoft, as well as non-profits, UIUC's Speech Accessibility Project will try to expand the range of speech patterns that voice recognition systems can understand. This includes a focus on speech affected by diseases and disabilities, including Lou Gehrig's disease, Amyotrophic Lateral Sclerosis, Parkinson's, cerebral palsy, and Down syndrome.
In some cases, speech recognition systems could provide quality-of-life improvements to users with ailments that inhibit movement, but issues affecting the user's voice can impact its effectiveness.
Under the Speech Accessibility Project, samples will be collected from individuals "representing a diversity of speech patterns," to create a private and de-identified dataset. That dataset, which will focus on American English at first, could then be used to train machine learning models to better cope with the speech.
The involvement of a wide array of tech companies that have virtual assistants or offer speech recognition features in their tools could help speed up developments within the project. Instead of using separate teams that could duplicate the results found by others, the teams can instead collaborate directly through the project.
"Speech interfaces should be available to everybody, and that includes people with disabilities," said Mark Hasegawa-Johnson, a professor at UIUC. "This task has been difficult because it requires a lot of infrastructure, ideally the kind that can be supported by leading technology companies, so we've created a uniquely interdisciplinary team with expertise in linguistics, speech, AI, security, and privacy."
4 Comments
Tough problem to solve. But the CS work at UIUC has long had a reputation of quality, so I'm confident good progress will be made.
This is one of those initiatives that has real benefits on the face of it, but will benefit all of us as well. Most everyone has an accent that will give Siri trouble on occasion.
I work in the field, and it has often occurred to me that some speech adaptations could be made by training Siri (or any other speech to text application) on an individual basis. Someone’s skillful and experienced communication partner could work phrase by phrase - e.g. “When John says something that sounds like this [….] it means this […….]“.
Working on [disabled] accents-in-general is likely to be a very fruitful tack, but John’s unique [Down Syndrome] accent may call for a significant amount of fine-tuning. A hybrid approach could be useful here.