Dysarthria is a neurological speech disorder that affects the clarity and intelligibility of spoken language, often resulting from conditions such as stroke, ALS, Parkinson’s disease, or cerebral palsy. Individuals with dysarthria experience slurred or slowed speech, making verbal communication difficult and impacting quality of life.
While augmentative and alternative communication (AAC) devices can assist, they often require fine motor control and offer much slower communication rates compared to natural speech. As digital voice interfaces become more central to everyday interactions, there is a growing need for speech recognition systems that can accurately interpret dysarthric speech and bridge this accessibility gap.
This technology introduces a personalized automatic speech recognition (ASR) system specifically designed for individuals with severe dysarthria. The core model is built using a uniquely large dataset comprising over 50 hours of speech from a single speaker with dysarthria, totaling more than 40,000 words and 187,000 phonemes.
The system employs hidden Markov models with extended state durations to capture the distinct acoustic characteristics of dysarthric speech and can be further enhanced using Gaussian Mixture Models, Deep Neural Networks, or Long Short-Term Memory architectures.
It is implemented in C++ using the Kaldi ASR toolkit and is deployable as a standalone application, web service, or embedded component across platforms. The system can be personalized with minimal user data, achieving over 85 percent word recognition accuracy, even in cases of low intelligibility.
U.S. Provisional serial no. 63/658,764 filed on 06/11/2024
PCT application serial no. PCT/US2025/032777 06/06/2025. Published as https://patents.google.com/patent/WO2025259567A1/en?oq=WO+2025%2f259567+A1