Back

AI-Powered Choker Aids Communication for Stroke Survivors with Dysarthria

Show me the source
Generated on: Last updated:

Introduction

A new study details the development of an artificial intelligence (AI)-powered intelligent throat (IT) system. This wearable device aims to help stroke survivors suffering from dysarthria communicate more effectively by translating subtle throat vibrations and pulse signals into clearer speech.

Addressing Communication Challenges

Neurological conditions such as stroke, amyotrophic lateral sclerosis (ALS), and Parkinson's disease frequently cause dysarthria, a motor-speech disorder that impairs vocal tract control. This condition significantly hinders communication, impacting patients' quality of life and rehabilitation.

Existing augmentative and alternative communication (AAC) technologies, like head or eye-tracking systems, are often slow. Neuroprosthetics, while promising for severe cases, require invasive procedures. There is a demand for simpler, portable solutions for patients with some muscle control.

Current wearable silent-speech devices have limitations, including being primarily tested on healthy individuals and relying on word-level decoding or one-to-one mapping, which can disrupt communication flow and strain fatigued patients. Systems capable of expanding shorter expressions into coherent sentences are necessary for more natural interaction.

The AI-Powered IT System

The IT system developed in this study is an AI-driven wearable silent speech device for dysarthria patients. Key features include:

  • Smart Choker: Equipped with textile strain sensors and a wireless circuit board to capture laryngeal muscle vibrations and carotid pulse signals.
  • AI Integration: Utilizes machine learning models and large language model (LLM) agents for real-time analysis.
  • Functionality: Generates either direct text output or expanded, contextually appropriate sentences reflecting the patient's intended meaning and emotional state.
  • Emotion Recognition: Pulse signals are processed by an emotion-decoding network to identify emotional states (neutral, relieved, frustrated).
  • Sentence Expansion: An agent expands generated sentences by incorporating emotion labels and contextual data (e.g., time of day, weather) when activated by the user.
  • Power: The circuit board consumes 76.5 mW, with an 1,800 mWh battery providing all-day operation.

System Training and Assessment

  • Participants: The study included 10 healthy subjects and 5 stroke patients with dysarthria.
  • Data Collection: A corpus of 47 Chinese words and 20 sentences, based on common daily communication needs, was used. Healthy subjects completed 100 repetitions per word and 50 per sentence, while patients completed 50 repetitions for both. Carotid pulse signals were recorded for the patient group.
  • Signal Processing: Silent speech signals were recorded at 10 kHz, downsampled to 1 kHz, and segmented into 144 ms tokens. Stress-isolation treatment was applied to prevent crosstalk between silent speech vibrations and the carotid artery.
  • Performance:
    • The system analyzed speech signals at the token level, enabling near real-time continuous expression.
    • Achieved an average per-word accuracy of 96.3% and 83.2% emotion recognition accuracy.
    • Recorded a 4.2% word error rate and a 2.9% sentence error rate under optimized synthesis conditions.
    • Patient satisfaction increased by 55% when using the sentence expansion mode.

Conclusion

The IT system offers a comprehensive solution for dysarthria patients, aiming for more natural communication through token-based decoding, emotion recognition, and user-selectable intelligent sentence expansion. The system has shown potential to reduce social isolation and support rehabilitation by decreasing the physical and cognitive effort required for communication. Future research will involve larger patient cohorts and broader vocabularies.