Poster
in
Workshop: Workshop on AI for Children: Healthcare, Psychology, Education
Evaluating Speech To Text For Children With Speech Impairment
Ifeoluwa Adeyeye-Kayode · Anthony Soronnadi · Ife Adebara · Olubayo Adekanmbi
Keywords: [ Speech defects ] [ Speech ]
Speech-to-text (STT) technology has advanced significantly, yet it remains inadequate for individuals with pediatric speech impairments such as dysarthria, dysphonia, stuttering, neurological disorders like cerebral palsy, and those with hearing loss–induced articulation disorders. This study evaluates the performance of three widely used STT models—Whisper, wav2vec 2.0, and LibriSpeech-basedsystems—on these speech conditions using standard speech recognition metrics, including Word Error Rate (WER), Match Error Rate (MER), Word Information Lost (WIL), and Word Information Preserved (WIP). Results indicate that all models exhibit substantial transcription errors when processing disordered pediatric speech. Whisper showed moderate success but frequently misclassified disordered speech as noise. wav2vec 2.0 demonstrated improved adaptability but struggled with irregular rhythm and prosody. LibriSpeechbased models, trained on fluent adult speech, performed the worst, rendering pediatric speech nearly unintelligible. These findings underscore the systemic exclusion of speech impairments in mainstream STT development. Future models must incorporate diverse pediatric datasets, improve adaptability to disordered speech, and prioritize accessibility in their design