Integration of Metamodel and Acoustic Model for Dysarthric Speech Recognition
|
Title | Integration of Metamodel and Acoustic Model for Dysarthric Speech Recognition |
Authors | |
Abstract | We investigated the speech recognition of a person with articulation disorders resulting from athetoid cerebral palsy. The articulation of the first words spoken tends to be unstable due to the strain placed on the speech-related muscles, and this causes degradation of speech recognition. Therefore, we proposed a robust feature extraction method based on PCA (Principal Component Analysis) instead of MFCC, where the main stable utterance element is projected onto low-order features and fluctuation elements of speech style are projected onto high-order features. Therefore, the PCA-based filter will be able to extract stable utterance features only. The fluctuation of speaking style may invoke phone fluctuations, such as substitutions, deletions and insertions. In this paper, we discuss our effort to integrate a Metamodel and an Acoustic model approach. Metamodels have a technique for incorporating a model of a speaker’s confusion matrix into the ASR process in such a way as to increase recognition accuracy. The integration of metamodels and acoustic models enables fluctuation suppression not only in feature extraction but also in recognition. The proposed method resulted in an improvement of 9.9% (from 79.1% to 89%) in the recognition rate compared to the conventional method. |
Publisher | ACADEMY PUBLISHER |
Date | 2009-08-01 |
Source | Journal of Multimedia Vol 4, No 4 (2009) |
Rights | Copyright © ACADEMY PUBLISHER - All Rights Reserved.To request permission, please check out URL: http://www.academypublisher.com/copyrightpermission.html. |