SCS Undergraduate Thesis Topics
|Vinay Vemuri||Alan W Black||Reconstructing Dysarthic Speech from Cross-Speaker Articulatory Position Data using Speech Synthesis and Voice Conversion Techniques|
Dysarthria is a motor speech disorder that results from serious injury to a major component of the human speech system. Traditional speech synthesis techniques have often proven insufficient in constructing clear synthetic speech from source recordings of speakers affected by dysarthria. In this project, we propose an alternate approach to constructing a synthetic voice for a dysarthic speaker BF (a tongue cancer patient whose tongue had been surgically removed during cancer treatment) with the goal of constructing synthetic speech that both sounds clear and preserves distinctive acoustic features of BF's original voice. Our approach centers around the idea of constructing "an artificial tongue" for BF and using this along with information about the positions of BF’s other major articulators for any given sentence to build a voice. Since no information about the positions of BF's articulators is available, we use recordings and corresponding articulator position data (APD) of an individual who will be referred to as MSAK to construct an articulatory speech synthesizer that predicts APD given acoustics. Using the articulatory speech synthesizer and recordings of BF post-surgery, we determine the positions of all of BF's articulators except the tongue. Next, we synthesize recordings of MSAK speaking the same sentences as BF and run the articulatory speech synthesizer on newly synthesized MSAK recordings to determine the positions of MSAK's articulators for BF's sentences. We then use the predicted position of MSAK's tongue as an approximation of BF's tongue (the artificial tongue) and use this, along with the predicted positions of BF's other articulators to construct a voice for BF.