Robert Mac Auslan, Joel Mac Auslan, and Linda J. Ferrier-Reid of Phonologics chart the evolution of intelligibility assessments for non-native speakers of English With the increased use of global ...
Bipolar Disorder, Digital Phenotyping, Multimodal Learning, Face/Voice/Phone, Mood Classification, Relapse Prediction, T-SNE, Ablation Share and Cite: de Filippis, R. and Al Foysal, A. (2025) ...
Abstract: Are end-to-end text-to-speech (TTS) models over-parametrized? To what extent can these models be pruned, and what happens to their synthesis capabilities? This work serves as a starting ...
Abstract: Expressive text-to-speech (TTS) has become a hot research topic recently, mainly focusing on modeling prosody in speech. Prosody modeling has several challenges: 1) the extracted pitch used ...
IndexTTS2 has evolved from a traditional text-to-speech system into an intelligent, self-learning voice synthesis platform. With comprehensive AI enhancements, it provides unprecedented audio quality, ...
This library can be used to resynthesize pitch in natural speech using pitch contours taken from other speech samples, generated pitch contours, or through algorithmic manipulations of the source ...
Article Views are the COUNTER-compliant sum of full text article downloads since November 2008 (both PDF and HTML) across all institutions and individuals. These metrics are regularly updated to ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results