IRCAM 2021 – ViaDialog – David Guennec
Title of the intervention:
Towards helpful, customer-specific Text-To-Speech synthesis
Abstract of the intervention:
The subject of automatic speech synthesis began to democratize in the 1990s. Each of us has already dealt with those automated answering machine voices that initially made us all suffer. Today, however, the advances made in both understanding language and the acoustic quality of speech synthesis approaches have allowed us to make giant leaps, and new voice services are currently seeing their quality and capabilities rapidly increase with voices that are ever more human and expressive.
In this presentation, we will briefly review recent advances in speech synthesis. After this introduction, we will address topics related to customizing synthetic voices to meet client needs at several levels. First, at the level of the main components of spoken expression: language, speech style, language register, and gender, for example. Then, the issues related to the utterance; primarily prosodic (manipulation of pitch, rate). Finally, we will conclude by discussing subsidiary elements to consider in order to best meet the needs of clients and end users of synthetic voices in our constantly evolving world.
Information about the speaker:
Name: David GUENNEC
Mini bio: Computer scientist passionate about the history of sound reproduction, David Guennec specializes in the field of new voice technologies. After a doctorate focused on speech synthesis, he moved towards creating voice assistants integrating the entire chain of voice reproduction; from speech recognition to synthesis through natural language understanding. Currently employed at ViaDialog, he mainly focuses on speech synthesis and recognition.