Artificial production of human speech is known as speech synthesis. This machine learning-based technique is applicable in text-to-speech, music generation, speech generation, speech-enabled devices, navigation systems, and accessibility for visually-impaired people.
In this article, we’ll look at research and model architectures that have been written and developed to do just that using deep learning.
But before we jump in, there are a couple of specific, traditional strategies for speech synthesis that we need to briefly outline: concatenative and parametric.
Continue reading A 2019 Guide to Speech Synthesis with Deep Learning