Good open source models
Pypi TTS library ‣ includes a number of notable models
xTTS: from Coqui (defunct) in TTS library
Tortoise TTS: inspired by DALLE and diffusion
https://github.com/neonbjb/tortoise-tts
Tortoise-TTS Fully Explained | Part 1 | Architecture Design - YouTube
[D] What is the best open source text to speech model? : r/MachineLearning
Bibliography
WaveNet
looks like a convolution (not necessary, was just a design choice)
They actually use “dilated” connections to be able to look further back
Architecture
Building makemore Part 5: Building a WaveNet - YouTube
L2 Autoregressive Models -- CS294-158 SP24 Deep Unsupervised Learning -- UC Berkeley Spring 2024 - YouTube