- Model the acoustic parameters of speech
- Synthesis audio from parameters
- Hidden Markov Model - precursor to tacotron
HMM based TTS pipeline
- Text
- Phonemes
- Duration Model (HMM)
- Acoustic Model (HMM)
- Vocoder - Parametric
- Waveform
graph TD A[Text] -->|G2P| B[Phonemes] B -->|State durations| C[Duration Model - HMM] C -->|Timed states| D[Acoustic Model - HMM] D -->|Spectral features| E[Vocoder] E -->|Audio signal| F[Waveform]
Example
- HTS