Text to Speech

TTS

1. Kokoro

2. Zonos

Supports mandarin (cmn) and cantonese (yue)

3. Reviews

Increasing number of high quality neural network based TTS are now available:


source: AIPrintify

Notes:

  • Spark-TTS license changed to non-commercial

Language Code

A lot of TTS use the language code based on espeak-ng:

Cantonese

Alternatives

You can pick from MANY open sourced Text to Speech Engines

  1. GitHub - suno-ai/bark: 🔊 Text-Prompted Generative Audio Model
  2. GitHub - metavoiceio/metavoice-src: Foundational model for human-like, expressive TTS
  3. GitHub - coqui-ai/TTS: 🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
  4. https://github.com/myshell-ai/MeloTTS
  5. GitHub - jishengpeng/ControlSpeech: [ACL 2025 Main] ControlSpeech: Towards Simultaneous Zero-shot Speaker Cloning and Zero-shot Language Style Control With Decoupled Codec
  6. GitHub - fishaudio/fish-speech: SOTA Open Source TTS
  7. GitHub - jaywalnut310/vits: VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
  8. GitHub - 2noise/ChatTTS: A generative speech model for daily dialogue.
  9. GitHub - huggingface/parler-tts: Inference and training library for high-quality TTS models.
  10. GitHub - yl4579/StyleTTS2: StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
  11. https://github.com/jasonppy/VoiceCraft
  12. GitHub - neonbjb/tortoise-tts: A multi-voice TTS system trained with an emphasis on quality
  13. balacoon/tts · Hugging Face
  14. GitHub - snakers4/silero-models: Silero Models: pre-trained text-to-speech models made embarrassingly simple
  15. https://community.openconversational.ai/
  16. https://marytts.github.io/
  17. GitHub - ihuguet/picotts: Pico TTS: text to speech voice sinthesizer from SVox, included in Android AOSP

VITS

The default TTS engine Piper is based on VITS:

https://medium.com/@vansh_/using-vits-for-text-to-speech-tts-with-code-6e4e2c25e57d

Cantonese

StyleTTS2

Demo

Server

Piper

Under Windows - Team AI in Windows