We offer two high-quality models with different strengths:
Kokoro (Recommended)
- Faster processing and smaller model size
- Better voice quality and naturalness
- Multiple English accents (US and British)
- Support for casual speaking styles
- Better handling of numbers and special characters
- Optimized for browser-based usage
OuteTTS
- Multilingual support (English, Chinese, Japanese, Korean)
- Multiple voice profiles available
- Adjustable voice characteristics
- Efficient model size (500M parameters)
- Good balance of speed and quality
- Browser and Node.js compatible
SpeechT5
- More diverse speaker accents
- Good for formal content
- Broader language support
- Based on the T5 architecture
- Larger model size with longer processing time
For most use cases, we recommend starting with Kokoro as it provides better performance and voice quality while being more efficient.