zen3-tts

High-Quality Text-to-Speech

An 82M parameter text-to-speech model delivering natural prosody and expressive speech across 40+ voices and 8 languages. Ideal for voice assistants, audiobook generation, accessibility tools, and interactive voice applications.

Specifications

Property	Value
Model ID	`zen3-tts`
Parameters	82M
Architecture	TTS
Voices	40+
Languages	8
Output	Audio (MP3, WAV, FLAC, OPUS)
Tier	pro max
Status	Available
Deployment	API only

Capabilities

Natural prosody with human-like intonation
40+ built-in voice presets across styles and genders
8 language support with native-quality output
Adjustable speaking rate and pitch
Streaming audio output for real-time playback
Voice cloning compatible architecture

API Usage

curl https://api.hanzo.ai/v1/audio/speech \
  -H "Authorization: Bearer $HANZO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "zen3-tts",
    "input": "Welcome to Zen AI. How can I help you today?",
    "voice": "nova",
    "response_format": "mp3"
  }' \
  --output speech.mp3

from hanzoai import Hanzo

client = Hanzo(api_key="hk-your-api-key")

response = client.audio.speech.create(
    model="zen3-tts",
    input="Welcome to Zen AI. How can I help you today?",
    voice="nova",
    response_format="mp3",
)

response.stream_to_file("speech.mp3")

Try It

Open in Hanzo Chat

Resources

Audio API -- Endpoint documentation
Technical Report

zen3-tts

zen3-tts

Specifications

Capabilities

API Usage

Try It

Resources

See Also

On this page