🪷 Zen LM
Models

zen3-tts-fast

Low-latency text-to-speech for real-time voice agents and interactive applications.

zen3-tts-fast

Low-Latency Voice Synthesis

An 82M parameter low-latency text-to-speech model built for real-time voice agents and interactive applications. Streams audio output with minimal first-byte latency, enabling fluid conversational AI experiences.

Specifications

PropertyValue
Model IDzen3-tts-fast
Parameters82M
ArchitectureTTS
OutputAudio (MP3, WAV, OPUS)
Tierpro
StatusAvailable
DeploymentAPI only

Capabilities

  • Low first-byte latency for conversational responsiveness
  • Streaming audio output for real-time playback
  • Voice agent and chatbot integration
  • Interactive voice response (IVR) systems
  • Real-time narration and announcements
  • High-throughput TTS for cost-efficient scale

API Usage

curl https://api.hanzo.ai/v1/audio/speech \
  -H "Authorization: Bearer $HANZO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "zen3-tts-fast",
    "input": "Your order has been confirmed. It will arrive in 3 to 5 business days.",
    "voice": "alloy",
    "response_format": "opus"
  }' \
  --output response.opus
from hanzoai import Hanzo

client = Hanzo(api_key="hk-your-api-key")

# Streaming for real-time playback
with client.audio.speech.with_streaming_response.create(
    model="zen3-tts-fast",
    input="Your order has been confirmed.",
    voice="alloy",
    response_format="opus",
) as response:
    response.stream_to_file("response.opus")

Try It

Open in Hanzo Chat

Resources

See Also

  • zen3-tts -- High-quality TTS with 40+ voices
  • zen3-tts-hd -- Broadcast-quality audio production
  • zen3-asr -- Real-time streaming speech recognition

On this page