Models
zen-dub-live
Real-time voice synthesis with ultra-low latency for live applications.
zen-dub-live
Real-Time Voice
A real-time voice synthesis model designed for ultra-low latency live applications. Streaming TTS with sub-200ms latency for conversational AI, live dubbing, and interactive voice experiences.
This model is coming soon. Join the waitlist at hanzo.chat.
Specifications
| Property | Value |
|---|---|
| Model ID | zen-dub-live |
| Architecture | Streaming Speech Transformer |
| Latency | < 200ms first-byte |
| Languages | 50+ |
| Sample Rate | 24 kHz |
| Status | Coming Soon |
| HuggingFace | -- |
Capabilities
- Ultra-low latency streaming TTS
- Real-time conversational voice
- Live dubbing and translation
- Emotion-adaptive speech
- Voice activity detection
- Interruption handling for natural dialogue
Usage
from hanzoai import Hanzo
client = Hanzo(api_key="hk-your-api-key")
# Coming soon -- streaming API
with client.audio.speech.create(
model="zen-dub-live",
input="This is streamed in real time with minimal latency.",
voice="nova",
stream=True,
) as stream:
for chunk in stream:
audio_player.write(chunk)See Also
- zen-dub -- Batch voice synthesis
- zen-live -- Real-time bidirectional translation
- zen-scribe -- Speech-to-text transcription