⚡ Zen LM
Models

zen-dub-live

Real-time voice synthesis with ultra-low latency for live applications.

zen-dub-live

Real-Time Voice

A real-time voice synthesis model designed for ultra-low latency live applications. Streaming TTS with sub-200ms latency for conversational AI, live dubbing, and interactive voice experiences.

This model is coming soon. Join the waitlist at hanzo.chat.

Specifications

PropertyValue
Model IDzen-dub-live
ArchitectureStreaming Speech Transformer
Latency< 200ms first-byte
Languages50+
Sample Rate24 kHz
StatusComing Soon
HuggingFace--

Capabilities

  • Ultra-low latency streaming TTS
  • Real-time conversational voice
  • Live dubbing and translation
  • Emotion-adaptive speech
  • Voice activity detection
  • Interruption handling for natural dialogue

Usage

from hanzoai import Hanzo

client = Hanzo(api_key="hk-your-api-key")

# Coming soon -- streaming API
with client.audio.speech.create(
    model="zen-dub-live",
    input="This is streamed in real time with minimal latency.",
    voice="nova",
    stream=True,
) as stream:
    for chunk in stream:
        audio_player.write(chunk)

See Also

  • zen-dub -- Batch voice synthesis
  • zen-live -- Real-time bidirectional translation
  • zen-scribe -- Speech-to-text transcription

On this page