zen3-asr

Real-Time Streaming ASR

Real-time streaming automatic speech recognition for live transcription and voice agents. Delivers word-by-word output via WebSocket with minimal latency, enabling responsive voice-driven applications.

Specifications

Property	Value
Model ID	`zen3-asr`
Architecture	Streaming ASR
Tier	pro max
Status	Available
Deployment	API only (WebSocket)

Capabilities

Real-time word-by-word streaming transcription
Sub-300ms latency for voice agent responsiveness
Live captioning and transcription
Voice-driven UI interaction
Speaker change detection
Interim and final result streaming

API Usage

import asyncio
import websockets
import json

async def stream_transcription(audio_source):
    uri = "wss://api.hanzo.ai/v1/audio/stream"
    headers = {"Authorization": f"Bearer {HANZO_API_KEY}"}

    async with websockets.connect(uri, extra_headers=headers) as ws:
        await ws.send(json.dumps({
            "model": "zen3-asr",
            "language": "en",
            "interim_results": True,
        }))

        async for audio_chunk in audio_source:
            await ws.send(audio_chunk)

        await ws.send(json.dumps({"type": "end"}))

        async for message in ws:
            result = json.loads(message)
            if result.get("is_final"):
                print(f"[FINAL] {result['transcript']}")
            else:
                print(f"[INTERIM] {result['transcript']}", end="\r")

asyncio.run(stream_transcription(audio_source))

Try It

Open in Hanzo Chat

Resources

Audio API -- Endpoint documentation
Technical Report

zen3-asr

zen3-asr

Specifications

Capabilities

API Usage

Try It

Resources

See Also

On this page