Models
zen3-audio-fast
Fastest speech-to-text transcription for high-throughput workloads.
zen3-audio-fast
High-Throughput Transcription
An 809M parameter speech-to-text model optimized for maximum throughput. Delivers fast, accurate transcription for high-volume workloads where processing speed and cost efficiency are the primary requirements.
Specifications
| Property | Value |
|---|---|
| Model ID | zen3-audio-fast |
| Parameters | 809M |
| Architecture | ASR |
| Languages | 100+ |
| Input | Audio (WAV, MP3, FLAC, M4A, OGG) |
| Tier | pro |
| Status | Available |
| Deployment | API only |
Capabilities
- High-throughput batch audio transcription
- Faster-than-realtime processing
- Multi-language support (100+)
- Timestamp generation (word and segment level)
- Punctuation and capitalization restoration
- Cost-optimized for large-scale transcription pipelines
API Usage
curl https://api.hanzo.ai/v1/audio/transcriptions \
-H "Authorization: Bearer $HANZO_API_KEY" \
-F model=zen3-audio-fast \
-F file=@interview.mp3 \
-F language=en \
-F response_format=jsonfrom hanzoai import Hanzo
client = Hanzo(api_key="hk-your-api-key")
with open("interview.mp3", "rb") as audio_file:
response = client.audio.transcriptions.create(
model="zen3-audio-fast",
file=audio_file,
language="en",
response_format="json",
)
print(response.text)Try It
Resources
- Audio API -- Endpoint documentation
- Technical Report
See Also
- zen3-audio -- Best quality transcription (1.5B)
- zen3-asr -- Real-time streaming speech recognition
- Pricing -- Full pricing table