⚡ Zen LM
Models

zen-video

Video understanding model for frame analysis, captioning, and temporal reasoning.

zen-video

Video Understanding

A video understanding model that analyzes video content frame by frame. Answers questions about video content, generates descriptions, detects actions, and reasons about temporal sequences.

This model is coming soon. Join the waitlist at hanzo.chat.

Specifications

PropertyValue
Model IDzen-video
ArchitectureMultimodal Transformer
InputVideo (up to 10 minutes)
StatusComing Soon
HuggingFace--

Capabilities

  • Video question answering
  • Scene description and captioning
  • Action detection and classification
  • Temporal reasoning across frames
  • Key moment extraction
  • Content moderation for video

Usage

from hanzoai import Hanzo

client = Hanzo(api_key="hk-your-api-key")

# Coming soon
response = client.chat.completions.create(
    model="zen-video",
    messages=[{
        "role": "user",
        "content": [
            {"type": "video_url", "video_url": {"url": "https://example.com/video.mp4"}},
            {"type": "text", "text": "Summarize what happens in this video."},
        ],
    }],
)

See Also

On this page