⚡ Zen LM
Models

zen-reranker

568M dense cross-encoder model for search result reranking.

zen-reranker

Search Reranking

A 568M dense cross-encoder model that reranks search results for improved relevance. Takes query-document pairs and produces relevance scores, dramatically improving retrieval quality in RAG pipelines and search systems.

Specifications

PropertyValue
Model IDzen-reranker
Parameters568M
ArchitectureDense Cross-Encoder
Context Window8K tokens
StatusAvailable
HuggingFacezenlm/zen-reranker

Capabilities

  • Cross-encoder search result reranking
  • Query-document relevance scoring
  • RAG pipeline retrieval improvement
  • Multi-stage search refinement
  • Passage and document-level ranking
  • Lightweight enough for high-throughput production

Usage

HuggingFace

pip install transformers torch
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

tokenizer = AutoTokenizer.from_pretrained("zenlm/zen-reranker")
model = AutoModelForSequenceClassification.from_pretrained("zenlm/zen-reranker")

query = "What is retrieval augmented generation?"
documents = [
    "RAG combines retrieval with language model generation for grounded responses.",
    "The weather in Tokyo is sunny today.",
    "Vector databases store embeddings for semantic search.",
]

pairs = [[query, doc] for doc in documents]
inputs = tokenizer(pairs, padding=True, truncation=True, return_tensors="pt")

with torch.no_grad():
    scores = model(**inputs).logits.squeeze()

ranked = sorted(zip(documents, scores.tolist()), key=lambda x: x[1], reverse=True)
for doc, score in ranked:
    print(f"[{score:.3f}] {doc}")

API

from hanzoai import Hanzo

client = Hanzo(api_key="hk-your-api-key")

response = client.rerank.create(
    model="zen-reranker",
    query="What is retrieval augmented generation?",
    documents=[
        "RAG combines retrieval with language model generation.",
        "The weather in Tokyo is sunny today.",
        "Vector databases store embeddings for semantic search.",
    ],
)

for result in response.results:
    print(f"[{result.relevance_score:.3f}] {result.document.text}")

See Also

On this page