Research

Proof of AI: Verifiable Machine Learning on Chain

When an AI system makes a prediction, how do you know it actually ran the model it claims? In centralized systems, you trust the operator. Decentralized AI needs cryptographic proof. Today we introduce Proof of AI (PoAI), a framework for verifiable machine learning inference. The Trust Problem Consider a decentralized AI service: User submits input and payment Compute provider runs inference Provider returns output User receives result What prevents the provider from:...

June 26, 2023 · 3 min · 614 words · Zach Kelling

Zen Reranker: Two-Stage Retrieval Done Right

Embedding-based retrieval is fast but imprecise. Cross-encoder reranking is precise but slow. The combination unlocks the best of both. Today we release the Zen Reranker, purpose-built for two-stage retrieval. Two-Stage Retrieval Modern retrieval pipelines typically operate in two stages: Query -> [Embedding Retrieval] -> Top-K Candidates -> [Reranker] -> Final Results (fast, approximate) (slow, precise) Stage 1: Bi-encoder embeddings enable fast approximate search over millions of documents. Retrieve top-100 to top-1000 candidates....

March 13, 2023 · 3 min · 573 words · Zach Kelling

7680-Dimensional Embeddings: More Dimensions, Better Retrieval

Embedding dimensions have standardized around powers of two: 768, 1536, occasionally 4096. We asked a simple question: what happens if we go bigger? The answer surprised us. Background: Why Dimensions Matter Text embeddings map variable-length sequences to fixed-dimensional vectors. These vectors enable semantic similarity search, clustering, and retrieval. The dimension count determines the vector space’s capacity. Lower dimensions mean: Smaller storage requirements Faster similarity computations Potential information loss Higher dimensions mean:...

December 5, 2022 · 3 min · 507 words · Zach Kelling

Embedding Spaces at 7680 Dimensions

The Dimension Question How many dimensions does a text embedding need? The field has settled on conventions: 768 for BERT-scale models, 1536 for OpenAI’s ada-002, 4096 for some recent models. But these choices reflect architectural constraints, not fundamental requirements. We investigate what happens when we scale embedding dimensions to 7680—ten times the BERT baseline. Why Higher Dimensions? Capacity Arguments A $d$-dimensional embedding space can represent $\mathcal{O}(e^d)$ nearly-orthogonal vectors. For semantic search, we want documents with different meanings to map to different regions....

December 5, 2022 · 4 min · 647 words · Zach Kelling

GRPO: Group Relative Policy Optimization

Reinforcement learning from human feedback (RLHF) has become central to aligning language models with human preferences. But current methods like PPO are sample-inefficient and unstable. Today we introduce Group Relative Policy Optimization (GRPO), a new approach that addresses these limitations. The RLHF Challenge Standard RLHF follows three steps: Train a reward model on human preference data Use the reward model to provide training signal Optimize the policy with reinforcement learning (typically PPO) Step 3 is problematic....

September 19, 2022 · 3 min · 522 words · Zach Kelling