← Index/02 · Legal tech · 2025

Meridian
Retrieval at fleet scale.

A retrieval pipeline that holds up against twelve million documents and a tribunal-grade audit.

MeridianRAG · Search

✺ — The problem

Meridian's first-gen RAG was answering legal questions with the right vibe and the wrong citations. Their largest customer threatened to walk after a partner relied on a fabricated paragraph during a deposition. The team needed retrieval that was correct, attributable, and fast — not impressive.

Sector

Legal tech

Year

2025

Duration

9 weeks

Team

1 Principal · 2 Engineers

Stack

PythonpgvectorOpenSearchCohereModalDatadog

✺ — Approach

The same arc as every engagement — tuned to this problem.

01

Define · What 'correct' means

We co-wrote the evaluation rubric with two senior associates. Forty test queries with gold-standard passages. Anything not grounded in a real citation was scored zero, no partial credit.

02

Build · Hybrid retrieval, reranked

BM25 + dense embeddings + structured filters, fused with a cross-encoder rerank. Citations carried through every layer — the model could not write a sentence without a source it could point to.

03

Operate · Drift evals in CI

The evaluation suite runs on every PR. Silent regressions in retrieval recall now block merges instead of reaching customers two months later.

✺ — Outcome

Three numbers we’d defend in public.

94%

citation accuracy on the held-out audit set

320ms

median end-to-end retrieval latency

0

fabricated citations in 6 months of production

They didn't try to sell us a vector database. They asked us what 'correct' meant in our world, wrote a test suite around it, and then built the smallest thing that passed.

CTO, Meridian