Transcription Benchmark
An open-source tool for benchmarking transcription APIs against meeting audio. The same methodology behind every accuracy claim on this site, published for anyone to verify or run themselves.
Why meeting audio needs its own benchmark
Standard speech benchmarks (LibriSpeech, Common Voice) test clean audio: audiobooks, read sentences, podcast-quality recordings. Meetings are different. They have crosstalk, laptop microphones, screen-share audio bleeding through, engineers reading out variable names, and people switching languages mid-sentence.
If you're choosing a transcription API for meeting recordings, you need a benchmark that tests meeting conditions. Generic WER scores don't transfer.
What it measures
How many words the service got wrong vs. a human-verified transcript. Lower is better. Normalized for punctuation and casing.
Did it correctly identify who said what? Measured as time-weighted accuracy with automatic label mapping.
Wall-clock time from upload to transcript. Includes API overhead, processing, and any polling.
What you'll actually pay at current API pricing. Updated quarterly.
Latest results
Sample output from the benchmark CLI. Results will vary by audio sample and API version.
| Service | WER | Diarization | Latency | Cost/hr |
|---|---|---|---|---|
| Deepgram Nova-2 | 4.2% | 91.3% | 1.2s | $0.22 |
| Rev AI | 4.9% | 90.1% | 5.1s | $0.28 |
| AssemblyAI | 5.1% | 88.7% | 4.8s | $0.30 |
| OpenAI Whisper | 6.8% | N/A | 3.4s | $0.36 |
Whisper API does not support speaker diarization. Results from two-speaker standup sample, 26s audio.
Run it yourself
Test samples
The benchmark ships with scripted meeting recordings paired with human-verified transcripts. Each sample is tagged with speaker count, noise level, and meeting type.
Engineering standup, 2 speakers, clean audio. Covers technical vocabulary, short turns.
Sales demo call, 3 speakers, mixed audio. Covers product terminology, compliance questions.
You can bring your own audio. Place a WAV file and a ground-truth transcript JSON in a directory and point the CLI at it.
Supported services
Four adapters ship with v0.1. Adding a new one is a single Python file that normalizes API output into a common transcript format.