transcription

Whisper (best performance)

Access our most performant Whisper implementations for high-throughput production workloads.

Model details

Developed by
OpenAI
Model family
Whisper
Use case
transcription
Version
V3
Size
Medium
Hardware
H100 MIG 40GB
License
MIT

Example usage

Transcribe audio files at up to a 400x real-time factor — that's 1 hour of audio in under 9 seconds. This setup requires meaningful production traffic to be cost-effective, but at scale it's at least 80% cheaper than OpenAI. Get in touch with us and we'll work with you to deploy a transcription pipeline that's customized to match your needs.

For quick deployments of Whisper suitable for shorter audio files and lower traffic volume, you can deploy Whisper V3 and Whisper V3 Turbo directly from the model library.

Input

1import requests
2import os
3
4# Model ID for production deployment
5model_id = ""
6# Read secrets from environment variables
7baseten_api_key = os.environ["BASETEN_API_KEY"]
8
9# Call model endpoint
10resp = requests.post(
11    f"https://model-{model_id}.api.baseten.co/production/predict",
12    headers={"Authorization": f"Api-Key {baseten_api_key}"},
13    json={
14      "url": "https://www2.cs.uic.edu/~i101/SoundFiles/gettysburg10.wav",
15    }
16)
17
18print(resp.content.decode("utf-8"))

JSON output

1{
2    "segments": [
3        {
4            "start": 0,
5            "end": 9.8,
6            "text": "Four score and seven years ago, our fathers brought forth on this continent a new nation, conceived in liberty and dedicated to the proposition that all men are created equal."
7        }
8    ],
9    "language_code": "en"
10}

transcription models

See all

Transcription

OpenAI models

See all

Transcription

🔥 Trending models

LLM

Whisper (best performance)

Model details

Example usage

transcription models

Whisper (best performance)

WhisperX

Whisper V3

OpenAI models

Whisper (best performance)

WhisperX

Whisper V3

🔥 Trending models

Qwen 3 235B

Orpheus TTS

DeepSeek-R1

Explore Baseten today