transcription

OpenAI logoWhisper Large V3 (best performance)

Access our most performant Whisper implementations for high-throughput production workloads.

Model details

Example usage

Transcribe audio files at up to a 1000x real-time factor — that's 1 hour of audio in under 4 seconds. This setup requires meaningful production traffic to be cost-effective, but at scale, it's at least 80% cheaper than OpenAI.

Get in touch with us and we'll work with you to deploy a transcription pipeline that's customized to match your needs.For quick deployments of Whisper suitable for shorter audio files and lower traffic volume, y ou can deploy Whisper V3 and Whisper V3 Turbo directly from the model library.

For more details about the inference API, please refer to our documentation.

Input
1import requests
2import os
3
4# Model ID for production deployment
5model_id = ""
6# Read secrets from environment variables
7baseten_api_key = os.environ["BASETEN_API_KEY"]
8
9# Call model endpoint
10resp = requests.post(
11    f"https://model-{model_id}.api.baseten.co/production/predict",
12    headers={"Authorization": f"Api-Key {baseten_api_key}"},
13    json={
14        "whisper_input": {
15            "audio": {
16                "url": "https://cdn.baseten.co/docs/production/Gettysburg.mp3"
17            }
18        }
19    },
20)
21
22print(resp.json())
JSON output
1{
2    "segments": [
3        {
4            "start_time": 0.768,
5            "end_time": 11.520000000000001,
6            "text": "four score and seven years ago our fathers brought forth upon this continent a new nation conceived in liberty and dedicated to the proposition that all men are created equal",
7            "log_prob": -1.7316513061523438,
8            "word_timestamps": []
9        }
10    ],
11    "language_code": "en",
12    "language_prob": null
13}

transcription models

See all
OpenAI logo
Transcription

Whisper Large V3 (best performance)

V3 - H100 MIG 40GB
Fixie Logo
Transcription

Ultravox v0.6 70B

v0.6 - H100
OpenAI logo
Transcription

Whisper Streaming Large v3

H100 MIG 40GB

OpenAI models

See all
OpenAI logo
Transcription

Whisper Large V3 (best performance)

V3 - H100 MIG 40GB
OpenAI logo
Transcription

Whisper Streaming Large v3

H100 MIG 40GB
OpenAI logo
Transcription

WhisperX

L4

🔥 Trending models