AssemblyAI vs Subtitles King: cost, latency, and lines of code
If you are evaluating transcription APIs, AssemblyAI is probably on the shortlist. It is a serious product with a deep feature set. We are not going to pretend otherwise. This post is about where the two tools actually overlap, where they do not, and which one you should reach for based on the job in front of you.
The deep side-by-side lives at /vs/assemblyai. This post is the developer-eye summary.
What AssemblyAI is good at
AssemblyAI is the right call if your workload includes any of these:
- Real-time streaming transcription. Live captions, voice agents, real-time meeting tools. AssemblyAI ships a streaming endpoint with word-by-word output. Subtitles King does batch only.
- Speaker diarization. Multi-speaker recordings where you need to know who said what. AssemblyAI labels speakers; we do not, by design.
- Intelligence features. Sentiment, topic detection, content safety, PII redaction. AssemblyAI runs an LLM-style layer on top of the transcript. If you need that bundled, it is there.
- Enterprise compliance. SOC 2 Type II, BAA on request, dedicated capacity. The procurement story is mature.
If those are your requirements, the conversation is over. Use AssemblyAI.
What Subtitles King is good at
We optimized for one thing: subtitled video, end to end, callable by an agent or a script. The product fits a different shape:
- Free tier with no credit card. 100 MB videos, 24 hour retention, full pipeline. AssemblyAI gives you a free credit balance, not a free tier — when it runs out you put a card down.
- MCP support out of the box. A hosted MCP server at
https://brains.subtitlesking.com/mcpand an open-source stdio binary. AssemblyAI does not ship an MCP server. You can wrap their REST API in one, but you are writing it. - Self-hostable. The full pipeline (ffmpeg + Whisper + Go upload server + MCP shim) runs on a VPS you control. AssemblyAI is hosted only.
- Burned-in video out of the box. We hand back a video file with subtitles already rendered into the frame. AssemblyAI returns SRT, VTT, and JSON; you then run ffmpeg yourself to burn them in. That is not hard, but it is one more step in your pipeline.
The pipeline behind Subtitles King is exactly: upload → ffmpeg compress → OpenAI Whisper large → ffmpeg burn subtitles → return file. There is no proprietary model. We are honest about that — Whisper is the floor of quality, and for most subtitle work it is plenty.
Code comparison
Subtitling a video on AssemblyAI requires three steps you wire up yourself:
import assemblyai as aai
aai.settings.api_key = "..."
transcript = aai.Transcriber().transcribe("https://example.com/video.mp4")
with open("captions.srt", "w") as f:
f.write(transcript.export_subtitles_srt())
# now run ffmpeg yourself to burn captions.srt into video.mp4
The same task on Subtitles King over the REST API:
curl -X POST https://brains.subtitlesking.com/api/jobs \
-H "Content-Type: application/json" \
-d '{"video_url":"https://example.com/video.mp4","language":"en"}'
# poll /api/jobs/:id until status == "done", then download the file
Or through MCP, no code at all — you ask Claude to subtitle the video and the model handles the upload, status, and download tools itself. Walkthrough of that flow is on /mcp.
Rough pricing math
Always confirm current pricing on each vendor's site. As of writing:
- AssemblyAI: roughly $0.12 per hour of audio for the standard model, more for Universal-2 or with intelligence features turned on. No free tier; a small starting credit.
- Subtitles King free tier: videos up to 100 MB, 24 hour retention, unlimited within reason. No card required.
- Subtitles King paid: flat-rate plans rather than per-minute usage, details on /pricing.
- Self-hosted: your VPS bill plus the OpenAI Whisper API cost (or zero if you run Whisper locally on a GPU you already own).
If you are doing a few hours of video a month, AssemblyAI's pay-per-use math is fine. If you are doing tens of hours a month and only need captions on top of video, the flat plan or a self-hosted instance gets cheaper fast.
Which one fits which job
A quick decision matrix, opinionated:
| Job | Pick | |---|---| | Live captions on a stream | AssemblyAI | | Multi-speaker meeting transcripts | AssemblyAI | | YouTube uploads, podcast clips, course videos | Subtitles King | | Agentic workflow ("Claude, subtitle this") | Subtitles King | | Privacy-sensitive video that cannot leave your network | Subtitles King self-host | | Compliance-heavy enterprise with a procurement team | AssemblyAI | | Side project or hobby app | Subtitles King free tier |
The honest pitch: AssemblyAI is a deep transcription platform; we are a focused subtitle pipeline with an MCP front door. If your problem is "give me captioned video, ideally callable by an agent, ideally with a self-host escape hatch," start with us. If your problem is anything broader, AssemblyAI is probably the right answer.
The full feature-by-feature comparison lives at /vs/assemblyai. To run a video through our pipeline right now, /try takes about thirty seconds.