Generate SRT and VTT subtitles locally on Windows in 96 languages. StarWhisper transcribes the audio track on your PC and gives you a subtitle file you can drop into Premiere, DaVinci, CapCut, or any editor. No upload, no per-minute pricing.
Local-only workflow, works with any editor.
Use FFmpeg, which is free and the de-facto Windows tool for audio extraction. The one-liner below pulls audio out of any common video format and saves it as MP3:
ffmpeg -i input.mp4 -vn -acodec libmp3lame audio.mp3
If you do not want to use FFmpeg, you can also export an audio-only version directly from your video editor (Premiere, DaVinci, CapCut all support this).
Open StarWhisper and drop the extracted audio file onto the app window. The desktop app auto-detects the format, picks a Whisper model that suits your hardware (CUDA on NVIDIA, Vulkan as a fallback, CPU baseline), and starts transcription. A 10-minute audio finishes in 1 to 2 minutes on a modern GPU.
Use the Export menu to save as .srt (SubRip, the universal default) or .vtt (WebVTT, used by browsers and YouTube). Both formats include phrase-level timestamps that any video editor or video player can read. Keep both around if you are not sure which one your target tool wants.
Drag the .srt onto the timeline in Premiere Pro, DaVinci Resolve, Final Cut Pro, CapCut, Kdenlive, Shotcut, or any editor that accepts subtitles. The editor places the lines on a captions track aligned with the timeline. Most editors let you adjust timing, font, size, color, and position from there.
Skim the timeline, fix any mis-heard proper nouns (Whisper is good at words, less good at brand names it has not seen). Adjust styling to match your video's look. Export the video either with subtitles burned in (good for TikTok, Reels, Shorts) or with the SRT as a sidecar file (good for YouTube, Vimeo, and accessibility).
What you actually get versus online subtitle generators and paid editor features.
Online generators upload your video to their servers. For client work under NDA, unreleased content, internal corporate videos, or anything sensitive, that is a non-starter. StarWhisper runs locally and the video never leaves your PC.
Both major subtitle formats are supported. SRT works in every editor and player on the planet. VTT works for HTML5 video and YouTube uploads. Pick the one that fits, or export both.
Whisper supports far more languages than Premiere's built-in transcription. If your content is in Spanish, French, German, Japanese, Korean, Hindi, Arabic, Turkish, Polish, Vietnamese, or any of dozens of others, the same workflow applies.
Online subtitle services charge by the minute. A 1-hour video can cost $5 to $10 per pass. StarWhisper has a free plan for short videos and a flat $10/month or $80/year Pro for unlimited use.
StarWhisper produces a standard subtitle file. From there you can use Premiere, DaVinci Resolve, Final Cut, CapCut, Kdenlive, Shotcut, OBS, or even free subtitle editors like Subtitle Edit and Aegisub.
On NVIDIA hardware with a CUDA pack installed, StarWhisper transcribes at many times real-time speed. A 30-minute video is ready for editing in 2 to 5 minutes instead of waiting overnight on a free online tool.
There are three real options for adding subtitles to a video on Windows. The first is to use a paid subtitle feature inside an editor like Adobe Premiere Pro, DaVinci Resolve Studio, or Final Cut Pro on Mac. Quality is good and timing is precise, but the price tag is steep (Premiere is a Creative Cloud subscription, DaVinci Studio is a one-time $295, Final Cut is Mac-only at $300). The second is to use a free online subtitle generator that uploads your video. Quality varies, language coverage is uneven, and the privacy story is bad for any non-public content. The third is to run transcription locally with a tool like StarWhisper and import the resulting SRT into whatever editor you already use, free or paid.
The third path is the one that has gotten significantly more viable in the last two years because OpenAI Whisper, the underlying model, is genuinely good. Whisper's large variant matches or exceeds the accuracy of paid commercial systems on a wide range of content, and it runs on a normal PC. StarWhisper wraps Whisper in a Windows desktop app with file drop, language detection, GPU support, and subtitle export, so you do not need to invoke Python or build anything to get a usable .srt.
Comparing real options honestly.
| Tool | Cost | Languages | Privacy |
|---|---|---|---|
| Premiere Pro auto-captions | Creative Cloud subscription | ~15 strong | Adobe cloud processing |
| DaVinci Resolve Studio captions | $295 one-time | ~13 | Local in Studio |
| Rev AI subtitles | $0.25/min | ~36 | Uploaded to Rev |
| Online free subtitle generators | Free with caps | Varies | Uploaded |
| YouTube auto-captions | Free | ~13 strong | Uploaded to YouTube |
| StarWhisper SRT export | Free up to 500 w/day, $10/mo Pro | 96 | Local, no upload |
The honest case for paid tools: integrated editing, broadcast-grade per-word alignment, and styling baked in. The case for StarWhisper: privacy, language coverage, and price. For most independent creators, YouTubers, podcasters, and small studios, the trade favors the local approach.
FFmpeg is the standard tool for pulling audio out of video. It is free, runs from the command line, and handles every container format you will encounter. Install it from ffmpeg.org and add it to your PATH so you can call ffmpeg from any folder. Common one-liners:
// MP4 video to MP3 audio
ffmpeg -i input.mp4 -vn -acodec libmp3lame audio.mp3
// Any video to WAV (best quality for transcription)
ffmpeg -i input.mov -vn -acodec pcm_s16le -ar 16000 -ac 1 audio.wav
// Trim to a specific section, useful for testing
ffmpeg -ss 00:01:00 -to 00:02:00 -i input.mp4 -vn -acodec libmp3lame clip.mp3
// Batch process every .mp4 in a folder (PowerShell)
Get-ChildItem *.mp4 | ForEach-Object { ffmpeg -i $_.Name -vn -acodec libmp3lame ($_.BaseName + ".mp3") }
For transcription, mono 16 kHz WAV is the gold standard input. MP3 at any bitrate is fine for general use. Whisper does not need high-bitrate stereo audio.
Window > Captions, then drag the .srt onto the Captions panel. Premiere creates a captions track on the timeline that you can style and position. Export with "Burn Captions Into Video" to embed.
In the Edit page, right-click the media pool and Import Subtitle. Drop the resulting subtitle track onto the timeline. Customize style under Inspector > Captions.
Click Text > Captions > Import. Select the .srt file. CapCut adds a caption layer matched to your timeline. Useful for vertical edits and social formats.
Both free editors accept SRT imports natively. Kdenlive: Project > Subtitles > Import. Shotcut: Drop the SRT into the timeline like a clip.
If you want to refine subtitles before importing into a video editor, Subtitle Edit is the best free tool. Open the .srt, the audio waveform, and adjust lines visually with sync to playback.
For YouTube uploads, open the video in Studio, choose Subtitles, then Add Language, then Upload File, and pick the SRT. The Whisper-generated captions usually beat YouTube's auto-captions on accuracy and language coverage, especially for non-English audio. See the YouTube transcription guide for related workflows.
Two ways to ship a subtitled video. Each has its place.
Sidecar: A separate .srt file that travels alongside the video. The viewer's player (or YouTube, Vimeo, etc.) loads it on top of the video at playback time. Advantages: viewers can toggle subtitles on and off, multiple languages can ship as separate files, and the video itself is unmodified. Use this for YouTube, Vimeo, accessibility on websites, and anywhere you want viewers to control captions.
Burned in: The subtitles are rendered as part of the video frames. They cannot be turned off. Advantages: works on platforms that strip sidecar files (TikTok, Instagram Reels, Twitter video, many email clients), and the styling is exactly what you designed. Use this for short-form social, ads, anything where viewer settings cannot be trusted to display captions correctly.
You can burn subtitles directly with FFmpeg if your editor's export is overkill for the task:
ffmpeg -i input.mp4 -vf subtitles=subs.srt output.mp4
For styled subtitles (font, color, position), most editors offer better control than the FFmpeg one-liner.
Whisper handles 96 languages, and the export workflow is identical for each. If you produce content in multiple languages, the standard pattern is:
video.en.srt (or whatever language code).video.es.srt, video.de.srt, etc.The same workflow extends to podcasters publishing video versions of their shows, content creators producing for international audiences, and educators making lecture content accessible.
Other ways to get the words out of your media.