Every cloud transcription tool needs internet and uploads your audio. For confidential audio (legal, medical, R&D, journalism source material), that is unacceptable. StarWhisper runs Whisper locally on your Windows machine. Audio never leaves the device.
Install once with internet, work offline forever after.
Download the free installer from the StarWhisper homepage and run it. Setup takes about two minutes and includes a one-time download of the Whisper model files (a few hundred megabytes to several gigabytes depending on which model size you pick). This is the only time internet is required.
Turn on airplane mode, unplug the ethernet cable, disable the wifi adapter, or simply walk into a no-signal environment. StarWhisper does not need a connection to function. If you want hard proof, run Wireshark or any network monitor: zero packets will go out from the StarWhisper process during transcription.
Open StarWhisper. Drag any audio file (MP3, WAV, M4A, OGG, OPUS, FLAC, WMA, AAC, plus audio extracted from MP4, MOV, AVI, MKV videos) onto the window. The Whisper model on your hard drive decodes and transcribes the audio. Language is auto-detected from 96 supported.
StarWhisper processes the audio locally on your CPU or GPU. Speed: roughly 10 times faster than real-time on a modern laptop CPU and 50 times faster on an NVIDIA GPU with CUDA enabled. A one-hour audio file transcribes in about 6 to 12 minutes on CPU or 1 to 2 minutes on a mid-range GPU.
The transcript appears in the StarWhisper window when processing finishes. Copy to clipboard, save as .txt, or export with optional timestamps. The file is written to your local hard drive. No cloud sync, no upload, no telemetry. The text is yours and yours only.
Real situations where cloud transcription fails or is forbidden.
Long-haul flights, train rides, car trips in tunnels. Cloud tools die when the signal does. Local transcription keeps working on a fully-disconnected laptop.
Conference travel, business trips, hotel rooms with broken or throttled internet. The audio file you recorded earlier still transcribes locally, no network needed.
Recording studios, medical imaging rooms, hospitals with deliberate signal blocking, and similar environments where cloud uploads simply cannot reach the server.
Sensitive Compartmented Information Facilities prohibit network connectivity. StarWhisper's local architecture is compatible with airgapped requirements; deployment depends on facility approval.
Legal depositions, journalist source interviews, R&D audio, therapy session recordings. Cloud upload breaches confidentiality. Privacy concerns explained.
Cloud tools charge by the minute or by the month. Local install is free up to the daily word cap or 10 dollars per month for unlimited use. Pricing model.
Most people who type "how to transcribe audio" into a search engine end up at Otter, Rev, Trint, Happy Scribe, Sonix, Descript, or one of a dozen similar services. All of them work the same way: you upload your audio file to their servers, their backend (often the same OpenAI Whisper model that runs locally on your own machine) processes the file, and the transcript comes back. For a casual user with non-sensitive audio and good internet, this is fine.
It is not fine for a large fraction of real-world transcription needs.
Lawyers cannot upload privileged client conversations to a third-party server without violating their ethics rules and the attorney-client privilege itself. Doctors and therapists cannot upload patient audio without breaching HIPAA in the US or equivalent regulations elsewhere. Journalists working with confidential sources cannot upload source recordings without compromising the source. Corporate R&D teams cannot upload pre-release product audio without exposing trade secrets. Government and defense work obviously cannot upload anything classified. Even for non-regulated personal recordings, a lot of people simply do not want their voice memos sitting on someone else's server forever.
StarWhisper is built for this category of user. The Whisper model that powers it runs entirely on your Windows machine. After a one-time install (which is the only step that requires internet), the app does not need to phone home. You can transcribe in airplane mode, in a basement, on a plane, in a hotel with broken wifi, in a SCIF, or on a permanently airgapped workstation. Audio never leaves the device.
"Privacy" is easy to claim in marketing copy. Worth verifying for yourself when the stakes matter.
Install StarWhisper, complete the model download, then turn on airplane mode (or unplug the ethernet and disable wifi, if your laptop has a hardware wifi switch even better). Open StarWhisper and transcribe a file. It works. This single test confirms that everything in the transcription pipeline happens locally, because there is literally no network for it to use.
If you want absolute confirmation that StarWhisper is not uploading anything when the network is available, install Wireshark (free, open-source) or any equivalent packet capture tool. Filter on the StarWhisper process. Run a transcription. Observe zero outbound packets from the StarWhisper process during the transcription itself. Telemetry from the OS (Windows itself sends some background telemetry by default) will show up in the capture but is separate from anything StarWhisper does.
For SCIF or airgapped deployments where you need stronger guarantees, install StarWhisper on a connected machine, complete the model download, copy the entire installed folder (Program Files plus the cached models directory) to a USB stick using approved removable media procedures, transfer to the airgapped target machine, and run there. The app will function with no network at all because all the model weights and code dependencies sit on disk. This is also useful for the rare case where you want to deploy StarWhisper on many endpoints inside an enterprise network that does not allow outbound internet from workstations.
A paralegal records a deposition or witness interview for internal use. The recording contains privileged conversations and confidential client matters. Uploading to a cloud service like Rev or Otter would breach the attorney-client privilege (or at minimum require a Business Associate Agreement and ethical review). Drag the file into StarWhisper on a laptop, get the transcript locally, no upload. The same applies to recorded settlement negotiations, expert witness preparation calls, and any other audio that falls under the firm's confidentiality obligations. See voice-to-text for lawyers for the full legal workflow.
A clinician records a patient consultation for note-taking purposes (with consent). The audio contains protected health information (PHI). HIPAA permits processing PHI on the clinician's own equipment but generally restricts uploading it to third-party cloud services without explicit BAA agreements. Local Whisper transcription on the clinician's Windows workstation keeps the audio under HIPAA-appropriate control. See HIPAA compliance FAQ for more detail and voice-to-text for therapists for the mental health context.
An investigative reporter records a source interview. The source agreed to talk on condition that the recording would not be shared with anyone else, including a transcription vendor. Uploading the file to Otter, Rev, or any cloud service would technically share it with the vendor's infrastructure. Local transcription keeps the promise. See voice-to-text for journalists for the reporter-specific workflow.
Field researchers, conference attendees, and traveling professionals routinely have audio to transcribe but no reliable internet. Hotel wifi cuts out. Conference networks throttle large uploads. Train rides through dead zones. The cloud-tool workflow stalls; the local workflow does not. Drop the file, get the transcript, move on. See offline voice dictation Windows FAQ for related dictation scenarios.
A short explanation of why this is possible at all, since cloud transcription used to be the only option a few years ago.
OpenAI released the Whisper model as open source under the MIT license in 2022. Whisper is a single neural network trained on roughly 680,000 hours of multilingual audio. The largest version is around 1.5 billion parameters and runs in a few gigabytes of RAM. Smaller versions are a few hundred megabytes and run on a typical laptop CPU in real time or faster. This means the same model that powers most modern cloud transcription services can run locally on consumer hardware.
StarWhisper bundles the Whisper model with an audio decoding pipeline, a user interface, and the Windows-specific glue that lets you drag files in. After install, the entire pipeline sits on your hard drive: model weights, codec libraries, parsing code, output formatting. None of these need network access at runtime. The transcript is computed on your CPU or GPU and written to a local file. For a deeper architectural explanation, see privacy and offline architecture.
The trade-off compared to cloud services is hardware. Cloud transcription runs on whatever beefy server the vendor wants to provide. Local transcription runs on your laptop, which is slower than a dedicated GPU server. In practice, the speed difference matters less than you might expect: a typical modern laptop CPU transcribes audio at roughly 10 times real-time, meaning a one-hour file takes 6 to 12 minutes. An NVIDIA GPU with CUDA brings that down to 1 to 2 minutes. For most non-realtime workflows (transcribing an existing audio file, not live captioning), local speed is more than sufficient.
StarWhisper's free tier provides 500 words per day and 3,500 words per week of transcribed output. The free tier works offline exactly the same as the Pro tier; there are no online-only features locked behind the free wall. The only difference is the daily word cap.
For typical occasional use (a short recorded meeting, a personal voice memo, a single podcast clip), 500 words per day is enough. A 5-minute voice memo is roughly 700 words; a single typical paragraph of dictated text fits well under the cap. For high-volume professional use (full legal depositions, multi-hour medical session transcripts, daily journalism work), the Pro plan removes the cap at 10 dollars per month or 80 dollars per year. Pro plan details. A 7-day free trial of Pro is available if you want to verify the workflow on a long file.
The free tier is genuinely free with no credit card, no signup wall, and no trial-timer-converts-to-paid trap. It is not a teaser version of the paid app; it is the same app with a word cap. Commercial use is permitted on both tiers. For an in-depth look at the no-subscription approach, see the no-subscription pricing model.
If the offline angle is what brought you here, several related guides expand on adjacent topics. For the specific question of how local Whisper transcription compares to cloud Whisper, see Whisper local vs cloud. For the broader category of privacy concerns in transcription, see transcription privacy concerns. For format-specific offline workflows, see how to convert MP3 to text, how to convert M4A to text, and how to convert WAV to text, each of which uses the same local-only pipeline described above. For real-time offline dictation (not transcribing existing files), see the offline voice dictation Windows FAQ.
The full offline workflow for the most common audio format.
Deep dive on how local processing works and why it matters.
The problem cloud transcription creates and how to avoid it.
Real-time dictation offline, not just file transcription.