Privacy First

Transcription Privacy:
Keep Your Audio Off the Cloud

Many cloud transcription tools process audio on vendor infrastructure. That can be inappropriate for confidential client calls, medical conversations, legal work, HR meetings, or R&D recordings unless the vendor and workflow have been reviewed. StarWhisper can run OpenAI Whisper locally on your Windows PC. In Local Mode, audio is not uploaded for transcription.

Download Free for Windows
Microsoft Store
  • Windows 10/11
  • Simple Windows setup
"Local Mode active. 0 bytes uploaded."

The Cloud Transcription Problem

Where most transcription tools send your audio, and the alternative.

The status quo

Cloud-based transcription

Cloud transcription services process audio on vendor infrastructure and may retain files, logs, or derived content according to their own terms. The convenience is real. The review burden is also real.

For non-sensitive content this is fine. For NDA-bound calls, medical or legal conversations, HR matters, or anything covered by GDPR or HIPAA, it forces a procurement conversation and a paper trail before you can even use it.

Local alternative

Whisper running on your device

StarWhisper bundles the OpenAI Whisper model with the installer. When you transcribe, the model loads into your machine's memory, runs the audio through its neural network using your CPU or GPU, and produces text. There is no upload, no server, no log to subpoena, no retention period to ask about.

If you unplug the network, transcription still works. This is structural privacy, not a policy promise.

Six Privacy Properties

What "local processing" actually buys you

Zero upload by default

In Local Mode, no part of the audio leaves the device. This is checkable with any network monitor. You can verify it before you trust it.

Works offline

Disconnect from the internet, transcription still works. Cloud tools simply fail under the same conditions. Offline operation is the cleanest possible proof of local processing.

No retention window to manage

Cloud services handle retention according to their own policies. With Local Mode, transcription audio is not uploaded to a transcription vendor, but users should still review local files, application settings, and their own retention obligations.

No third-party LLM hop

Some cloud transcription services pass your audio through additional AI models for cleanup or summarization, multiplying the parties that have access. Local processing keeps the data path to one machine.

No subpoena surface

Reducing third-party copies can reduce one category of external request risk. This matters for journalists, lawyers, and anyone whose source material is sensitive enough that legal process is a real consideration.

No vendor lock-in

Whisper is open source. The audio you process today is not trapped in a vendor's account. If StarWhisper ever ceased to exist, the underlying model would still work.

What every cloud transcription service does with your audio

Open the privacy policy of a cloud transcription service and you need to review where audio is processed, who the subprocessors are, how long content is retained, whether model-training options exist, and what controls are available on your tier.

For competitor tools, verify current vendor documentation before use. The review checklist is:

  • Where audio is processed and stored.
  • Which subprocessors or AI providers can access audio or transcripts.
  • Retention windows for audio, transcripts, logs, and backups.
  • Whether model-training or product-improvement use is enabled by default.
  • Whether BAA, DPA, export, deletion, and audit controls are available on the plan you use.
  • Whether your policy allows third-party processing for the specific recording type.

For most use cases, this trade-off is fine. The cloud handles the heavy compute, you get a polished product, the audio is encrypted in transit, the company has SOC 2. For some use cases, no amount of policy is enough because the audio still leaves the trusted environment, and the trusted environment is the only one whose security you actually control.

What local-only transcription actually means

"Local" gets used loosely in marketing. Here is what it means in StarWhisper specifically.

The model lives on your disk

The OpenAI Whisper model files are bundled with the installer. They sit in the StarWhisper installation directory on your Windows drive. You can see them, you can checksum them, you can copy them to another machine. They are not loaded from the internet at runtime. After you have installed the app, you do not need a network connection to dictate.

Inference runs on your CPU or GPU

When you press the dictation hotkey, microphone audio is captured into a memory buffer, fed into the loaded Whisper model, and the model produces text using your machine's compute. No data is sent over the network. If your machine has an NVIDIA GPU, the inference runs on CUDA cores and is faster. If it does not, the CPU path works too, just slower.

There is no remote API call

This is the cleanest distinction between local and cloud transcription. A cloud product makes an HTTPS request to its API. A local product does not. You can confirm this by running a network monitor while you dictate. The result is the same as if the app had no internet permission at all.

What about updates and license checks

The app does talk to the network for two things: checking for new versions (only when you click the button, per StarWhisper's strict no-auto-update policy) and verifying your license if you are on the paid tier. Neither of those touch your audio. Both can be inspected separately. If you want to use StarWhisper on an air-gapped machine, the free tier requires no license check at all.

Use cases where local transcription is the right call

Healthcare and medical scribing

HIPAA-sensitive conversations should not be routed through any transcription workflow until the vendor, data path, settings, and retention policy have been reviewed. Local Mode can reduce transcription-vendor exposure because audio is not uploaded for transcription, but each practice still needs its own HIPAA review, workstation safeguards, retention policy, and Cloud Mode controls. We cover related considerations in voice to text for therapists.

Legal work and attorney-client privilege

Drafting privileged content into a cloud transcription tool is, depending on jurisdiction, either explicitly problematic or a gray area that most legal ethics opinions advise avoiding. The reasoning is that storing privileged communications on a third party's servers may waive privilege under some bar interpretations. Local Mode can reduce third-party processing exposure, but lawyers should follow their jurisdiction, client instructions, firm policy, and retention rules.

HR and personnel matters

Performance reviews, termination conversations, complaint investigations, and compensation discussions are exactly the type of content that should not appear in a third party's transcription database. Even if the SaaS vendor's posture is excellent, the surface area is unnecessary. Local transcription removes the question.

Journalism and source protection

If your source agreed to talk on background, "the audio is in our cloud, deleted after 30 days" is a different story than "the audio never left my laptop." Reputable journalists default to the second story when they can. Local transcription supports that default.

R&D, trade secrets, NDA-bound work

If your employer's data policy says "no customer data in third-party SaaS without security review," that same policy almost certainly applies to voice recordings of internal conversations about that data. Local processing keeps the conversation inside the trusted environment.

Government, defense, classified-adjacent work

For anything approaching SBU, CUI, or classified handling, cloud SaaS is generally off the table. Local processing is the only option that fits the threat model.

Comparison: local vs cloud audio handling

Property Cloud transcription StarWhisper Local Mode
Audio leaves device Yes No
Retention window 30 days typical, varies None (not stored)
Third-party LLM processing Sometimes No
Works offline No Yes
Subpoena-able server log Yes No
BAA required for HIPAA Yes Customer review
Used to train vendor models Sometimes (opt-out varies) Never
Works behind air gap No Yes
Verifiable by network capture Audio visible in transit Zero outbound

How to verify the privacy claim yourself

The reason "local" matters more than "private" is that local is checkable. You do not have to trust a policy statement. You can verify the property directly.

Test 1: Network capture

Install a network monitor on Windows. GlassWire is the easiest GUI option; Wireshark is the comprehensive one; the built-in Resource Monitor (Performance Monitor -> Network) is enough for a quick check. Start dictating in Local Mode and watch the StarWhisper process. You should see zero outbound bytes to any transcription endpoint during the dictation itself. The only outbound traffic associated with the app should be unrelated control-plane things like license verification or user-initiated update checks.

Test 2: Air gap

Disconnect from the network entirely. Disable Wi-Fi, unplug Ethernet, turn on airplane mode. Open StarWhisper and dictate. It still works. This is the cleanest proof because it is impossible to fake. Cloud transcription tools simply error out under air-gap conditions because they have nowhere to send the audio.

Test 3: Inspect the install

Open the StarWhisper installation folder. You will see the Whisper model files (the GGML or GGUF formats, depending on backend). These are large binary files (several hundred MB to a few GB depending on model size). Their presence on disk is what makes local processing possible. They are the model. They are the entire pipeline. Nothing about transcription has to leave the folder they live in.

What you cannot fully verify

You cannot verify that the app does not buffer audio to disk before discarding it. (It does not, but this is a code-level assertion.) You cannot verify Microsoft Windows itself is not capturing microphone audio independently. Those are separate concerns. For the OS layer, the standard Windows hardening guides apply.

Where cloud transcription wins, honestly

This is not a one-sided argument

For a lot of users, cloud transcription is genuinely the right tool. Multi-speaker meeting transcription with speaker labels is much better in Otter or Fireflies than in any single-microphone local tool. Cross-device sync works because the cloud is the storage layer. Automatic AI summarization runs faster on dedicated GPU servers than on a laptop. Customer support and integrations are stronger from a venture-backed product than a small Windows app.

If your content is not particularly sensitive, you are working across multiple devices, and you want the polished AI-summary-and-share workflow, a cloud tool is probably the better answer. StarWhisper is specifically the answer for users where the audio path matters, and the bar for adoption is whether you trust that path.

Specifically, cloud transcription is better when

  • You need speaker labels and multi-party transcription. StarWhisper is built for one speaker (you).
  • You need cross-device sync. StarWhisper is desktop Windows only, no mobile or cloud sync.
  • You want post-meeting AI summarization with action item extraction. This is a cloud-tool strength.
  • Your team has standardized on a particular tool. The integration cost may outweigh the privacy upside.

What about StarWhisper's optional Cloud Mode

StarWhisper does ship with an optional Cloud Mode that sends audio to the OpenAI Whisper API. This exists because some users on low-spec machines want faster transcription and do not have a privacy concern with cloud processing. Cloud Mode is:

  • Off by default. The app ships in Local Mode out of the box.
  • Opt-in. You enable it in Settings; the toggle is clearly labeled.
  • Reversible. You can turn it off at any time and the app returns to local-only behavior.
  • Disclosed. The settings UI explains what changes when you enable it.

If your reason for considering StarWhisper is privacy, keep Cloud Mode off. The full Local Mode experience does not require it. The deeper local vs cloud reference is on the Whisper local vs cloud FAQ page.

Pricing and how to start

StarWhisper is free to download. The free plan covers 500 words per day, which is enough for most users to evaluate the workflow on real content for a week or two before deciding. Pro is $10 per month or $80 per year and removes the daily limit. There is no per-seat pricing, no tier upsell, no usage meter beyond the daily word count. Full detail on the pricing section of the homepage.

System requirements are Windows 10 or 11. Any modern CPU works for the local Whisper path; an NVIDIA GPU makes it faster but is not required. The installer is a few hundred megabytes including the bundled model. Once installed, no network connection is needed for transcription. For more on the offline behavior, the dedicated privacy and offline features page goes into the architectural detail.

Frequently Asked Questions

Does StarWhisper ever send audio anywhere?
Not in Local Mode, which is the default. Audio is captured by your microphone, fed straight into the local Whisper model, turned into text, and discarded. There is no upload step, no third-party processor, no transcript stored on a remote server. The only way audio leaves the device is if you explicitly enable Cloud Mode in settings, which is opt-in and disclosed at the moment you turn it on.
What about Cloud Mode, when does that send audio?
Cloud Mode sends audio to the OpenAI Whisper API only after you explicitly enable it in Settings. It is off by default. You can disable it at any time. The toggle exists for users who want slightly faster transcription on low-end hardware and do not need local-only processing. The Local Mode default never touches the network for transcription.
Can I prove that the audio does not leave my device?
Yes. Open a network monitor like Wireshark, Resource Monitor, or GlassWire on Windows. Start a dictation session in Local Mode. You will see zero outbound traffic from StarWhisper to any transcription endpoint during transcription. The only network traffic associated with the app is occasional license verification and update checks, both unrelated to your audio.
What about telemetry or analytics, does that include audio?
No. StarWhisper's telemetry covers usage events (e.g., dictation started, app version, OS version) and crash reports. It does not include audio, transcribed text content, or any payload that could identify what you said. Telemetry can also be disabled in Settings if you prefer to send nothing at all. The full data inventory is documented in the privacy policy.
Is the transcript stored anywhere?
StarWhisper does not store a transcript history server-side. The transcribed text is pasted into the application you have focused (Word, Notion, Outlook, etc.) and that application handles storage on your own machine. If you use the optional local history feature, transcripts are saved to a folder on your PC that you control and can delete at any time. Nothing is uploaded.
What does local processing actually mean technically?
The OpenAI Whisper model is bundled with the installer and stored on your disk. When you dictate, the app loads the model into memory, captures microphone audio, runs the audio through the model's neural network using your CPU or GPU, and produces text. There is no remote API call. The same architecture would work on a fully air-gapped machine. This is fundamentally different from a SaaS transcription product where the model lives on the vendor's servers.
What about Windows itself or other apps spying on me?
That is a separate concern and outside the scope of any single application. Windows has its own telemetry, which you can configure in Settings. Other apps on your machine may have microphone access. StarWhisper cannot speak to what those do; it can only speak to what it does itself, which is process audio locally. If your threat model includes the OS, you should harden the OS independently.
How do I verify all of this for myself?
Three steps. First, run a network capture during dictation and confirm no upload. Second, check the StarWhisper installation folder to confirm the Whisper model files are present locally. Third, disconnect from the internet entirely and confirm dictation still works in Local Mode. The third test is the cleanest proof because cloud services would simply fail if the network were unavailable.

Try Local Transcription Free

500 words per day on the free plan. In Local Mode, audio is not uploaded for transcription.

Download StarWhisper