Windows Speech Recognition (WSR) is the legacy dictation engine in Control Panel. It runs offline, supports a handful of languages, and reaches roughly 85 to 90 percent accuracy on clean American English. StarWhisper is a free Whisper-based desktop app that pushes that to around 98 percent on clear English, supports 96 languages, and handles accents that WSR cannot.
Both are free and both work on Windows. The accuracy gap is the reason most users switch.
You dictate notes, emails, documents, or chat messages and the legacy WSR engine has been frustrating you. You speak a language other than English, or your English has an accent. You want a tool that works without training a voice profile.
Your main need is voice control of Windows itself, opening apps, scrolling pages, clicking buttons by voice, for accessibility reasons. WSR's command grammar is built around that use case and StarWhisper does not try to replace it.
Six concrete differences if your main need is fast, accurate text dictation
Whisper-class models routinely benchmark at around 98 percent word accuracy on clean English. WSR's legacy engine sits in the high 80s. Over a day of dictation, the gap shows up as fewer corrections, fewer re-reads, and less typing afterward.
Non-native English speakers, British and Indian accents, regional US accents, and noisy environments all hit WSR's weak points fast. Whisper was trained on a much broader speech distribution, so accuracy holds up under conditions where WSR collapses.
WSR officially supports about seven languages with varying quality. StarWhisper covers 96 languages including German, Spanish, French, Italian, Portuguese, Dutch, Polish, Swedish, Russian, Japanese, Chinese, Korean, Hindi, Arabic, Turkish, Vietnamese, Thai, Indonesian, and more.
WSR uses an old-style voice profile that you train by reading sample paragraphs. Skipping this is technically possible but gives worse results. Whisper does not need or use per-user training. You install StarWhisper, press the hotkey, and dictate.
StarWhisper ships CUDA 11 and CUDA 12 GPU packs and a Vulkan fallback for non-NVIDIA hardware. On a modern NVIDIA GPU, transcription is effectively real-time. WSR runs CPU-only with no GPU acceleration path.
StarWhisper types into any Windows text field: Word, Outlook, Chrome, Slack, VS Code, your EHR, your CRM. WSR is similar in scope for dictation but its accuracy gap makes it impractical for serious work in apps that require precision.
Windows Speech Recognition (WSR) is the dictation and voice-control engine that has shipped with Windows since Windows Vista, and is still present in Windows 10 and Windows 11 under Control Panel, Ease of Access, Speech Recognition. It was Microsoft's primary speech engine for desktop dictation throughout the late 2000s and early 2010s. It has two main capabilities: dictating text into supported applications, and using voice commands to control the operating system, such as opening apps, switching windows, clicking buttons, and scrolling. The voice-command side is what most accessibility users remember it for.
Two things to know up front. First, WSR is genuinely useful for OS-level voice commands and remains a meaningful tool for some accessibility scenarios. Second, the dictation engine itself is old. It predates the deep-learning wave that produced models like OpenAI's Whisper, and it shows. On clean American English in a quiet room, with a trained voice profile, WSR can be decent. Outside of that profile, it falls apart fast.
Microsoft has effectively replaced WSR for everyday dictation on Windows 11 with the Win+H voice typing tool, which uses a different and more modern engine. Win+H is a separate product from WSR and we cover that comparison on the StarWhisper vs Windows voice typing page. WSR is still present, still works offline, and is still the right answer for users who want voice control of the operating system.
WSR is a Hidden Markov Model-style speech recognizer with N-gram language models, trained on Microsoft's pre-deep-learning speech corpora. StarWhisper uses OpenAI Whisper, a transformer-based encoder-decoder model trained on around 680,000 hours of multilingual and multi-task supervised data. The two systems are not in the same generation of speech recognition technology.
In practical terms, that means three things. On clean clear English audio, Whisper holds about a 7 to 12 percentage-point accuracy advantage. On accented English, that gap widens considerably; Whisper holds up where WSR's accuracy can drop below 70 percent. On languages other than English, WSR is either unsupported or noticeably weaker than its English performance, while Whisper's multilingual training gives strong coverage across 96 languages.
The gap is most visible when you have any of: a non-native English accent, a slight head cold, background noise from a fan or HVAC, a less-than-perfect microphone, fast speech, or technical vocabulary. Each of those conditions degrades WSR fast and Whisper much less.
| Feature | StarWhisper | Windows Speech Recognition |
|---|---|---|
| Speech engine | OpenAI Whisper | Microsoft HMM (legacy) |
| Accuracy on clean English | ~98% | ~85 to 90% |
| Accuracy on accented English | Strong | Often poor |
| Languages supported | 96 | ~7 |
| Voice training required | No | Recommended |
| Works offline | Yes (Local Mode) | Yes |
| NVIDIA GPU acceleration | Yes (CUDA 11, 12) | No |
| OS voice commands | No | Yes |
| Types into any text field | Yes | Yes |
| Free price | 500 words/day free; $10/mo Pro | Included with Windows |
| Audio stays on device | Yes (Local Mode default) | Yes |
| Active development | Active | Maintenance only |
The basic loop is straightforward. You install StarWhisper from the download page or the Microsoft Store. You configure a push-to-talk hotkey (the default works for most people). You position your cursor in any text field, whether that is a Word document, a Gmail compose window, a Slack message, a Notepad scratch buffer, or the address bar in Chrome. You hold the hotkey, speak, and release. The transcribed text is typed into the field.
There is no separate dictation window, no transcription preview, no edit step. The text appears where your cursor is, the same way as if you had typed it. If you make a mistake, you correct it with the keyboard the same way you would correct a typo. This is the same input model as WSR's dictation mode, with two differences: the accuracy is much higher, and you do not have to train a voice profile first.
For more depth on how the accuracy and the workflow translate to specific roles and apps, see the professional accuracy feature page and the broader works everywhere overview.
Use Windows Speech Recognition when your main need is voice-driven control of Windows itself. WSR's voice command grammar is mature, and combined with the on-screen reference card it gives you keyboard-free OS navigation. For some accessibility users this is the deciding factor.
Use StarWhisper when your main need is fast, accurate dictation of actual text into your applications. If you spend any meaningful part of your day typing into documents, emails, notes, chat, or web forms, the accuracy gap will pay for itself almost immediately. The free tier covers 500 words per day, which is enough to validate the workflow before you decide whether to upgrade.
You can run both. They do not conflict. WSR can be enabled for voice commands when you want OS control, and StarWhisper can be activated by hotkey for any actual writing. Several users do exactly this and the combination works.
WSR has three genuine strengths that StarWhisper does not match. First, it has built-in voice commands for navigating Windows: "open Notepad", "scroll down", "click File", "switch to Chrome". For accessibility users who rely on voice for OS-level control, this is real and StarWhisper does not try to replace it. Second, it is shipped with Windows and requires no separate install, account, or download; for a one-off task on a borrowed machine that matters. Third, it has been around for nearly two decades, so there is a large catalog of community-built voice macros, documentation, and tutorials.
If voice control of Windows is your primary need, or if you need a built-in zero-install option for a specific machine, WSR is the right tool. StarWhisper is positioned as the modern upgrade for text dictation specifically. The two roles overlap but are not the same. If the legacy dictation engine is what frustrates you, see our notes in the why is Windows dictation so bad problem-aware page for additional context, and look at StarWhisper vs Dragon if you are evaluating against the other legacy player in this space.
Comparison against the newer Win+H tool, not the legacy WSR engine covered here.
How StarWhisper compares to the other legacy Windows dictation product.
Problem-aware reading on why built-in Windows speech tools frustrate users.
Why Whisper's accuracy holds up under real-world conditions where legacy engines fail.