Dictate target-language drafts faster than you type them. Works inside memoQ, Trados, Smartcat, Phrase, and DeepL. Local processing keeps client documents off the cloud.
Faster drafts in any CAT tool, in 96 languages, without sending text to the cloud.
Whisper handles the common translator pairs: EN-DE, EN-ES, EN-FR, EN-IT, EN-PT, EN-NL, EN-JA, EN-ZH, EN-KO, EN-RU, EN-AR, and 85 more. Strong on Romance, Germanic, and East Asian languages.
memoQ, SDL Trados Studio, Phrase TMS, Smartcat, Wordfast, OmegaT, MateCat, CafeTran, Across. Keystroke-based input means no plugin required and no broken integrations after CAT updates.
Bind one hotkey per target language. Press F8 for German, F9 for English, and so on. Switching languages mid-document is one keystroke, not a menu hunt.
Audio stays on your PC in Local Mode. Confidential legal, patent, M&A, and pre-launch product translations never touch a cloud server. Compliant with most standard translator NDAs.
Works inside DeepL's editor, Google Translate web, and any browser-based MT post-editing interface. Useful for the final human-review pass on machine translation output.
Transcribe source audio to timestamped SRT, then translate the file in your subtitling tool. Combined with dictation for the target text, subtitling projects move 30-50% faster.
Professional translators have used voice dictation for decades. The reason is simple arithmetic. Typing speed in your second or third language is usually slower than in your native language, often dramatically so. A translator who types 80 words per minute in English may only type 50 to 60 in German or French because the muscle memory for the keyboard layout, special characters, and language-specific punctuation has been built up over far fewer years. Speaking speed, by contrast, is roughly the same across all languages you are fluent in. You can speak German at the same pace you speak English: around 150 words per minute.
That gap (50 wpm typed vs 150 wpm spoken in the target language) is why translator dictation tools like Dragon Professional with the language pack have been a staple of the profession since the late 1990s. The new variable is that StarWhisper brings the same productivity gain at one-tenth the price, with the bonus that local processing means no audio ever leaves the laptop. For translators who handle legal, patent, medical, or corporate confidential work, the local-only constraint is not a nice-to-have, it is contractually required.
The accuracy story has also shifted. Whisper's multilingual training corpus is dramatically larger than the per-language models that powered Dragon's pre-2020 architecture. For German, Spanish, French, Italian, Portuguese, Dutch, Japanese, Mandarin, Korean, Russian, and Arabic, Whisper typically transcribes professional-grade source material at 95 to 98 percent word accuracy. That is comparable to a fast bilingual typist, and the corrections needed are usually one or two words per segment, not the constant fixing required by earlier-generation tools.
The standard StarWhisper setup for a working translator is one global hotkey per target language. For an EN-DE-FR translator, that means three hotkeys (English, German, French), each tied to the corresponding Whisper language mode. Open the Settings panel, find the Hotkeys section, and bind each language to a free function key. F7 for English, F8 for German, F9 for French is a common setup. F10 onward stay open for other languages added later.
Once bound, switching mid-document is one keystroke. Place the cursor in the target segment, press the hotkey for that language, speak, release. The text appears in the correct language. There is no menu to navigate and no confirmation dialog to dismiss. The hotkeys are global, meaning they work whether the foreground window is memoQ, Trados, a browser tab open to Smartcat, or a Word document for an unstructured translation.
For translators who work in more than three or four languages regularly, an alternative setup is to use a single hotkey for the dominant target language and use the menu picker for occasional secondaries. This keeps the keyboard real estate manageable. The trade-off is one extra click per language switch, which adds up if you actually work in five languages in the same day. Most multilingual translators settle on three hotkeys for their three highest-volume languages and the menu for everything else.
The basic flow in memoQ, Trados, Smartcat, or Phrase is identical because StarWhisper does not integrate with the CAT tool, it just sends keystrokes to wherever the cursor is. Open your CAT tool, navigate to the segment you want to translate, click into the target field, press the language hotkey, dictate the translation, release the hotkey. The text appears in the target field. Move to the next segment, repeat.
For tag-heavy segments (most marketing copy, software UI, technical documentation), the recommended approach is to dictate the prose first, then copy the source tags using the CAT tool's tag-copy shortcut. Trados uses Ctrl-Alt-down for "Copy source to target tags". memoQ uses F8 (so pick a different StarWhisper hotkey if you use memoQ). Smartcat has a tag-insert panel. Once tags are in, position your cursor inside the target text and the dictated translation drops in correctly. Some translators reverse the order, inserting tags first and then dictating the prose around them; either way works.
Concordance lookups and TM matches are unaffected because StarWhisper does not change anything about the CAT tool's behavior. You still see your TM matches, terminology suggestions, and QA flags as normal. Dictation is purely an input method for the target-language text. Some translators find it useful to dictate the gist of the translation first, then use the CAT tool's edit-distance display to refine toward the TM match. Others dictate cold and ignore the TM panel until a polish pass.
Subtitling is one of the strongest use cases for translator dictation, because subtitle segments are short, time-constrained, and need to read naturally in the target language. The combined audio-transcription and dictation workflow looks like this: extract audio from the source video using any free tool (FFmpeg, Audacity, VLC export). Open StarWhisper, drag the audio file in, set the source language, export as SRT. You now have a timestamped source transcript.
Open your subtitling tool of choice (Aegisub for hobbyists, Subtitle Edit for free professional use, EZTitles or OOONA Studio for studio work) and load both the video and the SRT. Translate each subtitle by clicking into the target field and dictating the translation using StarWhisper's per-language hotkey. The subtitling tool handles timing, line breaks, character counts, and reading-speed warnings; StarWhisper handles the typing.
For a 30-minute video at typical subtitling density (around 400 to 500 subtitles), this combined workflow typically cuts the project time by 30 to 50 percent versus typing every subtitle. The bigger savings come on dialogue-heavy content where each subtitle is full sentences; the savings are smaller on technical or visual content where most subtitles are short identifiers. The subtitling-from-scratch how-to covers the full pipeline including the audio-extraction step.
Translator NDAs are typically stricter than NDAs in most other professions. A pre-release product manual, a draft of a merger announcement, an unfiled patent specification, or a sealed court document carries serious consequences for any leak. The standard NDA language for high-end translation work prohibits using any cloud service for processing the source or target text, with some agencies maintaining explicit blacklists of providers (Google Translate, DeepL's free tier, Otter, anything that uploads text to a third party).
Local-only dictation sidesteps the issue cleanly. StarWhisper in Local Mode runs OpenAI Whisper on your PC. The audio is processed by code running on your CPU or GPU. No audio is uploaded, no transcript is sent anywhere, no telemetry includes the dictation content. From the agency's perspective, this is the same as if you were typing the translation yourself. The dictation step is invisible to the NDA. The privacy and offline architecture page goes into the technical details of what "local-only" actually means.
Cloud Mode exists as an opt-in feature for non-NDA work. It sends audio to OpenAI's Whisper API for processing, which is faster on machines without a GPU but obviously cannot be used for confidential content. The setting is clearly labeled and stays off by default. For translators who work across both NDA and non-NDA projects, the safe default is to leave Cloud Mode disabled and only flip it on for explicitly non-sensitive work.
| Option | Cost | Language coverage | Local processing | Works in CAT tools |
|---|---|---|---|---|
| StarWhisper | Free / $10 mo | 96 languages | Yes (Local Mode) | Yes (keystroke) |
| Dragon Professional + language pack | $699 + ~$300/language | 1 language per pack | Yes | Yes |
| Dragon Anywhere | $15/mo | 6 languages, mobile only | No (cloud) | Limited |
| Windows Speech Recognition | Free | ~10 languages | Yes | Yes, low accuracy |
| Apple Dictation | Free, Mac/iOS only | ~40 languages | Mixed (depends on mode) | Mac CAT tools only |
The cost gap is most dramatic against Dragon. A translator working in three target languages historically needed Dragon Professional plus two additional language packs, which adds up to roughly $1,300 in upfront cost. StarWhisper covers all three languages (and the other 93) on the $10 per month Pro plan. For a working translator the Pro plan pays for itself the first day, since one hour saved per week at typical translator rates of $40 to $80 per hour already exceeds the monthly cost. The multi-language feature page details which languages have which accuracy class.
Other StarWhisper pages for translation, writing, and multilingual work
Long-form writing, fiction, non-fiction. Dictation for anyone who builds documents word by word.
Scripts, captions, show notes. Faster content production for creator workflows.
The 96 languages StarWhisper supports, with accuracy notes for each tier of Whisper coverage.
Step-by-step subtitling pipeline using StarWhisper for transcription and dictation.