Claude Code Workflow

Voice Typing for Claude Code:
Dictate Long Prompts on Windows

Claude Code rewards detailed prompts. "Plan a refactor of the auth flow, replace JWT with sessions, keep backward compatibility for 30 days" works better than "fix auth." Dictating long prompts is the difference between one and ten iterations per hour.

Download for Windows
Microsoft Store
  • Trusted by Windows
  • Quick 30-second setup
"Plan a refactor of the auth flow, replace JWT with sessions, keep backward compatibility for 30 days..."

The CLI agent renaissance needs a voice layer

Claude Code, codex, Aider. The terminal is back as the highest-leverage coding interface. Voice keeps up with it.

For terminal-native coders

Dictate at 150 WPM into Claude Code

Claude Code does its best work when you describe context, constraints, files in scope, and acceptance criteria. That is a paragraph. Talk it through instead of typing it.

  • Works in Claude Code, codex CLI, Aider, Continue CLI
  • Any Windows shell: PowerShell, cmd, Windows Terminal, WSL
  • Whisper handles tech vocabulary cleanly
  • Local model, prompts never leave your machine
  • Free plan or $10/month, $80/year
When to type instead

Short commands and shell pipes

For a five-word git command or a fast cd, your hands are faster than your voice. The terminal-with-voice win is in the long agent prompts, not the regular shell.

  • git, ls, cd, cat, grep: just type
  • Multi-line agent prompts: dictate
  • SSH sessions and REPL chats: dictate
  • Database CLI tools like psql: dictate
  • Quick one-line fixes: type

Six things to dictate into Claude Code

The prompts that move work forward fastest are also the ones most painful to type

Multi-paragraph task descriptions

"Refactor the auth module to use sessions instead of JWT, with a 30-day compat window for existing tokens, update the middleware, write migration tests, leave a TODO at every callsite that needs review." Talk that out in 20 seconds.

Bug repro and context dumps

Claude Code can debug, but only with context. Describe the symptom, what you tried, the relevant files, and the suspected cause. A four-paragraph context dump dictated is a one-shot fix; the terse version is three round trips.

Plan-then-execute prompts

The "plan a refactor of X, do not write code yet, just outline the steps" pattern is high-value but only if you can express the constraints. Dictate the plan request, review the plan, dictate the green light to execute.

Code review and explanation requests

"Read the diff between main and feature-foo, tell me what behavior changed, flag anything that looks risky, suggest the next refactor that would make this safer." Long, specific prompts get long, specific answers.

Documentation and migration guides

Asking Claude Code to write README sections, ADRs, or migration notes is a long prompt because the requirements list is long. Dictation makes it cheap to actually list every section, audience, and tone constraint.

Iterative follow-ups during a session

The second and third prompts in a Claude Code session are usually short clarifications. Dictation is fastest for the long initial prompt, but it also smooths out the follow-up rhythm without changing the input mode mid-task.

Why Claude Code rewards long prompts

Claude Code is Anthropic's CLI agent for coding. It runs as a Node command-line tool, takes prompts from your terminal, and uses a Claude model with tool access to read, write, and edit files in your project. It is one of a wave of terminal-native AI coding tools that arrived in 2025 and 2026, along with codex CLI from OpenAI, Aider, Continue's CLI mode, and others.

Every one of these agents shares a property: the quality of their output is strongly correlated with the length and specificity of your prompt. "Fix the auth bug" produces guesswork. "The session cookie is being cleared when the user has the remember-me checkbox set; the relevant code is in auth/middleware.ts and the login form is in components/LoginForm.tsx; please trace why this might happen, propose the fix, and add a regression test" produces a one-shot fix.

That difference is the central reason developers want voice typing for these tools. The first prompt is 5 words. The second is around 50. Typing 50 words takes about 50 seconds at 60 words per minute. Dictating them takes about 20 seconds. Compound that across dozens of prompts a day and the time savings are real, but the bigger win is that you actually write the long version more often because the cost dropped. StarWhisper is the Windows desktop dictation layer that makes this practical.

How voice typing fits the terminal AI stack on Windows

The architecture of StarWhisper makes terminal use trivial. The app sits in your system tray, captures audio when you hold a hotkey, runs the audio through OpenAI Whisper locally, and pastes the resulting text into whatever Windows field has focus. Terminal panes are Windows text input surfaces, so they accept the pasted text exactly as if you had typed.

Concretely, that means voice typing works in:

  • Claude Code running in any Windows shell: PowerShell, cmd, Windows Terminal, Git Bash, WSL.
  • codex CLI, OpenAI's terminal coding agent, in any of the same shells.
  • Aider, the git-native AI pair programmer that operates as a CLI tool.
  • Continue CLI mode, Cody at the command line, and any other terminal-based coding agent.
  • Integrated terminals inside Cursor, VS Code, Windsurf, Zed, and JetBrains IDEs.
  • SSH sessions to remote dev boxes where you run any of the above against a remote project.

There is no per-tool integration. There is no plugin to install in each agent. There is no API key to manage in the dictation tool. The same hotkey works everywhere because the dictation happens before the tool sees the input.

Whisper accuracy on technical vocabulary

The standard concern with voice dictation for coders is that older speech recognition systems mangled framework names, library names, and product names. This was largely a function of the training data: systems trained on news and general English transcripts had no exposure to tech vocabulary. OpenAI's Whisper was trained on 680,000 hours of web audio, including a substantial amount of technical podcasts, conference talks, and tutorial videos, so the vocabulary baseline is much higher.

In practice, the names that come up in everyday backend, frontend, and ML work land correctly: React, Vue, Svelte, Next.js, Vite, Astro, Express, FastAPI, Django, Flask, Spring Boot, Rails, Phoenix, Postgres, MySQL, SQLite, MongoDB, Redis, Kafka, RabbitMQ, Docker, Kubernetes, Helm, Terraform, Ansible, Pulumi, TensorFlow, PyTorch, NumPy, Pandas, scikit-learn, Hugging Face, Anthropic, OpenAI, GitHub Actions, GitLab CI. Pro users on NVIDIA GPU paths get the medium or large Whisper model, which handles edge cases noticeably better than smaller models.

Names that are very new or very niche sometimes need a one-word correction. Newer LLM-related projects, internal company codenames, and unusual cli tool names occasionally come out phonetically rather than as the exact spelling. The good news is that the words you say most often, you learn to pronounce in a way Whisper transcribes cleanly within a couple of days. For everything else, fixing one word takes a second.

The pure CLI workflow, end to end

Here is what a typical Claude Code session looks like with voice typing on Windows:

  • Open Windows Terminal, navigate to your project directory.
  • Launch Claude Code with the CLI command.
  • Click into the terminal pane so it has focus.
  • Hold the StarWhisper hotkey, dictate your prompt: "Plan a refactor of the user model to add soft delete support. Cover the migration, the active record callbacks, the queries that need to filter out deleted records, and any background jobs that touch users."
  • Release the hotkey. The text auto-pastes into the Claude Code prompt.
  • Press Enter to send.
  • Read the plan. Approve, push back, or refine with another dictated message.

The interaction model is the same as typing a prompt, except faster and with less hand strain. Long prompts that you would not have typed because they were too tedious become trivial to send. The cumulative effect across a day is that your interactions with Claude Code become more deliberate and more specific.

Why local Whisper matters when dictating about code

Prompts you send to Claude Code describe your codebase. If the codebase is proprietary, the prose describing it is proprietary too. Sending that prose to a third-party cloud transcription service raises the same security review questions that come up around pasting code into a public LLM: where does the audio go, who has access, how long is it retained, can the vendor train on it.

StarWhisper runs Whisper locally on your CPU or GPU. The audio never leaves the machine. There is no transcription cloud, no audio retention period, no third-party vendor to audit on this dimension. If your laptop is on a plane with the WiFi off, dictation still works. That is structurally easier to defend in a security review than the standard cloud-dictation "we delete after 30 days" posture, because there is nothing to delete.

Cloud Mode, which sends audio to the OpenAI Whisper API for faster results on weaker hardware, is opt-in and disabled by default. For dictation about proprietary code or sensitive prompts, leave it off. On any modern NVIDIA GPU the local model is fast enough that there is rarely a performance reason to enable Cloud Mode. For more context on the privacy model, see the local vs cloud Whisper FAQ.

The speed math for typical Claude Code prompts

Prompt type Words Typing (60 WPM) Voice (150 WPM)
Short clarification 25 25 sec 10 sec
Bug report with context 150 2 min 30 sec 1 min
Refactor plan request 200 3 min 20 sec 1 min 20 sec
Multi-file change spec 400 6 min 40 sec 2 min 40 sec
Migration design brief 600 10 minutes 4 minutes

The numbers assume the dictated output is roughly 90% usable and needs a quick edit pass; that pass is already factored into the voice column. The point is not the precise minutes saved but the cumulative shift across a day. If you send a dozen long prompts to Claude Code in a working day, voice typing recovers around 30 to 45 minutes of focus time you would otherwise have spent at the keyboard.

Setup and first day with Claude Code plus voice

Install StarWhisper from the download page or the Microsoft Store. The installer auto-detects whether you have an NVIDIA GPU and picks the right pack: CPU, CUDA 11, or CUDA 12. First launch downloads the model files. After that the app lives in your system tray and listens for a hotkey.

Pick a hotkey that does not collide with anything in your terminal or shell. Right-side modifier keys (Right Ctrl, Right Alt) are good defaults. Mouse side buttons and USB foot pedals are popular for developers who already use programmable peripherals. Test by opening a Notepad window first, then move to your Claude Code session once you have confirmed the dictation flow.

For the first week, use voice for the long prompts only: the bug reports with context, the refactor plans, the multi-file change specs. Build the habit there because the wins are largest. Once that feels natural, extend to the medium-length prompts: clarifications, code review requests, documentation prompts. Most developers settle into a stable pattern within two weeks, after which voice typing becomes one of those tools you only notice when it is not available. For a broader overview across the whole AI coding stack, see voice typing for coding.

Frequently Asked Questions

Does Claude Code have a built-in voice mode?
No. Claude Code is Anthropic's CLI agent for coding, distributed as a Node command-line tool. It accepts text prompts from stdin and runs them through a Claude model with tool access for file editing and shell commands. There is no built-in microphone capture or voice input feature. The standard way to get voice into Claude Code on Windows is a desktop dictation layer that types into the terminal, the same way you would type a prompt by hand. StarWhisper fills that role.
Does it work in PowerShell, cmd, and Windows Terminal?
Yes. PowerShell, cmd, and Windows Terminal are all standard Windows text input surfaces. StarWhisper sends pasted keystrokes when you release the dictation hotkey, and the focused terminal receives them exactly as if you had typed. This means Claude Code running inside any of those shells gets the dictated prompt at its input. Windows Terminal users with multiple panes can dictate into whichever pane has focus, and switching panes does not require any reconfiguration of the dictation tool.
What about WSL? Does Claude Code in WSL work?
Yes. Whether you run Claude Code in a native Windows shell or in WSL through Windows Terminal, the WSL pane is still a Windows text input from the operating system's perspective. StarWhisper types into the pane and WSL receives the keystrokes through the standard PTY. If you have Claude Code installed inside Ubuntu, Debian, or another WSL distro, dictating prompts at it works the same as for the Windows-native version. The same applies to WSL panes inside VS Code or Cursor's integrated terminal.
Can I dictate multi-line prompts to Claude Code?
Yes, with a small workflow note. Claude Code accepts multi-line input through its built-in editor mode, usually triggered with a slash command or shift+Enter depending on your terminal. Dictate the whole prompt with the StarWhisper hotkey held down, release to let the text paste in, then submit. For very long prompts, many developers dictate first into a scratch file in their editor, review and edit, then paste into Claude Code. This gives a chance to fix any small Whisper transcription artifacts before sending.
How accurate is Whisper on technical terms?
Whisper was trained on a large web corpus including a lot of technical content, so common framework and library names land cleanly. React, Vue, Next.js, FastAPI, Django, Postgres, Redis, Kubernetes, Terraform, PyTorch, TensorFlow, scikit-learn all come out correctly in normal use. Pro users on GPU get the medium or large Whisper model, which handles tech vocabulary noticeably better than smaller models. Newer or obscure project names sometimes need a one-word correction, but the overall accuracy is well above what older speech recognition systems achieved.
Can I dictate actual code with this?
Yes for the most part. Whisper transcribes spoken English, so dictating prose works well and dictating literal code with punctuation like brackets, colons, and underscores works but produces less clean output. For code blocks, most developers find it more efficient to describe what they want in prose and let Claude Code write the actual code. For example, dictate 'add a Python function that takes a list of file paths, returns a dict mapping each path to its sha256 hash, and skips paths that do not exist' and let Claude Code produce the function. The voice handles the intent; Claude Code handles the syntax.
Does this work for codex CLI, Aider, and other terminal agents?
Yes. codex CLI (OpenAI's terminal coding agent), Aider, Continue's CLI mode, Cody at the command line, and any other terminal-based coding agent all accept the same dictated-then-pasted input. StarWhisper does not know or care which CLI tool you have running. If the terminal accepts pasted text at the prompt, it accepts dictated text. This is the same architectural reason the tool works equally well for SSH sessions, REPL prompts in Python or Node, and database CLI tools like psql or mongosh.
What is the cost?
StarWhisper has a free tier of 500 words per day and 3,500 per week, which is enough to dictate maybe one or two long prompts a day. The Pro plan is $10 per month or $80 per year for unlimited dictation, no daily or weekly cap. There is a 7-day free trial of the Pro plan to test heavier use without paying. The Whisper engine runs locally on your machine; there is no per-minute transcription cost from a cloud vendor. Pricing is documented in full on the pricing section of the StarWhisper homepage.

Dictate Long Prompts to Claude Code Today

Free 500 words per day. No credit card. Audio never leaves your device.

Download StarWhisper for Windows