Cursor IDE Workflow

Voice Typing for Cursor IDE:
Dictate Prompts and Code on Windows

Cursor is the AI-native IDE. Power users want voice input for Composer, the chat sidebar, and inline edits. No Mac voice plugin? No problem. StarWhisper sits in the Windows tray and types into any focused field, including Cursor.

Download for Windows
Microsoft Store
  • Trusted by Windows
  • Quick 30-second setup
"Refactor LoginForm to use the new AuthContext hook and drop the legacy localStorage logic..."

The Cursor power user's missing voice layer on Windows

Cursor changed how developers write code. Voice changes how fast you talk to it.

For the Cursor power user

Talk to Composer, type with your IDE

Composer needs long, specific prompts to do its best work. Typing a 400 word task description slows the iteration cycle. Dictating one keeps you in the loop while Cursor produces the diff.

  • Works in chat sidebar, cmd+K inline edit, and Composer
  • Also works in Cursor's integrated terminal and any agent CLI
  • Local Whisper, audio never leaves your machine
  • NVIDIA GPU acceleration for sub-second turnaround
  • $10/month or $80/year, or 500 words/day free
What this is not

Not a Cursor plugin, by design

StarWhisper is an OS-level dictation layer, not a Cursor extension. That sounds like a limitation; it is actually the point. Cursor ships breaking changes often. A plugin breaks. A keyboard layer does not.

  • No Cursor extension to install or maintain
  • Survives every Cursor release without re-testing
  • Also works in VS Code, Windsurf, JetBrains, Zed
  • Works in chat boxes outside the IDE too
  • Voice features are not gated by Cursor tier

Six places voice typing helps inside Cursor

The Cursor workflow has more prose surfaces than most developers realize

Composer task descriptions

Composer is at its best when you tell it the why, the constraints, and the files in scope. That is a paragraph or two. Dictating that paragraph keeps your hands free while you read the code you are about to refactor.

Chat sidebar conversations

Asking Cursor's chat to explain a function, suggest a refactor, or debug a stack trace is a conversation, not a single shot. Voice keeps the back and forth fast, especially when each follow up is two or three sentences of context.

cmd+K inline edits

The inline edit popup at cmd+K is the fastest path from intent to diff. Click into the popup, hold the StarWhisper hotkey, dictate the change, release, hit Enter. The whole loop takes seconds for a one sentence instruction.

Terminal CLI agents inside Cursor

Cursor's integrated terminal is where many developers now run Claude Code, codex, or Aider. The prompt input for each agent is just stdin. StarWhisper dictates straight into the terminal pane, same as any other Windows text field.

Comments and docstrings inside files

Cursor accepts dictation into the editor surface as well. Position your cursor on the line above a function, dictate the explanation, edit a few words, save. Comments stop being the part of the codebase that nobody writes.

Git commit messages and PR descriptions

The Cursor source control panel, the integrated terminal git commit prompt, and the GitHub web PR form all accept dictation. A two paragraph commit message goes from 90 seconds of typing to about 15 seconds of speaking.

Why Cursor and voice typing fit together so well

Cursor is structured around a few text input surfaces that you interact with all day: the chat sidebar, the cmd+K inline edit popup, the Composer pane, and the integrated terminal. Every one of those surfaces accepts prose, and every one of them rewards longer, more specific prose. StarWhisper is a Windows desktop app that adds dictation to any of them by listening for a hotkey and pasting transcribed text into the focused field.

The combination matters because Cursor moves the developer bottleneck. In a traditional editor, typing speed barely mattered, because the constraint was thinking, designing, and debugging. In Cursor, you spend more time describing what you want and less time hand-writing every line. The proportion of your day spent typing prose goes up. Dictation cuts that prose time roughly in half, since the average developer types around 60 words per minute and speaks around 130 to 160 words per minute.

There is no Cursor extension to install, no API key to manage, and no settings inside Cursor to configure. The integration happens at the Windows keyboard layer, which means it survives every Cursor update without any maintenance on your side. If a future Cursor release rearranges the sidebar or renames Composer, the same hotkey still dictates into whatever the new field is called.

What Cursor does and does not ship for voice on Windows

As of mid 2026, Cursor does not include a first-party voice input feature on Windows. The Cursor team has discussed voice in community channels and on Twitter, and there are sporadic experiments on the Mac side that wire up the system Dictation key or third-party menubar Whisper apps. Nothing of that exists as a built-in Cursor feature on the Windows build.

The Windows path that actually works today is a desktop dictation layer. StarWhisper does exactly this: it captures audio when you hold a hotkey, runs it through OpenAI Whisper locally, and pastes the resulting text wherever your cursor is. Cursor sees the pasted text the same way it would see text from your keyboard. There is no plugin to install, no extension marketplace to navigate, and no Cursor API to integrate with.

This architecture is also why the same setup works in Claude Code, codex, Aider, Windsurf, VS Code, Zed, JetBrains IDEs, and any chat box in your browser. The layer that does the dictation is below the application layer, so every application gets it for free.

How the prompt loop actually feels with voice in Cursor

The most common Cursor workflow looks like this: you have a feature to build or a bug to fix, you open Composer or chat, you describe what you want, Cursor produces a diff, you review the diff, you ask for an adjustment, repeat. The typing parts of that loop are the describe step and each adjustment.

With voice, the describe step changes character. Instead of writing a terse one line prompt because the long one is annoying to type, you naturally talk through the context: which files are in scope, what the current behavior is, what the desired behavior is, which edge cases matter. The prompt gets longer. The quality of the diff Cursor produces gets better. The number of adjustment rounds goes down.

Many developers report that the second order effect, fewer adjustment rounds per task, is larger than the first order effect of faster typing. Specifying the problem clearly the first time is more efficient than three rounds of underspecification followed by clarification. Voice removes the typing tax that pushes developers toward underspecification in the first place.

Setup for a Cursor developer on Windows

The full setup takes about three minutes. Install StarWhisper from the download page or the Microsoft Store listing. The installer auto-detects whether you have an NVIDIA GPU and picks the matching pack: CPU, CUDA 11, or CUDA 12. First launch downloads the Whisper model files. After that the app lives in your system tray and listens for the hotkey.

For Cursor specifically, choose a hotkey that does not collide with anything Cursor maps. The right-side modifier keys, especially Right Ctrl or Right Alt, are good defaults because Cursor's own shortcuts almost always use the left-side modifiers. Mouse side buttons are popular for developers who already use a programmable mouse. Some developers wire up a USB foot pedal because the hands-on-keyboard ergonomics are nicer than any keyboard chord.

Once the hotkey is set, the interaction is: click into a Cursor text field (sidebar, popup, Composer, terminal), hold the hotkey, dictate the prompt, release. The text auto-pastes into the focused field. Press Enter or click the submit button in Cursor exactly as you would for typed input. There is no special voice command grammar to learn and nothing in Cursor's interface that changes.

Privacy: code-adjacent dictation should stay local

The prompts you type into Cursor describe your codebase. If the codebase is proprietary, then the prose describing it is proprietary too. Sending that prose to a third-party cloud transcription service raises the same security review questions that come up around pasting code into a public LLM: where does the audio go, who has access, how long is it retained, can the vendor train on it.

StarWhisper runs Whisper locally on your CPU or GPU. Your audio never leaves the machine. There is no transcription cloud, no audio retention period, no vendor relationship to audit on this dimension. If your laptop is offline, dictation still works. That is structurally easier to defend in a security review than any "we delete after N days" cloud posture, because there is nothing to delete.

Cloud Mode, which sends audio to the OpenAI Whisper API for faster turnaround on weaker hardware, is opt-in and disabled by default. For dictation about proprietary code or sensitive prompts, leave it off. On a modern NVIDIA GPU the local model is fast enough that there is rarely a performance reason to enable it.

Voice typing in Cursor versus other editors

The same dictation layer works equally well across the Windows AI editor ecosystem. Switching between Cursor and VS Code with Copilot Chat, or experimenting with Windsurf, or trying Zed on Windows, does not change the dictation workflow. The hotkey works the same way in every editor because the dictation happens before the editor sees the input.

This matters because the AI editor market is moving fast. Cursor is dominant today; Windsurf has serious momentum; Zed is building toward similar territory; JetBrains has its own AI Assistant; Visual Studio added Copilot Chat. Betting on each editor shipping its own voice feature is a slow path. Using a dictation layer that works across all of them is a fast path. For a broader take on the developer use case across editors, see the voice typing for coding overview, which covers the cross-editor pattern in more depth.

When voice is the wrong tool, even in Cursor

Voice typing is for prose. Single line edits to a function body, quick syntax fixes, variable renames, and most refactor mechanics are still faster typed or done via Cursor's own refactor tools. The wins are real but they are in the prose surfaces: prompts, comments, commit messages, chat, PR descriptions, Slack threads explaining your design choice. If your Cursor workflow today is mostly hand-typing small edits and rarely using Composer or the chat sidebar, the upside is small.

If your Cursor workflow leans heavily on Composer for multi-file edits, on chat for design discussions, on cmd+K for inline transforms, and on terminal CLI agents for repo-wide tasks, voice typing pays back its setup time the first day. The richer the prose surface you live in, the more voice helps. For another related workflow, see how to use voice to text with ChatGPT, which covers the same pattern for browser-based chat tools.

Frequently Asked Questions

Does Cursor IDE have built-in voice input?
Cursor has experimented with voice features and there are community discussions about adding voice mode, but as of mid 2026 there is no first-party voice typing inside the Cursor desktop app on Windows. Some macOS users wire up the system Dictation key or third-party Mac tools like Whisper-based menubar utilities, but those do not exist as a Cursor feature on Windows. The standard workaround is a desktop dictation layer that types into any focused text field, which is the role StarWhisper plays.
Does this work in Cursor Pro and free tier alike?
Yes. StarWhisper is independent of which Cursor plan you have. It types into whatever text field is focused on screen, so the chat sidebar, Composer pane, and inline edit popup all accept dictation regardless of whether you are on Cursor Hobby, Pro, or Business. There is no API call to Cursor and no plugin installed on the Cursor side, so the app cannot detect or care about your tier. The only side that matters for the dictation itself is the StarWhisper plan, and the free tier of 500 words per day is enough for several prompts a day before upgrading.
Does it work with Cursor's inline AI features like cmd+K?
Yes. Press cmd+K (or ctrl+K on Windows) to open the inline edit popup, click into the prompt input, then hold the StarWhisper hotkey and dictate your instruction. Release the hotkey and the text auto-pastes into the popup. Hit Enter and Cursor runs the edit. The same flow works for the chat sidebar and Composer. Because the integration is at the keyboard layer, there is no UI to break when Cursor releases a new version; if the popup accepts typed text, it accepts dictated text.
Can I dictate variable names, file paths, or literal code?
You can, but it is rarely the right tool. Whisper transcribes spoken English, so dictating snake_case_variable_names or src/components/Auth/Login.tsx produces uneven results that need cleanup. The pattern that works is letting Cursor write the code while you describe what you want. For example, dictate 'Refactor the LoginForm component to use the new AuthContext hook and remove the legacy localStorage logic' and let Composer produce the actual diff. Voice handles the prose around the code; Cursor handles the code itself.
What is the latency from speaking to text in the Cursor prompt box?
Latency depends on the audio length and the Whisper model running on your machine. With an NVIDIA GPU and the medium model, a 30-second prompt usually takes about 2 to 4 seconds to transcribe after you release the hotkey. On CPU, the same prompt takes around 8 to 15 seconds. On a 4090 with the small model, latency can drop under a second. For Cursor prompts, which are often 100 to 400 words, the dictation plus transcription plus auto-paste cycle is faster than typing the same prompt from scratch even on CPU.
Does it work in Cursor's built-in terminal?
Yes. Cursor's integrated terminal is a standard Windows terminal pane that accepts pasted input. Open the terminal, click into it, hold the hotkey, dictate the command or message, release. The text appears at the cursor. This is useful when you are running a CLI agent like Claude Code, codex, or Aider from inside Cursor's terminal, since the agent's prompt input is just stdin. The same pattern works in PowerShell, cmd, Git Bash, and WSL panes opened inside Cursor.
What about VS Code, Windsurf, Zed, or other editors?
StarWhisper does not care which editor you use because it operates at the Windows input layer rather than as a plugin. VS Code with GitHub Copilot Chat, Windsurf, Zed on Windows, JetBrains IDEs with AI Assistant, and Visual Studio with Copilot all accept dictation into any text field including chat panes, inline prompts, comment boxes, and terminals. If you switch between Cursor and another editor during the day, the same hotkey works in both. This is one of the main reasons developers use a global dictation tool rather than betting on each editor shipping its own voice feature.
Is voice typing actually useful for vibe coding?
Yes, and this is the use case that converts most developers. Vibe coding means describing what you want in natural language and letting the LLM produce the code. The bottleneck moves from typing skill to specification quality. Long, specific prompts produce better Composer or chat output than short vague ones, but long prompts are slow to type. Dictating at 130 to 160 words per minute removes the typing tax on writing detailed prompts. Most developers who try voice dictation for Cursor find they iterate more often per hour, with better prompts each time.

Try Voice Typing in Cursor Today

Free 500 words per day. No credit card. Audio never leaves your device.

Download StarWhisper for Windows