Vibe coding means dictating long, structured prompts to Cursor's AI rather than typing them. StarWhisper is a free local voice-to-text hotkey that types into Cursor's chat panel, inline edit, terminal, and any text field on Windows.
From install to dictating directly into Cursor's chat panel.
Download StarWhisper from starwhisper.ai or the Microsoft Store. Run the installer, allow microphone access on first launch. The free plan covers 500 words per day, enough for several long coding prompts. Pro is $10 a month if you dictate all day.
Launch Cursor IDE and open the project folder you are working in. Pull up the chat panel with Ctrl+L, or open inline edit on the current selection with Ctrl+K. Either context accepts dictated prompts because both are normal Windows text inputs.
Place your cursor in Cursor's chat input or in the inline edit popup. StarWhisper types into whichever Windows text control currently has focus, so the cursor needs to be in the right spot before you press the hotkey. The same applies to the integrated terminal if you want to dictate a shell command.
Press and hold the global dictation hotkey. The mic indicator confirms recording is live. The default works for most setups and you can rebind it in StarWhisper Settings if it conflicts with a Cursor or VS Code shortcut you already use.
Dictate the full request naturally. Example: build a drag-drop file uploader, PNG and JPG only, ten megabyte max, with a progress bar and accessible aria labels. Release the hotkey, scan the transcribed text, fix any misheard identifiers, then press Enter to send the prompt to Cursor's model.
Specific advantages for vibe coding and AI-assisted development.
Dictate the role, the context, the existing pattern to match, the constraints, the edge cases, and the requested output. All in one breath of speaking, faster than typing it out. Cursor's models work better with fuller briefs.
React, Tailwind, Zustand, fastAPI, kubectl, useEffect, queryClient, dotenv, ECMAScript, Postgres, dependency injection, all come through correctly. Whisper was trained on enough developer content to recognize mainstream identifiers and patterns.
Cursor, VS Code, Windsurf, Zed (when on Windows), JetBrains IDEs, GitHub Desktop, terminals, Notion docs, Linear tickets, GitHub PR descriptions. One hotkey, every text input on the OS.
The audio is transcribed by Whisper running on your CPU or GPU. Prompts stay local until you decide to hit Send to Claude, GPT, or whatever model Cursor is routing to. No third-party transcription service is involved.
500 words per day on the free plan covers many coding prompts. Pro is $10 per month or $80 per year for unlimited dictation, which most developers hit once voice-driven prompting becomes the default workflow.
NVIDIA GPU owners can install the CUDA pack for faster transcription on long dictations. Vulkan is the cross-vendor fallback for AMD and Intel GPUs. CPU path also works on any modern machine.
Vibe coding is the recent shift where developers spend less time typing literal code and more time describing what they want the AI to write. Cursor pushed this pattern hard, and Claude Code, GitHub Copilot Workspace, and Windsurf followed. The bottleneck is no longer compilation or syntax knowledge, it is the speed at which you can describe intent clearly enough for the model to do the right thing on the first try.
That kind of description is conversational. It sounds like: I want a React component that takes a list of products and renders them as cards with hover states, the hover should reveal a buy button with the price next to it, and clicking buy should call a function I pass in as a prop. Typing that takes maybe forty seconds. Dictating it takes ten. Across a day of forty or fifty prompts, that is half an hour saved on input alone, and the prompts that arrive are usually richer because there is less friction to keep adding context.
StarWhisper exists because the system-wide dictation built into Windows is too narrow for this workflow, it does not handle technical vocabulary well, has no global hotkey that works inside Cursor, and the older Windows Speech Recognition is even worse on tech terms. Whisper handles them. The model has seen enough developer-adjacent text in training that names like Zustand, swr, drizzle, prisma, vitest, and tailwind transcribe correctly out of the box.
Here are real shapes of prompts that benefit from dictation in Cursor. Each takes a few seconds to speak and would take a minute or more to type.
Each of those is the kind of long instruction Cursor's chat or inline edit handles well. Dictating gets the full thought into the input box in one pass, where typing tempts you to drop the trailing details.
Cursor exposes a few different surfaces for prompting the AI, and they all behave the same way for dictation purposes.
| Cursor surface | How to open | Dictation behavior |
|---|---|---|
| Chat panel | Ctrl+L | Dictate full conversational prompts, multi-paragraph instructions |
| Inline edit | Ctrl+K with code selected | Dictate short surgical prompts targeting the highlighted block |
| Composer | Ctrl+I | Dictate multi-file refactor prompts and feature briefs |
| Integrated terminal | Ctrl+` | Dictate commit messages, search queries, long flags |
| File rename / search | Any text input | Dictate file names, search strings, regex descriptions |
From StarWhisper's side, all of these are Windows text inputs and the dictation flow is identical for each.
Cursor sends your prompts to whichever model you have selected (Claude, GPT-4, GPT-5, Gemini, Cursor's own) which is the same as any other AI coding assistant. The new question dictation adds is: does the audio also get uploaded?
In StarWhisper Local Mode (the default), the answer is no. Audio is captured by the microphone, processed in memory by Whisper on your CPU or GPU, and converted to text on-device. The audio never reaches any third-party transcription service. The transcribed text is the same text Cursor would receive if you had typed it, and only then does it leave your machine through Cursor's normal API call to the underlying model.
If you are working under an NDA or with sensitive code where you cannot accept any model exposure, you would not use Cursor's AI features at all, and the same applies regardless of whether you dictate or type. For everyone else, dictation does not change the privacy posture, it just changes how the text arrived in the prompt box. See the local vs cloud Whisper FAQ for the full breakdown.
The same hotkey covers every other Windows code editor and AI tool. Voice typing for coding walks through the broader pattern, but as a quick reference, dictation works in:
One install, one hotkey, every text field on Windows. That is why developers who start using it for Cursor end up using it for everything else within a week.
Same hotkey, different surfaces.
Speak 500-word prompts into ChatGPT, Claude, Gemini, Perplexity.
The broader pattern of voice-driven development across every editor.
Overview of every way to use voice in the ChatGPT workflow.
Why one hotkey covers every Windows text input, from Cursor to Word.