AI Dictation Modes for Mac: Voice to Clean Text, Your Way
Raw transcription is a solved problem. What you actually want is voice that lands as clean, finished text in whatever app you're in, formatted the way that moment needs it. Verba does that with dictation modes: pick Flow for verbatim, Polish to clean up your speech, Intent to write to an instruction, or Coding for code, then layer a style on top. Switch with Fn+1..9, and the AI cleanup runs on a model you control, never a billed call we make for you.
What an AI dictation mode actually does
A mode is a recipe for turning your voice into text. It decides whether Verba transcribes you verbatim or sends the transcript through an AI cleanup pass, and it tells that AI what kind of output you want: a tidy sentence, a Slack reply, a commit message, a code snippet. Transcription happens on-device first (WhisperKit or NVIDIA Parakeet), so your audio never has to leave your Mac, and only the cleanup step touches the AI engine you've chosen. The result pastes straight into the app where your cursor is, system-wide. You're never copy-pasting out of a separate window.
- Transcribe on-device, then optionally clean up with AI you control
- Output is tailored per mode: verbatim, polished prose, an instruction's answer, or code
- Pastes at your cursor in any app, no separate window
- Cleanup runs on your Claude plan, your key, OpenRouter, or local Ollama
The four core modes: Flow, Polish, Intent, Coding
Verba ships six built-in modes; these four are the ones you'll live in. Flow is pure speech-to-text with no AI, the fastest path when you just want your exact words down. Polish reads your transcript and resolves the way people really talk, the self-corrections, the false starts, the 'no wait, make that', and writes the version you meant. Intent treats your speech as a command and writes the thing you asked for, so you can say 'reply to this saying I'll be ten minutes late, keep it casual' and get the message, not a transcript of the request. Coding is tuned for code and technical text, preserving symbols, identifiers and structure instead of prose-ifying them.
- Flow: verbatim, no AI, instant
- Polish: resolves filler and self-corrections into clean prose
- Intent: write to an instruction, not a transcript of it
- Coding: keeps symbols, identifiers and structure intact
- (Plus Translate and Context, covered on their own pages)
Custom modes and styles you build by talking
The six built-ins are a starting point, not a ceiling. You can have Verba build a custom mode for any recurring job, a standup-update mode, a customer-reply mode, a code-review-comment mode, with its own instructions and AI engine. On top of any mode you can layer a style: a lightweight modifier that shifts tone, length or format without changing the mode itself. Want the same Polish output but more concise, or more formal? Toggle a style. Modes change what gets written; styles change how it reads, and the two compose.
- Build custom modes for the jobs you repeat
- Styles layer on top of any mode to shift tone, length or format
- Modes and styles compose, so one mode covers many situations
- Each mode can route to a different AI engine you choose
Switching modes and styles without breaking flow
The whole point is to never leave the keyboard. Fn+1 through Fn+9 jump straight to a mode, so going from dictating an email to dictating code is one keystroke. Fn+] and Fn+[ cycle styles on top of whatever mode you're in. Because switching is instant and lives under your fingers, you end up using the right mode for each task instead of forcing one generic setting to do everything, which is what makes the output feel finished rather than 'transcribed and then fixed by hand.'
- Fn+1..9 jumps directly to a mode
- Fn+] / Fn+[ cycles styles on top
- Instant, keyboard-only, system-wide
- Right mode per task instead of one generic setting
You control the AI behind every mode
Verba is strictly bring-your-own-AI: it never makes a billed AI call on your behalf. Every mode that uses cleanup runs on an engine you pick, your Claude subscription through Claude Code with no key at all, your own Anthropic or OpenRouter key, or a fully local Ollama model so nothing leaves your Mac. Keys live in the macOS Keychain. That means a mode isn't just a prompt; it's a prompt plus the engine you trust to run it, and you can mix them, Polish on local Ollama, Intent on your Claude plan, whatever fits the work and your privacy bar.
- No markup, no billed-by-us calls, ever
- Your Claude plan (no key), Anthropic, OpenRouter, or local Ollama
- Keys stored in the macOS Keychain
- Run fully offline cleanup with a local model
Questions, answered
What are AI dictation modes in Verba?+
Dictation modes are presets that turn your voice into the kind of text you want. Verba ships six built-in modes, including Flow (verbatim, no AI), Polish (cleans up filler and self-corrections), Intent (writes to a spoken instruction), and Coding (preserves code and symbols). You can also build custom modes and layer styles on top.
What's the difference between Flow and Polish mode?+
Flow is pure speech-to-text with no AI, giving you your exact words instantly. Polish runs your transcript through an AI cleanup pass that resolves false starts, filler, and self-corrections, writing the clean version you actually meant to say.
How do I switch between dictation modes on Mac?+
Press Fn+1 through Fn+9 to jump straight to a mode, and Fn+] or Fn+[ to cycle styles on top of the current mode. Switching is instant and keyboard-only, so you can move from dictating an email to dictating code without leaving the keyboard.
Can I create my own custom dictation modes?+
Yes. Verba can build a custom mode for any recurring task, with its own instructions and a choice of which AI engine runs it. You can also apply styles, lightweight modifiers that change tone, length, or format, on top of any built-in or custom mode.
Which AI runs the dictation cleanup, and is it private?+
You control it. Verba is bring-your-own-AI and never makes a billed call for you. Cleanup runs on your Claude subscription via Claude Code (no key), your Anthropic or OpenRouter key, or a fully local Ollama model. Transcription is on-device by default, so with a local model your voice and text never leave your Mac.