Real-Time Transcription

Real-time transcription shows your words appearing in a floating window as you speak, giving you instant visual feedback. Instead of waiting until you stop recording, you can see the text forming in real time.

How It Works

When enabled, StarWhisper processes your speech in small chunks while you're still talking. A transparent overlay window appears showing the live transcription. When you stop recording, the final transcription uses the complete audio for maximum accuracy.

Key points:

Live preview uses the Small model (maximum) for fast processing
Final transcription uses your full model selection for best accuracy
Requires Local Whisper mode (not available with OpenAI API)
GPU acceleration is strongly recommended for smooth real-time performance

Setting Up

Right-click the StarWhisper circle → Settings
Go to the Transcription tab
Toggle "Real-Time Transcription" on
Adjust the Chunk Size slider if needed (see below)

Real-time transcription options in the Transcription tab

Chunk Size

The chunk size controls how frequently StarWhisper processes audio during recording. It's measured in milliseconds.

Setting	Update Speed	CPU/GPU Load	Best For
300ms (Fast)	Words appear very quickly	Higher	Powerful GPUs, presentations
500ms (Default)	Good balance	Moderate	Most users
750ms	Slightly delayed	Lower	Older hardware
1000ms (Slow)	Noticeable lag	Lowest	CPU-only mode

If you notice stuttering or dropped words in real-time mode, try increasing the chunk size. If your GPU handles it well, a lower chunk size gives a more responsive experience.

Guidance Prompt

The guidance prompt provides context hints to the Whisper model, helping it recognize specialized vocabulary more accurately. This is especially useful for domain-specific dictation.

How to Use

In Settings → Transcription, enable "Use Guidance Prompt"
Enter context text in the text area (up to 244 characters)
Describe the topic or list key terms the model should expect

Examples

Context	Guidance Prompt Text
Medical dictation	"Medical consultation discussing cardiovascular symptoms, ECG results, and statin therapy"
Legal dictation	"Contract review for intellectual property licensing, discussing indemnification clauses"
Software development	"Code review discussing React components, TypeScript interfaces, and API endpoints"
Company meeting	"Team meeting at Acme Corp discussing Q4 targets, Project Phoenix, and the Berlin office"

Keep It Short

The guidance prompt is limited to 244 characters. Focus on the most important terms and context rather than writing full sentences. Even a short prompt like "Cardiology consultation" significantly improves domain-specific accuracy.

Requirements

Requirement	Details
Transcription method	Local Whisper (not OpenAI API)
GPU acceleration	Strongly recommended
Minimum model	Tiny (but Small recommended for accuracy)
CPU-only mode	Works but may be slow — increase chunk size to 750ms+

Real-time transcription puts continuous load on your GPU while recording. If you're also running games, video editing, or AI tools, you may experience degraded performance. See Quality & Performance for more details.