File Transcription PRO
Transcribe audio and video files without using your microphone. Import recordings, podcasts, interviews, or any media file and get a text transcription.
Pro Feature
File transcription is available exclusively to Pro subscribers. Upgrade at starwhisper.ai/upgrade to unlock this feature.
Supported Formats
| Format | Type | Common Use |
|---|---|---|
| MP3 | Audio | Music, podcasts, voice memos |
| WAV | Audio | Professional recordings, lossless audio |
| FLAC | Audio | Lossless compressed audio |
| OGG | Audio | Open-source audio format |
| MP4 | Video | Most common video format |
| MKV | Video | High-quality video container |
| MOV | Video | Apple QuickTime video |
| AVI | Video | Legacy Windows video |
| WebM | Video | Web-optimized video |
Video Files
For video files, StarWhisper extracts the audio track automatically. The video content is not analyzed.
Transcribing a Single File
- Right-click the StarWhisper circle → Settings
- Go to the Files tab
- Click "Select Audio/Video Files"
- Choose your file from the file picker
- Wait for the progress bar to complete
- Your transcription appears below
The Files tab — select files, track progress, and view results
Batch Processing
You can select multiple files at once to transcribe them in sequence.
- Click "Select Audio/Video Files"
- Select multiple files (hold Ctrl or Shift to select several)
- Files are added to a queue and processed one by one
- Each file shows individual progress
- An overall progress bar tracks the entire batch
Queue Management
The queue processes files in the order they were selected. You can clear the queue at any time with the "Clear Queue" button.
Viewing Results
After transcription completes, results appear in the Files tab under "Transcribed Files". Each entry shows:
- The original filename
- Timestamp of when it was transcribed
- The full transcription text
- Option to copy the text
Tips for Best Results
- Use high-quality source audio — clearer recordings produce better transcriptions
- Enable GPU acceleration — dramatically speeds up file transcription, especially for long files
- Choose the right model — use Medium or Large for important files where accuracy matters
- Quiet recordings work best — files with heavy background noise or music will have lower accuracy
- Split very long files — files over 60 minutes may take considerable time; consider splitting them
Processing Time
File transcription uses the same Whisper model as live dictation. Processing time depends on file length, model size, and whether GPU acceleration is enabled. A 10-minute file with the Small model on GPU typically takes 30–60 seconds.