- Fix state detection priority: dictation now takes precedence over conversation - Fix critical bug: event loop was created but never started, preventing async coroutines from executing - Optimize audio processing: reorder AcceptWaveform/PartialResult checks - Switch to faster Vosk model: vosk-model-en-us-0.22-lgraph for 2-3x speed improvement - Reduce block size from 8000 to 4000 for lower latency - Add filtering to remove spurious 'the', 'a', 'an' words from start/end of transcriptions - Update toggle-dictation.sh to properly clean up conversation lock file - Improve batch audio processing for better responsiveness
3.2 KiB
3.2 KiB
Dictation Service Setup Guide
This guide will help you set up the dictation service as a system service with global keybindings for voice-to-text input.
Prerequisites
- Ubuntu/GNOME desktop environment
- Python 3.12+ (already specified in project)
- uv package manager
- Microphone access
- Audio system (PulseAudio)
Installation Steps
1. Install Dependencies
# Install system dependencies
sudo apt update
sudo apt install python3.12 python3.12-venv portaudio19-dev
# Install Python dependencies with uv
uv sync
2. Set Up System Service
# Copy service file to systemd directory
sudo cp dictation.service /etc/systemd/system/
# Reload systemd daemon
sudo systemctl daemon-reload
# Enable and start the service
systemctl --user enable dictation.service
systemctl --user start dictation.service
3. Configure Global Keybinding
# Run the keybinding setup script
./setup-keybindings.sh
This will configure Alt+D as the global shortcut to toggle dictation.
4. Verify Installation
# Check service status
systemctl --user status dictation.service
# Test the toggle script
./toggle-dictation.sh
Usage
- Start Dictation: Press Alt+D (or run
./toggle-dictation.sh) - Wait for notification: You'll see "Dictation Started"
- Speak clearly: The service will transcribe your voice to text
- Text appears: Transcribed text will be typed wherever your cursor is
- Stop Dictation: Press Alt+D again
Troubleshooting
Service Issues
# Check service logs
journalctl --user -u dictation.service -f
# Restart service
systemctl --user restart dictation.service
Audio Issues
# Test microphone
arecord -D pulse -f cd -d 5 test.wav
aplay test.wav
# Check PulseAudio
pulseaudio --check -v
Keybinding Issues
# Check current keybindings
gsettings list-recursively org.gnome.settings-daemon.plugins.media-keys
# Reset keybindings if needed
gsettings reset org.gnome.settings-daemon.plugins.media-keys custom-keybindings
Permission Issues
# Add user to audio group
sudo usermod -a -G audio $USER
# Check microphone permissions
pacmd list-sources | grep -A 10 index
Configuration
Service Configuration
Edit /etc/systemd/user/dictation.service to modify:
- User account
- Working directory
- Environment variables
Keybinding Configuration
Run ./setup-keybindings.sh again to change the keybinding, or edit the script to use a different shortcut.
Dictation Behavior
The dictation service can be configured by modifying:
src/dictation_service/vosk_dictation.py- Main dictation logic- Model files for different languages
- Audio settings and formatting
Files Created
dictation.service- Systemd service filetoggle-dictation.sh- Dictation control scriptsetup-keybindings.sh- Keybinding configuration script
Removing the Service
# Stop and disable service
systemctl --user stop dictation.service
systemctl --user disable dictation.service
# Remove service file
sudo rm /etc/systemd/system/dictation.service
sudo systemctl daemon-reload
# Remove keybinding
gsettings reset org.gnome.settings-daemon.plugins.media-keys custom-keybindings