- Fix state detection priority: dictation now takes precedence over conversation - Fix critical bug: event loop was created but never started, preventing async coroutines from executing - Optimize audio processing: reorder AcceptWaveform/PartialResult checks - Switch to faster Vosk model: vosk-model-en-us-0.22-lgraph for 2-3x speed improvement - Reduce block size from 8000 to 4000 for lower latency - Add filtering to remove spurious 'the', 'a', 'an' words from start/end of transcriptions - Update toggle-dictation.sh to properly clean up conversation lock file - Improve batch audio processing for better responsiveness
149 lines
3.2 KiB
Markdown
149 lines
3.2 KiB
Markdown
# Dictation Service Setup Guide
|
|
|
|
This guide will help you set up the dictation service as a system service with global keybindings for voice-to-text input.
|
|
|
|
## Prerequisites
|
|
|
|
- Ubuntu/GNOME desktop environment
|
|
- Python 3.12+ (already specified in project)
|
|
- uv package manager
|
|
- Microphone access
|
|
- Audio system (PulseAudio)
|
|
|
|
## Installation Steps
|
|
|
|
### 1. Install Dependencies
|
|
|
|
```bash
|
|
# Install system dependencies
|
|
sudo apt update
|
|
sudo apt install python3.12 python3.12-venv portaudio19-dev
|
|
|
|
# Install Python dependencies with uv
|
|
uv sync
|
|
```
|
|
|
|
### 2. Set Up System Service
|
|
|
|
```bash
|
|
# Copy service file to systemd directory
|
|
sudo cp dictation.service /etc/systemd/system/
|
|
|
|
# Reload systemd daemon
|
|
sudo systemctl daemon-reload
|
|
|
|
# Enable and start the service
|
|
systemctl --user enable dictation.service
|
|
systemctl --user start dictation.service
|
|
```
|
|
|
|
### 3. Configure Global Keybinding
|
|
|
|
```bash
|
|
# Run the keybinding setup script
|
|
./setup-keybindings.sh
|
|
```
|
|
|
|
This will configure Alt+D as the global shortcut to toggle dictation.
|
|
|
|
### 4. Verify Installation
|
|
|
|
```bash
|
|
# Check service status
|
|
systemctl --user status dictation.service
|
|
|
|
# Test the toggle script
|
|
./toggle-dictation.sh
|
|
```
|
|
|
|
## Usage
|
|
|
|
1. **Start Dictation**: Press Alt+D (or run `./toggle-dictation.sh`)
|
|
2. **Wait for notification**: You'll see "Dictation Started"
|
|
3. **Speak clearly**: The service will transcribe your voice to text
|
|
4. **Text appears**: Transcribed text will be typed wherever your cursor is
|
|
5. **Stop Dictation**: Press Alt+D again
|
|
|
|
## Troubleshooting
|
|
|
|
### Service Issues
|
|
|
|
```bash
|
|
# Check service logs
|
|
journalctl --user -u dictation.service -f
|
|
|
|
# Restart service
|
|
systemctl --user restart dictation.service
|
|
```
|
|
|
|
### Audio Issues
|
|
|
|
```bash
|
|
# Test microphone
|
|
arecord -D pulse -f cd -d 5 test.wav
|
|
aplay test.wav
|
|
|
|
# Check PulseAudio
|
|
pulseaudio --check -v
|
|
```
|
|
|
|
### Keybinding Issues
|
|
|
|
```bash
|
|
# Check current keybindings
|
|
gsettings list-recursively org.gnome.settings-daemon.plugins.media-keys
|
|
|
|
# Reset keybindings if needed
|
|
gsettings reset org.gnome.settings-daemon.plugins.media-keys custom-keybindings
|
|
```
|
|
|
|
### Permission Issues
|
|
|
|
```bash
|
|
# Add user to audio group
|
|
sudo usermod -a -G audio $USER
|
|
|
|
# Check microphone permissions
|
|
pacmd list-sources | grep -A 10 index
|
|
```
|
|
|
|
## Configuration
|
|
|
|
### Service Configuration
|
|
|
|
Edit `/etc/systemd/user/dictation.service` to modify:
|
|
- User account
|
|
- Working directory
|
|
- Environment variables
|
|
|
|
### Keybinding Configuration
|
|
|
|
Run `./setup-keybindings.sh` again to change the keybinding, or edit the script to use a different shortcut.
|
|
|
|
### Dictation Behavior
|
|
|
|
The dictation service can be configured by modifying:
|
|
- `src/dictation_service/vosk_dictation.py` - Main dictation logic
|
|
- Model files for different languages
|
|
- Audio settings and formatting
|
|
|
|
## Files Created
|
|
|
|
- `dictation.service` - Systemd service file
|
|
- `toggle-dictation.sh` - Dictation control script
|
|
- `setup-keybindings.sh` - Keybinding configuration script
|
|
|
|
## Removing the Service
|
|
|
|
```bash
|
|
# Stop and disable service
|
|
systemctl --user stop dictation.service
|
|
systemctl --user disable dictation.service
|
|
|
|
# Remove service file
|
|
sudo rm /etc/systemd/system/dictation.service
|
|
sudo systemctl daemon-reload
|
|
|
|
# Remove keybinding
|
|
gsettings reset org.gnome.settings-daemon.plugins.media-keys custom-keybindings
|
|
``` |