This is a comprehensive refactoring that transforms the dictation service from a complex multi-mode application into two clean, focused features: 1. Voice dictation with system tray icon 2. On-demand read-aloud via Ctrl+middle-click ## Key Changes ### Dictation Service Enhancements - Add GTK/AppIndicator3 system tray icon for visual status - Remove all notification spam (dictation start/stop/status) - Icon states: microphone-muted (OFF) → microphone-high (ON) - Click tray icon to toggle dictation (same as Alt+D) - Simplify ai_dictation_simple.py by removing conversation mode ### Read-Aloud Service Redesign - Replace automatic clipboard reader with on-demand Ctrl+middle-click - New middle_click_reader.py service - Works anywhere: highlight text, Ctrl+middle-click to read - Uses Edge-TTS (Christopher voice) with mpv playback - Lock file prevents feedback with dictation service ### Conversation Mode Removed - Delete all VLLM/conversation code (VLLMClient, ConversationManager, TTS) - Archive 5 old implementations to archive/old_implementations/ - Remove conversation-related scripts and services - Clean separation of concerns for future reintegration if needed ### Dependencies Cleanup - Remove: openai, aiohttp, pyttsx3, requests (conversation deps) - Keep: PyGObject, pynput, sounddevice, vosk, numpy, edge-tts - Net reduction: 4 packages removed, 6 core packages retained ### Testing Improvements - Add test_dictation_service.py (8 tests) ✅ - Add test_middle_click.py (11 tests) ✅ - Fix test_run.py to use correct model path - Total: 19 unit tests passing - Delete obsolete test files (test_suite, test_vllm_integration, etc.) ### Documentation - Add CHANGES.md with complete changelog - Add docs/MIGRATION_GUIDE.md for upgrading - Add README.md with quick start guide - Update docs/README.md with current features only - Add justfile for common tasks ### New Services & Scripts - Add middle-click-reader.service (systemd) - Add scripts/setup-middle-click-reader.sh - Add desktop files for autostart - Remove toggle-conversation.sh (obsolete) ## Impact **Code Quality** - Net change: -6,007 lines (596 added, 6,603 deleted) - Simpler architecture, easier maintenance - Better test coverage (19 tests vs mixed before) - Cleaner separation of concerns **User Experience** - No notification spam during dictation - Clean visual status via tray icon - Full control over read-aloud (no unwanted readings) - Better performance (fewer background processes) **Privacy** - No conversation data stored - No VLLM connection needed - All processing local except Edge-TTS text ## Migration Notes Users upgrading should: 1. Run `uv sync` to update dependencies 2. Restart dictation.service to get tray icon 3. Run scripts/setup-middle-click-reader.sh for new read-aloud 4. Remove old read-aloud.service if present See docs/MIGRATION_GUIDE.md for details. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
206 lines
5.4 KiB
Markdown
206 lines
5.4 KiB
Markdown
# Migration Guide - Updated Features
|
|
|
|
## Summary of Changes
|
|
|
|
This update introduces significant UX improvements based on user feedback:
|
|
|
|
### ✅ Changes Made
|
|
|
|
1. **Dictation Mode: System Tray Icon Instead of Notifications**
|
|
- **Old:** System notifications for every dictation start/stop/status
|
|
- **New:** Clean system tray icon that changes based on state
|
|
- **Benefit:** No more notification spam, cleaner UX
|
|
|
|
2. **Read-Aloud: Middle-Click Instead of Automatic**
|
|
- **Old:** Automatic reading of all highlighted text via system tray service
|
|
- **New:** On-demand reading via middle-click on selected text
|
|
- **Benefit:** More control, less annoying, works on-demand only
|
|
|
|
3. **Conversation Mode: Unchanged**
|
|
- Still works with Super+Alt+D (Windows+Alt+D)
|
|
- Still maintains persistent context across calls
|
|
- Still sends notifications (intentionally kept for this feature)
|
|
|
|
## Migration Steps
|
|
|
|
### 1. Update the Dictation Service
|
|
|
|
The main dictation service now includes a system tray icon:
|
|
|
|
```bash
|
|
# Stop the old service
|
|
systemctl --user stop dictation.service
|
|
|
|
# Restart with new code (already updated)
|
|
systemctl --user restart dictation.service
|
|
```
|
|
|
|
**What to expect:**
|
|
- A microphone icon will appear in your system tray
|
|
- Icon changes from "muted" (OFF) to "high" (ON) when dictating
|
|
- Click the icon to toggle dictation, or continue using Alt+D
|
|
- No more notifications when dictating
|
|
|
|
### 2. Remove Old Read-Aloud Service
|
|
|
|
The automatic read-aloud service has been replaced:
|
|
|
|
```bash
|
|
# Stop and disable old service
|
|
systemctl --user stop read-aloud.service 2>/dev/null || true
|
|
systemctl --user disable read-aloud.service 2>/dev/null || true
|
|
|
|
# Remove old service file
|
|
rm -f ~/.config/systemd/user/read-aloud.service
|
|
|
|
# Reload systemd
|
|
systemctl --user daemon-reload
|
|
```
|
|
|
|
### 3. Install New Middle-Click Reader
|
|
|
|
Set up the new on-demand read-aloud service:
|
|
|
|
```bash
|
|
# Run setup script
|
|
cd /mnt/storage/Development/dictation-service
|
|
./scripts/setup-middle-click-reader.sh
|
|
```
|
|
|
|
**What to expect:**
|
|
- No visible tray icon (runs in background)
|
|
- Highlight text anywhere
|
|
- Middle-click (press scroll wheel) to read it
|
|
- Only reads when you explicitly request it
|
|
|
|
### 4. Test Everything
|
|
|
|
**Test Dictation:**
|
|
1. Look for microphone icon in system tray
|
|
2. Press Alt+D or click the icon
|
|
3. Icon should change to "microphone-high"
|
|
4. Speak - text should type
|
|
5. Press Alt+D or click icon again to stop
|
|
6. No notifications should appear
|
|
|
|
**Test Read-Aloud:**
|
|
1. Highlight some text in a browser or editor
|
|
2. Middle-click on the highlighted text
|
|
3. It should be read aloud
|
|
4. Try highlighting different text and middle-clicking again
|
|
|
|
**Test Conversation (unchanged):**
|
|
1. Press Super+Alt+D
|
|
2. Should see "Conversation Started" notification (this is kept)
|
|
3. Speak with AI
|
|
4. Press Super+Alt+D to end
|
|
|
|
## Deprecated Files
|
|
|
|
These files have been renamed with `.deprecated` suffix and are no longer used:
|
|
|
|
- `read-aloud.service.deprecated` (old automatic service)
|
|
- `scripts/setup-read-aloud.sh.deprecated` (old setup script)
|
|
- `scripts/toggle-read-aloud.sh.deprecated` (old toggle script)
|
|
- `src/dictation_service/read_aloud_service.py.deprecated` (old implementation)
|
|
|
|
You can safely delete these files if desired.
|
|
|
|
## New Files
|
|
|
|
- `src/dictation_service/middle_click_reader.py` - New middle-click service
|
|
- `middle-click-reader.service` - Systemd service file
|
|
- `scripts/setup-middle-click-reader.sh` - Setup script
|
|
|
|
## Troubleshooting
|
|
|
|
### System Tray Icon Not Appearing
|
|
|
|
1. Make sure AppIndicator3 is installed:
|
|
```bash
|
|
sudo apt-get install gir1.2-appindicator3-0.1
|
|
```
|
|
|
|
2. Check service logs:
|
|
```bash
|
|
journalctl --user -u dictation.service -f
|
|
```
|
|
|
|
3. Some desktop environments need additional packages:
|
|
```bash
|
|
# For GNOME Shell
|
|
sudo apt-get install gnome-shell-extension-appindicator
|
|
```
|
|
|
|
### Middle-Click Not Working
|
|
|
|
1. Check if service is running:
|
|
```bash
|
|
systemctl --user status middle-click-reader
|
|
```
|
|
|
|
2. Check logs:
|
|
```bash
|
|
journalctl --user -u middle-click-reader -f
|
|
```
|
|
|
|
3. Test xclip manually:
|
|
```bash
|
|
echo "test" | xclip -selection primary
|
|
xclip -o -selection primary
|
|
```
|
|
|
|
4. Verify edge-tts is installed:
|
|
```bash
|
|
edge-tts --list-voices | grep Christopher
|
|
```
|
|
|
|
### Notifications Still Appearing for Dictation
|
|
|
|
This means you might be running an old version of the code:
|
|
|
|
```bash
|
|
# Force restart the service
|
|
systemctl --user restart dictation.service
|
|
|
|
# Verify the new code is running
|
|
journalctl --user -u dictation.service -n 20 | grep "system tray"
|
|
```
|
|
|
|
## Rollback Instructions
|
|
|
|
If you need to revert to the old behavior:
|
|
|
|
```bash
|
|
# Restore old files (if you didn't delete them)
|
|
mv read-aloud.service.deprecated read-aloud.service
|
|
mv scripts/setup-read-aloud.sh.deprecated scripts/setup-read-aloud.sh
|
|
mv scripts/toggle-read-aloud.sh.deprecated scripts/toggle-read-aloud.sh
|
|
|
|
# Use git to restore old dictation code
|
|
git checkout HEAD~1 -- src/dictation_service/ai_dictation_simple.py
|
|
|
|
# Restart services
|
|
systemctl --user restart dictation.service
|
|
./scripts/setup-read-aloud.sh
|
|
```
|
|
|
|
## Benefits of New Approach
|
|
|
|
### Dictation
|
|
- ✅ No notification spam
|
|
- ✅ Visual status always visible in tray
|
|
- ✅ One-click toggle from tray menu
|
|
- ✅ Cleaner, less intrusive UX
|
|
|
|
### Read-Aloud
|
|
- ✅ Only reads when you want it to
|
|
- ✅ No background polling
|
|
- ✅ Lower resource usage
|
|
- ✅ Works everywhere (not just when service is "on")
|
|
- ✅ No accidental readings
|
|
|
|
## Questions?
|
|
|
|
Check the updated [AI_DICTATION_GUIDE.md](./AI_DICTATION_GUIDE.md) for complete usage instructions.
|