Major refactoring: v0.2.0 - Simplify to core dictation & read-aloud features

This is a comprehensive refactoring that transforms the dictation service from a
complex multi-mode application into two clean, focused features:
1. Voice dictation with system tray icon
2. On-demand read-aloud via Ctrl+middle-click

## Key Changes

### Dictation Service Enhancements
- Add GTK/AppIndicator3 system tray icon for visual status
- Remove all notification spam (dictation start/stop/status)
- Icon states: microphone-muted (OFF) → microphone-high (ON)
- Click tray icon to toggle dictation (same as Alt+D)
- Simplify ai_dictation_simple.py by removing conversation mode

### Read-Aloud Service Redesign
- Replace automatic clipboard reader with on-demand Ctrl+middle-click
- New middle_click_reader.py service
- Works anywhere: highlight text, Ctrl+middle-click to read
- Uses Edge-TTS (Christopher voice) with mpv playback
- Lock file prevents feedback with dictation service

### Conversation Mode Removed
- Delete all VLLM/conversation code (VLLMClient, ConversationManager, TTS)
- Archive 5 old implementations to archive/old_implementations/
- Remove conversation-related scripts and services
- Clean separation of concerns for future reintegration if needed

### Dependencies Cleanup
- Remove: openai, aiohttp, pyttsx3, requests (conversation deps)
- Keep: PyGObject, pynput, sounddevice, vosk, numpy, edge-tts
- Net reduction: 4 packages removed, 6 core packages retained

### Testing Improvements
- Add test_dictation_service.py (8 tests) ✅
- Add test_middle_click.py (11 tests) ✅
- Fix test_run.py to use correct model path
- Total: 19 unit tests passing
- Delete obsolete test files (test_suite, test_vllm_integration, etc.)

### Documentation
- Add CHANGES.md with complete changelog
- Add docs/MIGRATION_GUIDE.md for upgrading
- Add README.md with quick start guide
- Update docs/README.md with current features only
- Add justfile for common tasks

### New Services & Scripts
- Add middle-click-reader.service (systemd)
- Add scripts/setup-middle-click-reader.sh
- Add desktop files for autostart
- Remove toggle-conversation.sh (obsolete)

## Impact

**Code Quality**
- Net change: -6,007 lines (596 added, 6,603 deleted)
- Simpler architecture, easier maintenance
- Better test coverage (19 tests vs mixed before)
- Cleaner separation of concerns

**User Experience**
- No notification spam during dictation
- Clean visual status via tray icon
- Full control over read-aloud (no unwanted readings)
- Better performance (fewer background processes)

**Privacy**
- No conversation data stored
- No VLLM connection needed
- All processing local except Edge-TTS text

## Migration Notes

Users upgrading should:
1. Run `uv sync` to update dependencies
2. Restart dictation.service to get tray icon
3. Run scripts/setup-middle-click-reader.sh for new read-aloud
4. Remove old read-aloud.service if present

See docs/MIGRATION_GUIDE.md for details.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

2025-12-10 19:11:06 -07:00

5.4 KiB

Raw Permalink Blame History

Migration Guide - Updated Features

Summary of Changes

This update introduces significant UX improvements based on user feedback:

✅ Changes Made

Dictation Mode: System Tray Icon Instead of Notifications
- Old: System notifications for every dictation start/stop/status
- New: Clean system tray icon that changes based on state
- Benefit: No more notification spam, cleaner UX
Read-Aloud: Middle-Click Instead of Automatic
- Old: Automatic reading of all highlighted text via system tray service
- New: On-demand reading via middle-click on selected text
- Benefit: More control, less annoying, works on-demand only
Conversation Mode: Unchanged
- Still works with Super+Alt+D (Windows+Alt+D)
- Still maintains persistent context across calls
- Still sends notifications (intentionally kept for this feature)

Migration Steps

1. Update the Dictation Service

The main dictation service now includes a system tray icon:

# Stop the old service
systemctl --user stop dictation.service

# Restart with new code (already updated)
systemctl --user restart dictation.service

What to expect:

A microphone icon will appear in your system tray
Icon changes from "muted" (OFF) to "high" (ON) when dictating
Click the icon to toggle dictation, or continue using Alt+D
No more notifications when dictating

2. Remove Old Read-Aloud Service

The automatic read-aloud service has been replaced:

# Stop and disable old service
systemctl --user stop read-aloud.service 2>/dev/null || true
systemctl --user disable read-aloud.service 2>/dev/null || true

# Remove old service file
rm -f ~/.config/systemd/user/read-aloud.service

# Reload systemd
systemctl --user daemon-reload

3. Install New Middle-Click Reader

Set up the new on-demand read-aloud service:

# Run setup script
cd /mnt/storage/Development/dictation-service
./scripts/setup-middle-click-reader.sh

What to expect:

No visible tray icon (runs in background)
Highlight text anywhere
Middle-click (press scroll wheel) to read it
Only reads when you explicitly request it

4. Test Everything

Test Dictation:

Look for microphone icon in system tray
Press Alt+D or click the icon
Icon should change to "microphone-high"
Speak - text should type
Press Alt+D or click icon again to stop
No notifications should appear

Test Read-Aloud:

Highlight some text in a browser or editor
Middle-click on the highlighted text
It should be read aloud
Try highlighting different text and middle-clicking again

Test Conversation (unchanged):

Press Super+Alt+D
Should see "Conversation Started" notification (this is kept)
Speak with AI
Press Super+Alt+D to end

Deprecated Files

These files have been renamed with .deprecated suffix and are no longer used:

read-aloud.service.deprecated (old automatic service)
scripts/setup-read-aloud.sh.deprecated (old setup script)
scripts/toggle-read-aloud.sh.deprecated (old toggle script)
src/dictation_service/read_aloud_service.py.deprecated (old implementation)

You can safely delete these files if desired.

New Files

src/dictation_service/middle_click_reader.py - New middle-click service
middle-click-reader.service - Systemd service file
scripts/setup-middle-click-reader.sh - Setup script

Troubleshooting

System Tray Icon Not Appearing

Make sure AppIndicator3 is installed:

sudo apt-get install gir1.2-appindicator3-0.1

Check service logs:

journalctl --user -u dictation.service -f

Some desktop environments need additional packages:

# For GNOME Shell
sudo apt-get install gnome-shell-extension-appindicator

Middle-Click Not Working

Check if service is running:

systemctl --user status middle-click-reader

Check logs:

journalctl --user -u middle-click-reader -f

Test xclip manually:

echo "test" | xclip -selection primary
xclip -o -selection primary

Verify edge-tts is installed:

edge-tts --list-voices | grep Christopher

Notifications Still Appearing for Dictation

This means you might be running an old version of the code:

# Force restart the service
systemctl --user restart dictation.service

# Verify the new code is running
journalctl --user -u dictation.service -n 20 | grep "system tray"

Rollback Instructions

If you need to revert to the old behavior:

# Restore old files (if you didn't delete them)
mv read-aloud.service.deprecated read-aloud.service
mv scripts/setup-read-aloud.sh.deprecated scripts/setup-read-aloud.sh
mv scripts/toggle-read-aloud.sh.deprecated scripts/toggle-read-aloud.sh

# Use git to restore old dictation code
git checkout HEAD~1 -- src/dictation_service/ai_dictation_simple.py

# Restart services
systemctl --user restart dictation.service
./scripts/setup-read-aloud.sh

Benefits of New Approach

Dictation

✅ No notification spam
✅ Visual status always visible in tray
✅ One-click toggle from tray menu
✅ Cleaner, less intrusive UX

Read-Aloud

✅ Only reads when you want it to
✅ No background polling
✅ Lower resource usage
✅ Works everywhere (not just when service is "on")
✅ No accidental readings

Questions?

Check the updated AI_DICTATION_GUIDE.md for complete usage instructions.

5.4 KiB Raw Permalink Blame History