- Complete planning documentation for 5-phase development
- UI design specifications and integration
- Domain architecture and directory templates
- Technical specifications and requirements
- Knowledge incorporation strategies
- Dana language reference and integration notes

2025-12-03 16:54:37 -07:00

10 KiB

Raw Blame History

Media Ingestion and Processing Workflow

This document outlines the complete user journey for ingesting media content into the Advanced Second Brain PKM system, from initial file placement to actionable insights.

Overview

The media ingestion workflow demonstrates the system's core value proposition: transforming passive media consumption into active knowledge management through automated processing, intelligent analysis, and seamless integration with the user's knowledge base.

User Journey Map

Phase 1: Content Acquisition (User Action)

Trigger: User discovers valuable content (lecture, podcast, video course)

User Actions:

Download or acquire media file (MP4, MP3, WebM, etc.)
Navigate to appropriate domain directory in file system
Place file in correct subfolder (e.g., Neuroscience/Media/Lectures/)
Optionally rename file for clarity

System State: File appears in domain directory, ready for processing

User Expectations:

File placement should be intuitive
No manual intervention required
System should acknowledge file detection

Phase 2: Automated Detection and Processing (Background)

System Actions:

File Watcher Detection: File system monitor detects new file within 5 seconds
Metadata Extraction: Extract file metadata (duration, size, format, creation date)
Format Validation: Verify file format is supported
Queue Processing: Add to media processing queue with priority

Background Processing:

Transcription Service: Send to Whisper/OpenAI/Google Speech-to-Text
Transcript Generation: Convert audio/video to timestamped text
Quality Validation: Check transcript accuracy (>90% confidence)
Synchronization: Align transcript with video timeline (if video)
Storage: Save transcript alongside original file

System State: Media file processed, transcript available

User Feedback: Notification in UI when processing complete

Phase 3: Knowledge Integration (User Interaction)

User Actions:

Open Knowledge Browser for the domain
Navigate to media file in file tree
Click on video file to open in Content Viewer

System Response:

Content Loading: Display video player with controls
Transcript Display: Show synchronized transcript below video
Navigation Integration: Enable click-to-jump between transcript and video

User Value: Can now consume content with searchable, navigable transcript

Phase 4: Intelligent Analysis (User-Driven)

User Actions:

Click "Run Fabric Pattern" button in Insight/Fabric pane
Select analysis pattern (e.g., "Extract Ideas", "Summarize", "Find Action Items")
Optionally adjust parameters

System Actions:

Content Processing: Send transcript to domain agent
Pattern Execution: Run selected Fabric analysis pattern
Insight Generation: Extract structured insights from content
Result Display: Show formatted results in right pane

Example Output:

## Extracted Ideas
- Neural networks can be understood as parallel distributed processors
- Backpropagation remains the most effective learning algorithm
- Attention mechanisms solve the bottleneck problem in RNNs

## Key Takeaways
- Deep learning has moved from art to science
- Transformer architecture enables better long-range dependencies
- Self-supervised learning reduces annotation requirements

Phase 5: Knowledge Graph Integration (Automatic)

System Actions:

Concept Extraction: Identify key concepts from analysis results
Graph Updates: Add new concepts and relationships to knowledge graph
Embedding Generation: Create vector embeddings for new content
Relationship Discovery: Link to existing concepts in domain

Background Processing:

Update semantic search index
Recalculate concept centrality
Generate cross-references to related content
Update domain agent context

Phase 6: Cross-Domain Connection (Optional Advanced Usage)

User Actions:

Notice connection between current content and another domain
Switch to Agent Studio mode
Modify Dana agent code to include cross-domain relationships

Example Dana Code Modification:

agent NeuroscienceAgent {
    context: ["Neuroscience/Media/**", "CompSci/Papers/**"]

    query(query) {
        // Search both domains for neural network concepts
        neuroscience_results = search_domain("Neuroscience", query)
        compsci_results = search_domain("CompSci", "neural networks")

        // Combine and synthesize results
        return synthesize_results(neuroscience_results, compsci_results)
    }
}

Technical Implementation Details

File System Integration

Directory Structure:

Domain_Name/
├── Media/
│   ├── Lectures/
│   ├── Podcasts/
│   ├── Videos/
│   └── Transcripts/  # Auto-generated
├── Papers/
├── Notes/
└── agent.na         # Domain agent configuration

File Naming Convention:

Original: lecture_neural_networks_fundamentals.mp4
Transcript: lecture_neural_networks_fundamentals.mp4.transcript.json

Processing Pipeline

Queue Management:

@dataclass
class MediaProcessingJob:
    file_path: str
    domain_id: str
    priority: int = 1
    retry_count: int = 0
    status: ProcessingStatus = ProcessingStatus.PENDING

Processing Steps:

Validation: Check file integrity and format support
Transcription: Call external API with error handling
Post-processing: Clean transcript, add timestamps
Storage: Save in structured JSON format
Indexing: Update search indices
Notification: Alert user of completion

Transcript Format

JSON Structure:

{
  "metadata": {
    "source_file": "lecture.mp4",
    "duration": 3600,
    "transcription_service": "whisper",
    "confidence_score": 0.95,
    "processing_timestamp": "2024-01-15T10:30:00Z"
  },
  "segments": [
    {
      "start": 0.0,
      "end": 5.2,
      "text": "Welcome to this lecture on neural networks.",
      "confidence": 0.98
    },
    {
      "start": 5.2,
      "end": 12.1,
      "text": "Today we'll cover the fundamental concepts...",
      "confidence": 0.96
    }
  ],
  "chapters": [
    {
      "title": "Introduction",
      "start": 0.0,
      "end": 180.0
    },
    {
      "title": "Basic Concepts",
      "start": 180.0,
      "end": 900.0
    }
  ]
}

Synchronization Mechanism

Video-Transcript Sync:

Click Transcript: Jump to corresponding video timestamp
Video Playback: Highlight current transcript segment
Search: Find text and jump to video location
Export: Generate timestamped notes with video references

Fabric Analysis Patterns

Pattern Framework:

@dataclass
class FabricPattern:
    name: str
    description: str
    input_type: str  # "transcript", "document", "mixed"
    output_format: str  # "bullet_points", "summary", "structured"

    async def execute(self, content: str, context: Dict[str, Any]) -> PatternResult:
        # Implementation varies by pattern
        pass

Built-in Patterns:

Extract Ideas: Identify key concepts and insights
Summarize: Create concise content summary
Find Action Items: Extract tasks and follow-ups
Generate Questions: Create study/discussion questions
Extract References: Find citations and sources
Timeline Analysis: Create chronological breakdown

Error Handling and Recovery

Failure Scenarios:

Transcription Failure: Retry with different service, notify user
File Corruption: Skip processing, log error, allow manual retry
Storage Issues: Queue for later processing, alert admin
Analysis Errors: Fallback to basic processing, partial results

User Communication:

Processing status indicators in UI
Notification system for completion/failures
Manual retry options for failed jobs
Progress tracking for long-running tasks

Performance Requirements

Processing Times

File Detection: <5 seconds
Metadata Extraction: <1 second
Transcription: <10% of media duration (e.g., 6 min for 1-hour video)
Analysis: <30 seconds for typical content
UI Updates: <2 seconds for all operations

Scalability Targets

Concurrent Processing: 10 media files simultaneously
Queue Throughput: 50 files per hour
Storage Growth: Handle 100GB+ media libraries
Search Performance: <500ms for transcript searches

User Experience Considerations

Progressive Enhancement

Basic playback works immediately
Transcripts appear asynchronously
Analysis results load on demand
Advanced features available when processing complete

Accessibility

Keyboard navigation for all controls
Screen reader support for transcripts
High contrast mode for video controls
Adjustable playback speeds

Mobile Considerations

Responsive video player
Touch-friendly transcript navigation
Offline transcript access
Bandwidth-adaptive quality

Success Metrics

User Engagement

Completion Rate: % of videos watched with transcripts
Analysis Usage: % of content analyzed with Fabric patterns
Time Saved: Average time reduction vs. manual note-taking
Knowledge Retention: User-reported learning improvement

Technical Performance

Processing Success Rate: >95% of files processed successfully
Transcript Accuracy: >90% confidence scores
Analysis Quality: >80% user satisfaction with insights
System Reliability: <1% processing failures

Future Enhancements

Advanced Features

Multi-language Support: Automatic language detection and translation
Speaker Diarization: Identify different speakers in recordings
Emotion Analysis: Detect speaker enthusiasm and emphasis
Concept Mapping: Visual knowledge graphs from transcripts
Collaborative Annotations: Shared notes and highlights

Integration Opportunities

Calendar Integration: Sync with lecture schedules
Note-taking Apps: Export to Roam Research, Obsidian, etc.
Learning Platforms: Integration with Coursera, edX, etc.
Social Features: Share insights with study groups

This workflow transforms passive media consumption into an active, intelligent knowledge management process, demonstrating the system's core value proposition of making complex information accessible and actionable. docs/plans/user-journeys/media-ingestion-workflow.md

10 KiB Raw Blame History

Media Ingestion and Processing Workflow

Overview

User Journey Map

Phase 1: Content Acquisition (User Action)

Phase 2: Automated Detection and Processing (Background)

Phase 3: Knowledge Integration (User Interaction)

Phase 4: Intelligent Analysis (User-Driven)

Phase 5: Knowledge Graph Integration (Automatic)

Phase 6: Cross-Domain Connection (Optional Advanced Usage)

Technical Implementation Details

File System Integration

Processing Pipeline

Transcript Format

Synchronization Mechanism

Fabric Analysis Patterns

Error Handling and Recovery

Performance Requirements

Processing Times

Scalability Targets

User Experience Considerations

Progressive Enhancement

Accessibility

Mobile Considerations

Success Metrics

User Engagement

Technical Performance

Future Enhancements

Advanced Features

Integration Opportunities

10 KiB

Raw Blame History