Screen Context AI System
Problem Overview
You're tasked with designing a system that maintains awareness of a user's screen content to answer contextual questions that traditional RAG systems cannot handle.
Core Components:
- Data Ingestion API (
POST /ingest)
- Called every 5 seconds with current screen content
- Schema to be designed by you
- Query API (
POST /chat_completion)
- Handles user questions about their digital context
Example Use Cases:
- "Which people are in my team?"
- "What WhatsApp messages need replies? Draft responses for me."
- "What PRs need my reviews?"
- [Additional contextual questions]
Technical Parameters:
- You control the client-side implementation and parsing
- You'll have databases of your choice, an embedding model (if needed), and an LLM
- Solution should be generic but can include app-specific optimizations
- Recommended research areas: knowledge graphs, memory systems (Grafiti, mem0, GraphRAG)
Deliverables:
- A comprehensive system design proposal
- A proof-of-concept implementation (format of your choice)
- Research documentation