Architecture Overview
Canonical stores
- PostgreSQL is the system of record.
- pgvector stores memory embeddings.
- Redis backs Celery.
- Neo4j is a graph projection for traversal-heavy reads.
Main subsystems
app/api- FastAPI routes and schema validation
app/storage- repository layer around canonical persistence
app/services- memory commit logic, embeddings, scoring, relevance, staged tool sessions
app/retrieval- scope resolution, vector search, reranking
app/llms- role prompts, provider clients, role orchestration
app/engines- end-to-end orchestration for process, retrieval, graph, snapshots, and maintenance
app/workers- Celery tasks and schedules
High-level flow
Pre-turn reads
/v1/context and /v1/deep-memory use the same retrieval pipeline:
- resolve scope
- load snapshot refs when relevant
- run vector search
- load high-signal metadata candidates
- apply strict query-relevance gating
- expand graph context from Neo4j using relevant memory seeds only
- rerank deterministically
- call the Context Enhancer or Deep Memory role only if relevant evidence remains
Post-turn writes
/v1/process is intentionally thin:
- create user if missing
- create containers if missing
- create a job
- return immediately
- stores the normalized turn
- gathers nearby memory and graph context
- runs the Adjudicator tool loop
- validates staged operations
- commits memory and graph updates atomically
- refreshes embeddings
- updates relevance for touched memory
- marks the snapshot dirty
Hourly maintenance
The hourly Cortex flow is mostly programmatic:- recompute decay and effective relevance
- build maintenance proposals
- send those proposals to Cortex for staged review
- validate and commit approved changes
- generate the latest user/global snapshot summary