Architecture Overview

Canonical stores

PostgreSQL is the system of record.
pgvector stores memory embeddings.
Redis backs Celery.
Neo4j is a graph projection for traversal-heavy reads.

Neo4j is not the write source of truth. All canonical memory and graph mutations are written to PostgreSQL first.

Main subsystems

app/api
- FastAPI routes and schema validation
app/storage
- repository layer around canonical persistence
app/services
- memory commit logic, embeddings, scoring, relevance, staged tool sessions
app/retrieval
- scope resolution, vector search, reranking
app/llms
- role prompts, provider clients, role orchestration
app/engines
- end-to-end orchestration for process, retrieval, graph, snapshots, and maintenance
app/workers
- Celery tasks and schedules

High-level flow

Pre-turn reads

/v1/context and /v1/deep-memory use the same retrieval pipeline:

resolve scope
load snapshot refs when relevant
run vector search
load high-signal metadata candidates
apply strict query-relevance gating
expand graph context from Neo4j using relevant memory seeds only
rerank deterministically
call the Context Enhancer or Deep Memory role only if relevant evidence remains

Post-turn writes

/v1/process is intentionally thin:

create user if missing
create containers if missing
create a job
return immediately

The background worker then:

stores the normalized turn
gathers nearby memory and graph context
runs the Adjudicator tool loop
validates staged operations
commits memory and graph updates atomically
refreshes embeddings
updates relevance for touched memory
marks the snapshot dirty

Hourly maintenance

The hourly Cortex flow is mostly programmatic:

recompute decay and effective relevance
build maintenance proposals
send those proposals to Cortex for staged review
validate and commit approved changes
generate the latest user/global snapshot summary

Quickstart LLM Roles