How Pensiv turns scattered context into compounding intelligence
Five cognitive capabilities, grounded in decades of memory science. Each one solves a problem that similarity search cannot touch.
The Noise Problem
Most memory systems store everything. Every message, every observation, every redundant restatement. The result is a memory that grows in volume but not in value — a haystack that gets bigger with every straw.
How surprise-gating works
Pensiv's surprise gate evaluates every incoming piece of information against what the system already knows. Each input gets a surprise score (0.0–1.0) based on semantic novelty, entity novelty, and relational novelty.
0.0 – 0.3 surprise
Redundant content. The vault already knows this. No write occurs. Noise stays out.
0.3 – 0.7 surprise
Mildly novel content. Fast write: extract entities, basic relations, timestamp. No deep analysis.
0.7 – 1.0 surprise
Genuinely surprising. Deep write: full relation extraction, schema tagging, entity linking, contradiction detection, analogical search for structural matches.
Not a filter. An attention mechanism. The vault allocates compute proportional to surprise.
Why this matters
The 100th source doesn't dilute the vault — it sharpens it. Signal compounds. Noise stays out. The vault gets smarter with every source, not noisier.
The Analogy Problem
When an intelligence analyst asks "what patterns from the South China Sea resemble pre-conflict indicators from the Black Sea?" — no keyword matching or vector similarity will answer that. The surface features are maximally dissimilar. But the structure (escalation sequence, naval posturing, diplomatic rhetoric pattern) may be identical.
Structure-Mapping Engine (SME)
Pensiv uses Gentner's Structure-Mapping Engine to find memories that share structural relationships even when surface features diverge. It maps:
- Objects — entities in the source and target (actors, locations, events)
- Attributes — properties of those entities
- Relations — connections between entities (precedes, causes, opposes)
- Higher-order relations — relations between relations (escalation pattern, feedback loop)
Surface features differ (SCS vs Black Sea, 2024 vs 2022)
Deep structure matches (Actor A → provocation → Actor B → response → escalation)
The algorithm scores structural similarity, prioritizing deep relational matches over surface keyword overlap.
Use cases
- Legal precedent — find structurally similar cases with zero keyword overlap
- Competitive intelligence — "which competitor moves resemble patterns that led to market disruption?"
- Clinical research — match biomarker-intervention patterns across unrelated conditions
- Defense analysis — cross-theater pattern matching for pre-escalation indicators
The Staleness Problem
Knowledge decays. A client's technical preference from 2024 may have been superseded. A threat assessment from 6 months ago may no longer reflect reality. Every competitor uses either no decay (append-only) or simple TTL (delete after N days). Neither reflects how knowledge actually ages.
Memory science, not vibes
Pensiv uses FSRS-6 (Free Spaced Repetition Scheduler), adapted for memory importance. Each memory's importance reflects:
- Retrieval frequency — how often it's accessed
- Recency — when it was last accessed
- Structural connectivity — how many other memories reference it
- Supersession state — whether newer information contradicts it
Memory you return to compounds in strength. Memory you ignore ages gracefully. Important memories persist. Irrelevant memories fade.
Supersession and contradiction
When new information contradicts a memory, Pensiv doesn't delete the old one. It:
- Marks the old memory as
superseded - Links it to the new memory
- Drops its importance score but preserves it for audit trails
- Surfaces the contradiction to the user if both are retrieved
You can ask "what did we know on date X?" and get the vault's state at that moment.
The Pattern Problem
After ingesting 500 intelligence reports, a human analyst develops intuitions: "this type of actor tends to follow this sequence before escalating." Those intuitions are patterns — locked in the analyst's head, lost the day they leave.
Automatic pattern discovery
Pensiv's schema inducer discovers these patterns automatically. After ~50 memories in a domain, it:
- Clusters memories by relational structure
- Identifies recurring roles (e.g., "decision authority", "technical veto", "budget gatekeeper")
- Discovers recurring actions (e.g., "escalation sequence", "procurement delay pattern")
- Names the patterns and tracks their fitness over time
After 50 sources, the vault knows the shape of your work. No manual taxonomy required.
Schema families
Schemas cluster into families — higher-level abstractions that span multiple domains. Example:
- Delay pattern family — procurement delays, technical review delays, regulatory approval delays
- Escalation pattern family — geopolitical escalation, internal conflict escalation, competitive threat escalation
When you query across vaults, schema families enable cross-domain analogies.
The Trust Problem
In defense, intelligence, legal, and clinical contexts, you don't just need the answer — you need to prove how you got it. Which sources contributed? What was the retrieval path? When was each source ingested? Has any contributing fact been contradicted since?
Auditability at every layer
Pensiv provides full decision trails:
Write audit trail
Every memory records:
- Surprise score (0.0–1.0)
- Write decision (SKIP / FAST / DEEP)
- Extraction results (entities, relations, schemas)
- Source provenance (which connector, which document, exact timestamp)
- User attribution (who ingested it, if multi-user vault)
Retrieval trace
Every query returns:
- BM25 scores for each retrieved memory
- Vector similarity scores
- Fusion weights (how BM25 and vector were combined)
- Reranker adjustments
- System 1 vs System 2 path (fast retrieval vs deep analogical)
Temporal accuracy
- Event timestamps — when things happened in the real world
- Ingestion timestamps — when the system learned about them
- Supersession tracking — when information was contradicted
You can query: "what did we know about X on date Y?" and get the vault's state at that moment.
Compliance standards
This architecture satisfies:
- ITAR / EAR — defense export control requirements
- HIPAA — healthcare data audit requirements
- SEC 17a-4 — financial records retention (FSRS-6 maps to retention schedules)
- ICD 203 — intelligence community audit standards
Auditability is architectural, not an afterthought.
One database to back up, not five
Pensiv runs on PostgreSQL with pgvector and Apache AGE extensions. One database. One backup. No multi-store complexity.
- • PostgreSQL 16 — battle-tested, enterprise-grade RDBMS
- • pgvector — vector similarity search, native to Postgres
- • Apache AGE — graph database, runs inside Postgres
- • Your LLM — OpenAI, Anthropic, or local models (Ollama)
If your IT team can back up a database, they can back up Pensiv. No vendor-specific storage. No proprietary formats. Just SQL.