Built on decades of memory science
Every capability in Pensiv is grounded in cognitive science and validated by benchmarks. We ship carefully, not quickly.
Foundational research
Remembering: A Study in Experimental and Social Psychology
Foundational work on reconstructive memory. Bartlett showed that memory is not a passive recording but an active reconstruction shaped by schemas — mental frameworks that organize and interpret information.
His "War of the Ghosts" experiment demonstrated how people remember the gist and structure of a story, not verbatim details, and how memory transforms over time to fit existing schemas.
Schema induction in Pensiv is directly inspired by Bartlett's work. The system discovers recurring structures (roles, actions, constraints) and organizes memory around them — the same way human experts build domain expertise.
Episodic and Semantic Memory
Distinguished between episodic memory (personal experiences, tied to time and place) and semantic memory (facts and concepts, abstracted from experience).
This work laid the foundation for understanding how memory systems operate at different levels of abstraction.
Pensiv maintains both episodic memories (the specific call where a client mentioned budget concerns, timestamped) and semantic memories (the pattern that "clients in this vertical always raise security concerns at procurement stage"). Wiki synthesis converts episodic to semantic.
Structure-Mapping: A Theoretical Framework for Analogy
Developed the Structure-Mapping Engine (SME), a computational model of analogical reasoning. SME maps relational structure between source and target domains, prioritizing deep structural similarities over surface features.
The algorithm has been validated across decades of cognitive science research and remains the gold standard for computational analogy.
Pensiv's analogical retrieval uses SME to find memories with zero keyword overlap but identical deep structure. Legal precedent, competitive moves, clinical patterns — matched by shape, not surface.
YARN: Retrieval-Augmented Analogical Reasoning
Showed that LLMs drop below 50% accuracy on far analogies (structurally similar, surface dissimilar). Combining LLMs with structural mapping maintains 46-52% accuracy.
Validated that pure neural approaches fail on deep analogies; symbolic structure-mapping is essential.
YARN validated the approach Pensiv takes: hybrid retrieval combining neural embeddings (for near matches) and structural mapping (for far analogies). No other production memory system implements this.
Titans: Learning to Memorize at Test Time
Introduced surprise-gated memory at the neural level. The model learns to allocate memory capacity proportional to surprise — novel information gets deep encoding; redundant information is skipped.
Showed massive efficiency gains: models with surprise-gated memory achieve better performance with 10x less memory capacity than append-only approaches.
Pensiv is the only production system that implements surprise-gated writes at the application layer. SKIP/FAST/DEEP triage keeps noise out. Signal compounds; the vault gets sharper with every source, not noisier.
FSRS-6: Free Spaced Repetition Scheduler
Modern spaced repetition algorithm used by millions of learners via Anki. Models memory decay and reinforcement based on retrieval frequency, recency, and difficulty.
Replaces earlier Ebbinghaus-based models with data-driven approach validated on billions of review logs.
Pensiv adapts FSRS-6 from human learning to organizational memory. Memory you return to compounds in strength. Memory you ignore ages gracefully. The system forgets the way an expert forgets — by deprioritizing, not deleting.
AutoSchemaKG: Automatic Schema Induction for Knowledge Graphs
Demonstrated automatic schema discovery from unstructured knowledge graphs. The system clusters entities and relations by structural similarity, discovers recurring patterns, and achieves 92% alignment with human-crafted schemas.
Pensiv uses this approach for schema induction. After ~50 memories in a domain, the vault discovers recurring roles, actions, and constraints automatically. No manual taxonomy required.
Benchmarks & validation
We ship carefully, not quickly. Each capability clears real benchmarks before going live. Here's where we stand:
LoCoMo MRR
Phase 0 baseline across 1,982 queries and 5,882 conversation turns. Exceeds Mem0's reported ~0.620 performance.
BEIR SciFact MRR
Scientific fact verification retrieval. Validates BM25+vector fusion and reranking pipeline.
Wiki synthesis lift
MRR improvement on cross-domain analogical pairs when synthesized wiki articles are available as retrieval targets.
LongMemEval
Cohort 5 roadmap target. Industry standard for long-context memory systems. We'll publish results when ready — not before.
Test coverage
1,322 tests across write pipeline, retrieval, FSRS decay, schema induction, and provenance chains.
Auditability
Full provenance, confidence decomposition, retrieval trace, temporal accuracy. No competitor provides this.
Research roadmap
LongMemEval submission
Cohort 5 target. Industry-standard benchmark for memory systems. Hindsight scored 91.4%, Zep 63.8%, Mem0 49.0%. We'll publish when results clear our internal quality bar.
Multi-hop analogical reasoning
Extend SME to chain analogies across 3+ hops. "What pattern from domain A resembles B, which resembles our current situation in C?"
Prospective memory
Remember to remember — time-based and condition-based triggers. "When customer mentions budget concerns, surface the decision from last quarter."
Undiscovered Public Knowledge (UPK) detection
Swanson's ABC pattern: A connects to B, B connects to C, but nobody has connected A to C yet. Surface non-obvious connections across vault boundaries.
How we approach validation
Ship carefully, not quickly
Each capability clears real benchmarks before going live. If LongMemEval results don't meet our bar, we don't ship and we don't claim it. Roadmap targets are honest: "Q3 2026" means that's when we'll try, not when we'll succeed.
Cognitive science first, ML second
Every capability starts with decades of memory science (Bartlett, Tulving, Gentner) and adapts it with modern techniques (YARN, Titans, AutoSchema). We don't chase papers. We build on foundations.
Production-grade from day one
Research prototypes are interesting. Production systems are useful. Pensiv is architected for auditability, reliability, and deployment in regulated industries. 81% test coverage. Full provenance. One database to back up.