pensiv
Research appendix

Built on decades of memory science

Every capability in Pensiv is grounded in cognitive science and validated by benchmarks. We ship carefully, not quickly.

Standing on giants

Foundational research

1932
Cambridge University Press

Remembering: A Study in Experimental and Social Psychology

Frederic Bartlett

Foundational work on reconstructive memory. Bartlett showed that memory is not a passive recording but an active reconstruction shaped by schemas — mental frameworks that organize and interpret information.

His "War of the Ghosts" experiment demonstrated how people remember the gist and structure of a story, not verbatim details, and how memory transforms over time to fit existing schemas.

Relevance to Pensiv

Schema induction in Pensiv is directly inspired by Bartlett's work. The system discovers recurring structures (roles, actions, constraints) and organizes memory around them — the same way human experts build domain expertise.

1972
Psychological Review

Episodic and Semantic Memory

Endel Tulving

Distinguished between episodic memory (personal experiences, tied to time and place) and semantic memory (facts and concepts, abstracted from experience).

This work laid the foundation for understanding how memory systems operate at different levels of abstraction.

Relevance to Pensiv

Pensiv maintains both episodic memories (the specific call where a client mentioned budget concerns, timestamped) and semantic memories (the pattern that "clients in this vertical always raise security concerns at procurement stage"). Wiki synthesis converts episodic to semantic.

1983
Cognitive Science

Structure-Mapping: A Theoretical Framework for Analogy

Dedre Gentner

Developed the Structure-Mapping Engine (SME), a computational model of analogical reasoning. SME maps relational structure between source and target domains, prioritizing deep structural similarities over surface features.

The algorithm has been validated across decades of cognitive science research and remains the gold standard for computational analogy.

Relevance to Pensiv

Pensiv's analogical retrieval uses SME to find memories with zero keyword overlap but identical deep structure. Legal precedent, competitive moves, clinical patterns — matched by shape, not surface.

2024
arXiv:2603.29997

YARN: Retrieval-Augmented Analogical Reasoning

Multiple institutions

Showed that LLMs drop below 50% accuracy on far analogies (structurally similar, surface dissimilar). Combining LLMs with structural mapping maintains 46-52% accuracy.

Validated that pure neural approaches fail on deep analogies; symbolic structure-mapping is essential.

Relevance to Pensiv

YARN validated the approach Pensiv takes: hybrid retrieval combining neural embeddings (for near matches) and structural mapping (for far analogies). No other production memory system implements this.

2024
arXiv:2501.00663 · NeurIPS

Titans: Learning to Memorize at Test Time

Google DeepMind

Introduced surprise-gated memory at the neural level. The model learns to allocate memory capacity proportional to surprise — novel information gets deep encoding; redundant information is skipped.

Showed massive efficiency gains: models with surprise-gated memory achieve better performance with 10x less memory capacity than append-only approaches.

Relevance to Pensiv

Pensiv is the only production system that implements surprise-gated writes at the application layer. SKIP/FAST/DEEP triage keeps noise out. Signal compounds; the vault gets sharper with every source, not noisier.

2024
GitHub · Anki ecosystem

FSRS-6: Free Spaced Repetition Scheduler

Jarrett Ye et al.

Modern spaced repetition algorithm used by millions of learners via Anki. Models memory decay and reinforcement based on retrieval frequency, recency, and difficulty.

Replaces earlier Ebbinghaus-based models with data-driven approach validated on billions of review logs.

Relevance to Pensiv

Pensiv adapts FSRS-6 from human learning to organizational memory. Memory you return to compounds in strength. Memory you ignore ages gracefully. The system forgets the way an expert forgets — by deprioritizing, not deleting.

2025
arXiv:2505.23628 · ICML

AutoSchemaKG: Automatic Schema Induction for Knowledge Graphs

Multiple institutions

Demonstrated automatic schema discovery from unstructured knowledge graphs. The system clusters entities and relations by structural similarity, discovers recurring patterns, and achieves 92% alignment with human-crafted schemas.

Relevance to Pensiv

Pensiv uses this approach for schema induction. After ~50 memories in a domain, the vault discovers recurring roles, actions, and constraints automatically. No manual taxonomy required.

Benchmarks & validation

We ship carefully, not quickly. Each capability clears real benchmarks before going live. Here's where we stand:

0.629

LoCoMo MRR

Phase 0 baseline across 1,982 queries and 5,882 conversation turns. Exceeds Mem0's reported ~0.620 performance.

✓ Shipped · Production
0.637

BEIR SciFact MRR

Scientific fact verification retrieval. Validates BM25+vector fusion and reranking pipeline.

✓ Shipped · Production
+0.472

Wiki synthesis lift

MRR improvement on cross-domain analogical pairs when synthesized wiki articles are available as retrieval targets.

✓ Shipped · Production
Q3 2026

LongMemEval

Cohort 5 roadmap target. Industry standard for long-context memory systems. We'll publish results when ready — not before.

Roadmap · Not shipped
81%

Test coverage

1,322 tests across write pipeline, retrieval, FSRS decay, schema induction, and provenance chains.

✓ Continuous
100%

Auditability

Full provenance, confidence decomposition, retrieval trace, temporal accuracy. No competitor provides this.

✓ Architectural

Research roadmap

Q3 2026

LongMemEval submission

Cohort 5 target. Industry-standard benchmark for memory systems. Hindsight scored 91.4%, Zep 63.8%, Mem0 49.0%. We'll publish when results clear our internal quality bar.

Q4 2026

Multi-hop analogical reasoning

Extend SME to chain analogies across 3+ hops. "What pattern from domain A resembles B, which resembles our current situation in C?"

Q1 2027

Prospective memory

Remember to remember — time-based and condition-based triggers. "When customer mentions budget concerns, surface the decision from last quarter."

Q2 2027

Undiscovered Public Knowledge (UPK) detection

Swanson's ABC pattern: A connects to B, B connects to C, but nobody has connected A to C yet. Surface non-obvious connections across vault boundaries.

Research principles

How we approach validation

Ship carefully, not quickly

Each capability clears real benchmarks before going live. If LongMemEval results don't meet our bar, we don't ship and we don't claim it. Roadmap targets are honest: "Q3 2026" means that's when we'll try, not when we'll succeed.

Cognitive science first, ML second

Every capability starts with decades of memory science (Bartlett, Tulving, Gentner) and adapts it with modern techniques (YARN, Titans, AutoSchema). We don't chase papers. We build on foundations.

Production-grade from day one

Research prototypes are interesting. Production systems are useful. Pensiv is architected for auditability, reliability, and deployment in regulated industries. 81% test coverage. Full provenance. One database to back up.

Want to see how it works in practice?