Benchmark Report¶

Verdict¶

Memex is the canonical product repository. It absorbs the explanatory strength of the the executive guide prototype while keeping its own stronger technical substrate: schemas, skills, evals, synthetic examples, validation scripts, and CI.

Compared repositories¶

Repository	Role	Score
memex	Canonical local-first, agent-operable second-brain framework	8.1/10
an executive-facing guide prototype	Executive-facing setup guide and GitHub Pages prototype	6.8/10
kepano/obsidian-skills	Obsidian-specific agent skill layer	Reference only
Andrej Karpathy's LLM Wiki	Primary upstream knowledge-base pattern	Lineage
garrytan/gstack	Skill-based agent operating system	Operational reference
garrytan/gbrain	Retrieval, graph, synthesis, and brain layer	Operational reference

Primary credit belongs to Andrej Karpathy's LLM Knowledge Bases / LLM Wiki. Garry Tan's GStack and GBrain are the strongest operational references for turning the pattern into a repeatable agent system. See Lineage and Credits.

Memex vs the executive guide prototype¶

Memex wins on product credibility:

Agent Skills-compatible workflows in skills/.
Entity schema in schemas/entity.schema.json.
Evaluation rubric and sample query set in evals/.
Synthetic fake vault in examples/fake-vault/.
Validation scripts and GitHub Actions.
Real content and index pattern in content/.

The guide wins on explanation:

Stronger from-scratch setup manual.
Clearer governance and privacy framing.
Better operator playbook.
GitHub Pages landing page and styling.

Integration decision: copy the guide layer into Memex and make Memex the single canonical repo.

Memex vs Obsidian Skills¶

kepano/obsidian-skills is a strong tool-use layer for Obsidian-flavored markdown. Memex should sit one level higher.

Dimension	Obsidian Skills	Memex
Primary job	Help agents edit Obsidian files correctly	Help agents reason over a citeable knowledge graph
Unit of work	Markdown syntax, bases, canvases, vault operations	Claims, entities, sources, relationships, decisions
Data model	Obsidian-native conventions	Typed entities with provenance and lifecycle
Retrieval	Not the central problem	Core architecture concern
Evaluation	Not primary	Built into repo through evals and rubrics
Privacy stance	General-purpose vault tooling	Explicit public framework and private vault boundary

What to borrow¶

From the the executive guide prototype: setup clarity, governance language, operator cadence, Pages presentation.
From Obsidian Skills: small focused skill files, editor-native discipline, practical command recipes.

What to avoid¶

Shipping only documentation without schemas, skills, examples, and tests.
Staying at the markdown-helper layer without retrieval and citation quality.
Using real private vault examples in a public framework repo.
Measuring success by vibe rather than retrieval, citation, contradiction handling, and synthesis quality.

Current evidence¶

Validation passes on the canonical repo:

python3 scripts/validate_index.py
python3 scripts/validate_vault.py examples/fake-vault

The framework now includes setup manual, architecture polish, governance guide, operator playbook, templates guide, GitHub Pages homepage, corrected roadmap, and benchmark report.

Next benchmark step¶

Add an executable benchmark runner that reads evals/query-set.sample.yml, runs retrieval against a selected vault or index, and scores responses against evals/rubric.md.