Benchmark Report¶
Verdict¶
Memex is the canonical product repository. It absorbs the explanatory strength of the the executive guide prototype while keeping its own stronger technical substrate: schemas, skills, evals, synthetic examples, validation scripts, and CI.
Compared repositories¶
| Repository | Role | Score |
|---|---|---|
| memex | Canonical local-first, agent-operable second-brain framework | 8.1/10 |
| an executive-facing guide prototype | Executive-facing setup guide and GitHub Pages prototype | 6.8/10 |
| kepano/obsidian-skills | Obsidian-specific agent skill layer | Reference only |
| Andrej Karpathy's LLM Wiki | Primary upstream knowledge-base pattern | Lineage |
| garrytan/gstack | Skill-based agent operating system | Operational reference |
| garrytan/gbrain | Retrieval, graph, synthesis, and brain layer | Operational reference |
Primary credit belongs to Andrej Karpathy's LLM Knowledge Bases / LLM Wiki. Garry Tan's GStack and GBrain are the strongest operational references for turning the pattern into a repeatable agent system. See Lineage and Credits.
Memex vs the executive guide prototype¶
Memex wins on product credibility:
- Agent Skills-compatible workflows in skills/.
- Entity schema in schemas/entity.schema.json.
- Evaluation rubric and sample query set in evals/.
- Synthetic fake vault in examples/fake-vault/.
- Validation scripts and GitHub Actions.
- Real content and index pattern in content/.
The guide wins on explanation:
- Stronger from-scratch setup manual.
- Clearer governance and privacy framing.
- Better operator playbook.
- GitHub Pages landing page and styling.
Integration decision: copy the guide layer into Memex and make Memex the single canonical repo.
Memex vs Obsidian Skills¶
kepano/obsidian-skills is a strong tool-use layer for Obsidian-flavored markdown. Memex should sit one level higher.
| Dimension | Obsidian Skills | Memex |
|---|---|---|
| Primary job | Help agents edit Obsidian files correctly | Help agents reason over a citeable knowledge graph |
| Unit of work | Markdown syntax, bases, canvases, vault operations | Claims, entities, sources, relationships, decisions |
| Data model | Obsidian-native conventions | Typed entities with provenance and lifecycle |
| Retrieval | Not the central problem | Core architecture concern |
| Evaluation | Not primary | Built into repo through evals and rubrics |
| Privacy stance | General-purpose vault tooling | Explicit public framework and private vault boundary |
What to borrow¶
- From the the executive guide prototype: setup clarity, governance language, operator cadence, Pages presentation.
- From Obsidian Skills: small focused skill files, editor-native discipline, practical command recipes.
What to avoid¶
- Shipping only documentation without schemas, skills, examples, and tests.
- Staying at the markdown-helper layer without retrieval and citation quality.
- Using real private vault examples in a public framework repo.
- Measuring success by vibe rather than retrieval, citation, contradiction handling, and synthesis quality.
Current evidence¶
Validation passes on the canonical repo:
- python3 scripts/validate_index.py
- python3 scripts/validate_vault.py examples/fake-vault
The framework now includes setup manual, architecture polish, governance guide, operator playbook, templates guide, GitHub Pages homepage, corrected roadmap, and benchmark report.
Next benchmark step¶
Add an executable benchmark runner that reads evals/query-set.sample.yml, runs retrieval against a selected vault or index, and scores responses against evals/rubric.md.