From-Scratch Setup Manual¶
1. Goal¶
Build a local-first, agent-operable second brain that a person or team can use for:
- meeting preparation
- relationship intelligence
- company and market memory
- book and research synthesis
- decision logs
- strategic briefs
- recurring operating reviews
The target system keeps human-readable files as the source of truth:
Markdown Vault -> Schema + Index -> Agent Skills -> Query / Brief / Eval
Markdown remains the source of truth. Indexes accelerate retrieval. Agent skills make ingest, query, briefing, markdown editing, and evaluation repeatable.
Intellectual lineage matters. Start with Andrej Karpathy's LLM Knowledge Bases / LLM Wiki pattern, then add Garry Tan's GStack/GBrain operational layer. This repo packages those influences into a reusable framework with schemas, skills, evals, governance, and validation. See Lineage and Credits.
2. Required Components¶
Core¶
- macOS or Linux host
- Git
- GitHub account or GitHub Enterprise
- Python 3
- Optional agent runtime, such as OpenClaw or Codex
- Optional retrieval/indexing layer, such as GBrain
- Optional embedding provider key when semantic search is enabled
Optional Integrations¶
- Telegram, Slack, Discord, or WhatsApp for chat
- Gmail / Google Workspace for email and calendar ingestion
- Notion, Google Drive, or local folders for document ingestion
- Twilio or WebRTC for voice-to-brain
- Supabase for hosted Postgres when local PGLite is no longer enough
3. Recommended Repository Layout¶
Use this repository as the framework and keep real private vault data separate:
memex/ # framework, docs, skills, schemas, evals, synthetic examples
private-vault/ # real private knowledge vault, access-controlled
Keep them separate. This repo can be made public after privacy review. Real vaults should remain private unless explicitly sanitized.
Recommended private vault structure:
private-vault/
├── AGENTS.md
├── RESOLVER.md
├── schema.md
├── log.md
├── people/
├── companies/
├── projects/
├── books/
├── concepts/
├── philosophies/
├── decisions/
├── relationships/
├── sources/
├── raw/
├── inbox/
├── artifacts/
├── briefs/
├── reviews/
└── templates/
4. Install Runtime¶
Clone the framework repo:
git clone https://github.com/YOUR_ORG/memex.git
cd memex
python3 scripts/validate_index.py
python3 scripts/validate_vault.py examples/fake-vault
The validation commands should pass before you connect a real vault.
5. Optional Retrieval Runtime¶
Install Bun:
curl -fsSL https://bun.sh/install | bash
export PATH="$HOME/.bun/bin:$PATH"
Add the path permanently:
echo 'export PATH="$HOME/.bun/bin:$PATH"' >> ~/.zshrc
Install GBrain if you want local keyword/vector retrieval:
git clone https://github.com/garrytan/gbrain.git ~/gbrain
cd ~/gbrain
bun install
bun link
gbrain --version
6. Create the Private Vault Repo¶
mkdir -p ~/private-vault
cd ~/private-vault
git init
mkdir -p people companies projects books concepts philosophies decisions relationships sources raw inbox artifacts briefs reviews templates
touch AGENTS.md RESOLVER.md schema.md log.md
First commit:
git add .
git commit -m "Initialize private memex vault"
7. Configure Secrets¶
Set an embedding key only if your retrieval layer needs one:
export OPENAI_API_KEY="YOUR_EMBEDDING_KEY"
Persist it in a secure environment manager, not inside this repo or the private vault.
Do not commit:
- API keys
- OAuth tokens
- private credentials
- raw customer records
- cap tables or board materials unless the repository is private and access-controlled
8. Initialize Retrieval¶
gbrain init
gbrain doctor --json
Import the private vault without embeddings first:
gbrain import ~/private-vault --no-embed
Generate embeddings:
gbrain embed --stale
Verify:
gbrain health
gbrain query "what are the main themes in this brain?"
Expected healthy state:
Embed coverage: 100.0%
Missing embeddings: 0
Stale pages: 0
Dead links: 0
9. Configure Sync¶
Point GBrain to the private vault repo:
gbrain sync --repo ~/private-vault --no-pull --no-embed
After normal edits:
gbrain sync --repo ~/private-vault --no-pull --no-embed
gbrain embed --stale
If using a hosted Git remote:
gbrain sync --repo ~/private-vault
10. Install and Configure an Agent Runtime¶
Install or deploy OpenClaw according to your runtime:
- local workstation for personal use
- private cloud VM for executive assistant workflows
- managed OpenClaw deployment for team access
The OpenClaw agent needs:
- workspace access to
private-vault - shell access to
gbrain - access to
OPENAI_API_KEYfor embeddings - chat surface connection, such as Telegram or Slack
- cron/scheduler access for maintenance jobs
11. Add Agent Operating Instructions¶
Create AGENTS.md in the private vault repo:
# AGENTS.md
This vault is a private executive memory system.
Rules:
- Search before answering questions about people, companies, meetings, books, decisions, or strategy.
- Use GBrain for world knowledge and vault facts.
- Use operational memory for user preferences and assistant behavior.
- Every new claim added to the brain needs provenance.
- Raw sources are append-only.
- Never publish or send private vault content externally without explicit approval.
- Log every durable write in log.md.
Create RESOLVER.md:
# RESOLVER.md
Slug rules:
- lowercase ASCII
- spaces become hyphens
- Turkish characters are folded: ı->i, İ->i, ş->s, ğ->g, ü->u, ö->o, ç->c
- people use firstname-lastname
- companies use common company name
- books use title slug, author only when needed
Never create duplicate entities when a canonical page already exists.
12. Create Core Templates¶
Use the templates in templates for:
- person pages
- company pages
- book pages
- concept pages
- meeting notes
- decision records
Templates are not bureaucracy. They make the brain machine-readable.
12. First Ingestion Pass¶
Start with high-value material:
- top 50 people
- top 30 companies
- active projects
- current strategic theses
- board and investor materials
- important books and articles
- recent meeting notes
For each item:
gbrain put <slug> --content "$(cat path/to/file.md)"
Or bulk import:
gbrain import ~/private-vault --no-embed
gbrain embed --stale
13. Set Up Recurring Jobs¶
Minimum viable cadence:
Every 15 minutes:
gbrain sync --repo ~/private-vault --no-pull --no-embed
Hourly:
gbrain embed --stale
Daily:
gbrain health
create daily memory note
process inbox/
Weekly:
dead-link scan
orphan-page review
stale-claim review
top-project review
OpenClaw cron should run these as isolated jobs when possible.
14. OpenClaw Workflows¶
Pre-Meeting Brief¶
Input:
Prep me for my meeting with Jane Doe.
Procedure:
gbrain search "Jane Doe"gbrain get people/jane-doe- query neighboring company/project pages
- produce concise brief with:
- role
- history
- open threads
- risks
- one sharp question
Book-to-Brain¶
Input:
Ingest this book summary.
Procedure:
- create or update book entity
- extract thesis
- extract mental models
- connect to concepts and strategic theses
- add provenance
- sync and embed
Decision Log¶
Input:
Record this decision.
Procedure:
- create decision page
- capture context, options, decision, rationale, date, owner
- link people, companies, projects
- add review date
15. Verification Checklist¶
A system is ready when all checks pass:
gbrain healthshows 100% embedding coveragegbrain queryreturns relevant pagesgbrain get <slug>returns exact entities- OpenClaw can call GBrain from chat
- OpenClaw can write a new entity
- sync picks up file changes
- embeddings refresh after updates
- no secrets are committed
- the framework repo has no private vault content
- the private vault repo is access-controlled
16. GitHub Pages Publishing¶
Publish only the framework repo after privacy review. Do not publish an unsanitized private vault.
cd memex
git init
git add .
git commit -m "Initial Memex framework"
git branch -M main
git remote add origin git@github.com:YOUR_ORG/memex.git
git push -u origin main
In GitHub:
- Settings
- Pages
- Deploy from branch
- Branch:
main - Folder:
/root - Save
17. Common Failure Modes¶
GBrain Search Works, Query Does Not¶
Likely missing OPENAI_API_KEY.
echo $OPENAI_API_KEY
gbrain embed --stale
Sync Points to Old Repo¶
Run:
gbrain config set sync.repo_path ~/private-vault
gbrain sync --repo ~/private-vault --no-pull --no-embed
Missing Embeddings¶
Run:
gbrain embed --stale
gbrain health
Duplicate People or Companies¶
Do not delete. Merge into a canonical slug and leave a redirect.
Agent Gives Unsourced Claims¶
Tighten AGENTS.md: require citation to slugs before strategic claims.
18. Definition of Done¶
The setup is complete when an executive can ask:
What do we know about this person, why does it matter, and what should I ask next?
And the agent can answer from the brain with cited, current, inspectable evidence.