From Obsidian Knowledge Graph to Delegated Librarian

Personal update 15 May 2026

I did not start with an agent problem. I started with a memory problem.

Over time, I accumulated scripts, research notes, meeting logs, health reports, paper summaries, and small local systems spread across repositories and folders.

Everything was technically still available, but availability is not the same thing as retrieval. A fact buried in a repo is not really accessible just because it still exists.

So the first thing I built was not an AI workflow. It was a maintained Obsidian vault.

My setup is fairly simple but structured: I use OpenCode as the local agent orchestrator, Obsidian as the knowledge base, and a small set of model routes behind OpenCode for different task tiers. The raw source stays in the filesystem, the useful parts get curated into linked notes in the vault, and the agent layer works against that curated graph instead of searching everything from scratch.

Step 1: Build the Knowledge Graph First

The vault became the stable layer above the raw files.

Instead of treating every repository, report, or meeting note as something I would rediscover later from scratch, I started converting them into linked notes. That changed the role of the vault. It stopped being a casual note collection and became infrastructure.

The important shift was this:

raw sources remained the place where things originally came from
the vault became the place where those things became legible

That gave me something like a personal knowledge graph. Pages linked to other pages. Pipelines linked to outputs. Projects linked to meetings. Reports linked to stable summaries. A question no longer had to start from directory spelunking. It could start from a maintained graph.

This was already useful before any agent touched it.

How Information Gets Into the Vault

One of the most important practical decisions was that not all information enters the system in the same form.

Some things arrive as relatively stable source material:

a project repository
a technical README
a local tool
a generated report
a configuration tree

Other things arrive as event material:

a meeting note
a request from someone
a new decision
a change in direction
an action list that will evolve over time

Those two streams need different curation workflows.

If they are treated the same way, the vault becomes confused. Stable infrastructure notes start reading like diaries, while meeting logs start pretending to be timeless documentation.

So the vault works because information is not only stored; it is sorted into the right kind of page.

The Dumping Layer: Raw First, Then Curated

The system needs a place where information can be dropped in before it becomes polished knowledge.

That is the role of the source layer.

In practice, the flow is usually:

something new appears in raw form
it is inspected
key ideas are extracted
those ideas are rewritten into the maintained vault

This matters because “dumping information in” should be easy, while “making it useful later” should be deliberate.

If every new source had to be perfectly organized at the moment it arrived, the system would never keep up with real work. If everything stayed in raw form forever, retrieval would remain painful.

The vault sits between those two extremes.

Workflow 1: Project Documentation Curation

Project documentation is the easier of the two major workflows because it usually has a stable object underneath it.

flowchart TD
    A[New source material] --> B[Read live source]
    B --> C[Extract structure and purpose]
    C --> D[Write or update wiki page]
    D --> E[Link related pages]
    E --> F[Refresh index/navigation]
    F --> G[Stable wiki page in vault]

One better example is a personal static site built with Hugo.

The live source is not just one script or one README. It has layouts, partial templates, content sections, multilingual strings, data files, JavaScript modules, generated assets, and small helper scripts. The curation work is not to mirror that folder tree into the vault. It is to turn the project into a linked explanation of how the subsystems fit together:

what the site is for
how configuration, templates, content, and data divide responsibilities
which files define reusable page structure
where live panels or generated assets enter the site
which accessibility or quality checks matter

That kind of source usually produces more than one wiki page: a compact overview, then focused concept pages for configuration, templates, content types, data files, JavaScript modules, and audits when needed. The value is that future questions can start from the curated map instead of from the repo root.

The source might be:

a code repository
a configuration directory
a script collection
a report generator
a README that describes system behavior

The curation job here is to turn implementation shape into reusable understanding.

That usually means:

identify the project or tool boundary
inspect the live source
extract the structure, purpose, and moving parts
write one or more stable concept pages
connect those pages into the rest of the graph

The end result is not a copy of the repo. It is a maintained explanation of the repo.

This is a very different task from “summarize one file.” The point is to make future retrieval possible:

what is this tool for?
where does it live?
what are its dependencies?
how is it configured?
what outputs matter?
what page should I read next?

In that sense, project documentation curation is a translation layer from implementation to memory.

Workflow 2: Meeting Log Curation

Meeting logs are much more fluid.

flowchart TD
    A[New meeting note] --> B[Read note]
    B --> C[Extract discussion / decisions / actions]
    C --> N[Create next-meeting draft]
    C --> U[Update log and to-do]
    C --> D{Durable facts?}
    D -->|Yes| F[Update stable overview]
    D -->|No| G[Stop at meeting layer]

One real example is a recurring research-meeting workflow.

When a new meeting note appears in the raw meeting folder, it is not treated like project documentation. The curation task is chronological and action-oriented:

read the new meeting note
prepend a dated entry to the meeting log
refresh the relevant block in the active to-do page
create or update the next meeting draft in the meetings drafts folder
only update a stable overview page if the meeting introduced facts that are now durable enough to belong there

That keeps the transient parts of project work in the meeting layer, while letting the stable overview absorb only the decisions that have actually settled.

A meeting note is not a stable object in the same sense as a code repo. It is a time-bound event containing:

what was discussed
what changed
what was decided
what remains unclear
what someone now has to do

If project documentation captures structure, meeting logs capture movement.

That makes them interesting because they are often where the real state of a project lives before the stable documentation catches up.

The curation workflow is different:

read the raw meeting note
extract discussion, decisions, and action items
add a dated log entry
refresh the active to-do layer
create or update the next draft if the workflow uses one
only then update any stable overview pages if the meeting introduced durable facts

This is important: not every meeting detail belongs in a stable overview page.

That distinction protects the vault from becoming noisy. Meeting logs preserve chronology. Overview pages preserve settled knowledge.

Why These Two Workflows Matter Together

The vault becomes powerful because it can hold both kinds of knowledge at once without flattening them into one format.

Workflow	What it preserves
Project documentation	stable structure
Meeting logs	evolving state

That gives the knowledge graph depth.

If you only keep project documentation, you know what a system is supposed to be.

If you only keep meeting logs, you know what people recently said.

If you keep both, you can answer a much better class of questions:

what does this system do?
what changed recently?
what is still unresolved?
which decisions became stable and which are still provisional?

That is one of the reasons the librarian system became necessary later. The vault was no longer just a static wiki. It had enough structure and enough living state that retrieval itself became a serious task.

Putting both workflows together with the model that drives them, the full picture looks something like this:

flowchart TD
    S1[Raw meeting note] --> M
    S2[Project source<br/>repo · config · report] --> M
    M[Strong model<br/>ingest and curate]

    M --> I[Index]
    M --> W[Wiki]
    M --> L[Log]

    subgraph A[Project Documentation]
        A1[Overview] --> A2[Concept]
        A1 --> A3[Research Cohort]
        A1 --> A6[Audit · staleness]
    end

    subgraph B[Meeting Log]
        B1[Overview] --> B2[Concept]
        B1 --> B3[Audit · coverage]
        B1 --> B4[Meeting Log]
        B4 --> B5[Meeting Draft]
    end

    W --> A1
    W --> B1
    B4 --> T[To-Do]

Real projects don’t share the same shape. A research-leaning project may anchor around cohort pages; an active operational project revolves around the meeting log and a live next-meeting draft. Even audit pages differ in role — some check existing concept pages for staleness, others scan for new information that has appeared but isn’t yet documented anywhere. The recurring pattern is the cluster shape, not any specific page-type combination.

Curation Is the Real Cost

This is the part I think is easy to underestimate.

The expensive thing is not storing information. The expensive thing is converting it into something that will still make sense later.

Project documentation requires abstraction. Meeting logs require chronology. Both require judgment.

That is why the vault is valuable. It is not a dump. It is a maintained graph where information has already been forced into a form that a future human or a future agent can use.

Step 2: One Strong Model For Curation and Retrieval

Once the vault existed, the next obvious idea was simple: point a strong model at it.

At first, the same model did both jobs. In practice, Claude Sonnet 4.6 was excellent at it. It could:

read across pages
follow links
infer which notes mattered
summarize well
reconcile slightly messy information without much babysitting

That setup felt magical because it finally matched the structure of the vault. I was no longer asking a model to raw-search my machine every time. I was asking it to walk through a curated graph. The same model that ingested new source material was also the one answering questions about it.

And it worked.

But it also surfaced the next problem very quickly.

Step 3: Retrieval Is Not the Same as Curation

The same model that was excellent at curation was too expensive to use as the default answerer for every small request.

That difference matters.

There are at least two kinds of work in a system like this:

curation work	retrieval work
- deciding what a page should say	- answering a narrow question
- reconciling sources	- finding a value in a note
- updating cross-links	- checking whether the vault already knows something
- keeping structure coherent	- following one or two links

The first kind of work deserves a strong model. The second often does not.

If the same expensive librarian answers both, the system works, but the economics are wrong. A question like “what version of this tool is installed” should not cost the same kind of cognitive budget as a multi-page ingest or a source reconciliation pass.

So the problem stopped being “can the librarian do it?” and became “how should the librarian be organized?”

Step 4: Overhaul the Librarian

The answer was not to remove the librarian. The answer was to demote the librarian from being the sole worker.

The new version became a middle manager. This was the turning point — and the moment I started calling it the librarian rather than just “the strong model with vault access.”

Instead of one model doing everything, I split the role into two layers:

a front-desk librarian that understands the vault and decides what kind of job this is
a set of delegated workers with different costs and strengths

That preserves the intelligence of the system while making the cost structure much more rational.

In other words, the librarian stops being only an answerer and becomes an organizer.

Step 5: Match Work Complexity to Model Complexity

Once I accepted that the librarian should delegate, the next question became: delegate to whom?

The clean answer was to map work complexity to model complexity.

Work tier	Typical tasks	Why it exists
Light work	quick page lookup, link checks, frontmatter checks, tiny wording fixes, narrow factual questions	Cheap and fast
Standard work	refresh one page from a source repo, lint one section of the vault, document a local tool or config tree, sync one bounded update across page/index/log	Judgment, but bounded
Advanced work	multi-page ingest, audits, contradiction checking, meeting or request ingest, workflow redesign, paper-level evidence accounting across multiple notes	Strongest model belongs here

What matters here is not just capability. It is discipline. Different tasks deserve different model costs.

flowchart TD
    U[User question] --> L[Librarian<br/>wiki-first intake]
    L --> I[Read index.md]
    I --> W[Read Wiki.md if needed]
    W --> P[Read relevant wiki pages]
    P --> D{Task size and ambiguity}
    D -->|Simple lookup| A[Direct answer by librarian]
    D -->|Narrow low-risk task| WL[wiki-light]
    D -->|Bounded synthesis or maintenance| WS[wiki-standard]
    D -->|Multi-page ingest audit or heavy synthesis| WA[wiki-advanced]
    WL --> C1{Wiki enough?}
    WS --> C2{Wiki enough?}
    WA --> C3{Wiki enough?}
    C1 -->|Yes| R[Return to librarian for final answer]
    C2 -->|Yes| R
    C3 -->|Yes| R
    C1 -->|No| S[Follow cited sources]
    C2 -->|No| S
    C3 -->|No| S
    S --> X[raw or external source paths<br/>listed by relevant wiki pages]
    X --> M[Reconcile with wiki]
    M --> R
    A --> R

The Actual Constraint: Three APIs, Three Cost Regimes

The model-routing problem was not abstract. It came from a very practical setup:

Provider lane	Reality
OpenAI	paid, most reliable, should be used where accuracy matters most
OpenCode free	useful but constrained, best for cheap narrow work or fallback
Copilot education	available, but limited enough that it should not become the default heavy worker

That immediately rules out three naive strategies:

always use the strongest paid model
always use the cheapest free model
always try to maximize “free” before “good”

None of those is actually optimal.

The real objective is to optimize across three dimensions at once:

Dimension	Question
capability	is the model good enough for this task?
cost	is this task worth paying premium tokens for?
reliability	if this model fails or drifts, what is the next acceptable fallback?

The Routing Principle

The final principle was:

expensive models should sit at the work layer where judgment matters
cheap models should handle narrow retrieval where mistakes are low-cost
the front desk should be smart enough to route, but not so strong that it solves everything itself

That is why the librarian itself became a mid-level manager rather than the strongest worker in the system.

Work-to-Model Mapping Table

This was the practical mapping that emerged from the discussions.

Work tier	Typical tasks	Primary goal	Best model class
librarian	intake, classify, decide direct vs delegate	good routing at low cost	small but competent paid model
light	lookup, link check, frontmatter check, tiny fix	cheapest acceptable accuracy	fast free model
standard	single-page refresh, bounded synthesis, section lint	reliable normal work	strong paid model
advanced	ingest, audit, cross-page synthesis, ambiguity resolution	strongest judgment	strongest reliable paid model

Provider Preference Table

The provider order also needed to be explicit.

Priority	Provider	Why
1	OpenAI	most reliable paid lane
2	OpenCode free	best place to save cost on narrow work
3	Copilot	useful fallback, but too limited to be the main heavy-work lane

There was one more tie-break rule:

if OpenAI and Copilot offer roughly comparable models, prefer OpenAI

That stops the system from wasting a limited fallback provider on work that the main paid lane can already handle better.

Final Tier Table

The concrete routing table became. The appendix at the end of this post lists the exact model IDs currently configured; the tier structure is the durable part.

Role	Primary	Secondary	Tertiary
librarian	small paid	fast free	fallback
wiki-light	fast free	small paid	fast fallback
wiki-standard	strong paid	fast free	fallback
wiki-advanced	strong paid	strong fallback	free fallback

This table was not chosen by asking “what is the strongest model?” It was chosen by asking:

what work really needs the paid lane?
where is a free model good enough?
what fallback is acceptable when the preferred lane is unavailable?

Why the Librarian Should Not Be the Strongest Model

One of the key design discussions was whether the librarian itself should be the best model.

The answer turned out to be no.

If the librarian is too strong:

it stops delegating because it can solve the task itself
the most expensive model gets consumed by triage work

That defeats the whole economic purpose of delegation.

So the front desk needed to be:

good enough to classify correctly
cheap enough not to dominate cost
weak enough, in a good sense, to keep routing work outward when appropriate

The Review And Escalation Budget

There was also a separate optimization problem: too many workers can waste tokens just as easily as too few.

So the system adopted an implicit review-and-escalation budget.

Situation	Ideal delegation pattern
simple lookup	librarian only, or librarian -> light
single-page maintenance	librarian -> standard
advanced synthesis	librarian -> advanced
ambiguous widening task	librarian -> light or standard -> librarian review -> possible escalation

The target was never maximal delegation. It was minimal necessary delegation.

That distinction matters. The system was designed to avoid two opposite failures:

one overpowered librarian doing everything
a bureaucratic swarm of agents wasting tokens by passing work around

The point was to find the narrow corridor between them.

Who Reviews Whom?

This part took some discussion because the most confusing version is a system where workers keep passing work sideways or at the same level.

The cleaner rule is:

workers do the assigned job
the librarian reviews the result
the librarian decides whether to finalize, escalate upward, or ask for a tighter second pass

That means the normal control flow is vertical, not lateral.

Worker	Normal next step
`wiki-light`	return to librarian
`wiki-standard`	return to librarian
`wiki-advanced`	return to librarian

Workers are not supposed to endlessly pass work to workers at the same level.

Do Workers Pass Within the Same Level?

Usually, no.

A wiki-light worker does not normally pass to another wiki-light worker. That would mostly duplicate cost without improving judgment.

A wiki-standard worker also does not normally hand off to a second wiki-standard worker just because the task is large. If the task is large enough to require materially different judgment, it should be reviewed by the librarian and then escalated.

The system becomes much easier to reason about if each worker assumes:

do the bounded job
stop
return for review

Does Light Only Pass to Standard?

Not directly.

The better model is:

wiki-light returns to the librarian
the librarian decides whether the result is enough
if not, the librarian escalates to wiki-standard or wiki-advanced

So the movement is usually:

light -> librarian -> standard
light -> librarian -> advanced
standard -> librarian -> advanced

This keeps the routing logic in one place.

Can the Flow Ever Move Downward?

Yes, but only as a deliberate review move.

For example:

wiki-advanced may do the heavy synthesis
then the librarian may ask for a narrower verification pass from wiki-light or wiki-standard

That is not the same as escalation. It is review.

So upward movement means:

more complexity
more judgment
stronger worker

Downward movement means:

narrower verification
cheaper confirmation
smaller follow-up task

The Reviewer Model

This is probably the clearest way to think about it:

the librarian is the reviewer and dispatcher
the workers are specialists by complexity tier

The workers do not own the final routing logic. The librarian does.

That makes the system much easier to explain than a free-floating swarm of agents.

Step 6: Build the Delegation System

The final system looks more like an organization chart than a single assistant.

There is now:

one librarian
one light worker
one standard worker
one advanced worker

The librarian is wiki-first. It starts from the vault, not from the current directory. It reads the maintained graph first, and only follows cited sources when the vault is incomplete, stale, or ambiguous.

flowchart LR
    W[Worker returns result] --> R{Librarian review}
    R -->|Accept| Z[Final answer]
    R -->|Escalate up| U[Stronger tier worker]
    R -.->|Verify narrow, rare| V[Lighter tier worker]
    U --> R
    V --> R

Downward verification is the rare branch. The librarian usually handles narrow checks itself. The case where it actually matters is when the result already came from the strongest tier — “escalate up” no longer exists, so dispatching a lighter worker for a bounded second look is the cheapest way to avoid trusting the apex worker blindly.

That means the retrieval flow has become deliberate:

start from the maintained note graph
answer directly if the question is narrow
delegate if the scope widens
follow raw or external sources only when necessary

That last point is important. The system does not begin by searching the whole machine. It begins by trusting the curated graph, then escalating outward only when needed.

Step 7: Let the Librarian Leave the Vault

At first, the wiki logic mostly lived inside the local vault workflow. That was useful, but still too local.

Eventually I wanted something stronger: a librarian I could summon from any working directory on this machine.

So the final move was to configure the librarian globally, outside any one project directory.

Now the behavior I wanted is simple:

I can be in another directory entirely
I switch to librarian
I ask a vault question
the librarian still knows that the Obsidian vault is the primary knowledge base

That changes the nature of the system again.

The librarian is no longer just a note about a workflow. It becomes a persistent interface to the knowledge graph.

Why This Feels Different

The interesting thing is that none of this started from model selection.

It started from curation.

If there is no maintained knowledge graph, delegation does not help much. You are just sending agents into chaos. Once the vault exists, delegation becomes powerful because the workers have something stable to stand on.

That is why the ordering matters:

build the vault
curate it seriously
assign one good librarian
discover that retrieval and curation have different economics
overhaul the librarian into a delegated system

I think that sequence is more general than my own setup. A lot of personal AI workflows start too early with orchestration. But orchestration without a maintained knowledge layer is mostly fancy searching.

What made this system useful was not the existence of multiple agents. It was the existence of a curated graph that multiple agents could work against.

The Real Lesson

The real lesson is not “use more agents.”

It is:

build a stable knowledge layer first
separate curation from retrieval
then use delegation to control cost without losing judgment

The first librarian proved the vault could work.

The delegated librarian made it sustainable.

Appendix: Current Model Mapping

This is the concrete routing table encoded in the OpenCode agent config as of the time of writing. The tier labels used throughout the post are the durable part; specific model IDs will change as providers update their offerings.

librarian

Tier	Model
Primary	openai/gpt-5.4-mini
Secondary	opencode/minimax-m2.5-free
Tertiary	github-copilot/claude-sonnet-4.5

wiki-light

Tier	Model
Primary	opencode/deepseek-v4-flash-free
Secondary	openai/gpt-5.4-mini
Tertiary	github-copilot/claude-haiku-4.5

wiki-standard

Tier	Model
Primary	openai/gpt-5.4
Secondary	opencode/minimax-m2.5-free
Tertiary	github-copilot/claude-sonnet-4.5

wiki-advanced

Tier	Model
Primary	openai/gpt-5.4
Secondary	github-copilot/claude-sonnet-4.6
Tertiary	opencode/nemotron-3-super-free

wiki-advanced ranks Copilot before OpenCode-free in its fallback chain. The general provider preference (OpenAI > OpenCode free > Copilot) does not apply at the apex tier because the strongest available Copilot model is closer in capability to the OpenAI strong tier than the strongest available free model.