Turning a 478-page manual into an MCP knowledge base


The Elektron Octatrack Notebook is 478 pages. It is dense. When I need one answer, I don’t want to read 478 pages to find it, and I don’t want to paste the whole thing into an LLM and hope it doesn’t make something up.

So I built a knowledge base from it. It is served over an MCP server. The model queries it like a tool and gets back page-accurate sections. No guessing.

The problem with “just ask the LLM”

Two bad options:

  • Dump the whole manual into context. Expensive, slow, and the model still drifts.
  • Ask from memory. The model invents menu paths that don’t exist.

I wanted a third option. Store the manual once, in a form the model can search, and return only the relevant parts. Generative AI builds the base. At query time there is no generation at all — just retrieval.

Three layers

The KB has 179 items in three layers:

  • Reference sections (149) — one per numbered section of the manual, all 16 chapters. Each is a Markdown file with frontmatter: id, chapter, section, source_pages, concepts, usage_modes.
  • Concept graph (168 concepts) — built by inverting the frontmatter. For one concept you get its definition, every section that mentions it, and related concepts.
  • Workflow guides (30) — task-oriented micro-guides for the highest-value jobs, like “sample a live source”. Each guide cites the reference sections it was built from.

The identity of a chunk is its id, not the filename. So I can rename files on disk and the concept links still hold.

How a query looks

You ask a normal question:

How do I sample a live audio source on the Octatrack?

The server returns the matching guide (sample-a-live-source) plus the source sections it draws from (audio-inputs, pickup-machine-basics, recording-setup). Page-accurate. No extra AI cost.

Retrieval is local and hybrid:

final_score = vec_similarity + 0.5 × fts_rank + 0.25 × concept_boost
  • vec_similaritysqlite-vec, embeddings from bge-small-en-v1.5 (384-dim), computed on CPU.
  • fts_rank — FTS5 keyword match, for exact terminology like parameter-lock.
  • concept_boost — a bonus when a query word matches a chunk’s concept tags.

Vector search alone misses exact terms. Keyword search alone misses paraphrases. Together they cover both.

The MCP server

The server uses FastMCP and exposes 8 tools:

ToolPurpose
searchHybrid search over sections and guides
get_chunkFetch one section or guide by id
get_conceptDefinition, all mentions, related concepts
list_workflowsAll 30 guides
get_workflowOne full guide with its sources
list_usage_modesThe 8 usage modes with counts
by_usage_modeFilter by mode, e.g. sampler
compare_approachesTop guides for a task, side by side

It runs in Claude Desktop and in Claude Code.

Building it

The build is a pipeline of 9 stages. Every stage skips work it already did, so a crashed run resumes where it stopped.

  1. Render the PDF to images (200 DPI).
  2. Build a table of contents from the page text.
  3. Extract each page with vision.
  4. Assemble pages into section chunks.
  5. Tag concepts and usage modes.
  6. Reconcile slug aliases (193 raw slugs down to 168 canonical).
  7. Invert the frontmatter into the concept graph.
  8. Author the 30 guides, then verify them against the source.
  9. Build the SQLite search index.

All the model work runs on my Claude Max subscription, through the claude CLI. Flat rate, no per-page bill.

That has one cost: rate limits. Opus hit its window at page 151 of 478. I resumed extraction with Sonnet, which draws less. Because every stage is idempotent, I just waited and re-ran. It picked up from where it stopped.

One thing that paid off

The guides are authored by Opus, then checked by a second pass with Sonnet against the source sections. 10 of the 30 needed a fix on that pass. Without the verification step, those 10 would have shipped wrong.

Generate, then verify. The two passes are cheap compared to one wrong answer in a 478-page manual.

What I got

  • A manual I can actually query.
  • Answers tied to real page ranges, not invented ones.
  • No AI running at query time, so it is fast and free to ask.
  • A pipeline I can point at a different manual and run again.

The method is not specific to the Octatrack. Any large, dense manual works the same way.