Structured Digestion as an Alternative to RAG

Most conversations about giving AI access to your documents land on one of two approaches: stuff the raw text into the prompt, or embed chunks into a vector database and retrieve them with RAG. Both work. Both have real limitations. I built HealthCheck to explore a third option.

The Problem

Medical documents are dense, multilingual, and full of structured information trapped in unstructured PDFs. Lab reports contain test names, numeric values, reference ranges, and flags. Prescriptions list medications with dosages and frequencies. Discharge summaries reference diagnoses, procedures, and follow-up plans. All of this arrives as flat text on a page.

The two common approaches struggle here in different ways:

Context stuffing hits token limits fast. A patient with 40+ PDFs easily exceeds any context window. You pay full token cost on every request, there is no persistence across conversations, and no way to do cross-document reasoning unless everything fits in a single prompt.

RAG embeds text chunks into a vector database and retrieves the top-k most similar results per query. But embedding similarity is approximate and struggles with multilingual medical terminology. You cannot do relational queries like "show me how my hemoglobin trended over three years." Chunk boundaries break table structures. And there is no concept of typed entities, so a lab value and a medication name are just vectors in the same space.

The Idea: Structured Digestion

What if the AI read each document once, understood it, and wrote structured records into a relational database? Not chunks of text with embeddings, but typed clinical entities: lab results with values, units, and reference ranges. Medications with dosages and ATC codes. Diagnoses with ICD-10 codes. Procedures, immunizations, allergies, imaging results.

Future queries would hit the database directly. No re-reading documents, no approximate similarity search. Just precise, typed lookups across a normalized schema.

That is the core idea behind HealthCheck.

How It Works

HealthCheck is a Swift MCP (Model Context Protocol) server backed by SQLite. The architecture is deliberately simple: the server is a dumb data layer that parses PDFs and stores/retrieves data. The AI agent does all the intelligence.

The flow looks like this:

Ingest - A PDF is fed through a dual extraction pipeline. PDFKit pulls embedded text while Apple Vision's RecognizeDocumentsRequest runs OCR with structure detection (tables, lists, paragraphs). A reconciler picks the best version per page.
Read - The AI agent requests the extracted text through MCP tools and reads the document, handling multilingual content, OCR artifacts, and ambiguous formatting naturally.
Extract - The agent calls back with structured data: create_lab_result, create_medication, create_diagnosis, and so on. Each entity is typed, validated, and stored in a relational schema.
Query - When you ask a health question later, the agent queries the database directly. "What was my MPV?" becomes a precise lookup, not a similarity search across text chunks.

37 MCP tools handle the full lifecycle: 3 for ingestion, 14 for CRUD operations across clinical entities, 18 for querying, and 2 for updates. The database has 18 tables covering patients, encounters, facilities, doctors, and the full range of clinical data types.

Why MCP

The Model Context Protocol is what makes this architecture practical. It gives the AI agent typed tools to interact with the database, turning it from a text generator into a data operator. The agent does not get a raw SQL connection. It gets purpose-built tools like create_lab_result and get_lab_history, each with a defined schema. The server validates inputs, the agent reasons about content.

This separation is important. The MCP server never makes clinical decisions. It does not guess what a lab value means or whether a medication interaction exists. It stores and retrieves structured data. The intelligence stays in the agent.

Privacy by Architecture

All data stays local in a SQLite database on your machine. No embeddings sent to external services, no vector database in the cloud, no document text uploaded for processing. The AI agent only sees the structured query results it explicitly requests through MCP tools. When you ask "what was my last blood pressure?", the agent calls get_lab_history and receives just the relevant records, not your entire medical history.

The Trade-offs

This approach has real costs. The AI must read and structure each document upfront, which takes time and API calls. Context stuffing and RAG are faster to set up. The extraction is not perfect, so the system tracks confidence scores and supports a review workflow. The relational schema must be designed ahead of time, which means new entity types require migrations. RAG handles arbitrary content more flexibly.

Structured digestion works best for inherently structured data: medical records, financial documents, lab reports. Free-form text like essays or emails would not benefit much from this pattern.

The Comparison

Here is how the three approaches compare across key dimensions:

Data format - Raw text in prompt vs. text chunks with embeddings vs. typed relational records
Query precision - Depends on context vs. approximate similarity vs. exact database queries
Cross-document reasoning - Only if docs fit in context vs. poor (chunks are isolated) vs. native relational joins
Persistence - None vs. embeddings persist vs. full structured persistence
Token efficiency - Send everything every time vs. send top-k chunks vs. send only requested fields
Privacy - Full text sent every request vs. text sent to embedding service vs. all data stays local

Tech Stack

Swift 6 with strict concurrency
MCP via the official Swift SDK (stdio transport)
GRDB.swift / SQLite with Codable records and built-in migrations
PDFKit + Apple Vision for dual text extraction
67 tests across 13 test files using Swift Testing

What I Learned

Building this reinforced something I keep running into: the right data model matters more than the retrieval algorithm. RAG is powerful for unstructured, open-ended content. But when your documents contain inherently structured data, fighting to reconstruct that structure from embedding similarity feels backwards. Sometimes the better approach is to structure it once and query it properly.

MCP turned out to be the right abstraction for this pattern. It creates a clean boundary between the dumb server and the intelligent agent, and it generalizes well. The same pattern could apply to financial records, legal documents, compliance paperwork, or any domain where structured data is trapped in unstructured documents.

This is a proof of concept, not a product. It works end-to-end with real multilingual medical PDFs, but it needs more testing and refinement. The source code is available on GitHub.

Break Zero