Forem: Daniel Yarmoluk

I mapped LangChain Core as a knowledge graph — here's what the structure reveals

Daniel Yarmoluk — Fri, 01 May 2026 20:17:10 +0000

I mapped LangChain Core as a knowledge graph. 180 modules, 650 dependency edges. Here's what the structure reveals that the docs never tell you.

Finding 1: The messages module has a 70% blast radius.

Change it and 126 of 180 modules break — directly or transitively. Every callback, every agent, every retriever, every embedding module traces a dependency path back to messages. It is the load-bearing wall of the entire framework. Nothing in the documentation flags this.

Finding 2: runnables.base requires 147 other modules to fully function.

That is 82% of the codebase as a prerequisite chain. Before an agent touches runnables.base, it needs ground-truth awareness of almost everything else. Without that map, it is guessing.

Finding 3: Exactly 7 modules are completely safe to modify without any downstream risk.

cross_encoders, structured_query, sys_info, version, utils.html, utils.image, utils.mustache. Seven. Out of 180.

Why this matters for agents:

A coding agent dispatched to modify LangChain without this map will grep for context, retrieve similar-looking docs, and make a confident, structurally wrong change. The blast radius is invisible to similarity search. It is only visible to graph traversal.

This is the difference between retrieval and spatial intelligence. RAG finds text that looks relevant. A knowledge graph tells you what actually breaks.

The dataset is live. The same query interface that works on GLP-1 pharmacology and ICD-10 classification works on a codebase. The domain doesn't matter. The structure does.

LangChain Core CKG (180 modules, 650 edges): https://huggingface.co/datasets/danyarm/ckg-benchmark
MCP server — query it directly: https://github.com/Yarmoluk/ckg-mcp
Full benchmark (RAG vs CKG across 54 domains): https://graphifymd.com/paper.html

Beyond RAG: Why I replaced similarity search with graph traversal for AI agent context

Daniel Yarmoluk — Fri, 01 May 2026 19:57:56 +0000

The problem RAG doesn't solve

RAG is good for question answering. It's bad for tasks that require knowing dependencies before you act.

When an AI coding agent asks "what breaks if I change RunnableSequence?" — RAG retrieves text chunks that mention RunnableSequence. Approximate. Probabilistic. It might miss the 23 modules that directly import it.

Same problem in life sciences. A PM asks an agent to draft a payer brief for GLP-1 coverage. The agent doesn't know that Insulin Resistance flows upstream to Metabolic Syndrome — which gates Prior Authorization — which determines formulary position. It retrieves similar text and approximates.

The wrong answer in both cases sounds right. That's the risk.

What I built

ckg-mcp — an open-source Model Context Protocol server that delivers compact knowledge graphs (CKGs) to any MCP-compatible agent orchestrator as pre-action structural context.

pip install ckg-mcp

MCP config:

{"mcpServers": {"ckg": {"command": "ckg-mcp"}}}

The orchestrator calls tools before dispatching any worker agent:

trace_upstream("RunnableSequence")    # exact blast radius
trace_downstream("BaseRunnable")      # everything that depends on this
find_path("Insulin Resistance", "Prior Authorization")  # causal chain
get_domain_summary()                  # full domain stats

Two demos

Codebase: Mapped LangChain Core — 180 modules, 650 dependency edges. Before a coding agent edits any module, it calls trace_upstream(module). Gets back the exact dependency subgraph. Then trace_downstream(module) for blast radius. Every hop is a real edge. No guessing.

Clinical: The GLP-1 Clinical Pathway is a 146-node graph with 200+ typed dependency edges: mechanism, market, regulatory, clinical. An agent drafting a payer brief queries the graph first and gets the causal chain from mechanism of action through prior authorization to formulary position. Then it writes.

Both use the same MCP interface. Swap the CSV file, everything else stays the same.

Why it's more efficient

System	BERT F1	Cost/Correct Answer	Tokens/Query
CKG	0.857	$0.000506	274
RAG	0.817	$0.013046	~3,100
GraphRAG	0.825	$0.020098	~10,000

65x more token-efficient than RAG. 40x cheaper per correct answer than Microsoft GraphRAG. Higher BERT F1 than both.

The token difference is structural: CKG retrieves exactly what was asked for. RAG passes large text chunks that need synthesis. The model cost difference follows directly.

Tested across 8,121 queries, 47 domains, BERTScore (roberta-large).

Domain-agnostic

The same MCP tools work for:

Software codebases — blast radius before any edit
Clinical pathways — causal chain before any payer document
Regulatory frameworks — dependency chain before any compliance draft
Financial instruments — prerequisite structure before any analysis
Educational curricula — learning graph before any content generation

52 domains live. Switch by swapping the CSV file.

Try it

Live demo: huggingface.co/spaces/danyarm/ckg-demo
GitHub: github.com/Yarmoluk/ckg-mcp
Benchmark: github.com/Yarmoluk/ckg-benchmark
Site (mapped by persona): graphifymd.com
Questions: graphifymd@protonmail.com

I benchmarked RAG vs GraphRAG vs pre-structured knowledge graphs across 45 domains — here's what happened

Daniel Yarmoluk — Mon, 27 Apr 2026 23:59:13 +0000

Three retrieval architectures. Same LLM. Same 7,928 queries across 45 domains. Different structure going in.

Here are the results:

System	F1 Score	Tokens/query	Cost/query
RAG (FAISS + Claude)	0.123	2,982	~$0.009
GraphRAG (Microsoft)	0.120	3,450	~$0.013
CKG (pre-structured DAG)	0.471	269	~$0.001

CKG is 4x more accurate and uses 11x fewer tokens than RAG.

What is a CKG?

A Compact Knowledge Graph (CKG) pre-structures domain knowledge as a directed acyclic graph (DAG). Concepts are nodes. Dependencies are edges. A CSV file:

ConceptID,ConceptLabel,Dependencies,TaxonomyID
1,Calculus,2|3,CORE
2,Algebra,,FOUND
3,Trigonometry,,FOUND

When an agent asks "what do I need to know before Calculus?", CKG traverses edges. No embedding. No similarity search. No hallucination by construction.

Why RAG fails on multi-hop queries

RAG retrieves the most similar text chunk to a query. For simple lookups, this works. For multi-hop questions — prerequisites, dependency chains, drug interactions, regulatory trees — it fragments the answer across chunks that contradict each other.

F1 by hop depth:

Hop depth	CKG	RAG
1	0.374	0.312
2	0.512	0.298
3	0.631	0.241
4	0.714	0.198
5	0.772	0.187

CKG improves continuously with depth. RAG plateaus at hop=2 and degrades. The deeper the question, the larger the gap.

Where CKG dominates by query type

Query type	CKG	RAG	Advantage
Aggregate (T4)	0.964	0.286	3.4x
Path traversal (T3)	0.660	0.201	3.3x
Dependency (T2)	0.634	0.078	8.1x
Cross-concept (T5)	0.323	0.115	2.8x
Entity lookup (T1)	0.207	0.094	2.2x

The biggest win (8.1x) is on dependency queries — the exact query type that matters in clinical, legal, financial, and regulatory domains.

Structure is the signal — not curation effort

Track 2: I built a GLP-1/pharma domain from the ClinicalTrials.gov API in a single session. No expert curation.

F1 = 0.530 — higher than the 45-domain average.

If a domain has knowable dependencies, it can be CKG-ified. The structure drives accuracy, not the effort.

Try it

MCP server — works in Claude Code and any MCP-compatible agent:

pip install ckg-mcp

Your agent gets 4 tools: list_domains, query_ckg, get_prerequisites, search_concepts.

Live demo: https://huggingface.co/spaces/danyarm/ckg-demo

Full dataset (45 domain CSVs + 7,928 query JSONL + results):
https://huggingface.co/datasets/danyarm/ckg-benchmark

Paper + benchmark code:
https://github.com/Yarmoluk/ckg-benchmark

One-page summary:
https://github.com/Yarmoluk/ckg-benchmark/blob/main/SUMMARY.md

Custom domains

The benchmark covers 45 general domains. For clinical, legal, financial, or regulatory domains where dependency structure is critical: graphifymd.com

All code MIT licensed. Data CC BY 4.0. Questions welcome in the comments.

.md is the universal AI interface

Daniel Yarmoluk — Fri, 20 Mar 2026 10:24:28 +0000

I built an interactive healthcare knowledge graph — conditions, medications, drug interactions, diagnostics, billing codes, care pathways — and structured it as a compressed markdown file that any AI model can reason over.

Not a summary. Not a document. A traversable knowledge graph in .md format.

~3,000 tokens instead of ~500,000. Same reasoning quality. 170x more efficient.

Here's the live interactive demo: graphifymd.com/healthcare-kg-demo.html

Why this matters

85% of enterprise AI pilots fail to scale. Not because the models are bad. Because the context is.

An LLM can't reason about drug interactions if it doesn't know that metformin relates to renal function relates to GFR thresholds relates to dosing adjustments. That's not a retrieval problem. That's a relationship problem.

RAG retrieves text chunks. Knowledge graphs traverse relationships. The difference is the difference between searching a library index and having a librarian who knows which books reference each other — and why.

The pipeline

Raw clinical data (~2MB)
    ↓
Knowledge graph extraction (200 entities, 500+ relationships)
    ↓
Graph distillation (typed relationships + traversal rules)
    ↓
Compressed .md (~12KB, ~3,000 tokens)
    ↓
Deploy anywhere

What the .md looks like

Here's a fragment of the cardiology domain graph compressed to markdown:

## Entities

### Conditions
- Atrial Fibrillation | ICD: I48 | prevalence: 2.7M US
- Heart Failure | ICD: I50 | prevalence: 6.2M US
  - subtypes: HFrEF (EF≤40%), HFpEF (EF≥50%)

### Medications
- Apixaban | class: DOAC | no INR monitoring
- Warfarin | class: anticoagulant | INR target: 2-3
- Amiodarone | class: antiarrhythmic | ⚠️ toxicity

## Relationships

AFib → TREATED_BY → Apixaban (first-line DOAC)
AFib → RISK_FACTOR_FOR → Stroke (5x risk)
HFrEF → TREATED_BY → Metoprolol (mortality ↓35%)
Warfarin → INTERACTS_WITH → Amiodarone ⚠️
  ↳ RULE: ↑INR 50-70%. Reduce warfarin dose 30-50%.
Apixaban → REQUIRES → CrCl assessment
  ↳ RULE: Reduce dose if CrCl 15-29, avoid if <15

## Traversal Examples

Q: Patient with AFib + CKD Stage 4. Anticoagulation?
AFib → TREATED_BY → Apixaban
Apixaban → REQUIRES → CrCl
CKD Stage 4 → CrCl 15-29 → DOSE_ADJUST Apixaban
→ Answer: Apixaban 2.5mg BID (reduced dose)

The model doesn't guess. It follows the chain. Multi-hop reasoning with an audit trail.

The numbers

Metric	Raw Data	Knowledge Graph .md
Size	~2MB	~12KB
Tokens	~500,000	~3,000
Density	1x	170x
Compression	—	93%
CO₂ per query	~0.34 kg	~0.002 kg

That last line matters. Fewer tokens = less compute = lower energy. 99.4% carbon reduction per query. Structured intelligence is greener intelligence.

March Madness knowledge graph — 68 teams, built live with graduate software engineers

It works everywhere

The same .md file works across every AI environment without modification:

Claude Projects — upload as project knowledge
Claude Code — CLAUDE.md project context
ChatGPT — custom GPT instructions
Cursor / Windsurf — context file
Codex CLI — AGENTS.md
MCP Server — serve as tool context
API — system prompt injection
Email — it's just text. Paste it.

No vendor lock-in. No format conversion. No special tooling. Markdown is the universal interface.

Why not just use RAG?

RAG retrieves the top-k text chunks that match your query. It's single-hop — find the most similar text, return it.

A knowledge graph traverses relationships. When you ask about a patient with AFib and kidney disease, the graph follows:

AFib → treatment options → Apixaban → renal requirements →
CrCl thresholds → CKD staging → dose adjustment rules

That's 5 hops. RAG would need to independently retrieve and stitch together 5 separate chunks and hope the model connects them. The graph has already connected them.

Microsoft's 2024 research showed knowledge graphs achieve an 83% win rate vs vector RAG. HopRAG (ACL 2025) showed 77% higher accuracy on multi-hop questions.

What I'm building

I run Graphify.md — we build domain knowledge graphs and compress them to portable .md for any industry. Healthcare is one vertical. We've also built graphs for:

March Madness tournament — 68 teams, real-time scores, built live with grad students
LinkedIn Groups ecosystem — 200+ groups, 15 verticals, relationship edges
Defense, legal, construction, supply chain, GovTech, education — 12 verticals mapped

The methodology works on any domain. If your data has entities and relationships — and all data does — it can be graphed, compressed, and deployed.

Try it

The interactive demo is live. Hover over nodes to see relationship chains light up:

👉 graphifymd.com/healthcare-kg-demo.html

Built entirely with Claude Code. The whole thing — knowledge graph extraction, D3 visualization, .md compression, the site — solo, in days not months.

If you're working on a domain where AI keeps hallucinating or RAG keeps missing context, the problem might not be the model. It might be the structure.

Daniel Yarmoluk — Graphify.md — Book a call

Tired of being on the RAG? Try GraphRAG

Daniel Yarmoluk — Fri, 20 Mar 2026 09:17:34 +0000

Are you tired of the aches and pains of Vector search guessing...There's gotta be a better way to endure the toil.

Maybe just try a knowledge graph at the .md level to fit more through the context pipeline? There's gotta be a better way...

Knowledge Graph Creation

Daniel Yarmoluk — Fri, 20 Mar 2026 08:20:22 +0000

I am looking for co-creators, people that want to solve real problems. If you have a hairy issue, we can knowledge graph and compress it. If I'm wrong, block me, who cares right?

I need to feed my family. I have real problems. I'm trying to do real things with my knowledge graph to the .md file level. If someone has a really complicated layered value chain or problem that you feel would benefit from a knowledge graph compressed to .md level in which you can layer on more complications, please reach out.

Looking for audio/video geeks -- knowledge context architect seeking creativity for applications

Daniel Yarmoluk — Fri, 20 Mar 2026 08:16:29 +0000

My post is up there, so many rules, jeez...

the anti-framework -- problem / solution -- cuts through stuff

First Post Ever! -- I compressed 2MB of healthcare data into 12KB of markdown - here's the knowledge graph

Daniel Yarmoluk — Fri, 20 Mar 2026 08:14:09 +0000

Not a summary. Not a document. A traversable knowledge graph in .md format.

~3,000 tokens instead of ~500,000. Same reasoning quality. 170x more efficient.

Here's the live interactive demo: graphifymd.com/healthcare-kg-demo.html

Why this matters

85% of enterprise AI pilots fail to scale. Not because the models are bad. Because the context is.

The pipeline

Raw clinical data (~2MB)
    ↓
Knowledge graph extraction (200 entities, 500+ relationships)
    ↓
Graph distillation (typed relationships + traversal rules)
    ↓
Compressed .md (~12KB, ~3,000 tokens)
    ↓
Deploy anywhere

What the .md looks like

Here's a fragment of the cardiology domain graph compressed to markdown:

## Entities

### Conditions
- Atrial Fibrillation | ICD: I48 | prevalence: 2.7M US
- Heart Failure | ICD: I50 | prevalence: 6.2M US
  - subtypes: HFrEF (EF≤40%), HFpEF (EF≥50%)

### Medications
- Apixaban | class: DOAC | no INR monitoring
- Warfarin | class: anticoagulant | INR target: 2-3
- Amiodarone | class: antiarrhythmic | ⚠️ toxicity

## Relationships

AFib → TREATED_BY → Apixaban (first-line DOAC)
AFib → RISK_FACTOR_FOR → Stroke (5x risk)
HFrEF → TREATED_BY → Metoprolol (mortality ↓35%)
Warfarin → INTERACTS_WITH → Amiodarone ⚠️
  ↳ RULE: ↑INR 50-70%. Reduce warfarin dose 30-50%.
Apixaban → REQUIRES → CrCl assessment
  ↳ RULE: Reduce dose if CrCl 15-29, avoid if <15

## Traversal Examples

Q: Patient with AFib + CKD Stage 4. Anticoagulation?
AFib → TREATED_BY → Apixaban
Apixaban → REQUIRES → CrCl
CKD Stage 4 → CrCl 15-29 → DOSE_ADJUST Apixaban
→ Answer: Apixaban 2.5mg BID (reduced dose)

The model doesn't guess. It follows the chain. Multi-hop reasoning with an audit trail.

The numbers

Metric	Raw Data	Knowledge Graph .md
Size	~2MB	~12KB
Tokens	~500,000	~3,000
Density	1x	170x
Compression	—	93%
CO₂ per query	~0.34 kg	~0.002 kg

That last line matters. Fewer tokens = less compute = lower energy. 99.4% carbon reduction per query. Structured intelligence is greener intelligence.

March Madness knowledge graph — 68 teams, built live with graduate software engineers

It works everywhere

The same .md file works across every AI environment without modification:

Claude Projects — upload as project knowledge
Claude Code — CLAUDE.md project context
ChatGPT — custom GPT instructions
Cursor / Windsurf — context file
Codex CLI — AGENTS.md
MCP Server — serve as tool context
API — system prompt injection
Email — it's just text. Paste it.

No vendor lock-in. No format conversion. No special tooling. Markdown is the universal interface.

Why not just use RAG?

RAG retrieves the top-k text chunks that match your query. It's single-hop — find the most similar text, return it.

A knowledge graph traverses relationships. When you ask about a patient with AFib and kidney disease, the graph follows:

AFib → treatment options → Apixaban → renal requirements →
CrCl thresholds → CKD staging → dose adjustment rules

That's 5 hops. RAG would need to independently retrieve and stitch together 5 separate chunks and hope the model connects them. The graph has already connected them.

Microsoft's 2024 research showed knowledge graphs achieve an 83% win rate vs vector RAG. HopRAG (ACL 2025) showed 77% higher accuracy on multi-hop questions.

What I'm building

I run Graphify.md — we build domain knowledge graphs and compress them to portable .md for any industry. Healthcare is one vertical. We've also built graphs for:

March Madness tournament — 68 teams, real-time scores, built live with grad students
LinkedIn Groups ecosystem — 200+ groups, 15 verticals, relationship edges
Defense, legal, construction, supply chain, GovTech, education — 12 verticals mapped

The methodology works on any domain. If your data has entities and relationships — and all data does — it can be graphed, compressed, and deployed.

Try it

The interactive demo is live. Hover over nodes to see relationship chains light up:

👉 graphifymd.com/healthcare-kg-demo.html

Built entirely with Claude Code. The whole thing — knowledge graph extraction, D3 visualization, .md compression, the site — solo, in days not months.

If you're working on a domain where AI keeps hallucinating or RAG keeps missing context, the problem might not be the model. It might be the structure.

Daniel Yarmoluk — Graphify.md — Book a call

I compressed 2MB of healthcare data into 12KB of markdown — here's the knowledge graph -- first time to Dev.to in my life...

Daniel Yarmoluk — Fri, 20 Mar 2026 06:40:25 +0000

Not a summary. Not a document. A traversable knowledge graph in .md format.

~3,000 tokens instead of ~500,000. Same reasoning quality. 170x more efficient.

Here's the live interactive demo: graphifymd.com/healthcare-kg-demo.html

Why this matters

85% of enterprise AI pilots fail to scale. Not because the models are bad. Because the context is.

The pipeline

Raw clinical data (~2MB)
    ↓
Knowledge graph extraction (200 entities, 500+ relationships)
    ↓
Graph distillation (typed relationships + traversal rules)
    ↓
Compressed .md (~12KB, ~3,000 tokens)
    ↓
Deploy anywhere

What the .md looks like

Here's a fragment of the cardiology domain graph compressed to markdown:

## Entities

### Conditions
- Atrial Fibrillation | ICD: I48 | prevalence: 2.7M US
- Heart Failure | ICD: I50 | prevalence: 6.2M US
  - subtypes: HFrEF (EF≤40%), HFpEF (EF≥50%)

### Medications
- Apixaban | class: DOAC | no INR monitoring
- Warfarin | class: anticoagulant | INR target: 2-3
- Amiodarone | class: antiarrhythmic | ⚠️ toxicity

## Relationships

AFib → TREATED_BY → Apixaban (first-line DOAC)
AFib → RISK_FACTOR_FOR → Stroke (5x risk)
HFrEF → TREATED_BY → Metoprolol (mortality ↓35%)
Warfarin → INTERACTS_WITH → Amiodarone ⚠️
  ↳ RULE: ↑INR 50-70%. Reduce warfarin dose 30-50%.
Apixaban → REQUIRES → CrCl assessment
  ↳ RULE: Reduce dose if CrCl 15-29, avoid if <15

## Traversal Examples

Q: Patient with AFib + CKD Stage 4. Anticoagulation?
AFib → TREATED_BY → Apixaban
Apixaban → REQUIRES → CrCl
CKD Stage 4 → CrCl 15-29 → DOSE_ADJUST Apixaban
→ Answer: Apixaban 2.5mg BID (reduced dose)

The model doesn't guess. It follows the chain. Multi-hop reasoning with an audit trail.

The numbers

Metric	Raw Data	Knowledge Graph .md
Size	~2MB	~12KB
Tokens	~500,000	~3,000
Density	1x	170x
Compression	—	93%
CO₂ per query	~0.34 kg	~0.002 kg

That last line matters. Fewer tokens = less compute = lower energy. 99.4% carbon reduction per query. Structured intelligence is greener intelligence.

March Madness knowledge graph — 68 teams, built live with graduate software engineers

It works everywhere

The same .md file works across every AI environment without modification:

Claude Projects — upload as project knowledge
Claude Code — CLAUDE.md project context
ChatGPT — custom GPT instructions
Cursor / Windsurf — context file
Codex CLI — AGENTS.md
MCP Server — serve as tool context
API — system prompt injection
Email — it's just text. Paste it.

No vendor lock-in. No format conversion. No special tooling. Markdown is the universal interface.

Why not just use RAG?

RAG retrieves the top-k text chunks that match your query. It's single-hop — find the most similar text, return it.

A knowledge graph traverses relationships. When you ask about a patient with AFib and kidney disease, the graph follows:

AFib → treatment options → Apixaban → renal requirements →
CrCl thresholds → CKD staging → dose adjustment rules

That's 5 hops. RAG would need to independently retrieve and stitch together 5 separate chunks and hope the model connects them. The graph has already connected them.

Microsoft's 2024 research showed knowledge graphs achieve an 83% win rate vs vector RAG. HopRAG (ACL 2025) showed 77% higher accuracy on multi-hop questions.

What I'm building

I run Graphify.md — we build domain knowledge graphs and compress them to portable .md for any industry. Healthcare is one vertical. We've also built graphs for:

March Madness tournament — 68 teams, real-time scores, built live with grad students
LinkedIn Groups ecosystem — 200+ groups, 15 verticals, relationship edges
Defense, legal, construction, supply chain, GovTech, education — 12 verticals mapped

The methodology works on any domain. If your data has entities and relationships — and all data does — it can be graphed, compressed, and deployed.

Try it

The interactive demo is live. Hover over nodes to see relationship chains light up:

👉 graphifymd.com/healthcare-kg-demo.html

Built entirely with Claude Code. The whole thing — knowledge graph extraction, D3 visualization, .md compression, the site — solo, in days not months.

If you're working on a domain where AI keeps hallucinating or RAG keeps missing context, the problem might not be the model. It might be the structure.

Daniel Yarmoluk — Graphify.md — Book a call