Retrieval observability where humans and AI debug together

Trace every answer back to exact evidence — in two lines of code. AI agents evaluate via MCP, humans review in the dashboard.

$ pip install sourcemapr
$ sourcemapr server
Server running at http://localhost:5000
SourceMapR Dashboard

Add Retrieval Observability in Two Lines

Add SourceMapR to your existing LangChain or LlamaIndex pipeline. Your code stays the same — we provide retrieval observability automatically.

# pip install sourcemapr llama-index

from sourcemapr import init_tracing, stop_tracing
init_tracing(endpoint="http://localhost:5000")

# Your existing LlamaIndex code — unchanged
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex

documents = SimpleDirectoryReader("./papers").load_data()
index = VectorStoreIndex.from_documents(documents)

response = index.as_query_engine().query("What is attention?")
print(response)

stop_tracing()
Then open http://localhost:5000 to see the full evidence lineage.

Retrieval Observability Features

Debug retrieval with full evidence tracing — AI agents evaluate, humans review

PyPI v0.1.0 MIT License Local-first SQLite storage MCP Server

Trace LLM Answers to Sources

Trace every response to the exact chunks that were retrieved. See similarity scores, rankings, and complete evidence lineage.

PDF Chunk Viewer

Click any retrieved chunk to see it highlighted in the original PDF. Optimized for PDF files — HTML and other formats are experimental.

Full LLM Tracing

See the exact prompt sent to the model, the response, token counts, and latency for every query.

Experiment Tracking

Organize runs into experiments. Compare chunking strategies, retrievers, and embedding models side by side.

Evidence Lineage

Complete trace from document load → parse → chunk → embed → retrieve → answer with full metadata.

Debug Hallucinations

Verify grounding without guessing. See exactly what evidence was used to generate each answer.

MCP for AI Agents

AI agents read queries and write evaluations via Model Context Protocol. Humans review in the dashboard.

LLM-as-Judge Evaluations

Score relevance, faithfulness, and completeness. AI agents evaluate at scale, humans review and guide.

Works with your favorite frameworks

Drop-in instrumentation for LangChain and LlamaIndex. Full retrieval evidence tracing.

LlamaIndex

LlamaIndex

Full pipeline instrumentation

Supported
SimpleDirectoryReader
SentenceSplitter
VectorStoreIndex
QueryEngine
LangChain

LangChain

Callback-based tracing

Supported
PyPDFLoader
RecursiveCharacterTextSplitter
VectorStore Retrievers
LLM & ChatModel

How Retrieval Evidence Tracing Works

SourceMapR traces every step of your retrieval pipeline with complete evidence lineage

Load Documents
Parse Extract text
Chunk Split text
Embed Vectorize
Retrieve Find similar
Generate LLM answer

Complete Retrieval Evidence Lineage

For every answer, SourceMapR shows you: which chunks were retrieved with similarity scores, where they came from in the original document (with PDF highlighting), what prompt was sent to the LLM, and how many tokens were used. Debug hallucinations and verify grounding without guessing.

Current Support

PDF
Full support
HTML & other
Experimental
Pipeline tracing
Experimental