Track citations with provenance

Build a CitationRecord from retrieval hits and render it as markdown in four steps.

from datetime import datetime, timezone
from attune_rag.provenance import build_citation_record, format_citations_markdown

record = build_citation_record(
    query="What is the return policy?",
    hits=retrieval_hits,          # iterable of RetrievalHit objects
    retriever_name="dense-v1",
    retrieved_at=datetime.now(timezone.utc),
)
print(format_citations_markdown(record))

Prerequisites

The project is cloned and installed locally.
You have a list of RetrievalHit objects from a RAG pipeline run.

Steps

Build a citation record. Call build_citation_record() with your query, hits, retriever name, and retrieval timestamp. Each hit becomes a CitedSource stored in record.hits. By default, excerpts are truncated to 200 characters; pass excerpt_chars to change that.

Render the record as markdown. Pass the record to format_citations_markdown(). Supply base_url if you want source links to resolve to a hosted URL.

md = format_citations_markdown(record, base_url="https://docs.example.com")
print(md)

Expected output (shape):

## Sources

1. **policy/returns.md** (faq, score: 0.92)
   > Items may be returned within 30 days of purchase…

Annotate claim-level citations. If the Anthropic Citations API returned ClaimCitation objects, use format_claim_citations_markdown() to inline footnote markers into the response text.

from attune_rag.provenance import format_claim_citations_markdown

annotated = format_claim_citations_markdown(
    text=response_text,
    citations=claim_citations,   # iterable of ClaimCitation
    base_url="https://docs.example.com",
)
print(annotated)

Inspect the record fields directly. The CitationRecord dataclass exposes query, hits, retrieved_at, and retriever_name if you need to log or serialize provenance rather than render it.
```
for source in record.hits:
    print(source.template_path, source.score, source.excerpt)
```

Next: Read the CitationRecord and CitedSource reference to understand how scores and excerpts are populated from your retriever's output.

Tags: provenance, citations, traceability