Work with provenance
Use provenance when you want to record which corpus entries grounded a RAG pipeline answer and render those citations as formatted markdown for display or auditing.
Prerequisites
- Access to
src/attune_rag/provenance.py RetrievalHitobjects from a completed retrieval pipeline run
Build and format a citation record
-
Call
build_citation_record()to convert yourRetrievalHitobjects into aCitationRecord.from datetime import datetime, timezone from attune_rag.provenance import build_citation_record record = build_citation_record( query="What is the return policy?", hits=retrieval_hits, retriever_name="bm25", retrieved_at=datetime.now(timezone.utc), excerpt_chars=200, # optional; defaults to 200 )This produces a
CitationRecordcontaining the query, a tuple ofCitedSourceentries (each with atemplate_path,category,score, and optionalexcerpt), the retrieval timestamp, and the retriever name. -
Render the record as a markdown section by passing the
CitationRecordtoformat_citations_markdown().from attune_rag.provenance import format_citations_markdown markdown = format_citations_markdown(record, base_url="https://docs.example.com") print(markdown)Pass
base_urlto turn eachCitedSource.template_pathinto an absolute link. Omit it to use relative paths. -
Annotate response text with claim-level citations by calling
format_claim_citations_markdown(). Use this step when the Anthropic Citations API has returnedClaimCitationobjects that map character spans in the response back to specific source documents.from attune_rag.provenance import format_claim_citations_markdown annotated = format_claim_citations_markdown( text=response_text, citations=claim_citations, # Iterable[ClaimCitation] base_url="https://docs.example.com", ) print(annotated)Each
ClaimCitationcarries aresponse_span,document_index,document_title,cited_text, andcited_block_index. The function inserts footnote-style markers into the text at the positions indicated byresponse_span.
Verify success
build_citation_record()returns aCitationRecordwhosehitstuple contains oneCitedSourceperRetrievalHit. Inspectrecord.hitsto confirm the count and scores match your retrieval results.format_citations_markdown()returns a non-empty string containing markdown. Check that eachCitedSource.template_pathappears in the output.format_claim_citations_markdown()returns the originaltextwith footnote markers inserted. Confirm that the number of markers equals the number ofClaimCitationobjects you passed in.
Key files
src/attune_rag/provenance.py