Provenance errors
Common error signatures
Errors in the provenance module typically occur when building, validating, or rendering citation records. Watch for these failure patterns:
TypeErrorinbuild_citation_record()— ARetrievalHitobject is missing an expected attribute (such asscore,template_path, orcategory), orhitsis not iterable. This produces a record with incompleteCitedSourceentries or fails before theCitationRecordis constructed.AttributeErrorinformat_citations_markdown()— Therecordargument is not a validCitationRecordinstance, or one of itsCitedSourceentries has aNonevalue where a string field (template_path,category) is required.TypeErrorinformat_claim_citations_markdown()— Thecitationsargument is not iterable, or aClaimCitationentry has a malformedresponse_span(for example, a tuple with fewer than two integers).- Invalid
base_url— Passing a malformed string asbase_urlto eitherformat_citations_markdown()orformat_claim_citations_markdown()can produce broken links in the rendered markdown without raising an immediate exception.
Where errors originate
All three public functions in src/attune_rag/provenance.py are potential raise sites:
build_citation_record(query, hits, retriever_name, retrieved_at, excerpt_chars=200)— Converts rawRetrievalHitobjects into aCitationRecord. Failures here mean no citation data is available downstream. Check that every hit exposes the fields thatCitedSourcerequires:template_path,category, andscore.format_citations_markdown(record, base_url=None)— Renders aCitationRecordas a markdown section. Fails ifrecord.hitsis malformed or if individualCitedSourcefields areNone.format_claim_citations_markdown(text, citations, base_url=None)— Annotates response text with footnote-style citations derived fromClaimCitationobjects. Fails ifcitationscontains entries whoseresponse_spantuples don't index correctly intotext.
How to diagnose
-
Identify which function raised. The traceback will point to one of the three functions above. A failure in
build_citation_record()means theCitationRecordwas never valid; a failure in aformat_*function means construction succeeded but the data couldn't be rendered. -
Inspect the
CitationRecordand itshits. After callingbuild_citation_record(), confirm thatrecord.hitsis a non-empty tuple and that eachCitedSourcehas a non-Nonetemplate_path,category, and a numericscore. AnexcerptofNoneis acceptable — all other fields are required. -
Validate
ClaimCitationspans against the response text. If the error is informat_claim_citations_markdown(), check that eachClaimCitation.response_spanis a(start, end)tuple where both indices are within the bounds oftext. A span that exceedslen(text)will produce an index error. -
Confirm
retrieved_atis adatetimeobject.build_citation_record()storesretrieved_atdirectly onCitationRecord. Passing a string orNoneinstead of adatetimeinstance will cause failures anywhere that field is serialised or compared.
Source files
src/attune_rag/provenance.py
Tags: provenance, citations, traceability