Work with retrieval
Use KeywordRetriever when you need to search a corpus by keyword and get back ranked results — for example, to surface the most relevant entries before passing them to a language model.
Prerequisites
attune_raginstalled and importable in your environment- A corpus object that satisfies
CorpusProtocol
Steps
-
Import the retrieval classes.
from attune_rag.retrieval import KeywordRetriever, RetrievalHit -
Instantiate
KeywordRetriever.Pass
min_scoreif you want to discard low-confidence results. Omit it to return all scored hits.retriever = KeywordRetriever(min_score=0.1) -
Call
retrievewith your query, corpus, and result count.kcontrols how many hits are returned (default:3).hits: list[RetrievalHit] = retriever.retrieve( query="how to configure logging", corpus=my_corpus, k=5, ) -
Inspect each
RetrievalHit.Each hit exposes three fields:
Field Type Description entryRetrievalEntryThe matched corpus entry scorefloatRelevance score assigned by the retriever match_reasonstrHuman-readable explanation of why this entry matched for hit in hits: print(hit.score, hit.match_reason, hit.entry) -
Swap in a custom retriever if needed.
If
KeywordRetrieverdoes not suit your use case, implementRetrieverProtocolinstead. Any class with aretrieve(self, query: str, corpus: CorpusProtocol, k: int = 3) -> Iterable[RetrievalHit]method satisfies the protocol and can be used anywhereKeywordRetrieveris accepted.class MyRetriever: def retrieve(self, query: str, corpus: CorpusProtocol, k: int = 3): ... # your logic here
Verify the result
After calling retrieve, confirm that:
hitsis a non-empty list when the corpus contains relevant entries.- Each
hit.scoreis greater than or equal to themin_scoreyou set (if any). - Each
hit.match_reasonis a non-empty string describing the match.
If hits is empty and you expect matches, lower min_score or omit it entirely to check whether the retriever is scoring entries at all.