Work with retrieval

Use KeywordRetriever when you need to search a corpus by keyword and get back ranked results — for example, to surface the most relevant entries before passing them to a language model.

Prerequisites

Steps

  1. Import the retrieval classes.

    from attune_rag.retrieval import KeywordRetriever, RetrievalHit
    
  2. Instantiate KeywordRetriever.

    Pass min_score if you want to discard low-confidence results. Omit it to return all scored hits.

    retriever = KeywordRetriever(min_score=0.1)
    
  3. Call retrieve with your query, corpus, and result count.

    k controls how many hits are returned (default: 3).

    hits: list[RetrievalHit] = retriever.retrieve(
        query="how to configure logging",
        corpus=my_corpus,
        k=5,
    )
    
  4. Inspect each RetrievalHit.

    Each hit exposes three fields:

    Field Type Description
    entry RetrievalEntry The matched corpus entry
    score float Relevance score assigned by the retriever
    match_reason str Human-readable explanation of why this entry matched
    for hit in hits:
        print(hit.score, hit.match_reason, hit.entry)
    
  5. Swap in a custom retriever if needed.

    If KeywordRetriever does not suit your use case, implement RetrieverProtocol instead. Any class with a retrieve(self, query: str, corpus: CorpusProtocol, k: int = 3) -> Iterable[RetrievalHit] method satisfies the protocol and can be used anywhere KeywordRetriever is accepted.

    class MyRetriever:
        def retrieve(self, query: str, corpus: CorpusProtocol, k: int = 3):
            ...  # your logic here
    

Verify the result

After calling retrieve, confirm that:

If hits is empty and you expect matches, lower min_score or omit it entirely to check whether the retriever is scoring entries at all.