Rerank retrieval results with LLMReranker
Use LLMReranker when keyword retrieval returns too many loosely relevant hits and you need Claude to score and reorder them by relevance before presenting results to the user.
Prerequisites
- An Anthropic API key with access to
claude-haiku-4-5 - A list of
RetrievalHitobjects from your keyword retrieval step attune_rag.rerankeravailable in your Python environment
Steps
-
Import
LLMRerankerfromattune_rag.reranker.from attune_rag.reranker import LLMReranker -
Instantiate
LLMRerankerwith your configuration.Pass your API key and, optionally, adjust
candidate_multiplierto control how many candidates Claude evaluates relative to the number of results you want returned. The defaultcandidate_multiplieris3and the defaulttimeoutis60.0seconds.reranker = LLMReranker( model="claude-haiku-4-5", api_key="YOUR_API_KEY", candidate_multiplier=3, timeout=60.0, ) -
Call
rerankwith your query and retrieved hits.Pass the user query string and the list of
RetrievalHitobjects from your retrieval step.rerankreturns a new list ofRetrievalHitobjects sorted from most to least relevant.reranked_hits = reranker.rerank(query=user_query, hits=keyword_hits)If the API call fails for any reason,
rerankfalls back to the original keyword-retrieval order, so your application continues to return results. -
Use the reranked list in your response pipeline.
Replace your existing hit list with
reranked_hits. The first item in the list is the hit Claude judged most relevant to the query. -
Run the reranker tests to confirm nothing is broken.
pytest -k "reranker"
Verify the task worked
After calling rerank, inspect the first element of the returned list and confirm it corresponds to the document you would expect to be most relevant for your test query. If Claude's judgment differs from keyword rank order, the reranker is working. If the API is unreachable, the returned list should match your original hits order exactly, confirming the fallback behavior is active.