CLI cautions
What to watch for
The cli module is the command-line entry point for debugging retrieval. It exposes two commands: attune-rag query, which runs a RAG query and prints a grounded answer with citations, and attune-rag corpus-info, which displays corpus statistics.
Because this module bridges user input and the retrieval pipeline, mistakes here can silently produce incorrect output or swallow errors without a non-zero exit code.
Risk areas
main() exit code masking
main(argv) returns an int exit code, but callers that ignore the return value — including shell wrappers or test harnesses that don't assert on it — will miss failure signals. A failed retrieval or misconfigured corpus may print partial output and still return 0 if error handling is incomplete. Always assert on the return value in tests and propagate it to the shell via sys.exit(main()).
build_parser() default argument behavior
build_parser() constructs the argument parser with its own defaults. If you call main(argv=None), it reads from sys.argv[1:]. Passing an empty list ([]) is not the same as passing None — an empty list bypasses sys.argv entirely and will likely trigger a parser error or use unexpected defaults. Be explicit about which argv you intend in tests.
Unvalidated corpus path at parse time Argument parsing succeeds before any corpus files are opened. If a user supplies an invalid path or a missing corpus, the error surfaces later in the retrieval pipeline, not at the CLI boundary. This can produce confusing error messages that point into library internals rather than the bad argument.
How to avoid problems
-
Assert on the return value of
main(). In tests, writeassert main(["query", ...]) == 0rather than callingmain()as a void function. In scripts, usesys.exit(main())so the shell sees failures. -
Pass explicit argv in tests. Always pass a fully constructed
argvlist tomain()in tests rather than relying onNone. This keeps tests hermetic and preventssys.argvfrom leaking between test cases. -
Validate corpus inputs before invoking the CLI in automation. If you are driving
attune-ragprogrammatically, confirm that corpus paths exist and are readable before callingmain(). This surfaces configuration errors with a clear message rather than a traceback from deep in the retrieval stack. -
Treat
build_parser()as an internal detail. The parser structure can change as commands are added or renamed. If you depend on the parser object directly — for example, to extract subcommand names — expect that to break during refactors. Prefer driving the CLI throughmain(argv).
Source files
src/attune_rag/cli.py
Tags: cli, query, corpus-info