Async Python package for calling AI CLI tools (Claude, Gemini, Cursor) via subprocess. Includes LLM cost calculation via LiteLLM pricing data and model listing/validation.
uv add ai-cli-runnerimport asyncio
from ai_cli_runner import call_ai_cli
async def main():
result = await call_ai_cli(
prompt="What is the capital of France?",
ai_provider="claude",
ai_model="claude-haiku-4-20250514",
output_format="json",
)
if result.success:
print(result.text)
if result.usage:
print(f"Tokens: in={result.usage.input_tokens} out={result.usage.output_tokens}")
if result.usage.cost_usd is not None:
print(f"Cost: ${result.usage.cost_usd:.6f}")
asyncio.run(main())See examples/ for complete usage:
| Example | What it shows |
|---|---|
basic_call.py |
Parallel calls to all 3 providers with token usage |
with_pricing.py |
LLM cost tracking via LiteLLM pricing |
model_listing.py |
List models, validate names, check CLI availability |
Run any example: uv run examples/basic_call.py
call_ai_cli(prompt, cwd, ai_provider, ai_model, ai_cli_timeout, cli_flags, output_format) → AIResult
Call an AI CLI tool. Pass output_format="json" to get structured token usage.
Send a trivial prompt to verify the CLI is installed and working.
| Field | Type | Description |
|---|---|---|
success |
bool |
Whether the call succeeded |
text |
str |
Response text |
usage |
AITokenUsage | None |
Token usage (when output_format="json") |
Supports tuple unpacking (success, text = await call_ai_cli(...)) and boolean evaluation (if result: ...).
| Field | Type | Description |
|---|---|---|
input_tokens |
int |
Tokens in the prompt |
output_tokens |
int |
Tokens in the response |
cache_read_tokens |
int |
Tokens read from cache |
cache_write_tokens |
int |
Tokens written to cache |
cost_usd |
float | None |
Cost in USD (native or LiteLLM calculated) |
duration_ms |
int | None |
Wall-clock duration |
model |
str |
Model used |
provider |
str |
Provider name |
Claude reports cost natively. For Gemini and Cursor, costs are calculated using LiteLLM pricing data:
from ai_cli_runner import pricing_cache
await pricing_cache.load() # call once at startup
# cost_usd is now auto-populated on all output_format="json" callsfrom ai_cli_runner import model_cache, pricing_cache
await pricing_cache.load()
models = await model_cache.list_models("claude")
is_valid = model_cache.is_valid_model("claude", "claude-haiku-4-20250514")| Provider | Binary | Notes |
|---|---|---|
claude |
claude |
-p flag for non-interactive mode |
gemini |
gemini |
Stdin prompt |
cursor |
agent |
--workspace for cwd |
| Variable | Default | Purpose |
|---|---|---|
AI_CLI_TIMEOUT |
10 |
Timeout in minutes for AI CLI calls |
uv sync --all-extras
uv run pytest
uv run ruff check .
uv run ruff format .
uv run mypy src/