Skip to content

10. AI Model Providers

“samuele edited this page Feb 26, 2026 · 2 revisions

AI Model Providers

RedAmon supports five AI providers out of the box, giving you access to 400+ language models through a single, unified interface. The model selector in project settings dynamically fetches available models from each configured provider — no hardcoded lists, no manual updates.


Supported Providers

Provider Models API Key Env Variable
OpenAI (Direct) ~30 chat models — GPT-5.2, GPT-5, GPT-4.1, o3, o4-mini OPENAI_API_KEY
Anthropic (Direct) ~15 models — Claude Opus 4.6, Sonnet 4.6/4.5, Haiku 4.5 ANTHROPIC_API_KEY
OpenAI-Compatible Any self-hosted or third-party OpenAI-compatible API OPENAI_COMPAT_BASE_URL
OpenRouter 300+ models — Llama 4, Gemini 3, Mistral, Qwen, DeepSeek OPENROUTER_API_KEY
AWS Bedrock ~60 foundation models — Claude, Titan, Llama, Cohere, Mistral AWS_ACCESS_KEY_ID + AWS_SECRET_ACCESS_KEY

How It Works

  1. Provider detection — on startup, the agent checks which API keys are set in .env. Only configured providers are queried.
  2. Dynamic model fetching — the agent's /models endpoint fetches available models from all configured providers in parallel. Results are cached for 1 hour.
  3. Searchable model selector — the project settings UI presents a searchable dropdown grouped by provider. Each model shows its name, context window size, and provider.

Model Selector

  1. Provider prefix convention — models are stored with a provider prefix (openai_compat/, openrouter/, bedrock/) so the agent knows which SDK to use at runtime. OpenAI and Anthropic models are detected by name pattern (no prefix needed).

Provider Setup

Add the keys for the providers you want to use in your .env file:

OpenAI (Direct)

OPENAI_API_KEY=sk-proj-...

Get your key from platform.openai.com/api-keys.

Anthropic (Direct)

ANTHROPIC_API_KEY=sk-ant-...

Get your key from console.anthropic.com.

Recommended — Claude Opus 4.6 is the default model and generally provides the best results for autonomous pentesting tasks.

OpenRouter

OPENROUTER_API_KEY=sk-or-...

Get your key from openrouter.ai/settings/keys. OpenRouter provides access to 300+ models from 50+ providers through a single API, including many free models for testing.

AWS Bedrock

AWS_ACCESS_KEY_ID=AKIA...
AWS_SECRET_ACCESS_KEY=...
AWS_DEFAULT_REGION=us-east-1

Create an IAM user with bedrock:InvokeModel and bedrock:ListFoundationModels permissions. Foundation models on Bedrock are automatically enabled across all commercial regions — no manual model access activation required.

Recommended region: us-east-1 (N. Virginia) has the widest model availability.

OpenAI-Compatible Provider

OPENAI_COMPAT_BASE_URL=http://host.docker.internal:11434/v1
OPENAI_COMPAT_API_KEY=                # optional (fallback: "ollama")

Any backend exposing /v1/chat/completions and /v1/models endpoints works. The agent container includes host.docker.internal resolution, so local servers on your host machine are reachable from Docker.


Local Models & Self-Hosted Options

Ollama (Recommended for Local)

The easiest way to run local LLMs:

  1. Install Ollama: ollama.com
  2. Pull a model: ollama pull llama3.1:70b
  3. Set in .env:

Ollama on the same machine as RedAmon:

OPENAI_COMPAT_BASE_URL=http://host.docker.internal:11434/v1

Ollama on a remote server (different machine):

OPENAI_COMPAT_BASE_URL=http://192.168.1.50:11434/v1   # replace with your Ollama server's IP or hostname

Use the actual IP address or hostname of the remote machine instead of host.docker.internal. Ensure port 11434 is reachable from the machine running RedAmon (check firewall rules).

Important — Bind Ollama to 0.0.0.0: By default, Ollama only listens on localhost and will reject connections from Docker containers and remote machines. You must configure it to listen on all interfaces:

# If Ollama is managed by systemd (Linux):
sudo mkdir -p /etc/systemd/system/ollama.service.d
echo -e '[Service]\nEnvironment="OLLAMA_HOST=0.0.0.0"' | sudo tee /etc/systemd/system/ollama.service.d/override.conf
sudo systemctl daemon-reload && sudo systemctl restart ollama

This is required for all Linux setups (local or remote) and for remote access from any OS. macOS and Windows with Docker Desktop handle local container-to-host resolution automatically, but still need OLLAMA_HOST=0.0.0.0 if Ollama must be accessed from a different machine.

Other Self-Hosted Options

Provider Description Example Base URL
vLLM High-performance GPU inference http://host.docker.internal:8000/v1
LM Studio Desktop app with built-in server http://host.docker.internal:1234/v1
LocalAI Open-source OpenAI drop-in, runs on CPU http://host.docker.internal:8080/v1
Jan Desktop app with local server mode http://host.docker.internal:1337/v1
llama.cpp server Lightweight C++ inference http://host.docker.internal:8080/v1
OpenLLM Run open-source LLMs with one command http://host.docker.internal:3000/v1
text-generation-webui Gradio UI with OpenAI-compatible API http://host.docker.internal:5000/v1

Gateway / Proxy

Provider Description
LiteLLM Proxy for 100+ LLMs in OpenAI format — self-hostable via Docker

Cloud Providers with OpenAI-Compatible API

Provider Description
Together AI 200+ open-source models, serverless
Groq Ultra-fast inference for Llama, Mixtral, Gemma
Fireworks AI Fast open-source model hosting
Deepinfra Pay-per-token open-source models
Mistral AI Mistral/Mixtral via OpenAI-compatible endpoint
Perplexity Sonar models via OpenAI-compatible API

Important Notes

  • Multiple providers at once — you can configure all five providers simultaneously. The model selector shows all available models from all providers.
  • OpenAI-Compatible caution — RedAmon fetches all models from your compatible endpoint, including non-chat models (embeddings, audio, image). Select a chat-capable model in project settings.
  • Switching models — you can change the model per project at any time. Switch between a free Llama model on OpenRouter for testing and Claude Opus on Anthropic for production assessments.

Next Steps

Clone this wiki locally