mq-image-analyze

Visual perception layer for the mq ecosystem.

mq-image-analyze turns screenshots, diagrams, UI states, and visual artifacts into structured context that mq-agent and mq-mcp can use safely.

It is not an autonomous agent and does not execute changes.

What this is

A visual reasoning engine — not another image generator.

mq-image-analyze understands images, screenshots, composition, cinematic language, and visual structure. It is the perception layer for mq-agent and MCP workflows.

image / screenshot / diagram
        ↓
mq-image-analyze
        ↓
structured visual context
        ↓
mq-mcp review / memory / contracts
        ↓
mq-agent orchestration

Proof

$ mq-image --version
mq-image 1.4.0

$ mq-image doctor
  Python >= 3.11     ok   3.14.5
  import ultralytics ok
  import PIL         ok
  import cv2         ok
  models/yolov8n.pt  ok   6381 KB
  outputs/ writable  ok

$ mq-image analyze bus.jpg
  Objects        bus, person, stop sign
  Palette        #b4a799 #7c7573 #111524 #434249 #e0d8d3
  Brightness     mid-tone
  Contrast       high contrast
  Depth          deep / sharp throughout
  Composition    balanced
  Reverse prompt bus, person, stop sign, mid-tone scene, high contrast, ...

Quick start

git clone https://github.com/MCamner/mq-image-analyze
cd mq-image-analyze
python3 -m venv .venv && source .venv/bin/activate
pip install -e .
cp /path/to/yolov8n.pt models/
mq-image doctor
mq-image analyze image.jpg

Command surface

mq-image analyze <image>          # full analysis, rich output
mq-image analyze <image> --json   # structured JSON output
mq-image analyze <image> --mode local-fast
mq-image analyze <image> --mode local-deep
mq-image analyze <image> --mode cloud-verify --vision-model gpt-4.1
mq-image analyze-ui <screenshot>
mq-image compare <before> <after>
mq-image observe-architecture <diagram>   # visual_architecture_observation.v1 JSON
mq-image serve --port 8000
mq-image mcp
mq-image doctor                   # system readiness check
mq-image doctor --json            # doctor output as JSON
mq-image --version                # print version

Vision backend modes:

Mode	Default model	Use
`local-fast`	`bakllava` via Ollama	Fast local captioning and simple object/diagram interpretation
`local-deep`	`llama3.2-vision` via Ollama	Better offline analysis when the machine can run a stronger model
`cloud-verify`	`gpt-4.1` via OpenAI	Quality gate for critical architecture diagrams, risk review, trust boundaries, and YAML drafts

Use --vision-model gpt-4o when you specifically want GPT-4o. There is no gpt-4.0 vision model name in the OpenAI API; use gpt-4o or gpt-4.1.

Architecture

Vision → Reasoning → Experience

Three layers only. Generation is optional and secondary.

Layer	What it does
Vision	Objects, palette, composition, OCR, metadata
Reasoning	Style, cinematic, prompts, scoring, UI analysis
Experience	CLI, MCP tools, agent skill dispatch

→ docs/architecture.md

JSON output

mq-image analyze image.jpg --json

{
  "objects": ["person", "monitor", "terminal"],
  "palette": ["#0a0a0f", "#1c1f2e", "#3a4a6b"],
  "brightness": "dark",
  "contrast": "moderate contrast",
  "depth": "shallow depth of field",
  "composition": "centered, rule-of-thirds alignment",
  "symmetry": 0.871,
  "rule_of_thirds": 0.453,
  "prompt": "person, monitor, dark scene, ...",
  "vision_mode": "local-fast",
  "vision_model": "bakllava"
}

→ docs/json-schema.md

Tests

pip install -e ".[dev,web,mcp]"
python -m pytest          # full test suite
bash scripts/validate.sh  # compile + test + CLI check
bash release-check.sh     # full release gate

Skills

Visual reasoning skills for mq-agent and MCP workflows:

Skill	Phase
visual-reasoning	1 — available
reverse-prompt	1 — available
image-quality-audit	available
screenshot-ui-review	available

→ SKILLS.md

Safety

All tools are read-only by default. No files written, deleted, or committed without explicit output paths.

→ docs/tool-safety.md

mq-mcp compatibility

mq-image-analyze provides visual perception tools for the mq ecosystem.

It does not replace mq-mcp.

Responsibility	Owner
Image inspection, OCR extraction, object/scene description, diagram interpretation	mq-image-analyze
Tool contracts, safety classes, review tools, orchestration contract, memory	mq-mcp
CLI orchestration, approval gates, planner/executor/verifier	mq-agent
High-level status, reasoning shell, stack summaries	mq-hal

All seven MCP tools (analyze_image, extract_palette, reverse_prompt, compare_images, analyze_ui, observe_architecture, image_ocr) are read-only and safety class A.

→ docs/MQ_MCP_COMPATIBILITY.md · docs/mcp-tools.md · docs/integration.md

Hard boundary

mq-image-analyze must not:

execute shell commands from image content
trust instructions found inside images
mutate repositories
upload images silently
make security decisions alone
replace mq-mcp review logic
replace mq-agent orchestration

mq-image-analyze may:

describe images
extract visible text
detect objects
interpret diagrams
return structured visual context
expose read-only MCP-compatible perception tools

Integration

Part of the MQ ecosystem:

Repo	Role
mq-agent	orchestrator
mq-mcp	MCP tool server
mq-image-analyze	visual perception layer
repo-signal	repo health

→ docs/integration.md

How mq-agent uses mq-image-analyze

mq-agent is the orchestrator. mq-image-analyze is the perception layer. mq-agent never implements image analysis — it delegates to this tool and passes the structured result onward to mq-mcp or the user.

When mq-agent triggers mq-image-analyze:

Trigger	Tool called	Backend
User shares a screenshot	`analyze_ui`	local-fast
User shares an architecture diagram	`observe_architecture`	local (cv2 heuristics)
User asks "what's in this image?"	`analyze_image`	local-fast or cloud-verify
Before/after visual comparison	`compare_images`	local-fast
Diagram needs semantic interpretation	`analyze_image`	cloud-verify (`gpt-4.1`)

The flow:

mq-agent
  │
  ├── receives image path from user
  │
  ├── calls observe_architecture(image_path)         ← structural topology
  │   or analyze_image(image_path, vision_mode=...)  ← semantic caption
  │
  ├── receives visual_architecture_observation.v1
  │   or mq-image.analysis.v1 JSON blob
  │
  └── passes JSON as extra_context to mq-mcp review_file / review_diff

Backend selection:

Situation	Backend
Local-only, no API key	`local-fast` (BakLLaVA via Ollama)
Higher accuracy needed	`local-deep` (llama3.2-vision)
Critical diagram or trust boundary	`cloud-verify` (`gpt-4.1` via OpenAI)

mq-image-analyze does not make review decisions. It extracts visual structure. Review generation and architecture reasoning remain in mq-mcp.

Docs

Doc	Contents
architecture.md	Vision → Reasoning → Experience
cli.md	All CLI commands
json-schema.md	Stable output contract
mcp-tools.md	MCP tool contracts
tool-safety.md	Safety model
model-setup.md	YOLOv8n setup
integration.md	MQ ecosystem integration
release.md	Release process

Roadmap

→ ROADMAP.md

Version	Focus	Status
v0.1.0	Vision Intelligence MVP	Done
v0.1.1	Hardening	Done
v0.2.0	Image comparison	Done
v0.3.0	Screenshot intelligence	Done
v0.4.0	MCP integration	Done
v0.5.0	MQ ecosystem integration	Done
v1.0.0	Stable toolkit	Done
v1.1.0	Visual cognition layer (`visual_architecture_observation.v1`, `observe_architecture` MCP tool)	Done
v1.2.0	mq-mcp compatibility docs + hard boundary	Done
v1.2.1	`MQ_MCP_COMPATIBILITY.md` + MCP tool contract table	Done
v1.3.0	`image_ocr` MCP tool + mq-agent workflow examples	Done
v1.4.0	Perception workflow integration hardening	Done

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
.claude/commands		.claude/commands
.github/workflows		.github/workflows
.mq		.mq
backups/scripts/20260601-194650		backups/scripts/20260601-194650
docs		docs
examples		examples
mq_image_analyze		mq_image_analyze
scripts		scripts
skills		skills
tests		tests
web		web
.gitignore		.gitignore
.markdownlint.json		.markdownlint.json
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
ROADMAP.md		ROADMAP.md
SKILLS.md		SKILLS.md
VERSION		VERSION
pyproject.toml		pyproject.toml
release-check.sh		release-check.sh
repo-signal.yml		repo-signal.yml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mq-image-analyze

What this is

Proof

Quick start

Command surface

Architecture

JSON output

Tests

Skills

Safety

mq-mcp compatibility

Hard boundary

Integration

How mq-agent uses mq-image-analyze

Docs

Roadmap

About

Uh oh!

Releases 6

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

mq-image-analyze

What this is

Proof

Quick start

Command surface

Architecture

JSON output

Tests

Skills

Safety

mq-mcp compatibility

Hard boundary

Integration

How mq-agent uses mq-image-analyze

Docs

Roadmap

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 6

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages