This guide will help you set up the RoK Vision API on your local machine using Docker. This is the recommended way to run the project, as it automatically handles the Python (OCR) and .NET (Orchestrator) dependencies.
Before you begin, ensure you have the following installed:
- Docker Desktop (or Docker Engine + Compose plugin on Linux).
- Git.
- Optional: Postman or Insomnia for API testing.
Open your terminal and clone the project:
git clone https://github.com/feels-dev/RokVision.git
cd RoKVision
RoK Vision uses docker-compose to orchestrate the Brain (.NET) and the Eye (Python).
Run the following command to build the images and start the containers:
docker compose up --buildNote: The first build might take a few minutes as it downloads the .NET SDK, Python base images, and installs dependencies like PaddleOCR.
Once the logs stop scrolling and you see "Now listening on...", the services are up:
- API Gateway (Swagger UI): http://localhost:5000/swagger
- OCR Engine (Health Check): http://localhost:8000/health
- Go to http://localhost:5000/swagger/index.html.
- Expand the endpoint you want to test (e.g.,
/api/governor/analyze). - Click Try it out.
- In the
Imagesfield, upload your screenshot. - NEW: Ensure the
Debugcheckbox is marked (or set totrue) to receive the full processing logs and timers in the response. - Click Execute.
Use the following template to test an endpoint, making sure to include the Debug flag to get rich debugging output:
# Governor Profile Example (Single Image)
curl -X 'POST' \
'http://localhost:5000/api/governor/analyze' \
-H 'accept: application/json' \
-H 'Content-Type: multipart/form-data' \
-F 'Image=@/path/to/your/governor_screenshot.jpg' \
-F 'Debug=true'
# XP Inventory Example (Multiple Images)
curl -X 'POST' \
'http://localhost:5000/api/xp/analyze' \
-H 'accept: application/json' \
-H 'Content-Type: multipart/form-data' \
-F 'Images=@/path/to/xp_scroll_1.jpg' \
-F 'Images=@/path/to/xp_scroll_2.jpg' \
-F 'Debug=true'
The docker-compose.yml file comes pre-configured. However, if you have a powerful GPU (NVIDIA), you can enable GPU acceleration for the OCR engine.
Open docker-compose.yml and modify the environment variables under ocr-engine:
environment:
- OCR_USE_GPU=True # Set to True if you have NVIDIA Drivers + CUDA Toolkit
- OCR_ENABLE_MKLDNN=True # CPU Acceleration (Keep True for CPU-only)
- OCR_CPU_THREADS=8 # Adjust based on your CPU cores
If ports 5000 or 8000 are already in use on your machine, change the left side of the port mapping in docker-compose.yml:
ports:
- "9090:5000" # Maps local 9090 to container 5000
1. "Failed to solve... parent snapshot does not exist" This happens if the Docker cache gets corrupted. Run:
docker builder prune -f
docker compose build --no-cache
2. "OCR Engine not reachable" Ensure both containers are running in the same network. The project uses a default bridge network created by docker-compose. Check logs with:
docker logs rok-ocr-api