Skip to content

Latest commit

 

History

History
118 lines (86 loc) · 3.74 KB

File metadata and controls

118 lines (86 loc) · 3.74 KB

🚀 Getting Started with RoK Vision

This guide will help you set up the RoK Vision API on your local machine using Docker. This is the recommended way to run the project, as it automatically handles the Python (OCR) and .NET (Orchestrator) dependencies.

📋 Prerequisites

Before you begin, ensure you have the following installed:


🛠️ Installation

1. Clone the Repository

Open your terminal and clone the project:

git clone https://github.com/feels-dev/RokVision.git
cd RoKVision

2. Build and Run (The Easy Way)

RoK Vision uses docker-compose to orchestrate the Brain (.NET) and the Eye (Python).

Run the following command to build the images and start the containers:

docker compose up --build

Note: The first build might take a few minutes as it downloads the .NET SDK, Python base images, and installs dependencies like PaddleOCR.

3. Verify Deployment

Once the logs stop scrolling and you see "Now listening on...", the services are up:


🧪 Testing the API

Option A: Using Swagger (Browser)

  1. Go to http://localhost:5000/swagger/index.html.
  2. Expand the endpoint you want to test (e.g., /api/governor/analyze).
  3. Click Try it out.
  4. In the Images field, upload your screenshot.
  5. NEW: Ensure the Debug checkbox is marked (or set to true) to receive the full processing logs and timers in the response.
  6. Click Execute.

Option B: Using cURL (Including Debug Mode)

Use the following template to test an endpoint, making sure to include the Debug flag to get rich debugging output:

# Governor Profile Example (Single Image)
curl -X 'POST' \
  'http://localhost:5000/api/governor/analyze' \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -F 'Image=@/path/to/your/governor_screenshot.jpg' \
  -F 'Debug=true'
# XP Inventory Example (Multiple Images)
curl -X 'POST' \
  'http://localhost:5000/api/xp/analyze' \
  -H 'accept: application/json' \
  -H 'Content-Type: multipart/form-data' \
  -F 'Images=@/path/to/xp_scroll_1.jpg' \
  -F 'Images=@/path/to/xp_scroll_2.jpg' \
  -F 'Debug=true'

⚙️ Configuration (Advanced)

Environment Variables

The docker-compose.yml file comes pre-configured. However, if you have a powerful GPU (NVIDIA), you can enable GPU acceleration for the OCR engine.

Open docker-compose.yml and modify the environment variables under ocr-engine:

environment:
  - OCR_USE_GPU=True       # Set to True if you have NVIDIA Drivers + CUDA Toolkit
  - OCR_ENABLE_MKLDNN=True # CPU Acceleration (Keep True for CPU-only)
  - OCR_CPU_THREADS=8      # Adjust based on your CPU cores

Ports

If ports 5000 or 8000 are already in use on your machine, change the left side of the port mapping in docker-compose.yml:

ports:
  - "9090:5000" # Maps local 9090 to container 5000

🐛 Troubleshooting

1. "Failed to solve... parent snapshot does not exist" This happens if the Docker cache gets corrupted. Run:

docker builder prune -f
docker compose build --no-cache

2. "OCR Engine not reachable" Ensure both containers are running in the same network. The project uses a default bridge network created by docker-compose. Check logs with:

docker logs rok-ocr-api