Data Extraction with GroundX on OpenShift AI

GroundX, from EyeLevel, is an enterprise platform that eliminates LLM hallucinations by grounding AI responses in a company's specific, private data. The platform utilizes advanced computer vision to preserve the context of complex document layouts, such as nested tables and schematics, ensuring high-fidelity search and retrieval. Beyond information discovery, it functions as a powerful tool for automated data extraction, transforming unstructured files into structured, verifiable insights with direct source citations.

When used with OpenShift AI on premises, customers can have complete control of their own data and where it is stored and processed.

Detailed Description

This AI quickstart demonstrates how to use GroundX from EyeLevel for billing data extraction in an on-prem AI environment with OpenShift AI.

You will deploy GroundX along with its supporting components (MinIO, Percona MySQL, and Strimzi Kafka) using two umbrella Helm charts, then open the included Jupyter notebook and follow the data extraction workflow.

See It in Action

Deploy operators and cluster prep (storage class, node labels, Percona operator, MinIO operator).
Deploy application workloads (database cluster, MinIO tenant, Kafka cluster, GroundX, notebook).
Open the Jupyter notebook and run the billing data extraction demo.

Architecture Diagram

Requirements

This quickstart was developed and tested on an OpenShift cluster with the following components and resources. This can be considered the minimum requirements.

Minimum Hardware Requirements

Node Type	Qty	vCPU	Memory (GB)
Control Plane	3	4	16
Worker	3	4	16

NVIDIA GPU with 16 GB of vRAM (optional — see GPU configuration).

Minimum Software Requirements

Software	Version
Red Hat OpenShift	4.20.5
Red Hat OpenShift Service Mesh	2.5.11-0
Red Hat OpenShift Serverless	1.37.0
Red Hat OpenShift AI	2.25
Helm CLI	3.17.1
GroundX	2.9.92

Required User Permissions

The user performing this quickstart should have admin permissions in the cluster (does not require cluster-admin).

Deploy

Deployment uses two Helm umbrella charts installed in sequence:

Chart	Path	Purpose
billing-operators	`helm/billing-operators/`	Operators and cluster prep (storage class, node labels, Percona operator, MinIO operator, optional Strimzi operator)
billing-workloads	`helm/billing-workloads/`	Application workloads (database cluster, MinIO tenant, Kafka cluster, GroundX, Jupyter notebook)

Pre-requisites

The following must already be deployed and functional on the cluster:

Red Hat OpenShift Container Platform
Red Hat OpenShift Service Mesh
Red Hat OpenShift Serverless
Red Hat OpenShift AI
Red Hat Authorino
Node Feature Discovery operator
NVIDIA GPU operator (if using GPU for layout inference)
User has admin permissions in the cluster
The eyelevel project should not exist

Installation

Clone this repo and log in to your cluster:

git clone https://github.com/johnson2500/Billing-extraction-with-GroundX.git
cd Billing-extraction-with-GroundX

oc login --token=<user_token> --server=https://api.<openshift_cluster_fqdn>:6443

Set the GroundX API key by copying your secret file into place:

cp <secret_with_key>.yaml values/values.groundx.secret.yaml

Review and edit the values files to match your environment:
- helm/billing-operators/values.yaml — operator toggles, node labels
- helm/billing-workloads/values.yaml — GroundX config, notebook settings, resource limits
Install using the Makefile (recommended):

# From the repo root — installs operators first, then workloads
make install

Or install each chart manually:

# Create the namespace
oc create namespace eyelevel --dry-run=client -o yaml | oc apply -f -

# Update chart dependencies
cd helm/billing-operators && helm dependency update && cd ../..
cd helm/billing-workloads && helm dependency update && cd ../..

# Phase 1: Operators
helm upgrade --install billing-operators ./helm/billing-operators \
  -f ./helm/billing-operators/values.yaml -n eyelevel

# Phase 2: Workloads
helm upgrade --install billing-workloads ./helm/billing-workloads \
  -f ./helm/billing-workloads/values.yaml -n eyelevel

Verify the Deployment

make get-pods
# or
oc get pods -n eyelevel

All pods should reach Running (or Completed for one-shot Jobs).

GPU Configuration for Layout Inference

To disable GPU for GroundX layout inference, add the following to helm/billing-workloads/values.yaml under groundx.layout.inference:

layout:
  inference:
    resources:
      limits:
        memory: 12Gi
        nvidia.com/gpu: '0' # <-- Set to 0
      requests:
        cpu: 500m
        memory: 2Gi
        nvidia.com/gpu: '0' # <-- Set to 0

To run with a GPU (CPU-only), set nvidia.com/gpu to '1'. See the comments in values/values.groundx.yaml for details.

Demo GroundX

Create Storage Bucket for Models

Use the MinIO console route to access MinIO's UI.
Create a storage bucket called models.

Use the Chart-managed Notebook (Recommended)

The billing-workloads chart can create an OpenShift AI notebook and optionally clone this repository into the notebook PVC.

Configure notebook settings in helm/billing-workloads/values.yaml.
Enable automatic clone by setting:
- notebook.gitClone.enabled: true
- notebook.gitClone.repository: https://github.com/johnson2500/Billing-extraction-with-GroundX.git
- (optional) notebook.gitClone.revision, notebook.gitClone.targetDir, notebook.gitClone.forceReset
Deploy/upgrade the chart:

make install

Open the created notebook from OpenShift AI → Workbenches.

Create a New Workbench Manually

If you are not using the chart-managed notebook, create a workbench in OpenShift AI:

In OpenShift AI, enter the eyelevel project.
Create a workbench:
- Name: groundx-wb
- Image selection: Jupyter | Minimal | CPU | Python 3.12
- Version selection: 2025.2
- Container size: Small
- Accelerator: None
- Add environment variables IMPORTANT READ:
  - Type: Secret → Key / value
    - Key: GROUNDX_ADMIN_API_KEY — Value: <YOUR_GROUNDX_ADMIN_API_KEY>
  - Type: ConfigMap → Key / value
    - Key: GROUNDX_BASE_URL — Value: <GROUNDX_OPENSHIFT_ROUTE>/api: IMPORTANT: This is set in the values.yaml file and needs to have /api appended.
- Click Create connection:
  - Select S3 compatible object storage
  - Connection name: Models-Storage
  - Access key: minio
  - Secret key: minio123
  - Endpoint: http://minio
  - Region: us-east-1
  - Bucket: models
Click Create notebook.
Clone this repo into the workbench if notebook.gitClone.enabled is false.

Run the GroundX Demo

Open the get_started notebook.
In the Initialize Client and Prompt Manager section, set the required variables (OpenShift route to GroundX, API key).
Save and run the notebook.

Uninstall

# From the repo root
make uninstall

Or manually:

cd helm && make uninstall NAMESPACE=eyelevel

This uninstalls the workloads chart first (clearing CRs and finalizers), then the operators chart. The namespace is preserved by default — delete it separately with oc delete project eyelevel if desired.

References

GroundX documentation
OpenShift AI documentation v2.25

Technical Details

Deploying Gemma 3 12B via LLM Service (Optional)

The chart can deploy google/gemma-3-12b-it using the llm-service from the ai-architecture-charts repo.

Enable: In helm/billing-workloads/values.yaml, llm-service.enabled is true by default and global.models.gemma-3-12b-it is configured.
Requirements: Nodes with NVIDIA GPUs (nvidia.com/gpu); the model uses the default GPU device and tolerations from the llm-service chart.
Hugging Face token: If the model is gated, set llm-service.secret.hf_token (e.g. via a values override or sealed secret).
Use with GroundX: After deploy, the model is exposed as a KServe InferenceService. To use it for GroundX extract, set the extract agent to your cluster's endpoint for the gemma-3-12b-it predictor (e.g. the OpenShift route or http://gemma-3-12b-it-predictor.<namespace>.svc.cluster.local/v1), and set modelId to the served model name.

To disable the LLM service, set llm-service.enabled: false in values.

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
.github		.github
__pycache__		__pycache__
docs/images		docs/images
helm		helm
prompts		prompts
test-docs		test-docs
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
get_started.ipynb		get_started.ipynb
manager.py		manager.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Extraction with GroundX on OpenShift AI

Table of Contents

Detailed Description

See It in Action

Architecture Diagram

Requirements

Minimum Hardware Requirements

Minimum Software Requirements

Required User Permissions

Deploy

Pre-requisites

Installation

Verify the Deployment

GPU Configuration for Layout Inference

Demo GroundX

Create Storage Bucket for Models

Use the Chart-managed Notebook (Recommended)

Create a New Workbench Manually

Run the GroundX Demo

Uninstall

References

Technical Details

Deploying Gemma 3 12B via LLM Service (Optional)

Tags

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Data Extraction with GroundX on OpenShift AI

Table of Contents

Detailed Description

See It in Action

Architecture Diagram

Requirements

Minimum Hardware Requirements

Minimum Software Requirements

Required User Permissions

Deploy

Pre-requisites

Installation

Verify the Deployment

GPU Configuration for Layout Inference

Demo GroundX

Create Storage Bucket for Models

Use the Chart-managed Notebook (Recommended)

Create a New Workbench Manually

Run the GroundX Demo

Uninstall

References

Technical Details

Deploying Gemma 3 12B via LLM Service (Optional)

Tags

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages