Skip to content

gia-uh/gensie

Repository files navigation

GenSIE 2026 Public Starter Kit

License: MIT Python 3.13+ Docker

GenSIE (General-purpose Schema-guided Information Extraction) is a shared task at IberLEF 2026. This repository provides the official starter kit for participants.

🚀 Quick Start

1. Installation

We recommend using uv for fast dependency management:

git clone <repository-url>
cd gensie
uv sync --group dev

2. Configuration

Create a .env file to configure your inference backend:

OPENAI_API_KEY="your-api-key"
OPENAI_BASE_URL="http://localhost:1234/v1" # Optional: for local LLMs

3. Serving your Agent

Start the FastAPI server:

uv run gensie serve --port 8000

4. Running Benchmarks

Evaluate your agent against the 40 starter instances:

uv run gensie eval --data data/starter/ --url http://localhost:8000 --pipeline baseline --model gpt-4o-mini

🛠️ How to Participate

  1. Inherit from GenSIEAgent: Implement your extraction logic in src/gensie/.
  2. Register your Pipelines: Configure up to 3 pipelines in OfficialParticipant (see src/gensie/baseline.py).
  3. Submit: Open a Competition Submission Issue to register your team and repository.
  4. Dockerize: Use the provided Dockerfile and docker-compose.yml for testing and final submission.
docker compose up --build

📊 Dataset & Metrics

The kit includes 40 silver-generated instances for initial testing. Official metrics use Flattened Schema Scoring (Micro-F1), which combines exact matches for rigid fields and semantic similarity for free-text fields.

📜 Documentation

For more details, see our guides:

⚖️ License

This starter kit is licensed under the MIT License.

About

GenSIE - General-purpose Schema-guided Information Extraction

Topics

Resources

License

Stars

Watchers

Forks

Contributors