GenSIE (General-purpose Schema-guided Information Extraction) is a shared task at IberLEF 2026. This repository provides the official starter kit for participants.
We recommend using uv for fast dependency management:
git clone <repository-url>
cd gensie
uv sync --group devCreate a .env file to configure your inference backend:
OPENAI_API_KEY="your-api-key"
OPENAI_BASE_URL="http://localhost:1234/v1" # Optional: for local LLMsStart the FastAPI server:
uv run gensie serve --port 8000Evaluate your agent against the 40 starter instances:
uv run gensie eval --data data/starter/ --url http://localhost:8000 --pipeline baseline --model gpt-4o-mini- Inherit from
GenSIEAgent: Implement your extraction logic insrc/gensie/. - Register your Pipelines: Configure up to 3 pipelines in
OfficialParticipant(seesrc/gensie/baseline.py). - Submit: Open a Competition Submission Issue to register your team and repository.
- Dockerize: Use the provided
Dockerfileanddocker-compose.ymlfor testing and final submission.
docker compose up --buildThe kit includes 40 silver-generated instances for initial testing. Official metrics use Flattened Schema Scoring (Micro-F1), which combines exact matches for rigid fields and semantic similarity for free-text fields.
For more details, see our guides:
This starter kit is licensed under the MIT License.