-
Notifications
You must be signed in to change notification settings - Fork 116
Description
name: 🚀 Feature Request
about: Suggest an idea or a new capability for FireForm.
title: "[FEAT]: Batch LLM extraction to reduce N API calls to 1"
labels: enhancement
assignees: ''
📝 Description
Currently, LLM.main_loop() makes one Ollama API call per form field.
For a form with N fields, this results in:
- N prompts
- N network round-trips
- N model inference cycles
- N JSON parses
This creates a scalability bottleneck, especially for multi-form submissions.
We propose adding a batch extraction method that sends all target fields in a single structured prompt and parses a single JSON response.
💡 Rationale
At scale, sequential per-field API calls significantly increase latency and infrastructure load.
Example:
- 1 form × 10 fields → 10 LLM calls
- 3 forms × 10 fields → 30 LLM calls
Reducing this to a single API call per form would:
- Improve performance
- Reduce response time
- Lower model inference overhead
- Improve scalability for multi-agency submissions
🛠️ Proposed Solution
Introduce a new method, e.g., LLM.main_loop_batch():
-
Build one structured prompt containing:
- All target fields
- Transcript/input text
-
Instruct the model to return strict JSON
-
Parse once
-
Strip markdown code fences (```json) if present
-
Handle
nulland list values -
Gracefully fall back to
main_loop()if JSON parsing fails -
Logic change in
src/ -
New prompt for Mistral/Ollama
-
Unit tests in
tests/test_llm.py -
Update
src/filler.pyto use batch method
✅ Acceptance Criteria
- Exactly 1 API call per form regardless of field count
- Batch method correctly populates all fields
- Handles null and list values correctly
- Gracefully falls back to sequential method if invalid JSON is returned
- Feature works in Docker container
- Documentation updated in
docs/ - JSON output validates against the schema
📌 Additional Context
Before:
N fields → N API calls
After:
N fields → 1 API call
This change significantly improves performance while maintaining backward compatibility.