A powerful tool to validate and enrich BibTeX entries using metadata from Crossref, arXiv, and Google Scholar. It helps ensure your bibliography is accurate, complete, and up-to-date.
- 🔍 DOI Validation: Automatically verifies DOIs against the Crossref API.
- 📄 arXiv Integration: Detects arXiv preprints and fetches updated metadata or official publication DOIs.
- ✨ Smart Enrichment: Fills in missing fields using metadata from multiple reliable academic sources:
- Crossref - For official DOI metadata.
- arXiv - For preprint information and updates.
- Google Scholar - For citation and missing metadata (via
scholarly). - DBLP - For computer science bibliography.
- Semantic Scholar - For AI-powered research paper data.
- PubMed - For biomedical literature.
- Zenodo - For general repositories and datasets.
- DataCite - For data DOI registry.
- OpenAlex - For comprehensive academic metadata.
- ⚖️ Dual Validation: Compares your local BibTeX data with partial API data to highlight conflicts.
- 🖥️ Interactive GUI: A modern web-based interface to review, accept, or reject changes visually with color-coded badges and intuitive controls.
- 📊 Report Generation: Produces detailed validation reports.
This project is managed with uv. Ensure you have uv installed on your system.
If you are writing a paper and simply want to use this tool to validate your references.bib file without modifying the tool's code:
This installs the tool in an isolated environment, making bibtex-validator available globally or for your specific project without polluting dependencies.
# Install directly from the repository
uv tool install git+https://github.com/bet-lab/reference-validator.gitRun validation:
bibtex-validator references.bib --guiIf you are already managing your paper's environment (e.g., for processing scripts) using uv or a pyproject.toml:
# Add to your existing project
uv add git+https://github.com/bet-lab/reference-validator.gitRun validation:
uv run bibtex-validator references.bibIf you want to modify the source code, fix bugs, or add new features:
-
Clone the repository:
git clone https://github.com/bet-lab/reference-validator.git cd reference-validator -
Sync dependencies: Run
uv syncto create a virtual environment and install all dependencies (including dev dependencies).uv sync
-
Run from source: You can run the script using
uv run.uv run bibtex-validator references_test.bib --gui
Basic Validation (Dry Run) Checks the file and prints a report without making changes.
# If installed via Scenario 1 (Option A)
bibtex-validator references.bib
# If installed via Scenario 1 (Option B) or Scenario 2
uv run bibtex-validator references.bibAuto-Update BibTeX File Validates and applies enriched metadata directly to your file.
uv run bibtex-validator references.bib --updateSave Update to New File Keeps the original file intact and saves the updated version to a new file.
uv run bibtex-validator references.bib --update --output references_enriched.bibSave Report to File
uv run bibtex-validator references.bib --report validation_report.txtLaunch the modern web-based review interface to manually inspect and approve changes.
uv run bibtex-validator references.bib --guiOnce running, the web interface will automatically launch in your default browser (default: http://127.0.0.1:8010).
📊 Validation Summary Dashboard
- Attention pie chart showing percentage of entries needing review
- Real-time statistics: Reviews Conflicts Differences Identical
- Global Action: ✅ Accept All Entries
🔍 Field-by-Field Comparison
- Side-by-side comparison of BibTeX vs API values
- Color-coded status badges for each field:
- Review - New data available
- Conflict - Significant mismatch
- Different - Minor formatting difference
- Identical - Verified match
- Source selection dropdown for fields with multiple data sources
⌨️ Keyboard Navigation
- Arrow keys (
←→) to navigate between entries Home/Endto jump to first/last entryPageUp/PageDownto jump by 10 entriesEscto clear selection
🎯 Flexible Actions
- Accept/Reject individual fields
- Bulk actions: Reject All / Accept All per entry
- Global batch approval for all entries
- Real-time save with visual feedback
usage: validate_bibtex.py [-h] [-o OUTPUT] [-r REPORT] [-u] [-d DELAY] [--no-progress] [--gui] [--workers WORKERS] [--port PORT] bib_file
Validate and enrich BibTeX entries using DOI, arXiv, and Google Scholar
positional arguments:
bib_file Input BibTeX file
optional arguments:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output BibTeX file (default: same as input if --update)
-r REPORT, --report REPORT
Output report file (default: print to stdout)
-u, --update Update BibTeX file with enriched data
-d DELAY, --delay DELAY
Delay between API requests in seconds (default: 1.0)
--no-progress Hide progress indicators
--gui Launch web-based GUI interface 🖥️
--workers WORKERS Number of threads for parallel validation (default: 10)
--port PORT Port for GUI web server (default: 8010)
| Key | Action | Description |
|---|---|---|
← ↑ |
Previous Entry | Navigate to previous entry |
→ ↓ |
Next Entry | Navigate to next entry |
Home |
First Entry | Jump to first entry |
End |
Last Entry | Jump to last entry |
PageUp |
Jump Back 10 | Move backward by 10 entries |
PageDown |
Jump Forward 10 | Move forward by 10 entries |
Esc |
Clear Selection | Deselect current entry |
-
📚 Core Libraries
bibtexparser(>=1.4.0): For reading and writing BibTeX files.requests(>=2.31.0): For making API calls to academic databases.
-
🔍 Data Sources
scholarly(>=1.7.0): For accessing Google Scholar data (optional).
-
🖥️ GUI Components
fastapi(>=0.104.0): Modern web framework for the GUI.uvicorn[standard](>=0.24.0): ASGI server for running the web interface.
These are automatically installed when using uv run or uv sync.
The validator integrates with multiple academic databases and metadata providers:
Crossref arXiv Semantic Scholar DBLP PubMed Zenodo DataCite OpenAlex