Hirely is a portfolio-grade resume screening platform built with Streamlit and a hybrid NLP ranking engine. It parses resumes, extracts skills from a large skill taxonomy, computes semantic relevance to a job description, and ranks candidates with transparent scoring and persisted results in SQLite.
Hirely helps recruiters and hiring teams:
- Upload and parse PDF resumes.
- Store jobs, candidates, and ranking outputs in a local database.
- Rank candidates using hybrid scoring (semantic relevance + required skill coverage).
- Inspect candidate-level explanations for decision support.
- Evaluate ranking quality with benchmark metrics.
- Multi-page Streamlit dashboard for end-to-end screening workflow, including a Ranker Evaluation page.
- Dynamic skill extraction using a curated JSON taxonomy (
data/skills.json) with 236 real-world skills. - Hybrid ranking engine:
- Semantic similarity (TF-IDF + cosine similarity)
- Skill match ratio against required skills
- Persistent storage via SQLite (
hirely.db) with tables for jobs, candidates, results. - Evaluation module with Precision@K, Recall@K, and MRR.
- Explainable outputs showing semantic score, skill match %, and missing skills.
User Upload Resume
↓
PDF Parser (PyMuPDF)
↓
Text Preprocessing
↓
Skill Extraction (skills.json taxonomy)
↓
Semantic Similarity Engine (TF-IDF + Cosine)
↓
Hybrid Ranking Engine
↓
SQLite Storage (jobs, candidates, results)
↓
Streamlit Dashboard
- PDF parsing (
resume_parser.py) extracts text from uploaded resumes. - Preprocessing (
ml_pipeline.py) normalizes and tokenizes text. - Skill extraction (
skill_extractor.py) matches normalized terms fromdata/skills.json. - Semantic scoring computes cosine similarity between the job text and each candidate profile.
- Hybrid ranking blends semantic similarity and skill coverage.
For each candidate:
semantic_similarity∈ [0, 1]skill_match_ratio∈ [0, 1]
Final score:
final_score = (0.7 * semantic_similarity + 0.3 * skill_match_ratio) * 100
This avoids misleading scaling and ensures normalization to 0–100 only at the end.
jobs: title, description, required skillscandidates: parsed resume text, cleaned text, extracted skillsresults: final scores + explainability fields
Database helper module: database.py.
evaluation.py includes a mini benchmark dataset with expected relevant candidates and computes:
- Precision@K
- Recall@K
- Mean Reciprocal Rank (MRR)
Current benchmark output (python evaluation.py, k=2):
| precision@k | recall@k | mrr |
|---|---|---|
| 0.75 | 1.00 | 1.00 |
Run:
python evaluation.py| Rank | Candidate | Semantic % | Skill Match % | Final Match % | Missing Skills |
|---|---|---|---|---|---|
| 1 | Alice Johnson | 84.3 | 100.0 | 88.99 | None |
| 2 | Ben Torres | 73.6 | 80.0 | 75.52 | Hugging Face Transformers |
| 3 | Carla Smith | 21.0 | 0.0 | 14.70 | Python, SQL, NLP, MLOps |
Add screenshots by running the app locally and capturing Dashboard, Ranking, and Insights pages.
pip install -r requirements.txt
streamlit run app.pyhirely/
├── app.py
├── ml_pipeline.py
├── resume_parser.py
├── database.py
├── evaluation.py
├── skill_extractor.py
├── utils.py
├── data/
│ └── skills.json
├── requirements.txt
└── README.md
- Clean separation of UI, NLP pipeline, and persistence.
- Explainable and reproducible ranking outputs.
- Benchmark-oriented evaluation for technical review.
- Lightweight and deployable architecture without heavyweight infra.