Automated Resume Parser Description: Create a system to extract and categorize information from resumes (PDFs/Docs). Technologies: Python (spaCy, PDFPlumber), Flask, PostgreSQL Outcome: Extract candidate details (name, skills, education) and store them in a searchable database.
A Flask-based web application that provides an API for parsing resumes and extracting useful information using the pyresparser library.
- Parses resumes in PDF format to extract key details such as name, email, phone, skills, experience, education, etc.
- Provides a simple API endpoint for uploading and processing resumes.
- Includes a health check endpoint to verify the application's status.
- Python 3.7 or later
- Virtual Environment (optional but recommended)
- Libraries specified in the
requirements.txtfile
-
Clone this repository:
git clone https://github.com/<your-username>/resume-parser-api.git cd resume-parser-api
-
Set up a virtual environment (optional but recommended):
python3 -m venv venv source venv/bin/activate # On Windows, use venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt
-
Download NLTK stopwords:
python -c "import nltk; nltk.download('stopwords')"
-
Start the application:
python app.py
The application will run on
http://0.0.0.0:8030. -
Test the health check endpoint:
curl http://localhost:8030/ping
-
Parse a resume:
- Use a tool like Postman or
curlto send a POST request to/resume-parserwith a resume file.
curl -X POST -F "resume=@path/to/resume.pdf" http://localhost:8030/resume-parser- Replace
path/to/resume.pdfwith the actual file path of your resume.
- Use a tool like Postman or
- Endpoint:
/ping - Method:
GET - Response:
{ "status": "Healthy" }
- Endpoint:
/resume-parser - Method:
POST - Parameters:
resume: A PDF file of the resume.
- Response: Extracted data in JSON format.
Example:{ "name": "John Doe", "email": "john.doe@example.com", "phone": "1234567890", "skills": ["Python", "Flask", "Machine Learning"], ... }
The application uses Python's logging module to log debugging information and errors to the console.
Warnings are suppressed to avoid unnecessary clutter in the output. This can be adjusted by modifying the warnings.filterwarnings setting in the code.
This application can be deployed using platforms like Heroku, AWS, Google Cloud, or any containerized service like Docker.
- Fork this repository.
- Create a feature branch.
- Commit your changes and push.
- Open a pull request.
This project is licensed under the MIT License.
Twisha
Feel free to connect for questions or collaborations!