Free-Document-Extractor

An AI tool that extracts text and named entities from PDF documents using Python, pdfplumber, and spaCy

🧠 Free Document Extractor

This is a simple AI-powered tool built using Python that:

Extracts text from PDF documents
Identifies useful entities like names, dates, locations, and more using spaCy

📄 Features

Extracts plain text from any .pdf file using pdfplumber
Performs Named Entity Recognition (NER) using spaCy's en_core_web_sm model
Outputs both the full text and the named entities to a .txt file

🛠️ Built With

Python 3.10+
pdfplumber
spaCy

💡 How to Run

Install dependencies:

pip install pdfplumber spacy
python -m spacy download en_core_web_sm

#Run the script python document_extractor.py Enter the full path to your PDF file (e.g. C:/Users/Name/Desktop/sample_resume.pdf) 📦 Output Extracted text saved in extracted_info.txt

Named entities displayed in terminal

📫 Contact Made with ❤️ by Laiba Azhar GitHub: laibatechX

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
document_extractor.py		document_extractor.py
extracted_entities.txt		extracted_entities.txt
sample_resume.pdf		sample_resume.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Free-Document-Extractor

🧠 Free Document Extractor

📄 Features

🛠️ Built With

💡 How to Run

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Free-Document-Extractor

🧠 Free Document Extractor

📄 Features

🛠️ Built With

💡 How to Run

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages