Skip to content

DarkCrawler is an advanced web crawler that performs deep scans of websites to extract and visualize their file structure. It identifies common admin paths, displays progress, and saves results as an interactive HTML report for easy analysis.

Notifications You must be signed in to change notification settings

mxvirus117/DarkCrawler

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 

Repository files navigation

🚀 DarkCrawler

Python Flask License Build Status


Programmer Animation


✨ Overview

DarkCrawler is a high-performance, concurrent web crawler built with Python and Flask. It recursively scans websites, detects common admin paths, and displays the website structure in a sleek terminal-style interface with a dark green theme.


🖥️ Supported Operating Systems

Windows macOS Linux
Full support with PowerShell and CMD Full support with Terminal Full support with Bash and Terminal


🎯 Features

  • Recursive website crawling with configurable depth
  • Concurrent crawling using multithreading for blazing speed
  • Detection of common admin and sensitive paths
  • Terminal-style website structure display with intuitive file icons
  • Responsive and accessible web UI with smooth terminal text animation
  • Efficient handling of large websites with thread-safe data structures

🏗️ Architecture & Technical Details

  • Concurrency Model: Uses Python's threading module with a thread pool to fetch multiple URLs in parallel.
  • Queue Management: URLs to visit are managed in a thread-safe queue to avoid race conditions.
  • HTML Parsing: Utilizes BeautifulSoup for robust HTML parsing and link extraction.
  • Admin Path Detection: Checks URLs against a comprehensive list of common admin paths.
  • File Structure: Builds a hierarchical tree structure representing the website's files and directories.
  • Frontend: Flask serves a responsive UI with a terminal-style display and animated typing effect using JavaScript.

🛠️ Installation

Clone the repository

git clone https://github.com/yourusername/DarkCrawler.git
cd DarkCrawler

Create and activate a virtual environment

Windows

python -m venv venv
venv\Scripts\activate

macOS / Linux

python3 -m venv venv
source venv/bin/activate

Install dependencies

pip install -r requirements.txt

🚀 Usage

Run the Flask server:

python server.py

Open your browser and navigate to:

http://localhost:5000

Enter the URL you want to crawl and start the crawl.


⚙️ Configuration

Parameter Description Default
max_threads Number of concurrent threads for crawling 10
max_depth Maximum crawl depth 7

📚 How It Works

  1. The crawler starts from the given URL.
  2. It fetches pages concurrently using multiple threads.
  3. Parses HTML to find links and resources.
  4. Detects common admin paths and flags them.
  5. Builds a hierarchical tree of the website structure.
  6. Displays the results in a terminal-style UI with animated typing effect.

🧩 Technologies Used

  • Python 3.8+
  • Flask web framework
  • Requests for HTTP requests
  • BeautifulSoup for HTML parsing
  • Threading and Queue for concurrency
  • HTML, CSS, and JavaScript for frontend UI


🤝 Contributing

Contributions are welcome! Please open issues or submit pull requests.


📄 License

This project is licensed under the MIT License.


👤 Author

S4Tech | Mr.SaMi

GitHub LinkedIn Twitter

About

DarkCrawler is an advanced web crawler that performs deep scans of websites to extract and visualize their file structure. It identifies common admin paths, displays progress, and saves results as an interactive HTML report for easy analysis.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published