This project is a web scraper that automates the process of extracting LinkedIn alumni data from a specific university. It collects information such as names, job titles, profile links, experience, education, and certifications.
- Open command prompt or terminal and clone GitHub repository and open the project folder
Git clone https://github.com/notyouriiz/Linkedin_Scraper.gitand then after already cloned
cd Linkedin_Scraper
2. install dependencies
pip install -r requirements.txt
3. Run the scraper
- using python
python app.py- using flask
flask run- custom port
python -m flask --app app run --port=5050- visit local link that appear
- Scraped Data will be saved in Data/Linkedin_SCU_Alumni_2025.csv
- Automated Login: Supports auto login that able to bypass LinkedIn's bot detection.
- Manual Login: Safer option to bypass LinkedIn's bot detection.
- Dynamic Scrolling: Loads more profiles dynamically for comprehensive data extraction.
- Profile Scraping: Extracts detailed information such as work experience, education, and certifications.
- Data Storage: Saves extracted data into a CSV file for further analysis.
- City-Based Search: Scrapes alumni based on city names from a predefined list.
Before running the scraper, ensure you have the following dependencies installed:
pip install -r requirements.txt- `selenium – Automates web interactions.
- `webdriver-manager – Manages WebDriver installations.
- `beautifulsoup4 – Parses HTML content.
- `lxml– Faster XML and HTML parsing.
- `pandas – Data manipulation and analysis.
- `numpy – Numerical computing.
- `openpyxl – Reads and writes Excel files.
- `python-dotenv – Loads environment variables.
-
Install Dependencies
pip install -r requirements.txt
-
Set Up Credentials Create a
.envfile in the project directory and add your LinkedIn credentials:LINKEDIN_EMAIL=your-email@example.com LINKEDIN_PASSWORD=your-password
-
Download ChromeDriver Ensure you have the appropriate version of ChromeDriver installed. The script will attempt to download it automatically using
webdriver-manager. -
Prepare City List Ensure that the
Data/Person Locations/indonesia_cities.csvfile contains a list of cities in a column namedCity. -
Prepare Class Code
('div', { 'class': 'YqprdwMdlHkSDMqLRuVsNMDuqpfpOSlCY EUugwXMAWHNSsJUZCvVoLYGTUzCejokiBUPPY aDbiGyAraCVAtqkDKUGRiLuhDZgkXmYiMA' # Make sure this Code is UP TO DATE })

Ensure that theClass, code from your Linkedin is Up To Date, the Class Code on the program might be different due to Linkedin Dynamic SectionClassCode. This is to get data from Experience, Education and License & Certification
This class code in here is for getting location information.💡Tips: Place your Cursor in the Border of the Section While Inspect With Cursor
Run the script with the following command:
python main.py- The script will prompt you to log in manually to LinkedIn.
- After logging in, press Enter in the terminal to continue.
- The script will automatically login to your LinkedIn, ensure your Email and Password on
.envare correct. - Don't do to much, otherwise the Linkedin Anti-Scraping System will notice unusual request and your account can get restriction.
- Press Enter to continue scraping the next profile.
- Type
nextto skip to the next city. - Type
exitto stop the script immediately.
The extracted data will be saved in:
Data/LinkedIn_SCU_Alumni.csv
with the following fields:
- City
- Name
- Job Title
- LinkedIn Profile Link
- Profile Picture URL
- Experience
- Education
- Licenses & Certifications
- Scraping LinkedIn data is against their terms of service; use this tool responsibly.
- Avoid running the script too frequently to prevent detection.
- Ensure your LinkedIn account is in good standing before scraping.
Author: Faiz Noor Adhytia Contact: faizadhytia24@gmail.com
