Job List Harvester

HarvestOrb - Web Scraping: Automated LinkedIn Job Searching Project

I started this project because spenting time scrolling and searching job posts on a job site can be time consuming. More importantly, some job postings are not up-to-date.

My first approach was to use BeautifulSoup to extract job posts from LinkedIn. It did not work because LinkedIn prevent web crawlers from indexing their job listing pages and whitelisted most of the web crawlers.

After doing some research, I finally find out that using Selenium Web driver can do the trick since Selenium is not a web scraping tool. In fact, Selenium is a purpose-built browser tool for testing. With Selenium, I don't have to worry about making any unauthorized or suspicious requests to a webpage.

As a result, I scraped and saved all customized data, especially job posting date, into a HTML file.

Steps Overview

Automation process of using Selenium and BeautifulSoup to extract the jobs from the job site (Behind the scene).

Visit LinkedIn page
Click 'Sign In' button
Enter account credentials
Click 'Log or Sign In' button
Click 'Jobs' tab
Enter title and location in the text fields (This can be skipped)
Click 'Search' button
Click 'All filters' button (This can be skipped)
Check any or all preferred boxes (This can be skipped)
Click 'show results' (This can be skipped)
Click job post that shows on the left panel
1. At the same time, the right panel will show more info about the company, job type, description, and requirement
2. Repeat 11 and 11a. to until the end of job list
Click 'Reset' button (This can be skipped)
Click 'Sign Out'

In this video, it will show the console output while running the automation.

The additional information is that it shows the table of job list that saved in html file.

The process of looping through 25 job posts will take some time.

Feel free to skip to 3:30 to view the list of job that stored in an HTML file.