/JobMatch

JobMatch is a Job recommendation system that utilizes user's resume and extract top 10 most similar jobs from LinkedIn, Indeed and other job portals for the user based on the resume content.

Primary LanguagePython

JobMatch - Job Recommendation System

Live application links

codelab

Demo

Technologies Used

Streamlit FastAPI Amazon AWS Google Cloud GitHub Python Pandas NumPy OpenAI Snowflake HuggingFace Docker Airflow Selenium Plotly MongoDB Pinecone

Overview

JobMatch is a revolutionary Job Recommendation System designed to streamline and enhance the job search experience by centralizing the job search process and by analyzing user uploaded resumes. JobMatch provides tailored job recommendations from top platforms like LinkedIn, Indeed, and SimplyHired. Users gain direct access to recommended job listings, ensuring a personalized and efficient job search experience.

Problem Statement

Challenge:

The current scenario for the job search process is really tiresome and exhilarating for job seekers. A user currently has to go through every job portal manually, browse through the available jobs, and make a profile describing the role he/she is targeting; which consumes a lot of time. We wanted to optimize the process by making an application that will bring jobs from different portals and filter them based on the user's resume, by using modern technologies, hence easing the process for job seekers.

Solution:

The objective of this project is to develop and deploy an efficient data engineering infrastructure for a Job Recommendation System, termed "JobMatch," which facilitates the seamless matching of job seekers with relevant employment opportunities based on their uploaded resumes. Unlike traditional job search platforms, JobMatch will aggregate jobs from several portals, utilize advanced data processing techniques to analyze the content of user's resumes, and recommend suitable job openings from various sources such as LinkedIn, Indeed, and SimplyHired. The system will not only match job seekers with relevant positions but also provide direct access to the recommended job listings through provided links.

Architecture:

Alt text

Codelab

Link: https://codelabs-preview.appspot.com/?file_id=1xOJo6D40dsWjctPaj2Z7uZlOG9cHrW0DRejiGDkK9XM#0

Project Tree

πŸ“¦ 
β”œβ”€ .gitignore
β”œβ”€ README.md
β”œβ”€ airflow
β”‚  β”œβ”€ .gitignore
β”‚  β”œβ”€ Dockerfile
β”‚  β”œβ”€ configuration.properties.example
β”‚  β”œβ”€ dags
β”‚  β”‚  β”œβ”€ embed
β”‚  β”‚  β”‚  β”œβ”€ configuration.properties.example
β”‚  β”‚  β”‚  β”œβ”€ connections.py
β”‚  β”‚  β”‚  β””─ embed_and_upsert.py
β”‚  β”‚  β”œβ”€ extract
β”‚  β”‚  β”‚  β”œβ”€ configuration.properties.example
β”‚  β”‚  β”‚  β”œβ”€ connections.py
β”‚  β”‚  β”‚  β”œβ”€ extract_indeed_jobs.py
β”‚  β”‚  β”‚  β”œβ”€ extract_linkedin_jobs.py
β”‚  β”‚  β”‚  β””─ extract_simplyhired_jobs.py
β”‚  β”‚  β”œβ”€ load
β”‚  β”‚  β”‚  β”œβ”€ configuration.properties.example
β”‚  β”‚  β”‚  β”œβ”€ connections.py
β”‚  β”‚  β”‚  β””─ loading.py
β”‚  β”‚  β”œβ”€ pipeline_indeed.py
β”‚  β”‚  β”œβ”€ pipeline_linkedin.py
β”‚  β”‚  β”œβ”€ pipeline_simplyhired.py
β”‚  β”‚  β””─ validate
β”‚  β”‚     β”œβ”€ configuration.properties.example
β”‚  β”‚     β”œβ”€ connections.py
β”‚  β”‚     β””─ validation.py
β”‚  β”œβ”€ docker-compose.yaml
β”‚  β””─ requirements.txt
β”œβ”€ architecture-diagram
β”‚  β”œβ”€ flow_diagram.ipynb
β”‚  β”œβ”€ flow_diagram.png
β”‚  β””─ input_icons
β”‚     β”œβ”€ huggingface.png
β”‚     β”œβ”€ indeed.png
β”‚     β”œβ”€ linkedin.png
β”‚     β”œβ”€ openai.png
β”‚     β”œβ”€ pdf.png
β”‚     β”œβ”€ pinecone.png
β”‚     β”œβ”€ streamlit.png
β”‚     β”œβ”€ user-authentication.png
β”‚     β””─ user.png
β”œβ”€ cloudbuild.yaml
β”œβ”€ docker-compose.yaml
β”œβ”€ fastapi-backend
β”‚  β”œβ”€ Dockerfile
β”‚  β”œβ”€ configuration.properties.example
β”‚  β”œβ”€ connections.py
β”‚  β”œβ”€ main.py
β”‚  β”œβ”€ requirements.txt
β”‚  β”œβ”€ routes
β”‚  β”‚  β”œβ”€ analyticsRoute.py
β”‚  β”‚  β”œβ”€ authRoute.py
β”‚  β”‚  β””─ userRoutes.py
β”‚  β””─ test_api.py
β”œβ”€ requirements.txt
β”œβ”€ setup
β”‚  β””─ snowflake_objects.sql
└─ streamlit-frontend
   β”œβ”€ Dockerfile
   β”œβ”€ components
   β”‚  β”œβ”€ analytics.py
   β”‚  β”œβ”€ get_job_matches.py
   β”‚  β”œβ”€ login_page.py
   β”‚  β”œβ”€ signup_page.py
   β”‚  β””─ upload_page.py
   β”œβ”€ configuration.properties.example
   β”œβ”€ main.py
   β””─ requirements.txt

Β©generated by Project Tree Generator

Prerequisites

Before running this project, ensure you have the following prerequisites set up:

  • Python: Ensure Python is installed on your system.

  • Docker: Ensure Docker-desktop is installed on your system.

  • Virtual Environment: Set up a virtual environment to manage dependencies and isolate your project's environment from other Python projects. You can create a virtual environment using virtualenv or venv.

  • requirements.txt: Install the required Python dependencies by running the command:

    pip install -r requirements.txt
    
  • Config File: Set up the configurations.properties file with the necessary credentials and configurations.

  • Snowflake: Use setup/snowflake_objects.sql to define the queries on snowflake. Also, ensure you have the necessary credentials and configurations set up in the configurations.properties file for connecting to Snowflake.

  • Google Cloud Platform: Create a Google Cloud Engine. Ensure you have the necessary credentials and configurations set up in the configurations.properties file.

How to Run the Application Locally

To run the application locally, follow these steps:

  1. Clone the repository to get all the source code on your machine.

  2. Use source/venv/bin/activate to activate the environment.

  3. Create a configuration.properties file in the all the directories where configuration.properties.example is present. Sample config file:

[auth-api]
SECRET_KEY = 
ALGORITHM = 

[MongoDB]
mongo_url = 
db_name = 
collection_name = 
collection_name_markdown = 

[airflow]
base_url_airflow = 
password = 
username = 

[AWS]
access_key = 
secret_key = 
region_name = 

[s3-bucket]
bucket = 
resumes_folder_name = 
text_folder_name = 

[SNOWFLAKE]
user = 
password = 
account = 
warehouse = 
database = 
schema = 
role = 
jobsTable = 

[OPENAI]
api_key = 

[PINECONE]
pinecone_api_key = 
index = 
  1. Once you have set up your environment variables, Use docker-compose up - build to run the application

  2. Access the Airflow UI by navigating to http://localhost:8080/ in your web browser.

  3. Once the DAGs have run successfully, view the Streamlit application

  4. Access the Streamlit UI by navigating to http://localhost:8501/ in your web browser.

  5. Enter username and password if you've already logged in. Otherwise you can register yourself and then run the application.

Team Contribution

WE ATTEST THAT WE HAVEN’T USED ANY OTHER STUDENTS’ WORK IN OUR ASSIGNMENT AND ABIDE BY THE POLICIES LISTED IN THE STUDENT HANDBOOK

Name Contribution %
Muskan Deepak Raisinghani 33.3%
Rachana Keshav 33.3%
Ritesh Choudhary 33.3%