A resume parser and position parser to help you perform job matching based on keywords.
The purpose of this project is to use the power of Natural Language Processing to distinguish sections and keywords in a resume or position. You then have all the information to conduct an optimal analysis.
- Python 3
- Spacy version equal or greater than 2.2.3
- Spacy language model: en_core_web_sm
If you do not have Spacy and its language model installed, here are the two commands:
pip install -U spacy
python -m spacy download en_core_web_sm
The two main components (Resume and Position) are sharing common parameters:
- text: the resume or position as a string.
- language: the language of the resume or position. By default, it is "en" for "English".
- model: the spacy model. By default, it is "en_core_web_sm" for the English model.
- options: the parsing options, allow you to define how much features you want to execute.
By default, only the global keywords are extracted but, you can specify otherwise in the options.
Resume and Position are sharing the same structure and thus, you can use them in the same way. The following examples are made using the Resume class but it would be the same with the Position class.
from resume import Resume
resume_text = 'All of your resume content here...'
resume = Resume(text = resume_text)
# resume.keywords now contains the keywords and their context as a dictionary of string and Span
To search for domain specific keywords, use the options:
from options import Options
from resume import Resume
options = Options(domain_keywords = ['Python', 'JavaScript', 'Agile', 'TypeScript', 'Java', 'Docker'])
resume = Resume(text = resume_text, options = options)
# resume.domain_keywords now contains the domains specific keywords that matched
To identify sections, use the options:
from options import Options
from resume import Resume
from sections import Certificate, Education, Experience, Skills, Summary, Volunteering
options = Options(sections = [Certificate, Education, Experience, Skills, Summary, Volunteering])
resume = Resume(text = resume_text, options = options)
# resume.sections now contains the sections
To check if a Section exist in a resume or position (the education section for example), use:
resume.has_section('education')
List of already defined sections:
- Certificate
- Education
- Experience
- Skills
- Summary
- Requirements
- Responsibilities
- Volunteering
In order to create a new section, you need to create a class that extends the Section class.
Let's say, we want to create a Competition section, here is an example of how to do it:
class Competition(Section):
name = 'Competition' # Section title is required by Spacy to use the matcher
keywords = {
'en': ['competition', 'competitive awards', 'awards'],
}
Then, you can import it and use it in the options:
options = Options(sections = [Competition])
resume = Resume(text = resume_text, options = options)
It is possible to compare both global and domain specific keywords between a resume and a position.
The compare_global_keywords returns two dictionaries meanwhile the compare_domain_keywords method returns two lists.
resume = Resume(text = resume_text)
position = Position(text = position_text)
matching_general_keywords, missing_general_keywords = resume.compare_global_keywords(position)
options = Options(domain_keywords = ['JavaScript', 'C++', 'HTML5'])
resume = Resume(text = resume_text, options = options)
position = Position(text = position_text, options = options)
matching_domain_keywords, missing_domain_keywords = position.compare_domain_keywords(resume)
- Link keywords to sections
- Machine Learning model for entities extraction (time required for a skill, job experience recognition...)