A simple resume parser used for extracting information from resumes
- For extracting text from various documents we use pdfminer and doc2text modules. Install them using:
pip install pdfminer # python 2
pip install pdfminer.six # python 3
pip install doc2text
- For NLP operations we use spacy and nltk. Install them using:
# spaCy
pip install spacy
python -m spacy download en_core_web_sm
# nltk
pip install nltk
python -m nltk nltk.download('words')
- For extracting other supporting dependencies, execute:
pip install -r requirements.txt
-
Modify
skills.csv
as per your requirements -
Modify
Education Degrees
as per you requirements in constants.py -
Place all the resumes that you want to parse in
resumes/
directory -
Run
resume_parser.py
For running the resume extractor you can also use the cli
provided
usage: cli.py [-h] [-f FILE] [-d DIRECTORY]
optional arguments:
-h, --help show this help message and exit
-f FILE, --file FILE resume file to be extracted
-d DIRECTORY, --directory DIRECTORY directory containing all the resumes to be extracted
For extracting data from a single resume file, use
python cli.py -f <resume_file_path>
For extracting data from several resumes, place them in a directory and then execute
python cli.py -d <resume_directory_path>
- Django used
- Easy extraction and interpretation using GUI
- For running GUI execute:
python manage.py makemigrations
python manage.py migrate
python manage.py runserver
- Visit
127.0.0.1
to view the GUI
The module would return a list of dictionary objects with result as follows:
[
{
'education': [('BE', '2014')],
'email': 'omkarpathak27@gmail.com',
'mobile_number': '8087996634',
'name': 'Omkar Pathak',
'skills': [
'Flask',
'Django',
'Mysql',
'C',
'Css',
'Html',
'Js',
'Machine learning',
'C++',
'Algorithms',
'Github',
'Php',
'Python',
'Opencv'
]
}
]
- Extracting Experience
- Extracting Projects
- Extracting hobbies
- Extracting universities
- Extracting month of passing
- Extracting Awards/ Achievements/ Recognition