ResumeParser

A simple resume parser used for extracting information from resumes

Installation

For extracting text from various documents we use pdfminer and doc2text modules. Install them using:

pip install pdfminer        # python 2
pip install pdfminer.six    # python 3
pip install doc2text

For NLP operations we use spacy and nltk. Install them using:

# spaCy
pip install spacy
python -m spacy download en_core_web_sm

# nltk
pip install nltk
python -m nltk nltk.download('words')

For extracting other supporting dependencies, execute:

pip install -r requirements.txt

Modify skills.csv as per your requirements
Modify Education Degrees as per you requirements in constants.py
Place all the resumes that you want to parse in resumes/ directory
Run resume_parser.py

CLI

For running the resume extractor you can also use the cli provided

usage: cli.py [-h] [-f FILE] [-d DIRECTORY]

optional arguments:
  -h, --help                            show this help message and exit
  -f FILE, --file FILE                  resume file to be extracted
  -d DIRECTORY, --directory DIRECTORY   directory containing all the resumes to be extracted

For extracting data from a single resume file, use

python cli.py -f <resume_file_path>

For extracting data from several resumes, place them in a directory and then execute

python cli.py -d <resume_directory_path>

GUI

Django used
Easy extraction and interpretation using GUI
For running GUI execute:

python manage.py makemigrations
python manage.py migrate
python manage.py runserver

Visit 127.0.0.1 to view the GUI

Working:

Result

The module would return a list of dictionary objects with result as follows:

[
    {
        'education': [('BE', '2014')],
        'email': 'omkarpathak27@gmail.com',
        'mobile_number': '8087996634',
        'name': 'Omkar Pathak',
        'skills': [
            'Flask',
            'Django',
            'Mysql',
            'C',
            'Css',
            'Html',
            'Js',
            'Machine learning',
            'C++',
            'Algorithms',
            'Github',
            'Php',
            'Python',
            'Opencv'
        ]
    }
]

To DO

References that helped me get here

codejunction/ResumeParser