/ResumeParser

A simple resume parser used for extracting information from resumes

Primary LanguagePythonMIT LicenseMIT

ResumeParser

A simple resume parser used for extracting information from resumes

Installation

  • For extracting text from various documents we use pdfminer and doc2text modules. Install them using:
pip install pdfminer        # python 2
pip install pdfminer.six    # python 3
pip install doc2text
  • For NLP operations we use spacy and nltk. Install them using:
# spaCy
pip install spacy
python -m spacy download en_core_web_sm

# nltk
pip install nltk
python -m nltk nltk.download('words')
  • For extracting other supporting dependencies, execute:
pip install -r requirements.txt
  • Modify skills.csv as per your requirements

  • Modify Education Degrees as per you requirements in constants.py

  • Place all the resumes that you want to parse in resumes/ directory

  • Run resume_parser.py

CLI

For running the resume extractor you can also use the cli provided

usage: cli.py [-h] [-f FILE] [-d DIRECTORY]

optional arguments:
  -h, --help                            show this help message and exit
  -f FILE, --file FILE                  resume file to be extracted
  -d DIRECTORY, --directory DIRECTORY   directory containing all the resumes to be extracted

For extracting data from a single resume file, use

python cli.py -f <resume_file_path>

For extracting data from several resumes, place them in a directory and then execute

python cli.py -d <resume_directory_path>

GUI

  • Django used
  • Easy extraction and interpretation using GUI
  • For running GUI execute:
python manage.py makemigrations
python manage.py migrate
python manage.py runserver
  • Visit 127.0.0.1 to view the GUI

Working:

Working

Result

The module would return a list of dictionary objects with result as follows:

[
    {
        'education': [('BE', '2014')],
        'email': 'omkarpathak27@gmail.com',
        'mobile_number': '8087996634',
        'name': 'Omkar Pathak',
        'skills': [
            'Flask',
            'Django',
            'Mysql',
            'C',
            'Css',
            'Html',
            'Js',
            'Machine learning',
            'C++',
            'Algorithms',
            'Github',
            'Php',
            'Python',
            'Opencv'
        ]
    }
]

To DO

  • Extracting Experience
  • Extracting Projects
  • Extracting hobbies
  • Extracting universities
  • Extracting month of passing
  • Extracting Awards/ Achievements/ Recognition

References that helped me get here