bjherger/ResumeParser

Can you help me able to fetch the name separately?

Closed this issue · 5 comments

Hi, amazing job, but it will be helpful if i can fetch the name from the resume.

@Shrinidhikulkarni7 : I'm not sure that I understand. Which name? The applicant name?

@bjherger Yes the applicant name.

@Shrinidhikulkarni7 nltk is the library which can be used for extracting the human names. Below is the code for that:

import nltk
from nltk.corpus import stopwords
stop = stopwords.words('english')

def extract_names(document):
names = []
sentences = ie_preprocess(document)
for tagged_sentence in sentences:
for chunk in nltk.ne_chunk(tagged_sentence):
if type(chunk) == nltk.tree.Tree:
if chunk.label() == 'PERSON':
names.append(' '.join([c[0] for c in chunk]))
return names

I am using this code in ResumeParser but it is not that much perfect name fetcher. I got the output like:

Names: [u'Brendan,Herger', u'Hiren,Patel,Address', u'Albert,Street', u'Sanket,Rajendra,Mantri', u'William', u'San,Franc']

Here, as you can see it has considered Address, Albert, Street as a name. I guess that is the limitation of nltk library.

I am also looking for the good solution. Hope it help you. Good luck.

A more robust approach might be to list out people and organizations on the resume. For example, I've worked for congresswoman Gabbie Giffords, so an NER search on my resume would include Brendan Herger (me), and Gabbie Giffords (a former employer).

I'm working on an approach similar to @iHirenDev 's which uses Stanford's NER engine, and NLTK's interface for Stanford's engine.

Unfortunately, NLTK's interface is pretty clumsy, and requires managing the Stanford NER jar separately.

@Shrinidhikulkarni7 : I've included this functionality w/ version 3.x. Please let me know if it doesn't suit your needs.