/psClassify

Assigns a probability that a name in the Patstat database belongs to a person and not to an entity that is not a person (eg. company, university)

Primary LanguagePython

psClassify

a simple supervised learning algorithm to classify PATSTAT records into two categories:

  • person names
  • not person names

psClassify_pre.py extracts data and prepares for model fitting

psClassify_R.r fits the model and saves to .csv