#Keyterm Importance Analysis for OSU Hackathon TDA Challenge
Our idea was to use certain algorithms to weight terms in the applications and the job offers based on their relative importance. This allowed us to see exactly how well student applications match with employers and vice versa. However, a limit of this frequency analysis approach is that it doesn't create a nice, coherent mapping between the two sets. Another limitation is that this approach is severely limited by our inability to do contextual parsing of written text in both the job offers and the applications. This meant we could not differentiate between required skills and desired, but optional skills. As a result we are unable to answer many of the questions asked by the challenge. However, we can offer a couple recommendations:
- Students using "engineering" more often on their applications would dramatically increase their similarity with job offers, as that is one of the most important terms for the job offers data set.
- Job offers that used very specific terminology had, perhaps counter-intuitively, higher similarity to student applications. Thus, we recommend that job offers be as specific as possible in the language they use.
Beyond that, using our approach we could not glean much more information about specific connections between the two data sets.