A Text_Mining Project Using Python/RapidMiner
Zhao Hengrui(G1801739D) Luo Shuang (G1702502A) Wang Sixin(G1801764B) Xia Xiaolong(G1801412A)
Copyright@Nanyang Technological University
According to a study on job seeking behavior by Pew Research Center in 2005, 79% of the job seekers utilized the online resources for their most recent employment (Aaron ,2015). This study result suggests that the online job boards become the major channel for job seekers in the digital era. However, another finding in the study indicates that most of the job seekers fail to match their experiences with the job requirements and spend hours on job board to apply job which is not seen to be suitable (Aaron, 2015). Additionally, Dr. John Sullivan conducted a similar research in 2013 which highlighted some interesting aspects: on average, 250 resumes are received for each job opening by the major organizations, more than 50% of the resumes does not meet the minimum requirement (John, 2013). This means the time our recruiter spends on these 50% of the resumes for each job is wasted. From both candidate and recruiter’s points of view, the phenomenon may suggest that the traditional online job board does not seem to simplify the job application process or reduce the effort required from both parties. With this challenge getting bigger and bigger, the demand to automate the resume - job matching process is getting increased as well. For instance, the content - based recommendation system (CBR) is introduced to analyze the job description to identify the potential area of interest to the job seekers (Shiqiang et al., 2016). To apply the concept in Singapore local context, our team has conducted a text mining project based on the data acquired from the major online job board in Singapore. The primary objective of this project is to create a machine learning model to accelerate the job - resume matching process. The detail of the text mining methodology and results are presented in the following sections.
MIT LICENSE
[1] Aaron, S. (2015). Searching for Work in the Digital Era, Pew Research Center, November 2015. Retrieved on October 15, 2018, from the website: http://www.pewresearch.org/wpcontent/uploads/sites/9/2015/11/PI_2015-11-19-Internet-and-Job-Seeking_FINAL.pdf
[2] John, S. (2013). Why You Can’t Get A Job … Recruiting Explained By the Numbers.
Retrieved on October 15, 2018, from the website: https://www.ere.net/why-you-cant-geta-job-recruiting-explained-by-the-numbers/
[3] Guo, S., Alamudun, F., & Hammond, T. (2016). Résumatcher: A personalized résumé-job matching system. Expert Systems with Applications, 60, 169-182.
[4] JobStreet. (2018). About Us. Retrieved on October 16, 2018, from the website:
https://www.jobstreet.com.sg/en/about-us/
[5] Import.co. (2018) Import.io Extract: get structured data from web pages. Retrieved on October 16, 2018, from the website: https://www.import.io/builder/data-extraction/
[6] RapidMiner. (2018). RapidMiner Platform Lightning Fast Data Science Platform.
Retrieved on October 16, 2018, from the website: https://rapidminer.com/products/
[7] North, M. (2012). Data mining for the masses (pp. 91-100). Athens: Global Text Project.
[8] Wowczko, I. A. (2015). Skills and Vacancy Analysis with Data Mining Techniques.
Informatics 2, 4 (2015), 31.
[9] Patel, B., Kakuste, V., & Eirinaki, M. (2017, April). CaPaR: A Career Path Recommendation Framework. In Big Data Computing Service and Applications (BigDataService), 2017 IEEE Third International Conference on (pp. 23-30). IEEE.
[10] Kwartler, T. (2017). Text mining in practice with R. John Wiley & Sons.
[11] Weiss, S. M., Apte, C., Damerau, F. J., Johnson, D. E., Oles, F. J., Goetz, T., & Hampp,
T. (1999). Maximizing text-mining performance. IEEE Intelligent Systems and their applications, 14(4), 63-69.
Fork or reference, please indicate the source @ Henry Zhao . Thx