Online education which is also known as e-learning. It is a relatively young, but fast-growing industry. Especially, MOOC (Massive Open Online Course) platform is an excellent alternative to the classroom training. MOOCs are a recent and widely researched development in distance education. MOOCs were introduced in 2006 and emerged as a popular mode of learning in 2012. Some of the MOOCs has nominal fees, and they provide certification of completion while others are entirely free, and they provide open access to the students across the world. MOOC platforms are capable of providing direct access to thousands of students via web platforms and mobile apps.
There are multiple platforms which deliver the MOOCs. Most popular platforms are Coursera, edx and Udacity. Courses are delivered in multiple languages and across a variety of specialisations. These Companies partner with the universities. Also, the university faculty develop and run most of the MOOCs. Coursera alone has over 2000+ courses, 160+ specialisations, and over 30+ million students across the world. With so many courses to choose from, it is difficult for the prospective students to select the appropriate course, difficult to know if the search results yield suitable courses and the university course provider is the best fit for the specific domain area. Each of the platforms allows students to perform a keyword-based search. However, it is difficult to know the relevant keywords for a specific role or specialisation. Thus, having a skill-based course recommender system will help the prospective students to choose the correct learning path.
This project aims to cover the research in building a Skill-based MOOC recommender system focusing primarily on the Data Science domain area. Broadly, there are four sub-domains within Data Science - Data Analysis, Data Visualisation, Machine Learning and Big Data Engineering. The project attempts to simplify student's learning path choice and help them make a better decision. Also, this approach does not expect students/learners to know the keywords or required industry skills in advance. As part of this project, web scraping methodology is applied to extract the metadata from Coursera and LinkedIn profiles of existing experts from the respective roles. Using the text mining algorithms, we will extract skills from the LinkedIn profiles. Finally, using the keywords (skills) extracted from the LinkedIn profiles, and other inputs we calculate the degree of closeness as the probability score for each course across all the skills. Moreover, the probabilistic score represents yhat.