/text-mining

Text Mining in Education Learning Labs demonstrate how text mining can be applied in STEM education research and provide LASER Institute scholars hands-on experience with popular techniques for collecting, processing, and analyzing text-based data.

Primary LanguageHTML

Text Mining in STEM Ed Research

The transition to digital learning has made available new sources of data, providing researchers new opportunities for understanding and improving STEM learning. Data sources such as digital learning environments and administrative data systems, as well as data produced by social media websites and the mass digitization of academic and practitioner publications, hold enormous potential to address a range of pressing problems in STEM Education, but collecting and analyzing text-based data also presents unique challenges. Text mining labs address the following critical questions:

  1. What kinds of text data are valuable?
  2. How can we quantify text data?
  3. What kinds of research questions could be addressed with text data?
  4. How can we set up a research agenda that drives innovations in STEM education research with text data?

Lab 1: Text Mining Basics - Tidy Text & Word Counts as summarized in our Overview Presentation is a gentle introduction to getting our text “tidy” so we can perform some basic word counts, look at words that occur at a higher rate in a group of documents, examine words that are unique to those document groups, and create visualizations such as word cloud. The focus of our Essential Readings and case study in this lab is to help LASER Scholars gain a general understanding of key text mining concepts and terminology, as well as develop a basic comfort level with quantifying text data and working with text data. Our Text mining Case Study: What aspects of online professional development offerings do teachers find most valuable? is guided by the work from Friday Institute and it examined teachers' experiences in professional development. Finally, the Intro to Text Mining Badge provides an opportunity create your own data product and to reflect on how theses concepts and techniques might apply to your own research.

Lab 2: Dictionary-Methods - Twitter Sentiment and School Reform as summarized in our Overview Presentation moves beyond basic concepts of text mining and takes a closer look at a dictionary-based text mining technique, sentiment analysis. Our Essential Readings examine the topic of opinion mining or sentiment analysis. This technique is very helpful for us to understand people's opinions about things such as a policy. Our Text mining Case Study: Do the public like NGSS? investigates the public sentiment expressed toward the Next Generation Science Standards (NGSS) and compares the sentiment for NGSS and Common Core State Standards using twitter data. This study is from Josh's team (https://osf.io/xymsd/). Finally, the Sentiment Analysis Badge provides an opportunity create your own data product and to reflect on how theses concepts and techniques might apply to your own research.

Lab 3: Topic Modeling in MOOC-Eds as summarized in our Overview Presentation focuses on identifying “topics” by examining how words cohere into different latent, or hidden, themes based on patterns of co-occurrence of words within documents. Our Essential Readings introduces this unsupervised machine learning technique. Our Text mining Case Study: What are participants discussing in forums? is guided by the work from Friday Institute and it explores ideas or issues emerged in the discussion forums in a MOOC-ed course. You can learn more about the work here (https://www.learntechlib.org/p/195234/). Finally, the Topic modeling Badge provides an opportunity create your own data product and to reflect on how theses concepts and techniques might apply to your own research.

Lab 4: Text Classification in Open Learning Resourses as summarized in our Overview Presentation wraps up our work with text mining and examines recent advances in using text classification to build predictive models for intelligent systems. Through our [essential readings], we'll learn about this supervised machine learning technique. Our Text mining Case Study: How can we assess students' critical data literacy automatically? is inspired by the need of assessing data literacy and using automated assessment for real-time intervention in the field of data science education. Finally, the Text Classification Badge provides an opportunity create your own data product and to reflect on how theses concepts and techniques might apply to your own research.