
Unstructured Data Analysis (Graduate) @Korea University

Recommended courses


Topic 1: Introduction to Text Analytics

  • Text Analytics: Backgrounds, Applications, & Challanges, and Process [Video]
  • Text Analytics Process [Video]

Topic 2: Text Preprocessing

  • Introduction to Natural Language Processing (NLP) [Video]
  • Lexical analysis [Video]
  • Syntax analysis & Other topics in NLP [Video]
  • Reading materials
Topic 3: Neural Networks Basics (Optional, No Video Lectures)

  • Perception, Multi-layered Perceptron
  • Convolutional Neural Networks (CNN)
  • Recurrent Neural Networks (RNN)
  • Practical Techniques

Topic 4: Text Representation I: Classic Methods

  • Bag of words, Word weighting, N-grams [Video]

Topic 5: Text Representation II: Distributed Representation

  • Neural Network Language Model (NNLM) [Video]
  • Word2Vec [Video]
  • GloVe [Video]
  • FastText, Doc2Vec, and Other Embeddings [Video]
  • Reading materials
Topic 6: Dimensionality Reduction

  • Dimensionality Reduction Overview, Supervised Feature Selection [Video]
  • Unsupervised Feature Extraction [Video]
  • Reading materials
Topic 7: Topic Modeling as a Distributed Reprentation

  • Topic modeling overview & Latent Semantic Analysis (LSA), Probabilistic Latent Semantic Analysis: pLSA [Video]
  • LDA: Document Generation Process [Video]
  • LDA Inference: Collapsed Gibbs Sampling, LDA Evaluation [Video]
  • Reading Materials
  • Recommended video lectures

Topic 8: Language Modeling & Pre-trained Models

  • Sequence-to-Sequence Learning [Slide], [Video]
  • Transformer [Slide], [Video]
  • ELMo: Embeddings from Language Models [Slide], [Video]
  • GPT: Generative Pre-Training of a Language Model [Slide], [Video]
  • BERT: Bidirectional Encoder Representations from Transformer [Slide], [Video]
  • GPT-2: Language Models are Unsupervised Multitask Learners
  • Reading Materials
Topic 9: Document Classification

  • Document classification overview, Vector Space Models (Naive Bayesian Classifier, k-Nearese Neighbor Classifier) [Slide], [Video]
  • (Optional) Other VSM-based classsification (Lecture videos are taken from IMEN415 (Multivariate Data Analysis for Undergraudate Students @Korea University))
  • RNN-based document classification
  • CNN-based document classification
  • Reading materials
Topic 10: Sentiment Analysis

  • Architecture of sentiment analysis
  • Lexicon-based approach
  • Machine learning-based approach
  • Reading materials
