/baby_llm

Large Language Model (LLM)

Primary LanguageJupyter Notebook

Large Language Model (LLM)


Using langchain and OpenAI


Natural Language Processing (NLP)

  • machine translation of human language via machine learning algorithms
  • trained to perform specific tasks

Language Modeling (LM)

  • statistical and probabalistic techniques to determine probability of word sequencing
  • probability distribution over a sequence of words

Large Language Models(LLMs)

  • large scale neural network language models resulting in a next word prediction engine
  • deep learning models, typically general purpose
  • typically trained on simple tasks (next word prediction)
  • culmination of training on vast sets of data increases the parameter count and enables fine-tuning of skills.

Machine Learning Architecture

  • why don't these individual formats work for LLMs

Artificial Neural Network (ANN)

  • vanishing gradient problem
  • gradient-based learning methods with backpropagation

Recurrent Neural Network (RNN)

Long-Short Term Memory (LSTM)

Gated Recurrent Units (GRU)

Self-Supervised Learning

N-Gram



References: