/deeplearning_lvm

Large vector models - a transformer architecture for mixed token-text models

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

Large vector models

Large vector models - a transformer architecture for mixed token-text models

Abstract

Large vector model (LVM) is a generalization of the LLM class of models used in language generation in generative AI. LVMs are transformer-based models designed for prediction of temporal sequences with tokens. However, unlike LLMs that generate human-language tokens, LVM can generate a sequence of $R^n$ vectors or a mixture of aligned tokens-vectors. LVM model promise to help in prediction of time series data in a variety of significant applications. In particular, it provides an important generalization of binary classification problems involving extended patient history.

This repo

  • Code implementing LVM architecture
  • Solution simple cases
  • Research paper