MOMENT: A Family of Open Time-series Foundation Models

🔥 News

We just released the small and base versions of the MOMENT model.
🔥🔥🔥 We released MOMENT research code, so you can pre-train your own time series foundation model, with your own data, and reproduce experiments from our paper!
We fixed an issue with Classification where MOMENT was unable to handle multi-channel inputs.
MOMENT was accepted at ICML 2024!
Interested in multimodal time series & text foundation models? Check out our preliminary work on JoLT (Jointly Learned Represenations for Time series & Text) [AAAI 2024 Student Abstract, NeurIPS 2023 DGM4H Workshop]. JoLT won the best student abstract presentation at AAAI! Stay tuned for multimodal time series & text foundation models!

📖 Introduction

We introduce MOMENT, a family of open-source foundation models for general-purpose time-series analysis. Pre-training large models on time-series data is challenging due to (1) the absence a large and cohesive public time-series repository, and (2) diverse time-series characteristics which make multi-dataset training onerous. Additionally, (3) experimental benchmarks to evaluate these models especially in scenarios with limited resources, time, and supervision, are still in its nascent stages. To address these challenges, we compile a large and diverse collection of public time-series, called the Time-series Pile, and systematically tackle time-series-specific challenges to unlock large-scale multi-dataset pre-training. Finally, we build on recent work to design a benchmark to evaluate time-series foundation models on diverse tasks and datasets in limited supervision settings. Experiments on this benchmark demonstrate the effectiveness of our pre-trained models with minimal data and task-specific fine-tuning. Finally, we present several interesting empirical observations about large pre-trained time-series models.

MOMENT: One Model, Multiple Tasks, Datasets & Domains

MOMENT on different datasets and tasks, without any parameter updates:

Imputation: Better than statistical imputation baselines
Anomaly Detection: Second best $F_1$ than all baselines
Classification: More accurate than 11 / 16 compared methods
Short-horizon Forecasting: Better than ARIMA on some datasets

By linear probing (fine-tuning the final linear layer):

Imputation: Better than baselines on 4 / 6 datasets
Anomaly Detection: Best $F_1$
Long-horizon Forecasting: Competitive in some settings

MOMENT Captures the Language of Time Series

Principal components of the embeddings of synthetically generated sinusoids suggest that MOMENT can capture subtle trend, scale, frequency, and phase information. In each experiment, $c$ controls the factor of interest, for example the power of the trend polynomial $c \in [\frac{1}{8}, 8)$ (Oreshkin et al., 2020). We generate multiple sine waves by varying $c$, derive their sequence-level representations using MOMENT, and visualize them in a 2-dimensional space using PCA.

MOMENT Learns Meaningful Representation of Data

PCA visualizations of representations learned by MOMENT on the ECG5000 dataset in UCR Classification Archive. Here different colors represent different classes. Even without dataset-specific fine-tuning, MOMENT learns distinct representations for different classes.

Architecture in a Nutshell

A time series is broken into disjoint fixed-length sub-sequences called patches, and each patch is mapped into a D-dimensional patch embedding. During pre-training, we mask patches uniformly at random by replacing their patch embeddings using a special mask embedding [MASK]. The goal of pre-training is to learn patch embeddings which can be used to reconstruct the input time series using a light-weight reconstruction head.

🧑‍💻 Usage

Recommended Python Version: Python 3.11 (support for additional versions is expected soon).

You can install the momentfm package using pip:

pip install momentfm

Alternatively, to install the latest version directly from the GitHub repository:

pip install git+https://github.com/moment-timeseries-foundation-model/moment.git

To load the pre-trained model for one of the tasks, use one of the following code snippets:

Forecasting

from momentfm import MOMENTPipeline

model = MOMENTPipeline.from_pretrained(
    "AutonLab/MOMENT-1-large", 
    model_kwargs={
        "task_name": "forecasting",
        "forecast_horizon": 96
    },
)
model.init()

Classification

from momentfm import MOMENTPipeline

model = MOMENTPipeline.from_pretrained(
    "AutonLab/MOMENT-1-large", 
    model_kwargs={
        "task_name": "classification",
        "n_channels": 1,
        "num_class": 2
    },
)
model.init()

Anomaly Detection, Imputation, and Pre-training

from momentfm import MOMENTPipeline

model = MOMENTPipeline.from_pretrained(
    "AutonLab/MOMENT-1-large", 
    model_kwargs={"task_name": "reconstruction"},
)
model.init()

Representation Learning

from momentfm import MOMENTPipeline

model = MOMENTPipeline.from_pretrained(
    "AutonLab/MOMENT-1-large", 
    model_kwargs={"task_name": "embedding"},
)
model.init()

🧑‍🏫 Tutorials

Here is the list of tutorials and reproducibile experiments to get started with MOMENT for various tasks:

Forecasting
Classification
Anomaly Detection
Imputation
Representation Learning
Real-world Electrocardiogram (ECG) Case Study -- This tutorial also shows how to fine-tune MOMENT for a real-world ECG classification problem, performing training and inference on multiple GPUs and parameter efficient fine-tuning (PEFT).

Special thanks to Yifu Cai and Arjun Choudhry for the tutorials!

All these experiments can be reproduced on a single NVIDIA A6000 GPU with 48 GiB RAM.

Tip

Have more questions about using MOMENT? Checkout Frequently Asked Questions, and you might find your answer!

BibTeX

@inproceedings{goswami2024moment,
  title={MOMENT: A Family of Open Time-series Foundation Models},
  author={Mononito Goswami and Konrad Szafer and Arjun Choudhry and Yifu Cai and Shuo Li and Artur Dubrawski},
  booktitle={International Conference on Machine Learning},
  year={2024}
}

⛑️ Research Code

We designed this codebase to be extremely lightweight, and in the process removed a lot of code! We released the complete but messier research code here. This includes code to handle different datasets, and scripts for pre-training, fine-tuning and evaluating MOMENT alongside other baselines. An early version of this code was available on Anonymous Github.

➕ Contributions

We encourage researchers to contribute their methods and datasets to MOMENT. We are actively working on contributing guidelines. Stay tuned for updates!

📰 Coverage

Moment: A Family of Open Time-Series Foundation Models, Medium post by Samuel Chazy
MOMENT: A Foundation Model for Time Series Forecasting, Classification, Anomaly Detection, Towards Datascience by Nikos Kafritsas
CMU Researchers Propose MOMENT: A Family of Open-Source Machine Learning Foundation Models for General-Purpose Time Series Analysis, MarketTechPost article by Mohammad Asjad
ARTIFICIAL INTELLIGENCEThe Rise of Time-Series Foundation Models for Data Analysis and Forecasting, Unite AI blog by Dr. Tehseen Zia
Time Series AI: MOMENT Model, Webinar hosted by Gradient AI
Forecasting Impact, Panel on Foundational Models with Azul Garza Ramírez, Podcast hosted by Mariana Menchero and Faranak Golestaneh on behalf of the International Institute of Forecasters

🤟 Contemporary Work

There's a lot of cool work on building time series forecasting foundation models! Here's an incomplete list. Checkout Table 9 in our paper for qualitative comparisons with these studies:

TimeGPT-1 by Nixtla, [Paper, API]
Lag-Llama: Towards Foundation Models for Probabilistic Time Series Forecasting by Morgan Stanley and ServiceNow Research, [Paper, Code, Hugging Face]
Tiny Time Mixers (TTMs): Fast Pre-trained Models for Enhanced Zero/Few-Shot Forecasting of Multivariate Time Series by IBM, [Paper, Hugging Face]
Moirai: A Time Series Foundation Model for Universal Forecasting [Paper, Code, Hugging Face]
A decoder-only foundation model for time-series forecasting by Google, [Paper, Code, Hugging Face]
Chronos: Learning the Language of Time Series by Amazon, [Paper, Code, Hugging Face]
Timer: Generative Pre-trained Transformers Are Large Time Series Models by THUML @ Tsinghua University, [Paper, Code]

There's also some recent work on solving multiple time series modeling tasks in addition to forecasting:

TOTEM: TOkenized Time Series EMbeddings for General Time Series Analysis [Paper, Code]

🪪 License

MIT License

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

See MIT LICENSE for details.

lbda1/moment