Time-MoE: Foundation Model

Sep 25, 2024

Time-MoE is a better time series prediction foundational model

Paper: Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of Experts (29 Pages)

Github: https://github.com/Time-MoE/Time-MoE

Researchers from Princeton University ,Squirrel Ai Learning and Griffith University are interested in new architecture for time series foundation models that leverages a sparse design with a mixture-of-experts (MoE).

Hmm..What’s the background?

The authors of the sources trained TIME-MoE models on a newly introduced large-scale dataset called Time-300B, which comprises over 300 billion data points spanning nine domains. They scaled TIME-MoE up to 2.4 billion parameters, with 1.1 billion activated, representing the first time a sparsely-activated time series model reached this scale.

Source: https://lexica.art/prompt/916714f9-e89e-4ba8-ab50-d79579bc327a

Ok, So what is proposed in the research paper?

The main proposal in the sources is TIME-MoE, a new, scalable and unified architecture for pre-training large time series foundation models for universal forecasting. TIME-MoE is designed to improve forecasting capabilities while reducing computational costs.

It comprises a family of decoder-only transformer models that operate in an autoregressive manner and it supports flexible forecasting horizons and can accommodate varying input context lengths of up to 4096.

What’s next?

Future work could explore even larger models trained on more massive datasets to further improve forecasting accuracy and unlock new capabilities.

So essentially,

Time-MoE is a better time series prediction foundational model

Learned something new? Consider sharing it!

So Essentially

Discussion about this post