SalesforceAIResearch
diff --git a/‎CITATION.cff‎
Lines changed: 1 addition & 1 deletion b/‎CITATION.cff‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎README.md‎
Lines changed: 41 additions & 30 deletions b/‎README.md‎
Lines changed: 41 additions & 30 deletions
diff --git a/‎cli/conf/eval/default.yaml‎
Lines changed: 1 addition & 1 deletion b/‎cli/conf/eval/default.yaml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎cli/conf/eval/model/moirai_moe_1.0_R_base.yaml‎
Lines changed: 2 additions & 3 deletions b/‎cli/conf/eval/model/moirai_moe_1.0_R_base.yaml‎
Lines changed: 2 additions & 3 deletions
diff --git a/‎cli/conf/eval/model/moirai_moe_1.0_R_small.yaml‎
Lines changed: 2 additions & 3 deletions b/‎cli/conf/eval/model/moirai_moe_1.0_R_small.yaml‎
Lines changed: 2 additions & 3 deletions
diff --git a/‎example/moirai_forecast.ipynb‎
Lines changed: 34 additions & 18 deletions b/‎example/moirai_forecast.ipynb‎
Lines changed: 34 additions & 18 deletions
diff --git a/‎example/moirai_forecast_pandas.ipynb‎
Lines changed: 128 additions & 65 deletions b/‎example/moirai_forecast_pandas.ipynb‎
Lines changed: 128 additions & 65 deletions
diff --git a/‎project/moirai-moe-1/README.md‎
Lines changed: 3 additions & 4 deletions b/‎project/moirai-moe-1/README.md‎
Lines changed: 3 additions & 4 deletions
diff --git a/‎src/uni2ts/model/moirai/__init__.py‎
Lines changed: 0 additions & 2 deletions b/‎src/uni2ts/model/moirai/__init__.py‎
Lines changed: 0 additions & 2 deletions
diff --git a/‎src/uni2ts/model/moirai/forecast.py‎
Lines changed: 21 additions & 148 deletions b/‎src/uni2ts/model/moirai/forecast.py‎
Lines changed: 21 additions & 148 deletions
@@ -1,6 +1,6 @@
 cff-version: 1.2.0
 title: Unified Training of Universal Time Series Forecasting Transformers
-message: If you find Chronos models useful for your research, please consider citing the associated paper.
+message: If you find Moirai useful for your research, please consider citing the associated paper.
 authors:
   - family-names: Woo
     given-names: Gerald
 
@@ -2,11 +2,13 @@
 
 Uni2TS is a PyTorch based library for research and applications related to Time Series Forecasting. It provides a unified framework for large-scale pre-training, fine-tuning, inference, and evaluation of Universal Time Series Transformers.
 
-Related reading: [Moirai Paper](https://arxiv.org/abs/2402.02592), [Moirai Salesforce Blog](https://blog.salesforceairesearch.com/moirai/), [Moirai-MoE Paper](https://arxiv.org/abs/2410.10469), [Moirai-MoE AI Horizon Forecast Blog](https://aihorizonforecast.substack.com/p/moirai-moe-upgrading-moirai-with), [Moirai-MoE Jiqizhixin Blog](https://mp.weixin.qq.com/s/LQvlgxx9vU965Yzy6RuBfQ).
+Related reading: [Moirai Paper](https://arxiv.org/abs/2402.02592), [Moirai Salesforce Blog](https://blog.salesforceairesearch.com/moirai/), [Moirai-MoE Paper](https://arxiv.org/abs/2410.10469), [Moirai-MoE Salesforce Blog](https://www.salesforce.com/blog/time-series-morai-moe/), [Moirai-MoE AI Horizon Forecast Blog](https://aihorizonforecast.substack.com/p/moirai-moe-upgrading-moirai-with), [Moirai-MoE Jiqizhixin Blog](https://mp.weixin.qq.com/s/LQvlgxx9vU965Yzy6RuBfQ).
 
 ## 🎉 What's New
 
-* Oct 2024: A new model Moirai-MoE! The preprint is available on [arXiv](https://arxiv.org/abs/2410.10469), along with model weights of [small](https://huggingface.co/Salesforce/moirai-moe-1.0-R-small) and [base](https://huggingface.co/Salesforce/moirai-moe-1.0-R-base), and [code](https://github.com/SalesforceAIResearch/uni2ts/tree/main/project/moirai-moe-1) to get started.
+* Nov 2024: The first general time series forecasting benchmark [GIFT-Eval](https://github.com/SalesforceAIResearch/gift-eval) is released. Moirai-Large achieves the best performance in the [Leaderborad](https://huggingface.co/spaces/Salesforce/GIFT-Eval)!
+
+* Oct 2024: A new model Moirai-MoE! The preprint is available on [arXiv](https://arxiv.org/abs/2410.10469), along with the model weights of [Moirai-MoE-Small](https://huggingface.co/Salesforce/moirai-moe-1.0-R-small) and [Moirai-MoE-Base](https://huggingface.co/Salesforce/moirai-moe-1.0-R-base). Getting started with [inference code](https://github.com/SalesforceAIResearch/uni2ts/tree/main/project/moirai-moe-1) and [notebook examples](https://github.com/SalesforceAIResearch/uni2ts/tree/main/example)!
 
 * Sep 2024: Released [Evaluation Code](https://github.com/SalesforceAIResearch/uni2ts/tree/main/project/benchmarks) of [TimesFM](https://arxiv.org/abs/2310.10688), [Chronos](https://arxiv.org/abs/2403.07815) and [VisionTS](https://arxiv.org/abs/2408.17253) on Monash, LSF and PF benchmarks.
 
@@ -16,22 +18,6 @@ Related reading: [Moirai Paper](https://arxiv.org/abs/2402.02592), [Moirai Sales
 
 * Mar 2024: Release of Uni2TS library, along with [Moirai Paper](https://arxiv.org/abs/2402.02592), [Moirai-1.0-R Models](https://huggingface.co/collections/Salesforce/moirai-10-r-models-65c8d3a94c51428c300e0742), and [LOTSA Data](https://huggingface.co/datasets/Salesforce/lotsa_data/).
 
-## ✅ TODO
-
-- [ ] Improve docstrings and documentation
-
-[//]: # (- [ ] Support more pre-training paradigms)
-
-[//]: # (  - [ ] &#40;Non-&#41;Contrastive learning)
-
-[//]: # (  - [ ] Masked Autoencoder)
-
-[//]: # (  - [ ] Next token prediction)
-
-[//]: # (- [ ] Decoder Transformer)
-
-[//]: # (- [ ] Data augmentations - down sampling, subsampling, aggregation)
-
 ## ⚙️ Installation
 
 1. Clone repository:
@@ -56,6 +42,11 @@ pip install -e '.[notebook]'
 touch .env
 ```
 
+We also support installation via PyPI.
+```shell
+pip install uni2ts
+```
+
 ## 🏃 Getting Started
 
 Let's see a simple example on how to use Uni2TS to make zero-shot forecasts from a pre-trained model. 
@@ -72,8 +63,9 @@ from huggingface_hub import hf_hub_download
 
 from uni2ts.eval_util.plot import plot_single
 from uni2ts.model.moirai import MoiraiForecast, MoiraiModule
+from uni2ts.model.moirai_moe import MoiraiMoEForecast, MoiraiMoEModule
 
-
+MODEL = "moirai-moe"  # model name: choose from {'moirai', 'moirai-moe'}
 SIZE = "small"  # model size: choose from {'small', 'base', 'large'}
 PDT = 20  # prediction length: any positive integer
 CTX = 200  # context length: any positive integer
@@ -104,16 +96,28 @@ test_data = test_template.generate_instances(
 )
 
 # Prepare pre-trained model by downloading model weights from huggingface hub
-model = MoiraiForecast(
-    module=MoiraiModule.from_pretrained(f"Salesforce/moirai-1.0-R-{SIZE}"),
-    prediction_length=PDT,
-    context_length=CTX,
-    patch_size=PSZ,
-    num_samples=100,
-    target_dim=1,
-    feat_dynamic_real_dim=ds.num_feat_dynamic_real,
-    past_feat_dynamic_real_dim=ds.num_past_feat_dynamic_real,
-)
+if MODEL == "moirai":
+    model = MoiraiForecast(
+        module=MoiraiModule.from_pretrained(f"Salesforce/moirai-1.1-R-{SIZE}"),
+        prediction_length=PDT,
+        context_length=CTX,
+        patch_size=PSZ,
+        num_samples=100,
+        target_dim=1,
+        feat_dynamic_real_dim=ds.num_feat_dynamic_real,
+        past_feat_dynamic_real_dim=ds.num_past_feat_dynamic_real,
+    )
+elif MODEL == "moirai-moe":
+    model = MoiraiMoEForecast(
+        module=MoiraiMoEModule.from_pretrained(f"Salesforce/moirai-moe-1.0-R-{SIZE}"),
+        prediction_length=PDT,
+        context_length=CTX,
+        patch_size=16,
+        num_samples=100,
+        target_dim=1,
+        feat_dynamic_real_dim=ds.num_feat_dynamic_real,
+        past_feat_dynamic_real_dim=ds.num_past_feat_dynamic_real,
+    )
 
 predictor = model.create_predictor(batch_size=BSZ)
 forecasts = predictor.predict(test_data.input)
@@ -243,7 +247,14 @@ If you're using this repository in your research or applications, please cite us
   year={2024}
 }
 
-@inproceedings{woo2024unified,
+@article{aksu2024gifteval,
+  title={GIFT-Eval: A Benchmark For General Time Series Forecasting Model Evaluation},
+  author={Aksu, Taha and Woo, Gerald and Liu, Juncheng and Liu, Xu and Liu, Chenghao and Savarese, Silvio and Xiong, Caiming and Sahoo, Doyen},
+  journal={arXiv preprint arXiv:2410.10393},
+  year={2024}
+}
+
+@inproceedings{woo2024moirai,
   title={Unified Training of Universal Time Series Forecasting Transformers},
   author={Woo, Gerald and Liu, Chenghao and Kumar, Akshat and Xiong, Caiming and Savarese, Silvio and Sahoo, Doyen},
   booktitle={Forty-first International Conference on Machine Learning},
 
@@ -20,5 +20,5 @@ metrics:
   - _target_: gluonts.ev.metrics.MeanWeightedSumQuantileLoss
     quantile_levels: [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
 batch_size: 512
-min_batch_size: 16
+min_batch_size: 1
 device: auto
@@ -1,8 +1,7 @@
-_target_: uni2ts.model.moirai.MoiraiForecast
+_target_: uni2ts.model.moirai_moe.MoiraiMoEForecast
 module:
-  _target_: uni2ts.model.moirai.MoiraiMoEModule.from_pretrained
+  _target_: uni2ts.model.moirai_moe.MoiraiMoEModule.from_pretrained
   pretrained_model_name_or_path: Salesforce/moirai-moe-1.0-R-base
-mode: autoregressive
 num_samples: 100
 patch_size: 16
 context_length: ???
@@ -1,8 +1,7 @@
-_target_: uni2ts.model.moirai.MoiraiForecast
+_target_: uni2ts.model.moirai_moe.MoiraiMoEForecast
 module:
-  _target_: uni2ts.model.moirai.MoiraiMoEModule.from_pretrained
+  _target_: uni2ts.model.moirai_moe.MoiraiMoEModule.from_pretrained
   pretrained_model_name_or_path: Salesforce/moirai-moe-1.0-R-small
-mode: autoregressive
 num_samples: 100
 patch_size: 16
 context_length: ???
@@ -21,15 +21,15 @@ The pre-trained weights of Moirai-MoE can be found in the following table.
 
 ## Usage
 
-Let's see a simple example on how to use pre-trained Moirai-MoE models to make forecasts. 
+Let's see a simple example below on how to use pre-trained Moirai-MoE models to make forecasts. See also the notebooks in the [example folder](https://github.com/SalesforceAIResearch/uni2ts/tree/main/example) to try out Moirai-MoE.
 
 ```python
 import matplotlib.pyplot as plt
 from gluonts.dataset.repository import dataset_recipes
 
 from uni2ts.eval_util.data import get_gluonts_test_dataset
 from uni2ts.eval_util.plot import plot_next_multi
-from uni2ts.model.moirai import MoiraiForecast, MoiraiMoEModule
+from uni2ts.model.moirai_moe import MoiraiMoEForecast, MoiraiMoEModule
 
 SIZE = "small"  # model size: choose from {'small', 'base'}
 CTX = 1000  # context length: any positive integer
@@ -43,11 +43,10 @@ test_data, metadata = get_gluonts_test_dataset(
 # print(sorted(dataset_recipes.keys()))
 
 # Prepare model
-model = MoiraiForecast(
+model = MoiraiMoEForecast(
     module=MoiraiMoEModule.from_pretrained(
         f"Salesforce/moirai-moe-1.0-R-{SIZE}",
     ),
-    mode="autoregressive",
     prediction_length=metadata.prediction_length,
     context_length=CTX,
     patch_size=16,
 
@@ -16,13 +16,11 @@
 from .finetune import MoiraiFinetune
 from .forecast import MoiraiForecast
 from .module import MoiraiModule
-from .module_moe import MoiraiMoEModule
 from .pretrain import MoiraiPretrain
 
 __all__ = [
     "MoiraiFinetune",
     "MoiraiForecast",
     "MoiraiModule",
-    "MoiraiMoEModule",
     "MoiraiPretrain",
 ]
@@ -27,7 +27,6 @@
 from gluonts.transform import (
     AddObservedValuesIndicator,
     AsNumpyArray,
-    CausalMeanValueImputation,
     ExpandDimArray,
     TestSplitSampler,
     Transformation,
@@ -82,7 +81,6 @@ def __init__(
         module: Optional[MoiraiModule] = None,
         patch_size: int | str = "auto",
         num_samples: int = 100,
-        mode: str = "direct",
     ):
         assert (module is not None) or (
             module_kwargs is not None
@@ -334,139 +332,22 @@ def forward(
             idx = val_loss.argmin(dim=0)
             return preds[idx, torch.arange(len(idx), device=idx.device)]
         else:
-            if self.hparams.mode == "direct":
-                distr = self._get_distr(
-                    self.hparams.patch_size,
-                    past_target,
-                    past_observed_target,
-                    past_is_pad,
-                    feat_dynamic_real,
-                    observed_feat_dynamic_real,
-                    past_feat_dynamic_real,
-                    past_observed_feat_dynamic_real,
-                )
-                preds = distr.sample(
-                    torch.Size((num_samples or self.hparams.num_samples,))
-                )
-                return self._format_preds(
-                    self.hparams.patch_size, preds, past_target.shape[-1]
-                )
-
-            elif self.hparams.mode == "autoregressive":
-                context_step = self.context_token_length(self.hparams.patch_size)
-                context_token = self.hparams.target_dim * context_step
-                predict_step = self.prediction_token_length(self.hparams.patch_size)
-                predict_token = self.hparams.target_dim * predict_step
-
-                (
-                    target,
-                    observed_mask,
-                    sample_id,
-                    time_id,
-                    variate_id,
-                    prediction_mask,
-                ) = self._convert(
-                    self.hparams.patch_size,
-                    past_target,
-                    past_observed_target,
-                    past_is_pad,
-                    feat_dynamic_real=feat_dynamic_real,
-                    observed_feat_dynamic_real=observed_feat_dynamic_real,
-                    past_feat_dynamic_real=past_feat_dynamic_real,
-                    past_observed_feat_dynamic_real=past_observed_feat_dynamic_real,
-                )
-                patch_size = (
-                    torch.ones_like(time_id, dtype=torch.long) * self.hparams.patch_size
-                )
-
-                pred_index = torch.arange(
-                    start=context_step - 1, end=context_token, step=context_step
-                )
-                assign_index = torch.arange(
-                    start=context_token,
-                    end=context_token + predict_token,
-                    step=predict_step,
-                )
-
-                if predict_step == 1:
-                    distr = self.module(
-                        target,
-                        observed_mask,
-                        sample_id,
-                        time_id,
-                        variate_id,
-                        prediction_mask,
-                        patch_size,
-                    )
-                    preds = distr.sample(
-                        torch.Size((num_samples or self.hparams.num_samples,))
-                    )
-                    preds[..., assign_index, :] = preds[..., pred_index, :]
-                    return self._format_preds(
-                        self.hparams.patch_size, preds, self.hparams.target_dim
-                    )
-                else:
-                    distr = self.module(
-                        target,
-                        observed_mask,
-                        sample_id,
-                        time_id,
-                        variate_id,
-                        prediction_mask,
-                        patch_size,
-                    )
-                    preds = distr.sample(torch.Size((self.hparams.num_samples,)))
-
-                    expand_target = target.unsqueeze(0).repeat(
-                        self.hparams.num_samples, 1, 1, 1
-                    )
-                    expand_prediction_mask = prediction_mask.unsqueeze(0).repeat(
-                        self.hparams.num_samples, 1, 1
-                    )
-                    expand_observed_mask = observed_mask.unsqueeze(0).expand(
-                        self.hparams.num_samples, -1, -1, -1
-                    )
-                    expand_sample_id = sample_id.unsqueeze(0).expand(
-                        self.hparams.num_samples, -1, -1
-                    )
-                    expand_time_id = time_id.unsqueeze(0).expand(
-                        self.hparams.num_samples, -1, -1
-                    )
-                    expand_variate_id = variate_id.unsqueeze(0).expand(
-                        self.hparams.num_samples, -1, -1
-                    )
-                    expand_patch_size = patch_size.unsqueeze(0).expand(
-                        self.hparams.num_samples, -1, -1
-                    )
-
-                    expand_target[..., assign_index, :] = preds[..., pred_index, :]
-                    expand_prediction_mask[..., assign_index] = False
-
-                    remain_step = predict_step - 1
-                    while remain_step > 0:
-                        distr = self.module(
-                            expand_target,
-                            expand_observed_mask,
-                            expand_sample_id,
-                            expand_time_id,
-                            expand_variate_id,
-                            expand_prediction_mask,
-                            expand_patch_size,
-                        )
-                        preds = distr.sample(torch.Size((1,)))
-                        _, _, bs, token, ps = preds.shape
-                        preds = preds.view(-1, bs, token, ps)
-
-                        pred_index = assign_index
-                        assign_index = assign_index + 1
-                        expand_target[..., assign_index, :] = preds[..., pred_index, :]
-                        expand_prediction_mask[..., assign_index] = False
-
-                        remain_step -= 1
-
-                    return self._format_preds(
-                        self.hparams.patch_size, expand_target, self.hparams.target_dim
-                    )
+            distr = self._get_distr(
+                self.hparams.patch_size,
+                past_target,
+                past_observed_target,
+                past_is_pad,
+                feat_dynamic_real,
+                observed_feat_dynamic_real,
+                past_feat_dynamic_real,
+                past_observed_feat_dynamic_real,
+            )
+            preds = distr.sample(
+                torch.Size((num_samples or self.hparams.num_samples,))
+            )
+            return self._format_preds(
+                self.hparams.patch_size, preds, past_target.shape[-1]
+            )
 
     def _val_loss(
         self,
@@ -1066,20 +947,12 @@ def get_default_transform(self) -> Transformation:
             dtype=np.float32,
         )
         if self.hparams.target_dim == 1:
-            transform += AddObservedValuesIndicator(
-                target_field="target",
-                output_field="observed_target",
-                imputation_method=CausalMeanValueImputation(),
-                dtype=bool,
-            )
             transform += ExpandDimArray(field="target", axis=0)
-            transform += ExpandDimArray(field="observed_target", axis=0)
-        else:
-            transform += AddObservedValuesIndicator(
-                target_field="target",
-                output_field="observed_target",
-                dtype=bool,
-            )
+        transform += AddObservedValuesIndicator(
+            target_field="target",
+            output_field="observed_target",
+            dtype=bool,
+        )
 
         if self.hparams.feat_dynamic_real_dim > 0:
             transform += AsNumpyArray(