Forecasting Time-Series - In Depth Tutorial¶
This more advanced tutorial describes how you can exert greater control over AutoGluon’s time-series modeling. As an example forecasting task, we again use the Covid-19 dataset previously described in the Forecasting Time-Series - Quick Start tutorial.
from autogluon.forecasting import ForecastingPredictor
from autogluon.forecasting import TabularDataset
train_data = TabularDataset("https://autogluon.s3-us-west-2.amazonaws.com/datasets/CovidTimeSeries/train.csv")
save_path = "agModels-covidforecast"
eval_metric = "mean_wQuantileLoss" # just for demonstration, this is already the default evaluation metric
/var/lib/jenkins/workspace/workspace/autogluon-forecasting-py3-v3/venv/lib/python3.7/site-packages/gluonts/json.py:46: UserWarning: Using json-module for json-handling. Consider installing one of orjson, ujson to speed up serialization and deserialization. "Using json-module for json-handling. "
Specifying hyperparameters and tuning them¶
While AutoGluon-Forecasting will automatically tune certain
hyperparameters of time-series models depending on the presets
setting, you can manually control the hyperparameter optimization (HPO)
process. The presets
argument of predictor.fit()
will
automatically determine particular search spaces to consider for certain
hyperparameter values, as well as how many HPO trials to run when
searching for the best value in the chosen hyperparameter search space.
Instead of specifying presets
, you can manually specify all of these
items yourself. Below we demonstrate how to tune the
`context_length
<https://ts.gluon.ai/tutorials/forecasting/extended_tutorial.html>`__
hyperparameter for just the
MQCNN
and
DeepAR
models, which controls how much past history is conditioned upon in any
one forecast prediction by a trained model.
import autogluon.core as ag
from gluonts.mx.distribution.neg_binomial import NegativeBinomialOutput
context_search_space = ag.Int(75, 100) # integer values spanning the listed range
epochs = 2 # small value used for quick demo, omit this or use much larger value in real applications!
num_batches_per_epoch = 5 # small value used for quick demo, omit this or use larger value in real applications!
num_hpo_trials = 2 # small value used for quick demo, use much larger value in real applications!
mqcnn_params = {
"context_length": context_search_space,
"epochs": epochs,
"num_batches_per_epoch": num_batches_per_epoch,
}
deepar_params = {
"context_length": context_search_space,
"epochs": epochs,
"num_batches_per_epoch": num_batches_per_epoch,
"distr_output": NegativeBinomialOutput(),
}
predictor = ForecastingPredictor(path=save_path, eval_metric=eval_metric).fit(
train_data, prediction_length=19, quantiles=[0.1, 0.5, 0.9],
index_column="name", target_column="ConfirmedCases", time_column="Date",
hyperparameter_tune_kwargs={'scheduler': 'local', 'searcher': 'bayesopt', 'num_trials': num_hpo_trials},
hyperparameters={"MQCNN": mqcnn_params, "DeepAR": deepar_params}
)
Training with dataset in tabular format... Finish rebuilding the data, showing the top five rows. name 2020-01-22 2020-01-23 2020-01-24 2020-01-25 2020-01-26 0 Afghanistan_ 0.0 0.0 0.0 0.0 0.0 1 Albania_ 0.0 0.0 0.0 0.0 0.0 2 Algeria_ 0.0 0.0 0.0 0.0 0.0 3 Andorra_ 0.0 0.0 0.0 0.0 0.0 4 Angola_ 0.0 0.0 0.0 0.0 0.0 2020-01-27 2020-01-28 2020-01-29 2020-01-30 ... 2020-03-24 0 0.0 0.0 0.0 0.0 ... 74.0 1 0.0 0.0 0.0 0.0 ... 123.0 2 0.0 0.0 0.0 0.0 ... 264.0 3 0.0 0.0 0.0 0.0 ... 164.0 4 0.0 0.0 0.0 0.0 ... 3.0 2020-03-25 2020-03-26 2020-03-27 2020-03-28 2020-03-29 2020-03-30 0 84.0 94.0 110.0 110.0 120.0 170.0 1 146.0 174.0 186.0 197.0 212.0 223.0 2 302.0 367.0 409.0 454.0 511.0 584.0 3 188.0 224.0 267.0 308.0 334.0 370.0 4 3.0 4.0 4.0 5.0 7.0 7.0 2020-03-31 2020-04-01 2020-04-02 0 174.0 237.0 273.0 1 243.0 259.0 277.0 2 716.0 847.0 986.0 3 376.0 390.0 428.0 4 7.0 8.0 8.0 [5 rows x 73 columns] Validation data is None, will do auto splitting... Finished processing data, using 0.307506799697876s. Random seed set to 0 All models will be trained for quantiles [0.1, 0.5, 0.9]. Beginning AutoGluon training ... AutoGluon will save models to agModels-covidforecast/ Start hyperparameter tuning for MQCNN
0%| | 0/2 [00:00<?, ?it/s]
Training model MQCNN/trial_0... Start model training Epoch[0] Learning rate is 0.001 0%| | 0/5 [00:00<?, ?it/s][ANumber of parameters in ForkingSeq2SeqTrainingNetwork: 57784 100%|██████████| 5/5 [00:00<00:00, 51.67it/s, epoch=1/2, avg_epoch_loss=131] Epoch[0] Elapsed time 0.100 seconds Epoch[0] Evaluation metric 'epoch_loss'=130.531328 0it [00:00, ?it/s][ANumber of parameters in ForkingSeq2SeqTrainingNetwork: 57784 10it [00:00, 78.57it/s, epoch=1/2, validation_avg_epoch_loss=133] Epoch[0] Elapsed time 0.129 seconds Epoch[0] Evaluation metric 'validation_epoch_loss'=132.996351 Epoch[1] Learning rate is 0.001 100%|██████████| 5/5 [00:00<00:00, 62.41it/s, epoch=2/2, avg_epoch_loss=1.23] Epoch[1] Elapsed time 0.083 seconds Epoch[1] Evaluation metric 'epoch_loss'=1.226578 10it [00:00, 80.24it/s, epoch=2/2, validation_avg_epoch_loss=132] Epoch[1] Elapsed time 0.127 seconds Epoch[1] Evaluation metric 'validation_epoch_loss'=132.171909 Computing averaged parameters. Loading averaged parameters. End model training Evaluating model MQCNN/trial_0 with metric mean_wQuantileLoss on validation data... 0%| | 0/313 [00:00<?, ?it/s][AForecast is not sample based. Ignoring parameter num_samples from predict method. 100%|██████████| 313/313 [00:00<00:00, 2448.79it/s] 100%|██████████| 313/313 [00:00<00:00, 3571.14it/s] Running evaluation: 100%|██████████| 313/313 [00:00<00:00, 1180.64it/s] Validation score for model MQCNN/trial_0 is -0.9916079769993384 Training model MQCNN/trial_1... Start model training Epoch[0] Learning rate is 0.001 0%| | 0/5 [00:00<?, ?it/s][ANumber of parameters in ForkingSeq2SeqTrainingNetwork: 57784 100%|██████████| 5/5 [00:00<00:00, 52.73it/s, epoch=1/2, avg_epoch_loss=132] Epoch[0] Elapsed time 0.097 seconds Epoch[0] Evaluation metric 'epoch_loss'=131.697012 0it [00:00, ?it/s][ANumber of parameters in ForkingSeq2SeqTrainingNetwork: 57784 10it [00:00, 76.57it/s, epoch=1/2, validation_avg_epoch_loss=134] Epoch[0] Elapsed time 0.133 seconds Epoch[0] Evaluation metric 'validation_epoch_loss'=134.007222 Epoch[1] Learning rate is 0.001 100%|██████████| 5/5 [00:00<00:00, 63.04it/s, epoch=2/2, avg_epoch_loss=1.34] Epoch[1] Elapsed time 0.082 seconds Epoch[1] Evaluation metric 'epoch_loss'=1.337818 10it [00:00, 79.43it/s, epoch=2/2, validation_avg_epoch_loss=133] Epoch[1] Elapsed time 0.128 seconds Epoch[1] Evaluation metric 'validation_epoch_loss'=133.267226 Computing averaged parameters. Loading averaged parameters. End model training Evaluating model MQCNN/trial_1 with metric mean_wQuantileLoss on validation data... 0%| | 0/313 [00:00<?, ?it/s][A 100%|██████████| 313/313 [00:00<00:00, 2427.10it/s] 100%|██████████| 313/313 [00:00<00:00, 3321.79it/s] Running evaluation: 100%|██████████| 313/313 [00:00<00:00, 1173.27it/s] Validation score for model MQCNN/trial_1 is -0.9964028387055066 Start hyperparameter tuning for DeepAR
0%| | 0/2 [00:00<?, ?it/s]
Training model DeepAR/trial_0...
Start model training
Epoch[0] Learning rate is 0.001
0%| | 0/5 [00:00<?, ?it/s][ANumber of parameters in DeepARTrainingNetwork: 25843
100%|██████████| 5/5 [00:17<00:00, 3.60s/it, epoch=1/2, avg_epoch_loss=6.86]
Epoch[0] Elapsed time 17.997 seconds
Epoch[0] Evaluation metric 'epoch_loss'=6.862702
0it [00:00, ?it/s][ANumber of parameters in DeepARTrainingNetwork: 25843
10it [00:11, 1.19s/it, epoch=1/2, validation_avg_epoch_loss=43.6]
Epoch[0] Elapsed time 11.899 seconds
Epoch[0] Evaluation metric 'validation_epoch_loss'=43.641521
Epoch[1] Learning rate is 0.001
100%|██████████| 5/5 [00:00<00:00, 14.72it/s, epoch=2/2, avg_epoch_loss=5.64]
Epoch[1] Elapsed time 0.342 seconds
Epoch[1] Evaluation metric 'epoch_loss'=5.637449
10it [00:00, 26.14it/s, epoch=2/2, validation_avg_epoch_loss=32.4]
Epoch[1] Elapsed time 0.385 seconds
Epoch[1] Evaluation metric 'validation_epoch_loss'=32.405570
Computing averaged parameters.
Loading averaged parameters.
End model training
Evaluating model DeepAR/trial_0 with metric mean_wQuantileLoss on validation data...
0%| | 0/313 [00:00<?, ?it/s][A
0%| | 1/313 [00:00<01:13, 4.23it/s][A
11%|█ | 33/313 [00:00<00:03, 82.88it/s][A
21%|██ | 65/313 [00:00<00:02, 108.94it/s][A
31%|███ | 97/313 [00:00<00:01, 121.30it/s][A
41%|████ | 129/313 [00:01<00:01, 129.00it/s][A
51%|█████▏ | 161/313 [00:01<00:01, 133.17it/s][A
62%|██████▏ | 193/313 [00:01<00:00, 136.62it/s][A
72%|███████▏ | 225/313 [00:01<00:00, 138.69it/s][A
82%|████████▏ | 257/313 [00:02<00:00, 140.20it/s][A
100%|██████████| 313/313 [00:02<00:00, 138.46it/s]
100%|██████████| 313/313 [00:00<00:00, 3911.75it/s]
Running evaluation: 100%|██████████| 313/313 [00:00<00:00, 990.81it/s]
Validation score for model DeepAR/trial_0 is -0.8336604527711646
Training model DeepAR/trial_1...
Start model training
Epoch[0] Learning rate is 0.001
0%| | 0/5 [00:00<?, ?it/s][ANumber of parameters in DeepARTrainingNetwork: 25843
100%|██████████| 5/5 [00:14<00:00, 2.98s/it, epoch=1/2, avg_epoch_loss=5.25]
Epoch[0] Elapsed time 14.883 seconds
Epoch[0] Evaluation metric 'epoch_loss'=5.253252
0it [00:00, ?it/s][ANumber of parameters in DeepARTrainingNetwork: 25843
10it [00:13, 1.36s/it, epoch=1/2, validation_avg_epoch_loss=42.9]
Epoch[0] Elapsed time 13.583 seconds
Epoch[0] Evaluation metric 'validation_epoch_loss'=42.857225
Epoch[1] Learning rate is 0.001
100%|██████████| 5/5 [00:00<00:00, 13.32it/s, epoch=2/2, avg_epoch_loss=1.83]
Epoch[1] Elapsed time 0.378 seconds
Epoch[1] Evaluation metric 'epoch_loss'=1.828084
10it [00:00, 24.72it/s, epoch=2/2, validation_avg_epoch_loss=34.5]
Epoch[1] Elapsed time 0.407 seconds
Epoch[1] Evaluation metric 'validation_epoch_loss'=34.543380
Computing averaged parameters.
Loading averaged parameters.
End model training
Evaluating model DeepAR/trial_1 with metric mean_wQuantileLoss on validation data...
0%| | 0/313 [00:00<?, ?it/s][A
0%| | 1/313 [00:00<01:15, 4.15it/s][A
11%|█ | 33/313 [00:00<00:03, 79.80it/s][A
21%|██ | 65/313 [00:00<00:02, 104.05it/s][A
31%|███ | 97/313 [00:00<00:01, 115.46it/s][A
41%|████ | 129/313 [00:01<00:01, 121.56it/s][A
51%|█████▏ | 161/313 [00:01<00:01, 126.02it/s][A
62%|██████▏ | 193/313 [00:01<00:00, 128.60it/s][A
72%|███████▏ | 225/313 [00:01<00:00, 130.71it/s][A
82%|████████▏ | 257/313 [00:02<00:00, 131.92it/s][A
100%|██████████| 313/313 [00:02<00:00, 130.84it/s]
100%|██████████| 313/313 [00:00<00:00, 3634.50it/s]
Running evaluation: 100%|██████████| 313/313 [00:00<00:00, 964.50it/s]
Validation score for model DeepAR/trial_1 is -0.863341598531469
AutoGluon training complete, total runtime = 73.66s ...
To ensure quick runtimes, we specified that only 2 HPO trials should be
run for tuning each model’s hyperparameters, which is too few for real
applications. We specified that HPO should be performed via a Bayesian
optimization searcher
with HPO trials to evaluate candidate
hyperparameter configurations executed via a local sequential job
scheduler
. See the AutoGluon Searcher/Scheduler
documentation/tutorials for more details.
Above we set the epochs
, num_batches_per_epoch
, and
distr_output
hyperparameters to fixed values. You are allowed to set
some hyperparameters to search spaces and others to fixed values. Any
hyperparameters you do not specify values or search spaces for will be
left at their default values. AutoGluon will only train those models
which appear as keys in the hyperparameters
dict argument passed
into fit()
, so in this case only the MQCNN and DeepAR models are
trained. Refer to the GluonTS
documentation
for individual GluonTS models to see all of the hyperparameters you may
specify for them.
Viewing additional information¶
We can view a summary of the HPO process, which will show the validation score achieved in each HPO trial as well as which hyperparameter configuration was evaluated in the corresponding trial:
predictor.fit_summary()
Generating leaderboard for all models trained...
* Summary of fit() * Estimated performance of each model: model val_score fit_order 0 DeepAR/trial_0 -0.833660 3 1 DeepAR/trial_1 -0.863342 4 2 MQCNN/trial_0 -0.991608 1 3 MQCNN/trial_1 -0.996403 2 Number of models trained: 4 Types of models trained: {'MQCNNModel', 'DeepARModel'} Hyperparameter-tuning used: True User-specified hyperparameters: {'MQCNN': {'context_length': Int: lower=75, upper=100, 'epochs': 2, 'num_batches_per_epoch': 5}, 'DeepAR': {'context_length': Int: lower=75, upper=100, 'epochs': 2, 'num_batches_per_epoch': 5, 'distr_output': gluonts.mx.distribution.neg_binomial.NegativeBinomialOutput()}} Feature Metadata (Processed): (raw dtype, special dtypes): * Details of Hyperparameter optimization * HPO for MQCNN model: Num. configurations tried = 2, Time spent = 6.654607772827148s Best hyperparameter-configuration (validation-performance: mean_wQuantileLoss = -0.9916079769993384): {'context_length': 88} HPO for DeepAR model: Num. configurations tried = 2, Time spent = 66.95265889167786s Best hyperparameter-configuration (validation-performance: mean_wQuantileLoss = -0.8336604527711646): {'context_length': 88} * End of fit() summary *
{'model_types': {'MQCNN/trial_0': 'MQCNNModel',
'MQCNN/trial_1': 'MQCNNModel',
'DeepAR/trial_0': 'DeepARModel',
'DeepAR/trial_1': 'DeepARModel'},
'model_performance': {'MQCNN/trial_0': -0.9916079769993384,
'MQCNN/trial_1': -0.9964028387055066,
'DeepAR/trial_0': -0.8336604527711646,
'DeepAR/trial_1': -0.863341598531469},
'model_best': None,
'model_paths': {'MQCNN/trial_0': 'agModels-covidforecast/models/MQCNN/trial_0/',
'MQCNN/trial_1': 'agModels-covidforecast/models/MQCNN/trial_1/',
'DeepAR/trial_0': 'agModels-covidforecast/models/DeepAR/trial_0/',
'DeepAR/trial_1': 'agModels-covidforecast/models/DeepAR/trial_1/'},
'model_fit_times': {'MQCNN/trial_0': 3.8433167934417725,
'MQCNN/trial_1': 0.5129275321960449,
'DeepAR/trial_0': 30.80358338356018,
'DeepAR/trial_1': 29.442209482192993},
'hyperparameter_tune': True,
'hyperparameters_userspecified': {'MQCNN': {'context_length': Int: lower=75, upper=100,
'epochs': 2,
'num_batches_per_epoch': 5},
'DeepAR': {'context_length': Int: lower=75, upper=100,
'epochs': 2,
'num_batches_per_epoch': 5,
'distr_output': gluonts.mx.distribution.neg_binomial.NegativeBinomialOutput()}},
'hpo_results': {'MQCNN': {'best_reward': -0.9916079769993384,
'best_config': {'context_length': 88},
'total_time': 6.654607772827148,
'metadata': {'stop_criterion': {'time_limits': None, 'max_reward': None},
'resources_per_trial': {'num_cpus': 'auto', 'num_gpus': 'auto'}},
'reward_attr': 'validation_performance',
'args': {'util_args': {'train_data_path': 'dataset_train.p',
'val_data_path': 'dataset_val.p',
'directory': 'agModels-covidforecast/models/MQCNN/',
'model': MQCNN,
'time_start': 1632359430.06532,
'time_limit': None},
'freq': 'D',
'prediction_length': 19,
'context_length': 88,
'epochs': 2,
'num_batches_per_epoch': 5,
'use_feat_static_cat': False,
'use_feat_static_real': False,
'cardinality': None,
'quantiles': [0.1, 0.5, 0.9],
'callbacks': [<autogluon.forecasting.models.gluonts_model.abstract_gluonts.callback.EpochCounter at 0x7effe15964d0>]},
'trial_info': {0: {'config': {'context_length': 88},
'history': [{'epoch': 1,
'trial': 0,
'time_this_iter': 5.026702642440796,
'time_since_start': 5.026702642440796}],
'metadata': {'epoch': 1,
'trial': 0,
'time_this_iter': 5.026702642440796,
'time_since_start': 5.026702642440796},
'validation_performance': -0.9916079769993384},
1: {'config': {'context_length': 97},
'history': [{'epoch': 1,
'trial': 1,
'time_this_iter': 1.5954809188842773,
'time_since_start': 1.5954809188842773}],
'metadata': {'epoch': 1,
'trial': 1,
'time_this_iter': 1.5954809188842773,
'time_since_start': 1.5954809188842773},
'validation_performance': -0.9964028387055066}},
'validation_performance': -0.9916079769993384,
'search_space': OrderedDict([('context_length',
Int: lower=75, upper=100)])},
'DeepAR': {'best_reward': -0.8336604527711646,
'best_config': {'context_length': 88},
'total_time': 66.95265889167786,
'metadata': {'stop_criterion': {'time_limits': None, 'max_reward': None},
'resources_per_trial': {'num_cpus': 'auto', 'num_gpus': 'auto'}},
'reward_attr': 'validation_performance',
'args': {'util_args': {'train_data_path': 'dataset_train.p',
'val_data_path': 'dataset_val.p',
'directory': 'agModels-covidforecast/models/DeepAR/',
'model': DeepAR,
'time_start': 1632359436.7424335,
'time_limit': None},
'freq': 'D',
'prediction_length': 19,
'context_length': 88,
'epochs': 2,
'num_batches_per_epoch': 5,
'distr_output': gluonts.mx.distribution.neg_binomial.NegativeBinomialOutput(),
'use_feat_static_cat': False,
'use_feat_static_real': False,
'cardinality': None,
'quantiles': [0.1, 0.5, 0.9],
'callbacks': [<autogluon.forecasting.models.gluonts_model.abstract_gluonts.callback.EpochCounter at 0x7f00944cd9d0>]},
'trial_info': {0: {'config': {'context_length': 88},
'history': [{'epoch': 1,
'trial': 0,
'time_this_iter': 34.0602662563324,
'time_since_start': 34.0602662563324}],
'metadata': {'epoch': 1,
'trial': 0,
'time_this_iter': 34.0602662563324,
'time_since_start': 34.0602662563324},
'validation_performance': -0.8336604527711646},
1: {'config': {'context_length': 96},
'history': [{'epoch': 1,
'trial': 1,
'time_this_iter': 32.85720777511597,
'time_since_start': 32.85720777511597}],
'metadata': {'epoch': 1,
'trial': 1,
'time_this_iter': 32.85720777511597,
'time_since_start': 32.85720777511597},
'validation_performance': -0.863341598531469}},
'validation_performance': -0.8336604527711646,
'search_space': OrderedDict([('context_length',
Int: lower=75, upper=100)])}},
'model_hyperparams': {'MQCNN/trial_0': {'freq': 'D',
'prediction_length': 19,
'context_length': 88,
'epochs': 2,
'num_batches_per_epoch': 5,
'use_feat_static_cat': False,
'use_feat_static_real': False,
'cardinality': None,
'quantiles': [0.1, 0.5, 0.9],
'callbacks': [<autogluon.forecasting.models.gluonts_model.abstract_gluonts.callback.EpochCounter at 0x7f00d03cc1d0>,
<autogluon.forecasting.models.gluonts_model.abstract_gluonts.callback.TimeLimitCallback at 0x7f00d03cc150>],
'hybridize': False},
'MQCNN/trial_1': {'freq': 'D',
'prediction_length': 19,
'context_length': 97,
'epochs': 2,
'num_batches_per_epoch': 5,
'use_feat_static_cat': False,
'use_feat_static_real': False,
'cardinality': None,
'quantiles': [0.1, 0.5, 0.9],
'callbacks': [<autogluon.forecasting.models.gluonts_model.abstract_gluonts.callback.EpochCounter at 0x7effe1f31f50>,
<autogluon.forecasting.models.gluonts_model.abstract_gluonts.callback.TimeLimitCallback at 0x7effe167c210>],
'hybridize': False},
'DeepAR/trial_0': {'freq': 'D',
'prediction_length': 19,
'context_length': 88,
'epochs': 2,
'num_batches_per_epoch': 5,
'distr_output': gluonts.mx.distribution.neg_binomial.NegativeBinomialOutput(),
'use_feat_static_cat': False,
'use_feat_static_real': False,
'cardinality': None,
'quantiles': [0.1, 0.5, 0.9],
'callbacks': [<autogluon.forecasting.models.gluonts_model.abstract_gluonts.callback.EpochCounter at 0x7f00944cdd90>,
<autogluon.forecasting.models.gluonts_model.abstract_gluonts.callback.TimeLimitCallback at 0x7f00944cded0>]},
'DeepAR/trial_1': {'freq': 'D',
'prediction_length': 19,
'context_length': 96,
'epochs': 2,
'num_batches_per_epoch': 5,
'distr_output': gluonts.mx.distribution.neg_binomial.NegativeBinomialOutput(),
'use_feat_static_cat': False,
'use_feat_static_real': False,
'cardinality': None,
'quantiles': [0.1, 0.5, 0.9],
'callbacks': [<autogluon.forecasting.models.gluonts_model.abstract_gluonts.callback.EpochCounter at 0x7f00944fb9d0>,
<autogluon.forecasting.models.gluonts_model.abstract_gluonts.callback.TimeLimitCallback at 0x7effe01ca950>]}},
'leaderboard': model val_score fit_order
0 DeepAR/trial_0 -0.833660 3
1 DeepAR/trial_1 -0.863342 4
2 MQCNN/trial_0 -0.991608 1
3 MQCNN/trial_1 -0.996403 2}
The 'best_config'
field in this summary indicates the hyperparameter
configuration that performed best for each model. We can alternatively
use the leaderboard to view the performance of each evaluated
model/hyperparameter configuration:
predictor.leaderboard()
Generating leaderboard for all models trained...
model | val_score | fit_order | |
---|---|---|---|
0 | DeepAR/trial_0 | -0.833660 | 3 |
1 | DeepAR/trial_1 | -0.863342 | 4 |
2 | MQCNN/trial_0 | -0.991608 | 1 |
3 | MQCNN/trial_1 | -0.996403 | 2 |
Here is yet another way to see which model AutoGluon believes to be the best (based on validation score), which is the model automatically used for prediction by default:
predictor._trainer.get_model_best()
'DeepAR/trial_0'
We can also view information about any model AutoGluon has trained:
models_trained = predictor._trainer.get_model_names_all()
specific_model = predictor._trainer.load_model(models_trained[0])
specific_model.get_info()
{'name': 'MQCNN/trial_0',
'model_type': 'MQCNNModel',
'eval_metric': 'mean_wQuantileLoss',
'fit_time': 3.8433167934417725,
'predict_time': 1.0593109130859375,
'val_score': -0.9916079769993384,
'hyperparameters': {'freq': 'D',
'prediction_length': 19,
'context_length': 88,
'epochs': 2,
'num_batches_per_epoch': 5,
'use_feat_static_cat': False,
'use_feat_static_real': False,
'cardinality': None,
'quantiles': [0.1, 0.5, 0.9],
'callbacks': [<autogluon.forecasting.models.gluonts_model.abstract_gluonts.callback.EpochCounter at 0x7f0094533610>,
<autogluon.forecasting.models.gluonts_model.abstract_gluonts.callback.TimeLimitCallback at 0x7effe156ead0>],
'hybridize': False}}
Evaluating trained models¶
Given some more recent held-out test data, here’s how to just evaluate
the default model AutoGluon uses for forecasting without evaluating all
of the other models as in leaderboard()
:
test_data = TabularDataset("https://autogluon.s3-us-west-2.amazonaws.com/datasets/CovidTimeSeries/test.csv")
predictor.evaluate(test_data) # to evaluate specific model, can also specify optional argument: model
Loaded data from: https://autogluon.s3-us-west-2.amazonaws.com/datasets/CovidTimeSeries/test.csv | Columns = 3 / 3 | Rows = 28483 -> 28483
Does not specify model, will by default use the model with the best validation score for evaluation
100%|██████████| 313/313 [00:02<00:00, 131.64it/s]
100%|██████████| 313/313 [00:00<00:00, 3785.06it/s]
Running evaluation: 100%|██████████| 313/313 [00:00<00:00, 961.39it/s]
0.8176319396478796
Be aware that without providing extra test_data
, AutoGluon’s
reported validation scores may be slightly optimistic due to adaptive
decisions like selecting models/hyperparameters based on the validation
data, so it is always a good idea to use some truly held-out test data
for an unbiased final evaluation after training has completed.
The ground truth time-series targets are often not available when we produce forecasts and only become available later in the future. In such a workflow, we may first produce predictions using AutoGluon, and then later evaluate them without having to recompute the predictions:
predictions = predictor.predict(train_data) # before test data have been observed
predictor = ForecastingPredictor.load(save_path) # reload predictor in future after test data are observed
ForecastingPredictor.evaluate_predictions(forecasts=predictions,
targets=test_data,
index_column=predictor.index_column,
time_column=predictor.time_column,
target_column=predictor.target_column,
eval_metric=predictor.eval_metric)
Does not specify model, will by default use the model with the best validation score for prediction
Predicting with model DeepAR/trial_0
Loading predictor from path agModels-covidforecast/
Running evaluation: 100%|██████████| 313/313 [00:00<00:00, 958.89it/s]
0.817738727172412
Static features¶
In some forecasting problems involving multiple time-series, each individual time-series may be associated with some static features that do not change over time. For example, if forecasting demand for products over time, each product may be associated with an item category (categorical static feature) and an item vector embedding from a recommender system (numeric static features). AutoGluon allows you to provide such static features such that its models will condition their predictions upon them:
static_features = TabularDataset("https://autogluon.s3-us-west-2.amazonaws.com/datasets/CovidTimeSeries/toy_static_features.csv")
static_features.head()
Loaded data from: https://autogluon.s3-us-west-2.amazonaws.com/datasets/CovidTimeSeries/toy_static_features.csv | Columns = 3 / 3 | Rows = 313 -> 313
name | static_cat_feature | static_real_feature | |
---|---|---|---|
0 | Afghanistan_ | 2 | 67.03 |
1 | Albania_ | 1 | 4.33 |
2 | Algeria_ | 5 | 19.63 |
3 | Andorra_ | 2 | 42.49 |
4 | Angola_ | 3 | 21.58 |
Note that each unique value of index_column
in our time series data
must be represented as a row in static_features
(in a column whose
name matches the index_column
) that contains the feature values
corresponding to this individual series. AutoGluon can automatically
infer which static features are categorical vs. numeric when they are
passed into fit()
:
predictor_static = ForecastingPredictor(path=save_path, eval_metric=eval_metric).fit(
train_data, static_features=static_features, prediction_length=19, quantiles=[0.1, 0.5, 0.9],
index_column="name", target_column="ConfirmedCases", time_column="Date",
presets="low_quality" # last argument is just here for quick demo, omit it in real applications!
)
Warning: path already exists! This predictor may overwrite an existing predictor! path="agModels-covidforecast" presets is set to be low_quality Training with dataset in tabular format... Finish rebuilding the data, showing the top five rows. name 2020-01-22 2020-01-23 2020-01-24 2020-01-25 2020-01-26 0 Afghanistan_ 0.0 0.0 0.0 0.0 0.0 1 Albania_ 0.0 0.0 0.0 0.0 0.0 2 Algeria_ 0.0 0.0 0.0 0.0 0.0 3 Andorra_ 0.0 0.0 0.0 0.0 0.0 4 Angola_ 0.0 0.0 0.0 0.0 0.0 2020-01-27 2020-01-28 2020-01-29 2020-01-30 ... 2020-03-24 0 0.0 0.0 0.0 0.0 ... 74.0 1 0.0 0.0 0.0 0.0 ... 123.0 2 0.0 0.0 0.0 0.0 ... 264.0 3 0.0 0.0 0.0 0.0 ... 164.0 4 0.0 0.0 0.0 0.0 ... 3.0 2020-03-25 2020-03-26 2020-03-27 2020-03-28 2020-03-29 2020-03-30 0 84.0 94.0 110.0 110.0 120.0 170.0 1 146.0 174.0 186.0 197.0 212.0 223.0 2 302.0 367.0 409.0 454.0 511.0 584.0 3 188.0 224.0 267.0 308.0 334.0 370.0 4 3.0 4.0 4.0 5.0 7.0 7.0 2020-03-31 2020-04-01 2020-04-02 0 174.0 237.0 273.0 1 243.0 259.0 277.0 2 716.0 847.0 986.0 3 376.0 390.0 428.0 4 7.0 8.0 8.0 [5 rows x 73 columns] Validation data is None, will do auto splitting... static feature column static_cat_feature has 10 or less unique values, assuming it is categorical. Fitting IdentityFeatureGenerator... Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features. Fitting IdentityFeatureGenerator... Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features. Fitting CategoryFeatureGenerator... Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features. Fitting CategoryMemoryMinimizeFeatureGenerator... Using previous inferred feature columns... Static Cat Features Dataframe including ['static_cat_feature'] Static Real Features Dataframe including ['static_real_feature'] Finished processing data, using 1.2074267864227295s. Random seed set to 0 All models will be trained for quantiles [0.1, 0.5, 0.9]. Beginning AutoGluon training ... AutoGluon will save models to agModels-covidforecast/ Fitting model: SFF ... Training model SFF... Start model training Epoch[0] Learning rate is 0.001 0%| | 0/10 [00:00<?, ?it/s]Number of parameters in SimpleFeedForwardTrainingNetwork: 31523 100%|██████████| 10/10 [00:02<00:00, 3.70it/s, epoch=1/5, avg_epoch_loss=0.645] Epoch[0] Elapsed time 2.705 seconds Epoch[0] Evaluation metric 'epoch_loss'=0.645329 0it [00:00, ?it/s]Number of parameters in SimpleFeedForwardTrainingNetwork: 31523 10it [00:00, 252.92it/s, epoch=1/5, validation_avg_epoch_loss=9.65] Epoch[0] Elapsed time 0.041 seconds Epoch[0] Evaluation metric 'validation_epoch_loss'=9.653509 Epoch[1] Learning rate is 0.001 100%|██████████| 10/10 [00:00<00:00, 195.29it/s, epoch=2/5, avg_epoch_loss=-.11] Epoch[1] Elapsed time 0.053 seconds Epoch[1] Evaluation metric 'epoch_loss'=-0.109619 10it [00:00, 313.17it/s, epoch=2/5, validation_avg_epoch_loss=9.15] Epoch[1] Elapsed time 0.033 seconds Epoch[1] Evaluation metric 'validation_epoch_loss'=9.146872 Epoch[2] Learning rate is 0.001 100%|██████████| 10/10 [00:00<00:00, 194.88it/s, epoch=3/5, avg_epoch_loss=0.0362] Epoch[2] Elapsed time 0.053 seconds Epoch[2] Evaluation metric 'epoch_loss'=0.036189 10it [00:00, 303.55it/s, epoch=3/5, validation_avg_epoch_loss=8.81] Epoch[2] Elapsed time 0.034 seconds Epoch[2] Evaluation metric 'validation_epoch_loss'=8.807447 Epoch[3] Learning rate is 0.001 100%|██████████| 10/10 [00:00<00:00, 195.24it/s, epoch=4/5, avg_epoch_loss=0.842] Epoch[3] Elapsed time 0.053 seconds Epoch[3] Evaluation metric 'epoch_loss'=0.842182 10it [00:00, 308.68it/s, epoch=4/5, validation_avg_epoch_loss=8.6] Epoch[3] Elapsed time 0.034 seconds Epoch[3] Evaluation metric 'validation_epoch_loss'=8.597749 Epoch[4] Learning rate is 0.001 100%|██████████| 10/10 [00:00<00:00, 192.90it/s, epoch=5/5, avg_epoch_loss=-.19] Epoch[4] Elapsed time 0.054 seconds Epoch[4] Evaluation metric 'epoch_loss'=-0.189805 10it [00:00, 306.27it/s, epoch=5/5, validation_avg_epoch_loss=8.34] Epoch[4] Elapsed time 0.034 seconds Epoch[4] Evaluation metric 'validation_epoch_loss'=8.344242 Computing averaged parameters. Loading averaged parameters. End model training 100%|██████████| 313/313 [00:00<00:00, 3948.88it/s] 100%|██████████| 313/313 [00:00<00:00, 3734.04it/s] Running evaluation: 100%|██████████| 313/313 [00:00<00:00, 943.49it/s] Fitting model: MQCNN ... Training model MQCNN... Start model training Epoch[0] Learning rate is 0.001 0%| | 0/10 [00:00<?, ?it/s]Number of parameters in ForkingSeq2SeqTrainingNetwork: 58218 100%|██████████| 10/10 [00:00<00:00, 63.30it/s, epoch=1/5, avg_epoch_loss=100] Epoch[0] Elapsed time 0.160 seconds Epoch[0] Evaluation metric 'epoch_loss'=100.159810 0it [00:00, ?it/s]Number of parameters in ForkingSeq2SeqTrainingNetwork: 58218 10it [00:00, 85.18it/s, epoch=1/5, validation_avg_epoch_loss=426] Epoch[0] Elapsed time 0.119 seconds Epoch[0] Evaluation metric 'validation_epoch_loss'=425.933828 Epoch[1] Learning rate is 0.001 100%|██████████| 10/10 [00:00<00:00, 68.97it/s, epoch=2/5, avg_epoch_loss=98.5] Epoch[1] Elapsed time 0.147 seconds Epoch[1] Evaluation metric 'epoch_loss'=98.538936 10it [00:00, 85.11it/s, epoch=2/5, validation_avg_epoch_loss=423] Epoch[1] Elapsed time 0.119 seconds Epoch[1] Evaluation metric 'validation_epoch_loss'=422.785052 Epoch[2] Learning rate is 0.001 100%|██████████| 10/10 [00:00<00:00, 69.43it/s, epoch=3/5, avg_epoch_loss=96.8] Epoch[2] Elapsed time 0.146 seconds Epoch[2] Evaluation metric 'epoch_loss'=96.827922 10it [00:00, 85.13it/s, epoch=3/5, validation_avg_epoch_loss=419] Epoch[2] Elapsed time 0.120 seconds Epoch[2] Evaluation metric 'validation_epoch_loss'=418.784891 Epoch[3] Learning rate is 0.001 100%|██████████| 10/10 [00:00<00:00, 69.81it/s, epoch=4/5, avg_epoch_loss=94.4] Epoch[3] Elapsed time 0.145 seconds Epoch[3] Evaluation metric 'epoch_loss'=94.388038 10it [00:00, 83.42it/s, epoch=4/5, validation_avg_epoch_loss=413] Epoch[3] Elapsed time 0.121 seconds Epoch[3] Evaluation metric 'validation_epoch_loss'=413.194976 Epoch[4] Learning rate is 0.001 100%|██████████| 10/10 [00:00<00:00, 69.57it/s, epoch=5/5, avg_epoch_loss=90.9] Epoch[4] Elapsed time 0.145 seconds Epoch[4] Evaluation metric 'epoch_loss'=90.850428 10it [00:00, 84.72it/s, epoch=5/5, validation_avg_epoch_loss=404] Epoch[4] Elapsed time 0.120 seconds Epoch[4] Evaluation metric 'validation_epoch_loss'=403.642245 Computing averaged parameters. Loading averaged parameters. End model training 100%|██████████| 313/313 [00:00<00:00, 2614.64it/s] 100%|██████████| 313/313 [00:00<00:00, 3741.37it/s] Running evaluation: 100%|██████████| 313/313 [00:00<00:00, 934.85it/s] Fitting model: DeepAR ... Training model DeepAR... Start model training Epoch[0] Learning rate is 0.001 0%| | 0/10 [00:00<?, ?it/s]Number of parameters in DeepARTrainingNetwork: 26218 100%|██████████| 10/10 [00:03<00:00, 2.78it/s, epoch=1/5, avg_epoch_loss=-2.29] Epoch[0] Elapsed time 3.600 seconds Epoch[0] Evaluation metric 'epoch_loss'=-2.289171 0it [00:00, ?it/s]Number of parameters in DeepARTrainingNetwork: 26218 10it [00:00, 11.91it/s, epoch=1/5, validation_avg_epoch_loss=8.88] Epoch[0] Elapsed time 0.841 seconds Epoch[0] Evaluation metric 'validation_epoch_loss'=8.879016 Epoch[1] Learning rate is 0.001 100%|██████████| 10/10 [00:00<00:00, 36.57it/s, epoch=2/5, avg_epoch_loss=-.744] Epoch[1] Elapsed time 0.275 seconds Epoch[1] Evaluation metric 'epoch_loss'=-0.744277 10it [00:00, 59.21it/s, epoch=2/5, validation_avg_epoch_loss=8.75] Epoch[1] Elapsed time 0.171 seconds Epoch[1] Evaluation metric 'validation_epoch_loss'=8.748208 Epoch[2] Learning rate is 0.001 100%|██████████| 10/10 [00:00<00:00, 37.73it/s, epoch=3/5, avg_epoch_loss=-.418] Epoch[2] Elapsed time 0.267 seconds Epoch[2] Evaluation metric 'epoch_loss'=-0.418038 10it [00:00, 59.09it/s, epoch=3/5, validation_avg_epoch_loss=8.51] Epoch[2] Elapsed time 0.171 seconds Epoch[2] Evaluation metric 'validation_epoch_loss'=8.507644 Epoch[3] Learning rate is 0.001 100%|██████████| 10/10 [00:00<00:00, 37.18it/s, epoch=4/5, avg_epoch_loss=-.681] Epoch[3] Elapsed time 0.271 seconds Epoch[3] Evaluation metric 'epoch_loss'=-0.681060 10it [00:00, 58.33it/s, epoch=4/5, validation_avg_epoch_loss=8.3] Epoch[3] Elapsed time 0.173 seconds Epoch[3] Evaluation metric 'validation_epoch_loss'=8.304434 Epoch[4] Learning rate is 0.001 100%|██████████| 10/10 [00:00<00:00, 37.22it/s, epoch=5/5, avg_epoch_loss=-1.9] Epoch[4] Elapsed time 0.271 seconds Epoch[4] Evaluation metric 'epoch_loss'=-1.904672 10it [00:00, 58.79it/s, epoch=5/5, validation_avg_epoch_loss=8.03] Epoch[4] Elapsed time 0.172 seconds Epoch[4] Evaluation metric 'validation_epoch_loss'=8.026868 Computing averaged parameters. Loading averaged parameters. End model training 100%|██████████| 313/313 [00:00<00:00, 313.62it/s] 100%|██████████| 313/313 [00:00<00:00, 3835.08it/s] Running evaluation: 100%|██████████| 313/313 [00:00<00:00, 943.59it/s] AutoGluon training complete, total runtime = 15.19s ...
Recall we only use presets = "low_quality"
to ensure this example
runs quickly, but this is NOT a good setting and you should either omit
this argument or set presets = "best_quality"
if you want to
benchmark the best accuracy that AutoGluon can obtain!
If you provided static features to fit()
, then the static features
must be also provided when using leaderboard()
, evaluate()
, or
predict()
:
predictor_static.leaderboard(test_data, static_features=static_features)
Using previous inferred feature columns...
Static Cat Features Dataframe including ['static_cat_feature']
Static Real Features Dataframe including ['static_real_feature']
Generating leaderboard for all models trained...
Additional data provided, testing on the additional data...
100%|██████████| 313/313 [00:00<00:00, 3942.52it/s]
100%|██████████| 313/313 [00:00<00:00, 3808.06it/s]
Running evaluation: 100%|██████████| 313/313 [00:00<00:00, 943.20it/s]
100%|██████████| 313/313 [00:00<00:00, 2445.15it/s]
100%|██████████| 313/313 [00:00<00:00, 3911.75it/s]
Running evaluation: 100%|██████████| 313/313 [00:00<00:00, 950.05it/s]
100%|██████████| 313/313 [00:00<00:00, 314.12it/s]
100%|██████████| 313/313 [00:00<00:00, 3918.63it/s]
Running evaluation: 100%|██████████| 313/313 [00:00<00:00, 951.37it/s]
model | val_score | fit_order | test_score | |
---|---|---|---|---|
0 | SFF | -0.660771 | 1 | -0.267795 |
1 | DeepAR | -0.764094 | 3 | -0.590745 |
2 | MQCNN | -0.920152 | 2 | -0.865667 |
AutoGluon forecast predictions will now be based on the static features in addition to the historical time-series observations:
predictions = predictor_static.predict(test_data, static_features=static_features)
print(predictions["Afghanistan_"])
Using previous inferred feature columns...
Static Cat Features Dataframe including ['static_cat_feature']
Static Real Features Dataframe including ['static_real_feature']
Does not specify model, will by default use the model with the best validation score for prediction
Predicting with model SFF
0.1 0.5 0.9
2020-04-22 848.724426 1433.520386 2100.333252
2020-04-23 525.479980 1203.280273 1828.540649
2020-04-24 406.676422 1320.063599 2089.422852
2020-04-25 135.429993 1292.610840 2219.305908
2020-04-26 59.995323 1354.163574 2438.460938
2020-04-27 586.250061 1494.719849 2511.511475
2020-04-28 -85.886353 1529.408325 3028.596924
2020-04-29 367.283447 1467.030396 2635.018311
2020-04-30 559.825256 1562.025513 2969.750732
2020-05-01 -58.859753 1607.525146 2800.288818
2020-05-02 311.700958 1617.724609 3071.249023
2020-05-03 -322.320007 1401.588867 2945.138428
2020-05-04 -319.960236 1088.040771 2829.933105
2020-05-05 -403.656525 1754.966675 3624.205811
2020-05-06 -348.057587 1366.859619 4198.331543
2020-05-07 -34.126774 1619.350708 3284.425293
2020-05-08 -1168.174927 1551.574707 3387.256348
2020-05-09 -1259.406860 1204.493530 4268.099121
2020-05-10 -189.284729 1399.963745 3504.257812