Forecasting Time-Series - Quick Start¶
Via a simple fit()
call, AutoGluon can train models to produce
forecasts for time series data. This tutorial demonstrates how to
quickly use AutoGluon to produce forecasts of Covid-19 cases in a
country given historical data from each
country.
Let’s first import AutoGluon’s ForecastingPredictor
and
TabularDataset
classes, where the latter is used to load time-series
data stored in a tabular file format:
from autogluon.forecasting import ForecastingPredictor
from autogluon.forecasting import TabularDataset
/var/lib/jenkins/workspace/workspace/autogluon-forecasting-py3-v3/venv/lib/python3.7/site-packages/gluonts/json.py:46: UserWarning: Using json-module for json-handling. Consider installing one of orjson, ujson to speed up serialization and deserialization. "Using json-module for json-handling. "
We load the time-series data to use for training from a CSV file into an
AutoGluon TabularDataset
object. This object is essentially
equivalent to a Pandas
DataFrame
and the same methods can be applied to both.
train_data = TabularDataset("https://autogluon.s3-us-west-2.amazonaws.com/datasets/CovidTimeSeries/train.csv")
print(train_data[50:60])
Date ConfirmedCases name
50 2020-03-12 7.0 Afghanistan_
51 2020-03-13 7.0 Afghanistan_
52 2020-03-14 11.0 Afghanistan_
53 2020-03-15 16.0 Afghanistan_
54 2020-03-16 21.0 Afghanistan_
55 2020-03-17 22.0 Afghanistan_
56 2020-03-18 22.0 Afghanistan_
57 2020-03-19 22.0 Afghanistan_
58 2020-03-20 24.0 Afghanistan_
59 2020-03-21 24.0 Afghanistan_
Note that we loaded data from a CSV file stored in the cloud (AWS s3
bucket), but you can you specify a local
file-path instead if you have already downloaded the CSV file to your
own machine (e.g., using wget).
Our goal is to train models on this data that can forecast Covid case
counts in each country at future dates. This corresponds to a
forecasting problem with many related individual time-series (one per
country). Each row in the table train_data
corresponds to one
observation of one time-series at a particular time.
The dataset you use for autogluon.forecasting
should usually contain
three columns: a date_column
with the time information (here
“Date”), an index_column
with a categorical index ID that specifies
which (out of multiple) time-series is being observed (here “name”,
where each country corresponds to a different time-series in our
example), and a target_column
with the observed value of this
particular time-series at this particular time (here “ConfirmedCases”).
When forecasting future values of one particular time-series, AutoGluon
may rely on historical observations of not only this time-series but
also all of the other time-series in the dataset. You can use NA
to
represent missing observations in the data. If your data only contains
observations of a single time-series, then index_column
can be
omitted. Currently only continuous numeric values are supported in the
target_column
.
Now let’s use AutoGluon to train some forecasting models. Below we
instruct AutoGluon to fit models that can forecast up to 19 time-points
into the future (prediction_length
) and save them in the folder
save_path
. Because of the inherent uncertainty involved in this
prediction problem, these models are trained to probabilistically
forecast 3 different quantiles of the “ConfirmedCases” distribution: the
central 0.5 quantile (median), a low 0.1 quantile, and a high 0.9
quantile. The first of these can be used as our forecasted value, while
the latter two can be used as a prediction interval for this value (we
are 80% confident the true value lies within this interval).
save_path = "agModels-covidforecast"
predictor = ForecastingPredictor(path=save_path).fit(train_data, prediction_length=19,
index_column="name", target_column="ConfirmedCases", time_column="Date",
quantiles=[0.1, 0.5, 0.9],
presets="low_quality" # last argument is just here for quick demo, omit it in real applications!
)
Warning: path already exists! This predictor may overwrite an existing predictor! path="agModels-covidforecast" presets is set to be low_quality Training with dataset in tabular format... Finish rebuilding the data, showing the top five rows. name 2020-01-22 2020-01-23 2020-01-24 2020-01-25 2020-01-26 0 Afghanistan_ 0.0 0.0 0.0 0.0 0.0 1 Albania_ 0.0 0.0 0.0 0.0 0.0 2 Algeria_ 0.0 0.0 0.0 0.0 0.0 3 Andorra_ 0.0 0.0 0.0 0.0 0.0 4 Angola_ 0.0 0.0 0.0 0.0 0.0 2020-01-27 2020-01-28 2020-01-29 2020-01-30 ... 2020-03-24 0 0.0 0.0 0.0 0.0 ... 74.0 1 0.0 0.0 0.0 0.0 ... 123.0 2 0.0 0.0 0.0 0.0 ... 264.0 3 0.0 0.0 0.0 0.0 ... 164.0 4 0.0 0.0 0.0 0.0 ... 3.0 2020-03-25 2020-03-26 2020-03-27 2020-03-28 2020-03-29 2020-03-30 0 84.0 94.0 110.0 110.0 120.0 170.0 1 146.0 174.0 186.0 197.0 212.0 223.0 2 302.0 367.0 409.0 454.0 511.0 584.0 3 188.0 224.0 267.0 308.0 334.0 370.0 4 3.0 4.0 4.0 5.0 7.0 7.0 2020-03-31 2020-04-01 2020-04-02 0 174.0 237.0 273.0 1 243.0 259.0 277.0 2 716.0 847.0 986.0 3 376.0 390.0 428.0 4 7.0 8.0 8.0 [5 rows x 73 columns] Validation data is None, will do auto splitting... Finished processing data, using 0.3105947971343994s. Random seed set to 0 All models will be trained for quantiles [0.1, 0.5, 0.9]. Beginning AutoGluon training ... AutoGluon will save models to agModels-covidforecast/ Fitting model: SFF ... Training model SFF... Start model training Epoch[0] Learning rate is 0.001 0%| | 0/10 [00:00<?, ?it/s]Number of parameters in SimpleFeedForwardTrainingNetwork: 31523 100%|██████████| 10/10 [00:01<00:00, 6.68it/s, epoch=1/5, avg_epoch_loss=-3.4] Epoch[0] Elapsed time 1.499 seconds Epoch[0] Evaluation metric 'epoch_loss'=-3.399109 0it [00:00, ?it/s]Number of parameters in SimpleFeedForwardTrainingNetwork: 31523 10it [00:00, 256.60it/s, epoch=1/5, validation_avg_epoch_loss=9.7] Epoch[0] Elapsed time 0.041 seconds Epoch[0] Evaluation metric 'validation_epoch_loss'=9.702665 Epoch[1] Learning rate is 0.001 100%|██████████| 10/10 [00:00<00:00, 194.89it/s, epoch=2/5, avg_epoch_loss=-2.2] Epoch[1] Elapsed time 0.053 seconds Epoch[1] Evaluation metric 'epoch_loss'=-2.204393 10it [00:00, 325.37it/s, epoch=2/5, validation_avg_epoch_loss=9.23] Epoch[1] Elapsed time 0.032 seconds Epoch[1] Evaluation metric 'validation_epoch_loss'=9.231876 Epoch[2] Learning rate is 0.001 100%|██████████| 10/10 [00:00<00:00, 198.72it/s, epoch=3/5, avg_epoch_loss=-.357] Epoch[2] Elapsed time 0.052 seconds Epoch[2] Evaluation metric 'epoch_loss'=-0.357210 10it [00:00, 318.91it/s, epoch=3/5, validation_avg_epoch_loss=8.87] Epoch[2] Elapsed time 0.033 seconds Epoch[2] Evaluation metric 'validation_epoch_loss'=8.873803 Epoch[3] Learning rate is 0.001 100%|██████████| 10/10 [00:00<00:00, 199.88it/s, epoch=4/5, avg_epoch_loss=-1.01] Epoch[3] Elapsed time 0.052 seconds Epoch[3] Evaluation metric 'epoch_loss'=-1.005207 10it [00:00, 310.13it/s, epoch=4/5, validation_avg_epoch_loss=8.61] Epoch[3] Elapsed time 0.034 seconds Epoch[3] Evaluation metric 'validation_epoch_loss'=8.606632 Epoch[4] Learning rate is 0.001 100%|██████████| 10/10 [00:00<00:00, 203.36it/s, epoch=5/5, avg_epoch_loss=-1.2] Epoch[4] Elapsed time 0.051 seconds Epoch[4] Evaluation metric 'epoch_loss'=-1.195356 10it [00:00, 325.66it/s, epoch=5/5, validation_avg_epoch_loss=8.5] Epoch[4] Elapsed time 0.032 seconds Epoch[4] Evaluation metric 'validation_epoch_loss'=8.499021 Computing averaged parameters. Loading averaged parameters. End model training 100%|██████████| 313/313 [00:00<00:00, 4182.21it/s] 100%|██████████| 313/313 [00:00<00:00, 3815.28it/s] Running evaluation: 100%|██████████| 313/313 [00:00<00:00, 1137.83it/s] Fitting model: MQCNN ... Training model MQCNN... Start model training Epoch[0] Learning rate is 0.001 0%| | 0/10 [00:00<?, ?it/s]Number of parameters in ForkingSeq2SeqTrainingNetwork: 57784 100%|██████████| 10/10 [00:00<00:00, 65.46it/s, epoch=1/5, avg_epoch_loss=98.9] Epoch[0] Elapsed time 0.155 seconds Epoch[0] Evaluation metric 'epoch_loss'=98.895254 0it [00:00, ?it/s]Number of parameters in ForkingSeq2SeqTrainingNetwork: 57784 10it [00:00, 84.81it/s, epoch=1/5, validation_avg_epoch_loss=424] Epoch[0] Elapsed time 0.119 seconds Epoch[0] Evaluation metric 'validation_epoch_loss'=424.208052 Epoch[1] Learning rate is 0.001 100%|██████████| 10/10 [00:00<00:00, 70.80it/s, epoch=2/5, avg_epoch_loss=97.6] Epoch[1] Elapsed time 0.143 seconds Epoch[1] Evaluation metric 'epoch_loss'=97.625159 10it [00:00, 85.74it/s, epoch=2/5, validation_avg_epoch_loss=422] Epoch[1] Elapsed time 0.118 seconds Epoch[1] Evaluation metric 'validation_epoch_loss'=421.774191 Epoch[2] Learning rate is 0.001 100%|██████████| 10/10 [00:00<00:00, 71.88it/s, epoch=3/5, avg_epoch_loss=96.3] Epoch[2] Elapsed time 0.141 seconds Epoch[2] Evaluation metric 'epoch_loss'=96.322472 10it [00:00, 87.63it/s, epoch=3/5, validation_avg_epoch_loss=419] Epoch[2] Elapsed time 0.115 seconds Epoch[2] Evaluation metric 'validation_epoch_loss'=418.761470 Epoch[3] Learning rate is 0.001 100%|██████████| 10/10 [00:00<00:00, 72.05it/s, epoch=4/5, avg_epoch_loss=94.5] Epoch[3] Elapsed time 0.140 seconds Epoch[3] Evaluation metric 'epoch_loss'=94.466040 10it [00:00, 87.92it/s, epoch=4/5, validation_avg_epoch_loss=414] Epoch[3] Elapsed time 0.115 seconds Epoch[3] Evaluation metric 'validation_epoch_loss'=414.468602 Epoch[4] Learning rate is 0.001 100%|██████████| 10/10 [00:00<00:00, 71.47it/s, epoch=5/5, avg_epoch_loss=91.6] Epoch[4] Elapsed time 0.142 seconds Epoch[4] Evaluation metric 'epoch_loss'=91.640606 10it [00:00, 88.57it/s, epoch=5/5, validation_avg_epoch_loss=406] Epoch[4] Elapsed time 0.114 seconds Epoch[4] Evaluation metric 'validation_epoch_loss'=406.421242 Computing averaged parameters. Loading averaged parameters. End model training 0%| | 0/313 [00:00<?, ?it/s]Forecast is not sample based. Ignoring parameter num_samples from predict method. 100%|██████████| 313/313 [00:00<00:00, 2736.95it/s] 100%|██████████| 313/313 [00:00<00:00, 1645.36it/s] Running evaluation: 100%|██████████| 313/313 [00:00<00:00, 1137.56it/s] Fitting model: DeepAR ... Training model DeepAR... Start model training Epoch[0] Learning rate is 0.001 0%| | 0/10 [00:00<?, ?it/s]Number of parameters in DeepARTrainingNetwork: 25884 100%|██████████| 10/10 [00:04<00:00, 2.35it/s, epoch=1/5, avg_epoch_loss=0.253] Epoch[0] Elapsed time 4.266 seconds Epoch[0] Evaluation metric 'epoch_loss'=0.252794 0it [00:00, ?it/s]Number of parameters in DeepARTrainingNetwork: 25884 10it [00:00, 11.83it/s, epoch=1/5, validation_avg_epoch_loss=8.16] Epoch[0] Elapsed time 0.847 seconds Epoch[0] Evaluation metric 'validation_epoch_loss'=8.158110 Epoch[1] Learning rate is 0.001 100%|██████████| 10/10 [00:00<00:00, 37.54it/s, epoch=2/5, avg_epoch_loss=-1.39] Epoch[1] Elapsed time 0.268 seconds Epoch[1] Evaluation metric 'epoch_loss'=-1.386599 10it [00:00, 62.07it/s, epoch=2/5, validation_avg_epoch_loss=7.7] Epoch[1] Elapsed time 0.163 seconds Epoch[1] Evaluation metric 'validation_epoch_loss'=7.703091 Epoch[2] Learning rate is 0.001 100%|██████████| 10/10 [00:00<00:00, 39.65it/s, epoch=3/5, avg_epoch_loss=-2.51] Epoch[2] Elapsed time 0.254 seconds Epoch[2] Evaluation metric 'epoch_loss'=-2.514444 10it [00:00, 61.58it/s, epoch=3/5, validation_avg_epoch_loss=7.43] Epoch[2] Elapsed time 0.164 seconds Epoch[2] Evaluation metric 'validation_epoch_loss'=7.431931 Epoch[3] Learning rate is 0.001 100%|██████████| 10/10 [00:00<00:00, 38.71it/s, epoch=4/5, avg_epoch_loss=-1.07] Epoch[3] Elapsed time 0.260 seconds Epoch[3] Evaluation metric 'epoch_loss'=-1.071596 10it [00:00, 62.33it/s, epoch=4/5, validation_avg_epoch_loss=7.23] Epoch[3] Elapsed time 0.162 seconds Epoch[3] Evaluation metric 'validation_epoch_loss'=7.230093 Epoch[4] Learning rate is 0.001 100%|██████████| 10/10 [00:00<00:00, 39.71it/s, epoch=5/5, avg_epoch_loss=-.417] Epoch[4] Elapsed time 0.254 seconds Epoch[4] Evaluation metric 'epoch_loss'=-0.417213 10it [00:00, 61.66it/s, epoch=5/5, validation_avg_epoch_loss=7.12] Epoch[4] Elapsed time 0.164 seconds Epoch[4] Evaluation metric 'validation_epoch_loss'=7.119217 Computing averaged parameters. Loading averaged parameters. End model training 100%|██████████| 313/313 [00:00<00:00, 321.07it/s] 100%|██████████| 313/313 [00:00<00:00, 3886.14it/s] Running evaluation: 100%|██████████| 313/313 [00:00<00:00, 1105.04it/s] AutoGluon training complete, total runtime = 17.75s ...
Note: We used presets = "low_quality"
above to ensure this
example runs quickly, but this is NOT a good setting! To obtain good
performance in real applications you should either delete this argument
or set presets
to be one of:
"best_quality", high_quality", "good_quality", "medium_quality"
.
Higher quality presets will generally produce superior forecasting
accuracy but take longer to train and may produce less efficient models.
The low_quality
presets are intended just for quickly verifying that
AutoGluon can be run on your data and you should generally use the
best_quality
presets instead when benchmarking the accuracy of
AutoGluon.
We can print a summary of what happened during fit()
:
predictor.fit_summary()
Generating leaderboard for all models trained...
* Summary of fit() * Estimated performance of each model: model val_score fit_order 0 SFF -0.676774 1 1 DeepAR -0.836025 3 2 MQCNN -0.928002 2 Number of models trained: 3 Types of models trained: {'SimpleFeedForwardModel', 'MQCNNModel', 'DeepARModel'} Hyperparameter-tuning used: False User-specified hyperparameters: toy Feature Metadata (Processed): (raw dtype, special dtypes): * End of fit() summary *
{'model_types': {'SFF': 'SimpleFeedForwardModel',
'MQCNN': 'MQCNNModel',
'DeepAR': 'DeepARModel'},
'model_performance': {'SFF': -0.6767744950415352,
'MQCNN': -0.92800191364702,
'DeepAR': -0.83602518266847},
'model_best': None,
'model_paths': {'SFF': 'agModels-covidforecast/models/SFF/',
'MQCNN': 'agModels-covidforecast/models/MQCNN/',
'DeepAR': 'agModels-covidforecast/models/DeepAR/'},
'model_fit_times': {'SFF': 5.295125722885132,
'MQCNN': 1.3960041999816895,
'DeepAR': 6.93402624130249},
'hyperparameter_tune': False,
'hyperparameters_userspecified': 'toy',
'model_hyperparams': {'SFF': {'freq': 'D',
'prediction_length': 19,
'epochs': 5,
'num_batches_per_epoch': 10,
'context_length': 5,
'use_feat_static_cat': False,
'use_feat_static_real': False,
'cardinality': None,
'quantiles': [0.1, 0.5, 0.9],
'callbacks': [<autogluon.forecasting.models.gluonts_model.abstract_gluonts.callback.EpochCounter at 0x7f45d0362210>,
<autogluon.forecasting.models.gluonts_model.abstract_gluonts.callback.TimeLimitCallback at 0x7f45d0362a90>]},
'MQCNN': {'freq': 'D',
'prediction_length': 19,
'epochs': 5,
'num_batches_per_epoch': 10,
'context_length': 5,
'use_feat_static_cat': False,
'use_feat_static_real': False,
'cardinality': None,
'quantiles': [0.1, 0.5, 0.9],
'callbacks': [<autogluon.forecasting.models.gluonts_model.abstract_gluonts.callback.EpochCounter at 0x7f45d03623d0>,
<autogluon.forecasting.models.gluonts_model.abstract_gluonts.callback.TimeLimitCallback at 0x7f468a5b3f10>],
'hybridize': False},
'DeepAR': {'freq': 'D',
'prediction_length': 19,
'epochs': 5,
'num_batches_per_epoch': 10,
'context_length': 5,
'use_feat_static_cat': False,
'use_feat_static_real': False,
'cardinality': None,
'quantiles': [0.1, 0.5, 0.9],
'callbacks': [<autogluon.forecasting.models.gluonts_model.abstract_gluonts.callback.EpochCounter at 0x7f468a5cc150>,
<autogluon.forecasting.models.gluonts_model.abstract_gluonts.callback.TimeLimitCallback at 0x7f468a5cc7d0>]}},
'leaderboard': model val_score fit_order
0 SFF -0.676774 1
1 DeepAR -0.836025 3
2 MQCNN -0.928002 2}
Now let’s load some more recent test data to examine the forecasting performance of our trained models:
test_data = TabularDataset("https://autogluon.s3-us-west-2.amazonaws.com/datasets/CovidTimeSeries/test.csv")
Loaded data from: https://autogluon.s3-us-west-2.amazonaws.com/datasets/CovidTimeSeries/test.csv | Columns = 3 / 3 | Rows = 28483 -> 28483
The below code is unnecessary here, but is just included to demonstrate how to reload a trained Predictor object from file (for example in a new Python session):
predictor = ForecastingPredictor.load(save_path)
Loading predictor from path agModels-covidforecast/
We can view the test performance of each model AutoGluon has trained via
the leaderboard()
function, where higher scores correspond to better
predictive performance (in this case where the evaluation metric
corresponds to a loss, we append a negative sign to the loss to ensure
higher=better):
predictor.leaderboard(test_data)
Generating leaderboard for all models trained...
Additional data provided, testing on the additional data...
100%|██████████| 313/313 [00:00<00:00, 4206.30it/s]
100%|██████████| 313/313 [00:00<00:00, 4018.07it/s]
Running evaluation: 100%|██████████| 313/313 [00:00<00:00, 1117.67it/s]
100%|██████████| 313/313 [00:00<00:00, 2562.23it/s]
100%|██████████| 313/313 [00:00<00:00, 3965.32it/s]
Running evaluation: 100%|██████████| 313/313 [00:00<00:00, 1111.00it/s]
100%|██████████| 313/313 [00:00<00:00, 318.32it/s]
100%|██████████| 313/313 [00:00<00:00, 3907.89it/s]
Running evaluation: 100%|██████████| 313/313 [00:00<00:00, 1133.81it/s]
model | val_score | fit_order | test_score | |
---|---|---|---|---|
0 | SFF | -0.676774 | 1 | -0.278595 |
1 | DeepAR | -0.836025 | 3 | -0.747373 |
2 | MQCNN | -0.928002 | 2 | -0.878101 |
Here test_score
quantifies the performance of predictions on the
held-out part of the test data (time points after the latest time
observed in the original training data), while val_score
quantifies
the performance of predictions on an internal validation set that
AutoGluon held-out during fit()
. By default the validation set is
comprised of the latest time-points in train_data
, but you can also
manually provide your own validation data through the fit()
argument: val_data
. You can also call predictor.leaderboard()
without any test_data
argument to only display val_score
. By
default, AutoGluon will score probabilistic forecasts of multiple
time-series via the weighted quantile
loss,
but you can specify a different eval_metric
in fit()
to instruct
AutoGluon to optimize for a different evaluation metric instead (eg.
eval_metric="MAPE")
. For more details about the individual
time-series models that AutoGluon can train, you can view the GluonTS
documentation or the AutoGluon source code
folder autogluon/forecasting/models/
.
We can also make forecasts further into the future based on the most
recent data. When we call predict()
, AutoGluon automatically
forecasts with the model that had the best validation performance during
training (this is the model at the top of leaderboard()
when called
without any data). The predictions returned by predict()
form a
dictionary whose keys index each time series (in this example, country)
and whose values are DataFrames containing quantile forecasts for each
time series (in this example, predicted quantiles of the case counts in
each country at future subsequent dates to those observed in the
test_data).
predictions = predictor.predict(test_data)
print(predictions['Afghanistan_']) # quantile forecasts for the Afghanistan time-series
Does not specify model, will by default use the model with the best validation score for prediction
Predicting with model SFF
0.1 0.5 0.9
2020-04-22 238.829971 1182.778687 1932.760254
2020-04-23 292.640137 1286.876831 2406.811279
2020-04-24 129.684311 1362.938843 2444.654297
2020-04-25 1.438057 1273.847778 2290.441895
2020-04-26 263.676788 1384.483032 2708.584229
2020-04-27 215.322830 1575.142334 3049.258789
2020-04-28 73.187408 1584.535278 2793.485840
2020-04-29 -237.644989 1579.732056 3183.700439
2020-04-30 84.962097 1472.296021 3427.439941
2020-05-01 -736.102356 1614.746216 3914.478516
2020-05-02 134.536484 1722.437500 3610.033447
2020-05-03 -392.093353 1552.886841 3763.156250
2020-05-04 284.896149 1357.460815 3524.158203
2020-05-05 -886.666504 1731.957031 4186.833008
2020-05-06 -824.011108 1287.756348 3940.486572
2020-05-07 -1039.792236 1383.604980 4043.714844
2020-05-08 -1171.404175 1779.110962 3805.483154
2020-05-09 -1153.900146 1306.392456 3292.706787
2020-05-10 -945.160034 1354.559814 4442.953125
Instead of forecasting with the model that had the best validation score, you can instead specify which model to use for prediction, as well as that AutoGluon should only predict certain time-series of interest:
model_touse = "MQCNN"
time_series_to_predict = ["Germany_", "Zimbabwe_"]
predictions = predictor.predict(test_data, model=model_touse, time_series_to_predict=time_series_to_predict)
Predicting with model MQCNN
In predict()
, AutoGluon makes predictions for prediction_length
(= 19 in this example) time points into the future, after the last
time observed in the dataset fed into predict()
. In evaluate()
and leaderboard()
, AutoGluon makes predictions for the first
prediction_length
time points exceeding the last time observed in
the train_data
originally fed into fit()
, and then scores these
predictions against the target values at the corresponding times in the
dataset fed into these methods. Because some models base their
predictions on lengthy histories, it is important that in either case,
the test_data
you provide contains the train_data
as a subset!
You can verify the train_data
are contained within the test_data
in the example above.
After you no longer need a particular trained Predictor, remember to
delete the save_path
folder to free disk space on your machine. This
is especially important to avoid running out of space if training many
Predictors in sequence.