TimeSeriesPredictor.evaluate¶

Evaluate the forecast accuracy for given dataset.

This method measures the forecast accuracy using the last self.prediction_length time steps of each time series in data as a hold-out set.

Note

Metrics are always reported in ‘higher is better’ format. This means that metrics such as MASE or MAPE will be multiplied by -1, so their values will be negative. This is necessary to avoid the user needing to know the metric to understand if higher is better when looking at the evaluation results.

Parameters:

data (Union[TimeSeriesDataFrame, pd.DataFrame, Path, str]) –
The data to evaluate the best model on. The last prediction_length time steps of each time series in data will be held out for prediction and forecast accuracy will be calculated on these time steps.

Must include both historic and future data (i.e., length of all time series in data must be at least prediction_length + 1).

The names and dtypes of columns and static features in data must match the train_data used to train the predictor.

If provided data is a pandas.DataFrame, AutoGluon will attempt to convert it to a TimeSeriesDataFrame. If a str or a Path is provided, AutoGluon will attempt to load this file.
model (str, optional) – Name of the model that you would like to evaluate. By default, the best model during training (with highest validation score) will be used.
metrics (str, TimeSeriesScorer or List[Union[str, TimeSeriesScorer]], optional) – Metric or a list of metrics to compute scores with. Defaults to self.eval_metric. Supports both metric names as strings and custom metrics based on TimeSeriesScorer.
display (bool, default = False) – If True, the scores will be printed.
use_cache (bool, default = True) – If True, will attempt to use the cached predictions. If False, cached predictions will be ignored. This argument is ignored if cache_predictions was set to False when creating the TimeSeriesPredictor.

Returns:

scores_dict – Dictionary where keys = metrics, values = performance along each metric. For consistency, error metrics will have their signs flipped to obey this convention. For example, negative MAPE values will be reported. To get the eval_metric score, do output[predictor.eval_metric.name].

Return type:

Dict[str, float]