TimeSeriesPredictor.fit¶
- TimeSeriesPredictor.fit(train_data: TimeSeriesDataFrame | DataFrame | Path | str, tuning_data: TimeSeriesDataFrame | DataFrame | Path | str | None = None, time_limit: int | None = None, presets: str | None = None, hyperparameters: str | Dict[str | Type, Any] | None = None, hyperparameter_tune_kwargs: str | Dict | None = None, excluded_model_types: List[str] | None = None, num_val_windows: int = 1, val_step_size: int | None = None, refit_every_n_windows: int | None = 1, refit_full: bool = False, enable_ensemble: bool = True, skip_model_selection: bool = False, random_seed: int | None = 123, verbosity: int | None = None) TimeSeriesPredictor[source]¶
- Fit probabilistic forecasting models to the given time series dataset. - Parameters:
- train_data (Union[TimeSeriesDataFrame, pd.DataFrame, Path, str]) – - Training data in the - TimeSeriesDataFrameformat.- Time series with length - <= (num_val_windows + 1) * prediction_lengthwill be ignored during training. See- num_val_windowsfor details.- If - known_covariates_nameswere specified when creating the predictor,- train_datamust include the columns listed in- known_covariates_nameswith the covariates values aligned with the target time series.- Columns of - train_dataexcept- targetand those listed in- known_covariates_nameswill be interpreted as- past_covariates- covariates that are known only in the past.- If - train_datacontains covariates or static features, they will be interpreted as follows:- columns with - int,- booland- floatdtypes are interpreted as continuous (real-valued) features
- columns with - object,- strand- categorydtypes are as interpreted as categorical features
- columns with other dtypes are ignored 
 - To ensure that the column type is interpreted correctly, please convert it to one of the above dtypes. For example, to ensure that column “store_id” with dtype - intis interpreted as a category, change its dtype to- category:- data.static_features["store_id"] = data.static_features["store_id"].astype("category") - If provided data is a pandas.DataFrame, AutoGluon will attempt to convert it to a TimeSeriesDataFrame. If a str or a Path is provided, AutoGluon will attempt to load this file. 
- tuning_data (Union[TimeSeriesDataFrame, pd.DataFrame, Path, str], optional) – - Data reserved for model selection and hyperparameter tuning, rather than training individual models. Also used to compute the validation scores. Note that only the last - prediction_lengthtime steps of each time series are used for computing the validation score.- If - tuning_datais provided, multi-window backtesting on training data will be disabled, the- num_val_windowswill be set to- 0, and- refit_fullwill be set to- False.- Leaving this argument empty and letting AutoGluon automatically generate the validation set from - train_datais a good default.- The names and dtypes of columns and static features in - tuning_datamust match the- train_data.- If provided data is a pandas.DataFrame, AutoGluon will attempt to convert it to a TimeSeriesDataFrame. If a str or a Path is provided, AutoGluon will attempt to load this file. 
- time_limit (int, optional) – Approximately how long - fit()will run (wall-clock time in seconds). If not specified,- fit()will run until all models have completed training.
- presets (str, optional) – - Optional preset configurations for various arguments in - fit().- Can significantly impact predictive accuracy, memory footprint, inference latency of trained models, and various other properties of the returned predictor. It is recommended to specify presets and avoid specifying most other - fit()arguments or model hyperparameters prior to becoming familiar with AutoGluon. For example, set- presets="high_quality"to get a high-accuracy predictor, or set- presets="fast_training"to quickly get the results. Any user-specified arguments in- fit()will override the values used by presets.- Available presets: - "fast_training": fit simple statistical models (- ETS,- Theta,- Naive,- SeasonalNaive) + fast tree-based models- RecursiveTabularand- DirectTabular. These models are fast to train but may not be very accurate.
- "medium_quality": all models mentioned above + deep learning model- TemporalFusionTransformer+ Chronos-Bolt (small). Produces good forecasts with reasonable training time.
- "high_quality": All ML models available in AutoGluon + additional statistical models (- NPTS,- AutoETS,- DynamicOptimizedTheta). Much more accurate than- medium_quality, but takes longer to train.
- "best_quality": Same models as in- "high_quality", but performs validation with multiple backtests. Usually better than- high_quality, but takes even longer to train.
 - Available presets with the new, faster Chronos-Bolt model: - "bolt_{model_size}": where model size is one of- tiny,mini,small,base. Uses the Chronos-Bolt pretrained model for zero-shot forecasting. See the documentation for- ChronosModelor see Hugging Face for more information.
 - Available presets with the original Chronos model. Note that as of v1.2 we recommend using the new, faster Chronos-Bolt models instead of the original Chronos models. - "chronos_{model_size}": where model size is one of- tiny,mini,small,base,large. Uses the Chronos pretrained model for zero-shot forecasting. See the documentation for- ChronosModelor see Hugging Face for more information. Note that a GPU is required for model sizes- small,- baseand- large.
- "chronos": alias for- "chronos_small".
- "chronos_ensemble": builds an ensemble of seasonal naive, tree-based and deep learning models with fast inference and- "chronos_small".
- "chronos_large_ensemble": builds an ensemble of seasonal naive, tree-based and deep learning models with fast inference and- "chronos_large".
 - Details for these presets can be found in - autogluon/timeseries/configs/presets_configs.py. If not provided, user-provided values for- hyperparametersand- hyperparameter_tune_kwargswill be used (defaulting to their default values specified below).
- hyperparameters (str or dict, optional) – - Determines what models are trained and what hyperparameters are used by each model. - If str is passed, will use a preset hyperparameter configuration defined in - autogluon/timeseries/trainer/models/presets.py. Supported values are- "default",- "light"and- "very_light".- If dict is provided, the keys are strings or types that indicate which models to train. Each value is itself a dict containing hyperparameters for each of the trained models, or a list of such dicts. Any omitted hyperparameters not specified here will be set to default. For example: - predictor.fit( ... hyperparameters={ "DeepAR": {}, "Theta": [ {"decomposition_type": "additive"}, {"seasonal_period": 1}, ], } ) - The above example will train three models: - DeepARwith default hyperparameters
- Thetawith additive seasonal decomposition (all other parameters set to their defaults)
- Thetawith seasonality disabled (all other parameters set to their defaults)
 - Full list of available models and their hyperparameters is provided in Forecasting Time Series - Model Zoo. - The hyperparameters for each model can be fixed values (as shown above), or search spaces over which hyperparameter optimization is performed. A search space should only be provided when - hyperparameter_tune_kwargsis given (i.e., hyperparameter-tuning is utilized). For example:- from autogluon.common import space predictor.fit( ... hyperparameters={ "DeepAR": { "hidden_size": space.Int(20, 100), "dropout_rate": space.Categorical(0.1, 0.3), }, }, hyperparameter_tune_kwargs="auto", ) - In the above example, multiple versions of the DeepAR model with different values of the parameters “hidden_size” and “dropout_rate” will be trained. 
- hyperparameter_tune_kwargs (str or dict, optional) – - Hyperparameter tuning strategy and kwargs (for example, how many HPO trials to run). If None, then hyperparameter tuning will not be performed. - If type is - str, then this argument specifies a preset. Valid preset values:- ”auto”: Performs HPO via bayesian optimization search on GluonTS-backed neural forecasting models and random search on other models using local scheduler. 
- ”random”: Performs HPO via random search. 
 - You can also provide a dict to specify searchers and schedulers Valid keys: - ”num_trials”: How many HPO trials to run 
- ”scheduler”: Which scheduler to use. Valid values:
- ”local”: Local scheduler that schedules trials FIFO 
 
 
- ”searcher”: Which searching algorithm to use. Valid values:
- ”local_random”: Uses the “random” searcher 
- ”random”: Perform random search 
- ”bayes”: Perform HPO with HyperOpt on GluonTS-backed models via Ray tune. Perform random search on other models. 
- ”auto”: alias for “bayes” 
 
 
 - The “scheduler” and “searcher” key are required when providing a dict. - Example: - predictor.fit( ... hyperparameter_tune_kwargs={ "num_trials": 5, "searcher": "auto", "scheduler": "local", }, ) 
- excluded_model_types (List[str], optional) – - Banned subset of model types to avoid training during - fit(), even if present in- hyperparameters. For example, the following code will train all models included in the- high_qualitypresets except- DeepAR:- predictor.fit( ..., presets="high_quality", excluded_model_types=["DeepAR"], ) 
- num_val_windows (int, default = 1) – - Number of backtests done on - train_datafor each trained model to estimate the validation performance. If- num_val_windows > 1is provided, this value may be automatically reduced to ensure that the majority of time series in- train_dataare long enough for the chosen number of backtests.- Increasing this parameter increases the training time roughly by a factor of - num_val_windows // refit_every_n_windows. See- refit_every_n_windowsand- val_step_size: for details.- For example, for - prediction_length=2,- num_val_windows=3and- val_step_size=1the folds are:- |-------------------| | x x x x x y y - - | | x x x x x x y y - | | x x x x x x x y y | - where - xare the train time steps and- yare the validation time steps.- This argument has no effect if - tuning_datais provided.
- val_step_size (int or None, default = None) – - Step size between consecutive validation windows. If set to - None, defaults to- prediction_lengthprovided when creating the predictor.- This argument has no effect if - tuning_datais provided.
- refit_every_n_windows (int or None, default = 1) – - When performing cross validation, each model will be retrained every - refit_every_n_windowsvalidation windows, where the number of validation windows is specified by num_val_windows. Note that in the default setting where num_val_windows=1, this argument has no effect.- If set to - None, models will only be fit once for the first (oldest) validation window. By default, refit_every_n_windows=1, i.e., all models will be refit for each validation window.
- refit_full (bool, default = False) – If True, after training is complete, AutoGluon will attempt to re-train all models using all of training data (including the data initially reserved for validation). This argument has no effect if - tuning_datais provided.
- enable_ensemble (bool, default = True) – If True, the - TimeSeriesPredictorwill fit a simple weighted ensemble on top of the models specified via- hyperparameters.
- skip_model_selection (bool, default = False) – If True, predictor will not compute the validation score. For example, this argument is useful if we want to use the predictor as a wrapper for a single pre-trained model. If set to True, then the - hyperparametersdict must contain exactly one model without hyperparameter search spaces or an exception will be raised.
- random_seed (int or None, default = 123) – If provided, fixes the seed of the random number generator for all models. This guarantees reproducible results for most models (except those trained on GPU because of the non-determinism of GPU operations). 
- verbosity (int, optional) – If provided, overrides the - verbosityvalue used when creating the- TimeSeriesPredictor. See documentation for- TimeSeriesPredictorfor more details.