TimeSeriesPredictor.feature_importance¶
- TimeSeriesPredictor.feature_importance(data: TimeSeriesDataFrame | DataFrame | Path | str | None = None, model: str | None = None, metric: str | TimeSeriesScorer | None = None, features: list[str] | None = None, time_limit: float | None = None, method: Literal['naive', 'permutation'] = 'permutation', subsample_size: int = 50, num_iterations: int | None = None, random_seed: int | None = 123, relative_scores: bool = False, include_confidence_band: bool = True, confidence_level: float = 0.99) DataFrame[source]¶
Calculates feature importance scores for the given model via replacing each feature by a shuffled version of the same feature (also known as permutation feature importance) or by assigning a constant value representing the median or mode of the feature, and computing the relative decrease in the model’s predictive performance.
A feature’s importance score represents the performance drop that results when the model makes predictions on a perturbed copy of the data where this feature’s values have been randomly shuffled across rows. A feature score of 0.01 would indicate that the predictive performance dropped by 0.01 when the feature was randomly shuffled or replaced. The higher the score a feature has, the more important it is to the model’s performance.
If a feature has a negative score, this means that the feature is likely harmful to the final model, and a model trained with the feature removed would be expected to achieve a better predictive performance. Note that calculating feature importance can be a computationally expensive process, particularly if the model uses many features. In many cases, this can take longer than the original model training. Roughly, this will equal to the number of features in the data multiplied by
num_iterations(or, 1 whenmethod="naive") and time taken whenevaluate()is called on a dataset withsubsample_size.- Parameters:
data (TimeSeriesDataFrame, pd.DataFrame, Path or str, optional) –
The data to evaluate feature importances on. The last
prediction_lengthtime steps of the data set, for each item, will be held out for prediction and forecast accuracy will be calculated on these time steps. More accurate feature importances will be obtained from new data that was held-out duringfit().The names and dtypes of columns and static features in
datamust match thetrain_dataused to train the predictor.If provided data is a
pandas.DataFrame, AutoGluon will attempt to convert it to aTimeSeriesDataFrame. If astror aPathis provided, AutoGluon will attempt to load this file.If
datais not provided, then validation (tuning) data provided during training (or the held out data used for validation iftuning_datawas not explicitly providedfit()) will be used.model (str, optional) – Name of the model that you would like to evaluate. By default, the best model during training (with highest validation score) will be used.
metric (str or TimeSeriesScorer, optional) – Metric to be used for computing feature importance. If None, the
eval_metricspecified during initialization of theTimeSeriesPredictorwill be used.features (list[str], optional) – List of feature names that feature importances are calculated for and returned. By default, all feature importances will be returned.
method ({"permutation", "naive"}, default = "permutation") –
Method to be used for computing feature importance.
naive: computes feature importance by replacing the values of each feature by a constant value and computing feature importances as the relative improvement in the evaluation metric. The constant value is the median for real-valued features and the mode for categorical features, for both covariates and static features, obtained from the feature values indataprovided.permutation: computes feature importance by naively shuffling the values of the feature across different items and time steps. Each feature is shuffled fornum_iterationstimes and feature importances are computed as the relative improvement in the evaluation metric. Refer to https://explained.ai/rf-importance/ for an explanation of permutation importance.
subsample_size (int, default = 50) – The number of items to sample from
datawhen computing feature importance. Larger values increase the accuracy of the feature importance scores. Runtime linearly scales withsubsample_size.time_limit (float, optional) – Time in seconds to limit the calculation of feature importance. If None, feature importance will calculate without early stopping. If
method="permutation", a minimum of 1 full shuffle set will always be evaluated. If a shuffle set evaluation takes longer thantime_limit, the method will take the length of a shuffle set evaluation to return regardless of thetime_limit.num_iterations (int, optional) – The number of different iterations of the data that are evaluated. If
method="permutation", this will be interpreted as the number of shuffle sets (equivalent tonum_shuffle_setsinTabularPredictor.feature_importance()). Ifmethod="naive", the constant replacement approach is repeated fornum_iterationstimes, and a different subsample of data (of sizesubsample_size) will be taken in each iteration. Default is 1 formethod="naive"and 5 formethod="permutation". The value will be ignored ifmethod="naive"and the subsample size is greater than the number of items indataas additional iterations will be redundant. Larger values will increase the quality of the importance evaluation. It is generally recommended to increasesubsample_sizebefore increasingnum_iterations. Runtime scales linearly withnum_iterations.random_seed (int or None, default = 123) – If provided, fixes the seed of the random number generator for all models. This guarantees reproducible results for feature importance.
relative_scores (bool, default = False) – By default, this method will return expected average absolute improvement in the eval metric due to the feature. If True, then the statistics will be computed over the relative (percentage) improvements.
include_confidence_band (bool, default = True) – If True, returned DataFrame will include two additional columns specifying confidence interval for the true underlying importance value of each feature. Increasing
subsample_sizeandnum_iterationswill tighten the confidence interval.confidence_level (float, default = 0.99) – This argument is only considered when
include_confidence_band=True, and can be used to specify the confidence level used for constructing confidence intervals. For example, ifconfidence_levelis set to 0.99, then the returned DataFrame will include columnsp99_highandp99_lowwhich indicates that the true feature importance will be betweenp99_highandp99_low99% of the time (99% confidence interval). More generally, ifconfidence_level= 0.XX, then the columns containing the XX% confidence interval will be namedpXX_highandpXX_low.
- Returns:
index: The feature name. ‘importance’: The estimated feature importance score. ‘stddev’: The standard deviation of the feature importance score. If NaN, then not enough
num_iterationswere used.- Return type:
pd.DataFrameof feature importance scores with 2 columns