autogluon.tabular.models¶

Note

This documentation is for advanced users, and is not comprehensive.

For a stable public API, refer to TabularPredictor.

Models¶

`AbstractModel`	Abstract model implementation from which all AutoGluon models inherit.
`LGBModel`	LightGBM model: https://lightgbm.readthedocs.io/en/latest/
`CatBoostModel`	CatBoost model: https://catboost.ai/
`XGBoostModel`	XGBoost model: https://xgboost.readthedocs.io/en/latest/
`RFModel`	Random Forest model (scikit-learn): https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html
`XTModel`	Extra Trees model (scikit-learn): https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.ExtraTreesClassifier.html#sklearn.ensemble.ExtraTreesClassifier
`KNNModel`	KNearestNeighbors model (scikit-learn): https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsClassifier.html
`LinearModel`	Linear model (scikit-learn): https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html
`TabularNeuralNetModel`	Class for neural network models that operate on tabular data.
`NNFastAiTabularModel`	Class for fastai v1 neural network models that operate on tabular data.

AbstractModel¶

class autogluon.tabular.models.AbstractModel(path: str, name: str, problem_type: str, eval_metric: Union[str, autogluon.core.metrics.Scorer] = None, hyperparameters=None, feature_metadata: autogluon.core.features.feature_metadata.FeatureMetadata = None, num_classes=None, stopping_metric=None, features=None, **kwargs)[source]¶

Abstract model implementation from which all AutoGluon models inherit.

Parameters

path (str): directory where to store all outputs.
name (str): name of subdirectory inside path where model will be saved.
problem_type (str): type of problem this model will handle. Valid options: [‘binary’, ‘multiclass’, ‘regression’].
eval_metric (str or autogluon.core.metrics.Scorer): objective function the model intends to optimize. If None, will be inferred based on problem_type.
hyperparameters (dict): various hyperparameters that will be used by model (can be search spaces instead of fixed values).
feature_metadata (autogluon.core.features.feature_metadata.FeatureMetadata): contains feature type information that can be used to identify special features such as text ngrams and datetime as well as which features are numerical vs categorical

Attributes

path_suffix

Methods

`can_infer`()
`compute_feature_importance`(X, y[, features, …])
`compute_permutation_importance`(X, y, features)
`convert_to_refit_full_template`()
`convert_to_template`()
`delete_from_disk`()
`get_disk_size`()
`get_memory_size`()
`get_model_feature_importance`()
`get_trained_params`()
`is_fit`()
`is_valid`()
`load`(path[, reset_paths, verbose])	Loads the model from disk to memory. Parameters ———- path : str Path to the saved model, minus the file name. This should generally be a directory path ending with a ‘/’ character (or appropriate path separator value depending on OS). The model file is typically located in path + cls.model_file_name. reset_paths : bool, default True Whether to reset the self.path value of the loaded model to be equal to path. It is highly recommended to keep this value as True unless accessing the original self.path value is important. If False, the actual valid path and self.path may differ, leading to strange behaviour and potential exceptions if the model needs to load any other files at a later time. verbose : bool, default True Whether to log the location of the loaded file. Returns ——- model : cls Loaded model object.
`reduce_memory_size`([remove_fit, …])
`rename`(name)	Renames the model and updates self.path to reflect the updated name.
`reset_metrics`()
`save`([path, verbose])	Saves the model to disk. Parameters ———- path : str, default None Path to the saved model, minus the file name. This should generally be a directory path ending with a ‘/’ character (or appropriate path separator value depending on OS). If None, self.path is used. The final model file is typically saved to path + self.model_file_name. verbose : bool, default True Whether to log the location of the saved file. Returns ——- path : str Path to the saved model, minus the file name. Use this value to load the model from disk via cls.load(path), cls being the class of the model object, such as model = RFModel.load(path).
`set_contexts`(path_context)

create_contexts
fit
get_info
hyperparameter_tune
load_info
predict
predict_proba
preprocess
save_info
score
score_with_y_pred_proba

classmethod load(path: str, reset_paths=True, verbose=True)[source]¶

Loads the model from disk to memory. Parameters ———- path : str

Path to the saved model, minus the file name. This should generally be a directory path ending with a ‘/’ character (or appropriate path separator value depending on OS). The model file is typically located in path + cls.model_file_name.

reset_pathsbool, default True: Whether to reset the self.path value of the loaded model to be equal to path. It is highly recommended to keep this value as True unless accessing the original self.path value is important. If False, the actual valid path and self.path may differ, leading to strange behaviour and potential exceptions if the model needs to load any other files at a later time.
verbosebool, default True: Whether to log the location of the loaded file.

modelcls: Loaded model object.

rename(name: str)[source]¶: Renames the model and updates self.path to reflect the updated name.

save(path: str = None, verbose=True) → str[source]¶

Saves the model to disk. Parameters ———- path : str, default None

Path to the saved model, minus the file name. This should generally be a directory path ending with a ‘/’ character (or appropriate path separator value depending on OS). If None, self.path is used. The final model file is typically saved to path + self.model_file_name.

verbosebool, default True: Whether to log the location of the saved file.

pathstr: Path to the saved model, minus the file name. Use this value to load the model from disk via cls.load(path), cls being the class of the model object, such as model = RFModel.load(path)

LGBModel¶

class autogluon.tabular.models.LGBModel(**kwargs)[source]¶

LightGBM model: https://lightgbm.readthedocs.io/en/latest/

Hyperparameter options: https://lightgbm.readthedocs.io/en/latest/Parameters.html

CatBoostModel¶

class autogluon.tabular.models.CatBoostModel(**kwargs)[source]¶

CatBoost model: https://catboost.ai/

Hyperparameter options: https://catboost.ai/docs/concepts/python-reference_parameters-list.html

XGBoostModel¶

class autogluon.tabular.models.XGBoostModel(**kwargs)[source]¶

XGBoost model: https://xgboost.readthedocs.io/en/latest/

Hyperparameter options: https://xgboost.readthedocs.io/en/latest/parameter.html

LinearModel¶

class autogluon.tabular.models.LinearModel(**kwargs)[source]¶

Linear model (scikit-learn): https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html

Model backend differs depending on problem_type:

‘binary’ & ‘multiclass’: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html

‘regression’: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Ridge.html#sklearn.linear_model.Ridge

TabularNeuralNetModel¶

class autogluon.tabular.models.TabularNeuralNetModel(**kwargs)[source]¶

Class for neural network models that operate on tabular data. These networks use different types of input layers to process different types of data in various columns.

Attributes:: _types_of_features (dict): keys = ‘continuous’, ‘skewed’, ‘onehot’, ‘embed’, ‘language’; values = column-names of Dataframe corresponding to the features of this type feature_arraycol_map (OrderedDict): maps feature-name -> list of column-indices in df corresponding to this feature

self.feature_type_map (OrderedDict): maps feature-name -> feature_type string (options: ‘vector’, ‘embed’, ‘language’) processor (sklearn.ColumnTransformer): scikit-learn preprocessor object.

Note: This model always assumes higher values of self.eval_metric indicate better performance.

NNFastAiTabularModel¶

class autogluon.tabular.models.NNFastAiTabularModel(**kwargs)[source]¶

Class for fastai v1 neural network models that operate on tabular data.

Hyperparameters:

y_scaler: on a regression problems, the model can give unreasonable predictions on unseen data. This attribute allows to pass a scaler for y values to address this problem. Please note that intermediate iteration metrics will be affected by this transform and as a result intermediate iteration scores will be different from the final ones (these will be correct). https://scikit-learn.org/stable/modules/classes.html#module-sklearn.preprocessing

‘layers’: list of hidden layers sizes; None - use model’s heuristics; default is None

‘emb_drop’: embedding layers dropout; defaut is 0.1

‘ps’: linear layers dropout - list of values applied to every layer in layers; default is [0.1]

‘bs’: batch size; default is 256

‘lr’: maximum learning rate for one cycle policy; default is 1e-2; see also https://fastai1.fast.ai/train.html#fit_one_cycle, One-cycle policy paper: https://arxiv.org/abs/1803.09820

‘epochs’: number of epochs; default is 30

# Early stopping settings. See more details here: https://fastai1.fast.ai/callbacks.tracker.html#EarlyStoppingCallback ‘early.stopping.min_delta’: 0.0001, ‘early.stopping.patience’: 10,

‘smoothing’: If > 0, then use LabelSmoothingCrossEntropy loss function for binary/multi-class classification; otherwise use default loss function for this type of problem; default is 0.0. See: https://docs.fast.ai/layers.html#LabelSmoothingCrossEntropy

Ensemble Models¶

`BaggedEnsembleModel`	Bagged ensemble meta-model which fits a given model multiple times across different splits of the training data.
`StackerEnsembleModel`	Stack ensemble meta-model which functions identically to `BaggedEnsembleModel` with the additional capability to leverage base models.
`WeightedEnsembleModel`	Weighted ensemble meta-model that implements Ensemble Selection: https://www.cs.cornell.edu/~alexn/papers/shotgun.icml04.revised.rev2.pdf

BaggedEnsembleModel¶

class autogluon.core.models.BaggedEnsembleModel(model_base: autogluon.core.models.abstract.abstract_model.AbstractModel, random_state=0, **kwargs)[source]¶: Bagged ensemble meta-model which fits a given model multiple times across different splits of the training data.

StackerEnsembleModel¶

class autogluon.core.models.StackerEnsembleModel(base_model_names=None, base_models_dict=None, base_model_paths_dict=None, base_model_types_dict=None, base_model_types_inner_dict=None, base_model_performances_dict=None, **kwargs)[source]¶

Stack ensemble meta-model which functions identically to BaggedEnsembleModel with the additional capability to leverage base models.

By specifying base models during init, stacker models can use the base model predictions as features during training and inference.

This property allows for significantly improved model quality in many situations compared to non-stacking alternatives.

Stacker models can act as base models to other stacker models, enabling multi-layer stack ensembling.

WeightedEnsembleModel¶

class autogluon.core.models.WeightedEnsembleModel(**kwargs)[source]¶

Weighted ensemble meta-model that implements Ensemble Selection: https://www.cs.cornell.edu/~alexn/papers/shotgun.icml04.revised.rev2.pdf

A autogluon.core.models.GreedyWeightedEnsembleModel must be specified as the model_base to properly function.

Experimental Models¶

FastTextModel

Attributes

TextPredictionV1Model

Attributes

FastTextModel¶

class autogluon.tabular.models.FastTextModel(**kwargs)[source]¶

TextPredictionV1Model¶

class autogluon.tabular.models.TextPredictionV1Model(**kwargs)[source]¶