Version 0.7.0¶
We’re happy to announce the AutoGluon 0.7 release. This release contains a new experimental module autogluon.eda for exploratory
data analysis. AutoGluon 0.7 offers conda-forge support, enhancements to Tabular, MultiModal, and Time Series
modules, and many quality of life improvements and fixes.
As always, only load previously trained models using the same version of AutoGluon that they were originally trained on. Loading models trained in different versions of AutoGluon is not supported.
This release contains 170 commits from 19 contributors!
See the full commit change-log here: https://github.com/autogluon/autogluon/compare/v0.6.2…v0.7.0
Special thanks to @MountPOTATO who is a first time contributor to AutoGluon this release!
Full Contributor List (ordered by # of commits):
@Innixma, @zhiqiangdon, @yinweisu, @gradientsky, @shchur, @sxjscience, @FANGAreNotGnu, @yongxinw, @cheungdaven, @liangfu, @tonyhoo, @bryanyzhu, @suzhoum, @canerturkmen, @giswqs, @gidler, @yzhliu, @Linuxdex and @MountPOTATO
AutoGluon 0.7 supports Python versions 3.8, 3.9, and 3.10. Python 3.7 is no longer supported as of this release.
Changes¶
NEW: AutoGluon available on conda-forge¶
As of AutoGluon 0.7 release, AutoGluon is now available on conda-forge (#612)!
Kudos to the following individuals for making this happen:
@giswqs for leading the entire effort and being a 1-man army driving this forward.
@h-vetinari for providing excellent advice for working with conda-forge and some truly exceptional feedback.
@arturdaraujo, @PertuyF, @ngam and @priyanga24 for their encouragement, suggestions, and feedback.
The conda-forge team for their prompt and effective reviews of our (many) PRs.
@gradientsky for testing M1 support during the early stages.
@sxjscience, @zhiqiangdon, @canerturkmen, @shchur, and @Innixma for helping upgrade our downstream dependency versions to be compatible with conda.
Everyone else who has supported this process either directly or indirectly.
NEW: autogluon.eda (Exploratory Data Analysis)¶
We are happy to announce AutoGluon Exploratory Data Analysis (EDA) toolkit. Starting with v0.7, AutoGluon now can analyze and visualize different aspects of data and models. We invite you to explore the following tutorials: Quick Fit, Dataset Overview, Target Variable Analysis, Covariate Shift Analysis. Other materials can be found in EDA Section of the website.
General¶
AutoMM¶
AutoGluon MultiModal (a.k.a AutoMM) supports three new features: 1) document classification; 2) named entity recognition for Chinese language; 3) few shot learning with SVM
Meanwhile, we removed autogluon.text and autogluon.vision as these features are supported in autogluon.multimodal
New features¶
Document Classification
NER for Chinese Language
Support Chinese named entity recognition
See tutorials
Contributors and commits: @cheungdaven (#2676, #2709)
Few Shot Learning with SVM
Other Enhancements¶
Add matcher realtime inference support. @zhiqiangdon (#2613)
Add matcher HPO. @zhiqiangdon (#2619)
Add YOLOX models (small, large, and x-large) and update presets for object detection. @FANGAreNotGnu (#2644, #2867, #2927, #2933)
Add AutoMM presets @zhiqiangdon. (#2620, #2749, #2839)
Add model dump for models from HuggingFace, timm and mmdet. @suzhoum @FANGAreNotGnu @liangfu (#2682, #2700, #2737, #2840)
Bug fix / refactor for NER. @cheungdaven (#2659, #2696, #2759, #2773)
MultiModalPredictor import time reduction. @sxjscience (#2718)
Bug Fixes / Code and Doc Improvements¶
NER example with visualization. @sxjscience (#2698)
Bug fixes / Code and Doc Improvements. @sxjscience @tonyhoo @giswqs (#2708, #2714, #2739, #2782, #2787, #2857, #2818, #2858, #2859, #2891, #2918, #2940, #2906, #2907)
Support of Label-Studio file export in AutoMM and added examples. @MountPOTATO (#2615)
Added example of few-shot memory bank model with feature extraction based on Tip-adapter. @Linuxdex (#2822)
Deprecations¶
autogluon.visionnamespace is deprecated. @bryanyzhu (#2790, #2819, #2832)autogluon.textnamespace is deprecated. @sxjscience @Innixma (#2695, #2847)
Tabular¶
TabularPredictor’s inference speed has been heavily optimized, with an average 250% speedup for real-time inference. This means that TabularPredictor can satisfy <10 ms end-to-end latency on many datasets when using
infer_limit, and thehigh_qualitypreset can satisfy <100 ms end-to-end latency on many datasets by default.TabularPredictor’s
"multimodal"hyperparameter preset now leverages the full capabilities of MultiModalPredictor, resulting in stronger performance on datasets containing a mix of tabular, image, and text features.
Performance Improvements¶
Upgraded versions of all dependency packages to use the latest releases. @Innixma (#2823, #2829, #2834, #2887, #2915)
Accelerated ensemble inference speed by 150% by removing TorchThreadManager context switching. @liangfu (#2472)
Accelerated FastAI neural network inference speed by 100x+ and training speed by 10x on datasets with many features. @Innixma (#2909)
(From 0.6.1) Avoid unnecessary DataFrame copies to accelerate feature preprocessing by 25%. @liangfu (#2532)
(From 0.6.1) Refactor
NN_TORCHmodel to be dataset iterable, leading to a 100% inference speedup. @liangfu (#2395)MultiModalPredictor is now used as a member of the ensemble when
TabularPredictor.fitis passedhyperparameters="multimodal". @Innixma (#2890)
API Enhancements¶
Deprecations¶
Bug Fixes / Doc Improvements¶
Fixed incorrect time_limit estimation in
NN_TORCHmodel. @Innixma (#2909)Fixed error when fitting with only text features. @Innixma (#2705)
Fixed error when
calibrate=True, use_bag_holdout=TrueinTabularPredictor.fit. @Innixma (#2715)Fixed error when tuning
n_estimatorswith RandomForest / ExtraTrees models. @Innixma (#2735)Fixed missing onnxruntime dependency on Linux/MacOS when installing optional dependency
skl2onnx. @liangfu (#2923)Fixed edge-case RandomForest error on Windows. @yinweisu (#2851)
Added
compile_modelsto the deployment tutorial. @liangfu (#2717)
autogluon.timeseries¶
New features¶
TimeSeriesPredictornow supports past covariates (a.k.a.dynamic features or related time series which is not known for time steps to be predicted). @shchur (#2665, #2680)New models from StatsForecast got introduced in
TimeSeriesPredictorfor various presets (medium_quality,high_qualityandbest_quality). @shchur (#2758)Support missing value imputation for TimeSeriesDataFrame which allows users to customize filling logics for missing values and fill gaps in an irregular sampled times series. @shchur (#2781)
Improve quantile forecasting performance of the AutoGluon-Tabular forecaster using the empirical noise distribution. @shchur (#2740)