.. _sec_tabularprediction_text_multimodal: Explore Models for Data Tables with Text and Categorical Features ================================================================= We will introduce how to use AutoGluon to deal with tabular data that involves text and categorical features. This type of data, i.e., data which contains text and other features, is prevalent in real world applications. For example, when building a sentiment analysis model of users' tweets, we can not only use the raw text in the tweets but also other features such as the topic of the tweet and the user profile. In the following, we will investigate different ways to ensemble the state-of-the-art (pretrained) language models in AutoGluon TextPrediction with all the other models used in AutoGluon's TabularPredictor. For more details about the inner-working of the neural network architecture used in AutoGluon TextPrediction, you may refer to Section ":ref:`sec_textprediction_architecture`" in :ref:`sec_textprediction_heterogeneous`. .. code:: python %matplotlib inline import matplotlib.pyplot as plt import numpy as np import pandas as pd import pprint import random from autogluon.text import TextPrediction from autogluon.tabular import TabularPredictor import mxnet as mx np.random.seed(123) random.seed(123) mx.random.seed(123) Product Sentiment Analysis Dataset ---------------------------------- In the following, we will use the product sentiment analysis dataset from this `MachineHack hackathon `__. The goal of this task is to predict the user's sentiment towards a product given a review that is in raw text and the product's type, e.g., Tablet, Mobile, etc. We have split the original training data to be 90% for training and 10% for development. .. code:: python !mkdir -p product_sentiment_machine_hack !wget https://autogluon-text-data.s3.amazonaws.com/multimodal_text/machine_hack_product_sentiment/train.csv -O product_sentiment_machine_hack/train.csv !wget https://autogluon-text-data.s3.amazonaws.com/multimodal_text/machine_hack_product_sentiment/dev.csv -O product_sentiment_machine_hack/dev.csv !wget https://autogluon-text-data.s3.amazonaws.com/multimodal_text/machine_hack_product_sentiment/test.csv -O product_sentiment_machine_hack/test.csv .. parsed-literal:: :class: output --2021-02-23 19:24:46-- https://autogluon-text-data.s3.amazonaws.com/multimodal_text/machine_hack_product_sentiment/train.csv Resolving autogluon-text-data.s3.amazonaws.com (autogluon-text-data.s3.amazonaws.com)... 52.216.101.195 Connecting to autogluon-text-data.s3.amazonaws.com (autogluon-text-data.s3.amazonaws.com)|52.216.101.195|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 689486 (673K) [text/csv] Saving to: ‘product_sentiment_machine_hack/train.csv’ product_sentiment_m 100%[===================>] 673.33K 2.09MB/s in 0.3s 2021-02-23 19:24:46 (2.09 MB/s) - ‘product_sentiment_machine_hack/train.csv’ saved [689486/689486] --2021-02-23 19:24:47-- https://autogluon-text-data.s3.amazonaws.com/multimodal_text/machine_hack_product_sentiment/dev.csv Resolving autogluon-text-data.s3.amazonaws.com (autogluon-text-data.s3.amazonaws.com)... 52.216.101.195 Connecting to autogluon-text-data.s3.amazonaws.com (autogluon-text-data.s3.amazonaws.com)|52.216.101.195|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 75517 (74K) [text/csv] Saving to: ‘product_sentiment_machine_hack/dev.csv’ product_sentiment_m 100%[===================>] 73.75K --.-KB/s in 0.1s 2021-02-23 19:24:48 (508 KB/s) - ‘product_sentiment_machine_hack/dev.csv’ saved [75517/75517] --2021-02-23 19:24:48-- https://autogluon-text-data.s3.amazonaws.com/multimodal_text/machine_hack_product_sentiment/test.csv Resolving autogluon-text-data.s3.amazonaws.com (autogluon-text-data.s3.amazonaws.com)... 52.216.101.195 Connecting to autogluon-text-data.s3.amazonaws.com (autogluon-text-data.s3.amazonaws.com)|52.216.101.195|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 312194 (305K) [text/csv] Saving to: ‘product_sentiment_machine_hack/test.csv’ product_sentiment_m 100%[===================>] 304.88K 1.18MB/s in 0.3s 2021-02-23 19:24:49 (1.18 MB/s) - ‘product_sentiment_machine_hack/test.csv’ saved [312194/312194] .. code:: python feature_columns = ['Product_Description', 'Product_Type'] label = 'Sentiment' train_df = pd.read_csv('product_sentiment_machine_hack/train.csv') dev_df = pd.read_csv('product_sentiment_machine_hack/dev.csv') test_df = pd.read_csv('product_sentiment_machine_hack/test.csv') train_df = train_df[feature_columns + [label]] dev_df = dev_df[feature_columns + [label]] test_df = test_df[feature_columns] print('Number of training samples:', len(train_df)) print('Number of dev samples:', len(dev_df)) print('Number of test samples:', len(test_df)) .. parsed-literal:: :class: output Number of training samples: 5727 Number of dev samples: 637 Number of test samples: 2728 There are two features in the dataset: the users' review of the product and the product's type. Also, there are four classes and we have split the train and dev set based on stratified sampling. .. code:: python train_df .. raw:: html
Product_Description Product_Type Sentiment
0 Just heard that Apple is opening a store in do... 2 3
1 Tristan H, apture: being fast & iterative ... 9 2
2 Hey, you lucky dogs at #SXSW with iPads -- che... 6 3
3 RT @mention THIS was the best thing I saw at #... 9 2
4 Apple is opening temp retail store in Austin t... 2 3
... ... ... ...
5722 RT @mention At #SXSW and want to win an iPad? ... 9 2
5723 RT @mention I mean, sliced bread is great. But... 3 3
5724 Apple cited as the opposite of crowdsourcing -... 2 1
5725 Good CNN article on why #SXSW is important to ... 7 3
5726 ‰ÛÏ@mention Google to Launch Major New Social ... 3 3

5727 rows × 3 columns

.. code:: python dev_df .. raw:: html
Product_Description Product_Type Sentiment
0 Do it. RT @mention Come party w/ Google tonigh... 3 3
1 Line for iPads at #SXSW. Doesn't look too bad!... 6 3
2 First up: iPad Design Headaches (2 Tablets, Ca... 6 2
3 #SXSW: Mint Talks Mobile App Development Chall... 9 2
4 ‰ÛÏ@mention Apple store downtown Austin open t... 9 2
... ... ... ...
632 Bet on a GoogleBuzz-like #fail. People don't c... 9 0
633 RT > @mention Guy gets tattoo at SXSW so he... 9 2
634 #austinites #sxsw and check it out on #iphone ... 9 2
635 New @mention for iPhone+Android.. No more serv... 0 3
636 Why isn't news industry spending more R&D?... 9 2

637 rows × 3 columns

.. code:: python test_df .. raw:: html
Product_Description Product_Type
0 RT @mention Going to #SXSW? The new iPhone gui... 7
1 RT @mention 95% of iPhone and Droid apps have ... 9
2 RT @mention Thank you to @mention for letting ... 9
3 #Thanks @mention we're lovin' the @mention app... 7
4 At #sxsw? @mention / @mention wanna buy you a ... 9
... ... ...
2723 RT @mention eww and LOL. RT @mention Just saw ... 9
2724 Free 22 track #sxsw sampler album on iTunes. #... 9
2725 Setting up for the Google #gsdm #sxsw party. ... 3
2726 RT @mention #SXSW Come see Bitbop in Austin #g... 9
2727 So many Google products. isn't it time to tra... 5

2728 rows × 2 columns

What happens if we ignore all the non-text features? ---------------------------------------------------- First of all, let's try to ignore all the non-text features. We will use the TextPrediction model in AutoGluon to train a predictor with text data only. This will internally use the ELECTRA-small model as the backbone. As we can see, the result is not very good. .. code:: python predictor_text_only = TextPrediction.fit(train_df[['Product_Description', 'Sentiment']], label=label, time_limits=None, ngpus_per_trial=1, hyperparameters='default_no_hpo', eval_metric='accuracy', stopping_metric='accuracy', output_directory='ag_text_only') .. parsed-literal:: :class: output 2021-02-23 19:24:49,386 - autogluon.text.text_prediction.text_prediction - INFO - All Logs will be saved to ag_text_only/ag_text_prediction.log All Logs will be saved to ag_text_only/ag_text_prediction.log 2021-02-23 19:24:49,404 - autogluon.text.text_prediction.text_prediction - INFO - Train Dataset: Train Dataset: 2021-02-23 19:24:49,405 - autogluon.text.text_prediction.text_prediction - INFO - Columns: - Text( name="Product_Description" #total/missing=4581/0 length, min/avg/max=11/104.81707050862258/170 ) - Categorical( name="Sentiment" #total/missing=4581/0 num_class (total/non_special)=4/4 categories=[0, 1, 2, 3] freq=[87, 280, 2721, 1493] ) Columns: - Text( name="Product_Description" #total/missing=4581/0 length, min/avg/max=11/104.81707050862258/170 ) - Categorical( name="Sentiment" #total/missing=4581/0 num_class (total/non_special)=4/4 categories=[0, 1, 2, 3] freq=[87, 280, 2721, 1493] ) 2021-02-23 19:24:49,406 - autogluon.text.text_prediction.text_prediction - INFO - Tuning Dataset: Tuning Dataset: 2021-02-23 19:24:49,407 - autogluon.text.text_prediction.text_prediction - INFO - Columns: - Text( name="Product_Description" #total/missing=1146/0 length, min/avg/max=29/104.95986038394415/178 ) - Categorical( name="Sentiment" #total/missing=1146/0 num_class (total/non_special)=4/4 categories=[0, 1, 2, 3] freq=[13, 79, 667, 387] ) Columns: - Text( name="Product_Description" #total/missing=1146/0 length, min/avg/max=29/104.95986038394415/178 ) - Categorical( name="Sentiment" #total/missing=1146/0 num_class (total/non_special)=4/4 categories=[0, 1, 2, 3] freq=[13, 79, 667, 387] ) WARNING: changing multiprocessing start method to forkserver 2021-02-23 19:24:49,415 - autogluon.text.text_prediction.text_prediction - INFO - All Logs will be saved to ag_text_only/main.log All Logs will be saved to ag_text_only/main.log .. parsed-literal:: :class: output 0%| | 0/1 [00:00 0 .. parsed-literal:: :class: output █ .. parsed-literal:: :class: output 0.8482 = Validation accuracy score 12.06s = Training runtime 0.36s = Validation runtime Fitting model: KNeighborsUnif ... 0.8534 = Validation accuracy score 0.02s = Training runtime 0.02s = Validation runtime Fitting model: KNeighborsDist ... 0.8534 = Validation accuracy score 0.02s = Training runtime 0.02s = Validation runtime Fitting model: RandomForestGini ... 0.8709 = Validation accuracy score 1.03s = Training runtime 0.08s = Validation runtime Fitting model: RandomForestEntr ... 0.8709 = Validation accuracy score 1.04s = Training runtime 0.08s = Validation runtime Fitting model: ExtraTreesGini ... 0.8464 = Validation accuracy score 1.15s = Training runtime 0.08s = Validation runtime Fitting model: ExtraTreesEntr ... 0.8464 = Validation accuracy score 1.15s = Training runtime 0.08s = Validation runtime Fitting model: LightGBM ... 0.8831 = Validation accuracy score 1.08s = Training runtime 0.01s = Validation runtime Fitting model: LightGBMXT ... 0.8534 = Validation accuracy score 1.19s = Training runtime 0.01s = Validation runtime Fitting model: CatBoost ... 0.8726 = Validation accuracy score 1.04s = Training runtime 0.01s = Validation runtime Fitting model: XGBoost ... 0.8778 = Validation accuracy score 1.84s = Training runtime 0.02s = Validation runtime Fitting model: LightGBMLarge ... 0.8813 = Validation accuracy score 3.45s = Training runtime 0.01s = Validation runtime .. parsed-literal:: :class: output █ .. parsed-literal:: :class: output Fitting model: WeightedEnsemble_L1 ... 0.8883 = Validation accuracy score 0.37s = Training runtime 0.0s = Validation runtime AutoGluon training complete, total runtime = 44.09s ... TabularPredictor saved. To load, use: TabularPredictor.load("model1/") .. code:: python predictor_model1.leaderboard(dev_df, silent=True) .. parsed-literal:: :class: output █ .. raw:: html
model score_test score_val pred_time_test pred_time_val fit_time pred_time_test_marginal pred_time_val_marginal fit_time_marginal stack_level can_infer fit_order
0 WeightedEnsemble_L1 0.890110 0.888307 10.636745 0.579389 20.212959 0.005916 0.000465 0.374030 1 True 14
1 LightGBMLarge 0.886970 0.881326 0.015931 0.006296 3.445040 0.015931 0.006296 3.445040 0 True 13
2 CatBoost 0.886970 0.872600 0.018929 0.006263 1.039262 0.018929 0.006263 1.039262 0 True 11
3 RandomForestGini 0.886970 0.870855 0.115808 0.077118 1.025226 0.115808 0.077118 1.025226 0 True 5
4 XGBoost 0.885400 0.877836 0.079946 0.019477 1.838979 0.079946 0.019477 1.838979 0 True 12
5 RandomForestEntr 0.885400 0.870855 0.118663 0.075674 1.039675 0.118663 0.075674 1.039675 0 True 6
6 KNeighborsUnif 0.883830 0.853403 0.032106 0.019376 0.019685 0.032106 0.019376 0.019685 0 True 3
7 KNeighborsDist 0.883830 0.853403 0.041204 0.019170 0.019528 0.041204 0.019170 0.019528 0 True 4
8 LightGBM 0.882261 0.883072 0.011820 0.006165 1.076681 0.011820 0.006165 1.076681 0 True 9
9 NeuralNetMXNet 0.877551 0.872600 0.046874 0.034172 4.098654 0.046874 0.034172 4.098654 0 True 1
10 LightGBMXT 0.869702 0.853403 0.013507 0.008016 1.188565 0.013507 0.008016 1.188565 0 True 10
11 ExtraTreesEntr 0.868132 0.846422 0.150189 0.080109 1.153447 0.150189 0.080109 1.153447 0 True 8
12 ExtraTreesGini 0.866562 0.846422 0.175136 0.079671 1.148813 0.175136 0.079671 1.148813 0 True 7
13 NeuralNetFastAI 0.854003 0.848168 10.244841 0.364427 12.060060 10.244841 0.364427 12.060060 0 True 2
We can find that using product type (a categorical column) is quite essential for good performance in this task. The accuracy is much higher than the model trained with only text column. Model 2: Extract Text Embedding and Use Tabular Predictor --------------------------------------------------------- Our second attempt in combining text and other features is to use the trained TextPrediction model to extract embeddings and use TabularPredictor to build the predictor on top of the text embeddings. The AutoGluon TextPrediction model offers the ``extract_embedding()`` functionality (For more details, go to :ref:`sec_textprediction_extract_embedding`), so we are able to build a two-stage model. In the first stage, we use the text-only model to extract sentence embeddings. In the second stage, we use TabularPredictor to get the final model. .. code:: python train_sentence_embeddings = predictor_text_only.extract_embedding(train_df) dev_sentence_embeddings = predictor_text_only.extract_embedding(dev_df) print(train_sentence_embeddings) .. parsed-literal:: :class: output /var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/text/src/autogluon/text/text_prediction/dataset.py:321: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy df[col_name] = df[col_name].fillna('').apply(str) /var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/venv/lib/python3.8/site-packages/mxnet/gluon/block.py:995: UserWarning: The 3-th input to HybridBlock is not used by any computation. Is this intended? self._build_cache(*args) .. parsed-literal:: :class: output [[-0.061683 0.789052 -0.614252 -0.383968 ... -0.202426 1.144868 0.039427 -0.13562 ] [-0.269277 0.177113 -0.197375 -0.172229 ... -0.584261 0.625235 -0.355088 0.211777] [-0.83451 0.264692 -0.687199 0.234191 ... -0.605356 0.332709 0.029832 -0.160492] [-0.271491 0.149054 -0.506492 -0.09476 ... -0.809414 0.29643 0.31992 -0.096194] ... [-0.054611 -0.060668 -0.49929 -0.170906 ... -0.284143 0.805138 0.430891 -0.191869] [-0.277809 0.150503 -0.625322 -0.241075 ... -0.525608 0.909708 -0.124487 -0.031551] [-0.64508 -0.07616 -0.567146 0.192171 ... -1.166889 0.589877 0.242167 -0.549045] [-0.837447 0.347837 -0.525436 -0.440289 ... -0.742066 0.565927 -0.054493 -0.411046]] .. code:: python merged_train_data = train_df.join(pd.DataFrame(train_sentence_embeddings)) merged_dev_data = dev_df.join(pd.DataFrame(dev_sentence_embeddings)) print(merged_train_data) .. parsed-literal:: :class: output Product_Description Product_Type \ 0 Just heard that Apple is opening a store in do... 2 1 Tristan H, apture: being fast & iterative ... 9 2 Hey, you lucky dogs at #SXSW with iPads -- che... 6 3 RT @mention THIS was the best thing I saw at #... 9 4 Apple is opening temp retail store in Austin t... 2 ... ... ... 5722 RT @mention At #SXSW and want to win an iPad? ... 9 5723 RT @mention I mean, sliced bread is great. But... 3 5724 Apple cited as the opposite of crowdsourcing -... 2 5725 Good CNN article on why #SXSW is important to ... 7 5726 ‰ÛÏ@mention Google to Launch Major New Social ... 3 Sentiment 0 1 2 3 4 5 \ 0 3 -0.061683 0.789052 -0.614252 -0.383968 0.794183 -0.581301 1 2 -0.269277 0.177113 -0.197375 -0.172229 0.547932 -0.265157 2 3 -0.834510 0.264692 -0.687199 0.234191 1.018778 -0.689753 3 2 -0.271491 0.149054 -0.506492 -0.094760 0.644704 -0.521082 4 3 0.180634 0.237529 -0.668062 -0.119891 0.387544 -0.314172 ... ... ... ... ... ... ... ... 5722 2 -0.771102 0.284567 -0.285301 -0.168485 0.645094 0.036831 5723 3 -0.054611 -0.060668 -0.499290 -0.170906 -0.367915 0.331775 5724 1 -0.277809 0.150503 -0.625322 -0.241075 0.157916 0.060280 5725 3 -0.645080 -0.076160 -0.567146 0.192171 0.524227 -0.318997 5726 3 -0.837447 0.347837 -0.525436 -0.440289 0.857342 -0.283967 6 ... 246 247 248 249 250 \ 0 0.919014 ... -0.357193 0.390834 0.833298 -0.115630 0.786055 1 1.170505 ... -0.212880 0.123253 0.668844 -0.826962 0.772176 2 1.244796 ... -0.300159 0.729561 0.551330 -0.400327 0.671096 3 1.293411 ... -0.233223 0.269830 0.657665 -0.285630 0.512562 4 1.199535 ... -0.490884 0.116289 0.888642 -0.219426 0.773025 ... ... ... ... ... ... ... ... 5722 1.063322 ... -0.032328 0.582992 0.674876 -0.196406 0.172549 5723 0.314198 ... -0.639878 -0.178192 0.481809 -0.700696 -0.039856 5724 0.656375 ... -0.459748 -0.046848 0.820173 -0.563527 0.208965 5725 1.467807 ... -0.601973 0.303506 0.423291 -0.275483 0.578006 5726 1.174435 ... -0.239029 0.465101 0.134878 -0.096706 0.547499 251 252 253 254 255 0 0.243626 -0.202426 1.144868 0.039427 -0.135620 1 -0.053033 -0.584261 0.625235 -0.355088 0.211777 2 0.120588 -0.605356 0.332709 0.029832 -0.160492 3 0.320794 -0.809414 0.296430 0.319920 -0.096194 4 0.546610 -0.479898 1.102872 0.085935 -0.345065 ... ... ... ... ... ... 5722 0.049309 -0.042409 0.503684 -0.348889 -0.508032 5723 1.003379 -0.284143 0.805138 0.430891 -0.191869 5724 0.631439 -0.525608 0.909708 -0.124487 -0.031551 5725 0.421409 -1.166889 0.589877 0.242167 -0.549045 5726 0.293577 -0.742066 0.565927 -0.054493 -0.411046 [5727 rows x 259 columns] .. code:: python predictor_model2 = TabularPredictor(label=label, eval_metric='accuracy', path='model2').fit(merged_train_data) .. parsed-literal:: :class: output Beginning AutoGluon training ... AutoGluon will save models to "model2/" AutoGluon Version: 0.1.0b20210223 Train Data Rows: 5727 Train Data Columns: 258 Preprocessing data ... AutoGluon infers your prediction problem is: 'multiclass' (because dtype of label-column == int, but few unique label-values observed). 4 unique label values: [3, 2, 1, 0] If 'multiclass' is not the correct problem_type, please manually specify the problem_type argument in fit() (You may specify problem_type as one of: ['binary', 'multiclass', 'regression']) Train Data Class Count: 4 Using Feature Generators to preprocess the data ... Fitting AutoMLPipelineFeatureGenerator... Available Memory: 13929.9 MB Train Data (Original) Memory Usage: 6.87 MB (0.0% of available memory) Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features. Stage 1 Generators: Fitting AsTypeFeatureGenerator... Stage 2 Generators: Fitting FillNaFeatureGenerator... Stage 3 Generators: Fitting IdentityFeatureGenerator... Fitting CategoryFeatureGenerator... Fitting CategoryMemoryMinimizeFeatureGenerator... Fitting TextSpecialFeatureGenerator... Fitting BinnedFeatureGenerator... Fitting DropDuplicatesFeatureGenerator... Fitting TextNgramFeatureGenerator... Fitting CountVectorizer for text features: ['Product_Description'] CountVectorizer fit with vocabulary size = 725 Warning: Due to memory constraints, ngram feature count is being reduced. Allocate more memory to maximize model quality. Reducing Vectorizer vocab size from 725 to 303 to avoid OOM error Stage 4 Generators: Fitting DropUniqueFeatureGenerator... Types of features in original data (raw dtype, special dtypes): ('float', []) : 256 | ['0', '1', '2', '3', '4', ...] ('int', []) : 1 | ['Product_Type'] ('object', ['text']) : 1 | ['Product_Description'] Types of features in processed data (raw dtype, special dtypes): ('float', []) : 256 | ['0', '1', '2', '3', '4', ...] ('int', []) : 1 | ['Product_Type'] ('int', ['binned', 'text_special']) : 38 | ['Product_Description.char_count', 'Product_Description.word_count', 'Product_Description.capital_ratio', 'Product_Description.lower_ratio', 'Product_Description.digit_ratio', ...] ('int', ['text_ngram']) : 304 | ['__nlp__.11', '__nlp__.6th', '__nlp__.about', '__nlp__.all', '__nlp__.amp', ...] 2.4s = Fit runtime 258 features in original data used to generate 599 features in processed data. Train Data (Processed) Memory Usage: 7.89 MB (0.1% of available memory) Data preprocessing and feature engineering runtime = 2.48s ... AutoGluon will gauge predictive performance using evaluation metric: 'accuracy' To change this, specify the eval_metric argument of fit() Automatically generating train/validation split with holdout_frac=0.1, Train Rows: 5154, Val Rows: 573 Fitting model: NeuralNetMXNet ... 0.8918 = Validation accuracy score 6.3s = Training runtime 0.04s = Validation runtime Fitting model: NeuralNetFastAI ... .. parsed-literal:: :class: output █ .. parsed-literal:: :class: output 0.8534 = Validation accuracy score 18.82s = Training runtime 0.52s = Validation runtime Fitting model: KNeighborsUnif ... 0.8551 = Validation accuracy score 0.02s = Training runtime 0.13s = Validation runtime Fitting model: KNeighborsDist ... 0.8586 = Validation accuracy score 0.03s = Training runtime 0.12s = Validation runtime Fitting model: RandomForestGini ... 0.8691 = Validation accuracy score 2.91s = Training runtime 0.08s = Validation runtime Fitting model: RandomForestEntr ... 0.8551 = Validation accuracy score 5.16s = Training runtime 0.08s = Validation runtime Fitting model: ExtraTreesGini ... 0.8168 = Validation accuracy score 1.25s = Training runtime 0.08s = Validation runtime Fitting model: ExtraTreesEntr ... 0.8028 = Validation accuracy score 1.31s = Training runtime 0.08s = Validation runtime Fitting model: LightGBM ... 0.8935 = Validation accuracy score 9.69s = Training runtime 0.01s = Validation runtime Fitting model: LightGBMXT ... 0.8726 = Validation accuracy score 9.19s = Training runtime 0.01s = Validation runtime Fitting model: CatBoost ... 0.8883 = Validation accuracy score 19.0s = Training runtime 0.01s = Validation runtime Fitting model: XGBoost ... 0.8935 = Validation accuracy score 39.02s = Training runtime 0.05s = Validation runtime Fitting model: LightGBMLarge ... 0.8935 = Validation accuracy score 57.6s = Training runtime 0.02s = Validation runtime .. parsed-literal:: :class: output █ .. parsed-literal:: :class: output Fitting model: WeightedEnsemble_L1 ... 0.8988 = Validation accuracy score 0.38s = Training runtime 0.0s = Validation runtime AutoGluon training complete, total runtime = 186.44s ... TabularPredictor saved. To load, use: TabularPredictor.load("model2/") .. code:: python predictor_model2.leaderboard(merged_dev_data, silent=True) .. parsed-literal:: :class: output █ .. raw:: html
model score_test score_val pred_time_test pred_time_val fit_time pred_time_test_marginal pred_time_val_marginal fit_time_marginal stack_level can_infer fit_order
0 WeightedEnsemble_L1 0.893250 0.898778 10.744249 0.715136 98.361377 0.005639 0.000464 0.375998 1 True 14
1 LightGBM 0.891680 0.893543 0.029830 0.007839 9.685987 0.029830 0.007839 9.685987 0 True 9
2 CatBoost 0.886970 0.888307 0.028248 0.012147 19.002439 0.028248 0.012147 19.002439 0 True 11
3 NeuralNetMXNet 0.886970 0.891798 0.053362 0.042966 6.302544 0.053362 0.042966 6.302544 0 True 1
4 LightGBMLarge 0.886970 0.893543 0.059007 0.016558 57.602369 0.059007 0.016558 57.602369 0 True 13
5 XGBoost 0.886970 0.893543 0.160011 0.054713 39.018259 0.160011 0.054713 39.018259 0 True 12
6 LightGBMXT 0.872841 0.872600 0.018458 0.009431 9.185145 0.018458 0.009431 9.185145 0 True 10
7 KNeighborsUnif 0.855573 0.855148 0.153559 0.128035 0.021495 0.153559 0.128035 0.021495 0 True 3
8 KNeighborsDist 0.854003 0.858639 0.123584 0.121682 0.030962 0.123584 0.121682 0.030962 0 True 4
9 NeuralNetFastAI 0.850863 0.853403 10.367347 0.518267 18.815119 10.367347 0.518267 18.815119 0 True 2
10 RandomForestGini 0.830455 0.869110 0.101238 0.078978 2.912344 0.101238 0.078978 2.912344 0 True 5
11 RandomForestEntr 0.808477 0.855148 0.099813 0.078739 5.161032 0.099813 0.078739 5.161032 0 True 6
12 ExtraTreesGini 0.800628 0.816754 0.150204 0.082309 1.251960 0.150204 0.082309 1.251960 0 True 7
13 ExtraTreesEntr 0.778650 0.802792 0.126574 0.082750 1.306403 0.126574 0.082750 1.306403 0 True 8
The performance is better than the first model. Model 3: Use the Neural Network in AutoGluon-Text in Tabular Weighted Ensemble ------------------------------------------------------------------------------ Another option is to directly include the neural network in AutoGluon-Text as one candidate of TabularPredictor. We can do that now by changing the hyperparameters. Note that for the purpose of this tutorial, we are manually setting the ``hyperparameters`` and we will release some good pre-configurations soon. .. code:: python tabular_multimodel_hparam_v1 = { 'GBM': [{}, {'extra_trees': True, 'ag_args': {'name_suffix': 'XT'}}], 'CAT': {}, 'TEXT_NN_V1': {}, } predictor_model3 = TabularPredictor(label=label, eval_metric='accuracy', path='model3').fit( train_df, hyperparameters=tabular_multimodel_hparam_v1 ) .. parsed-literal:: :class: output Beginning AutoGluon training ... AutoGluon will save models to "model3/" AutoGluon Version: 0.1.0b20210223 Train Data Rows: 5727 Train Data Columns: 2 Preprocessing data ... AutoGluon infers your prediction problem is: 'multiclass' (because dtype of label-column == int, but few unique label-values observed). 4 unique label values: [3, 2, 1, 0] If 'multiclass' is not the correct problem_type, please manually specify the problem_type argument in fit() (You may specify problem_type as one of: ['binary', 'multiclass', 'regression']) Train Data Class Count: 4 Using Feature Generators to preprocess the data ... Fitting AutoMLPipelineFeatureGenerator... Available Memory: 13504.55 MB Train Data (Original) Memory Usage: 1.0 MB (0.0% of available memory) Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features. Stage 1 Generators: Fitting AsTypeFeatureGenerator... Stage 2 Generators: Fitting FillNaFeatureGenerator... Stage 3 Generators: Fitting IdentityFeatureGenerator... Fitting IdentityFeatureGenerator... Fitting RenameFeatureGenerator... Fitting CategoryFeatureGenerator... Fitting CategoryMemoryMinimizeFeatureGenerator... Fitting TextSpecialFeatureGenerator... Fitting BinnedFeatureGenerator... Fitting DropDuplicatesFeatureGenerator... Fitting TextNgramFeatureGenerator... Fitting CountVectorizer for text features: ['Product_Description'] CountVectorizer fit with vocabulary size = 725 Warning: Due to memory constraints, ngram feature count is being reduced. Allocate more memory to maximize model quality. Reducing Vectorizer vocab size from 725 to 271 to avoid OOM error Stage 4 Generators: Fitting DropUniqueFeatureGenerator... Types of features in original data (raw dtype, special dtypes): ('int', []) : 1 | ['Product_Type'] ('object', ['text']) : 1 | ['Product_Description'] Types of features in processed data (raw dtype, special dtypes): ('int', []) : 1 | ['Product_Type'] ('int', ['binned', 'text_special']) : 38 | ['Product_Description.char_count', 'Product_Description.word_count', 'Product_Description.capital_ratio', 'Product_Description.lower_ratio', 'Product_Description.digit_ratio', ...] ('int', ['text_ngram']) : 272 | ['__nlp__.about', '__nlp__.all', '__nlp__.amp', '__nlp__.an', '__nlp__.an ipad', ...] ('object', ['text']) : 1 | ['Product_Description_raw_text'] 2.1s = Fit runtime 2 features in original data used to generate 312 features in processed data. Train Data (Processed) Memory Usage: 2.8 MB (0.0% of available memory) Data preprocessing and feature engineering runtime = 2.13s ... AutoGluon will gauge predictive performance using evaluation metric: 'accuracy' To change this, specify the eval_metric argument of fit() Automatically generating train/validation split with holdout_frac=0.1, Train Rows: 5154, Val Rows: 573 Fitting model: LightGBM ... 0.8796 = Validation accuracy score 1.02s = Training runtime 0.01s = Validation runtime Fitting model: LightGBMXT ... 0.8586 = Validation accuracy score 1.22s = Training runtime 0.01s = Validation runtime Fitting model: CatBoost ... 0.8726 = Validation accuracy score 0.95s = Training runtime 0.02s = Validation runtime Fitting model: TextNeuralNetV1 ... All Logs will be saved to model3/models/TextNeuralNetV1/TextNeuralNetV1/main.log Starting Hyperparameter Tuning ... (num_trials=1) .. parsed-literal:: :class: output 0%| | 0/1 [00:00
model score_test score_val pred_time_test pred_time_val fit_time pred_time_test_marginal pred_time_val_marginal fit_time_marginal stack_level can_infer fit_order
0 WeightedEnsemble_L1 0.896389 0.893543 0.999798 0.672808 92.509537 0.009143 0.000531 0.149596 1 True 5
1 TextNeuralNetV1 0.888540 0.891798 0.960964 0.649250 90.191949 0.960964 0.649250 90.191949 0 True 4
2 LightGBM 0.886970 0.879581 0.008237 0.006089 1.020931 0.008237 0.006089 1.020931 0 True 1
3 CatBoost 0.886970 0.872600 0.017194 0.015831 0.948771 0.017194 0.015831 0.948771 0 True 3
4 LightGBMXT 0.868132 0.858639 0.012497 0.007196 1.219221 0.012497 0.007196 1.219221 0 True 2
Model 4: K-Fold Bagging and Stack Ensemble ------------------------------------------ A more advanced strategy is to use 5-fold bagging and call stack ensembling. This is expected to improve the final performance. .. code:: python predictor_model4 = TabularPredictor(label=label, eval_metric='accuracy', path='model4').fit( train_df, hyperparameters=tabular_multimodel_hparam_v1, num_bag_folds=5, num_stack_levels=1 ) .. parsed-literal:: :class: output Beginning AutoGluon training ... AutoGluon will save models to "model4/" AutoGluon Version: 0.1.0b20210223 Train Data Rows: 5727 Train Data Columns: 2 Preprocessing data ... AutoGluon infers your prediction problem is: 'multiclass' (because dtype of label-column == int, but few unique label-values observed). 4 unique label values: [3, 2, 1, 0] If 'multiclass' is not the correct problem_type, please manually specify the problem_type argument in fit() (You may specify problem_type as one of: ['binary', 'multiclass', 'regression']) Train Data Class Count: 4 Using Feature Generators to preprocess the data ... Fitting AutoMLPipelineFeatureGenerator... Available Memory: 13421.11 MB Train Data (Original) Memory Usage: 1.0 MB (0.0% of available memory) Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features. Stage 1 Generators: Fitting AsTypeFeatureGenerator... Stage 2 Generators: Fitting FillNaFeatureGenerator... Stage 3 Generators: Fitting IdentityFeatureGenerator... Fitting IdentityFeatureGenerator... Fitting RenameFeatureGenerator... Fitting CategoryFeatureGenerator... Fitting CategoryMemoryMinimizeFeatureGenerator... Fitting TextSpecialFeatureGenerator... Fitting BinnedFeatureGenerator... Fitting DropDuplicatesFeatureGenerator... Fitting TextNgramFeatureGenerator... Fitting CountVectorizer for text features: ['Product_Description'] CountVectorizer fit with vocabulary size = 725 Warning: Due to memory constraints, ngram feature count is being reduced. Allocate more memory to maximize model quality. Reducing Vectorizer vocab size from 725 to 265 to avoid OOM error Stage 4 Generators: Fitting DropUniqueFeatureGenerator... Types of features in original data (raw dtype, special dtypes): ('int', []) : 1 | ['Product_Type'] ('object', ['text']) : 1 | ['Product_Description'] Types of features in processed data (raw dtype, special dtypes): ('int', []) : 1 | ['Product_Type'] ('int', ['binned', 'text_special']) : 38 | ['Product_Description.char_count', 'Product_Description.word_count', 'Product_Description.capital_ratio', 'Product_Description.lower_ratio', 'Product_Description.digit_ratio', ...] ('int', ['text_ngram']) : 266 | ['__nlp__.about', '__nlp__.all', '__nlp__.amp', '__nlp__.an', '__nlp__.an ipad', ...] ('object', ['text']) : 1 | ['Product_Description_raw_text'] 2.2s = Fit runtime 2 features in original data used to generate 306 features in processed data. Train Data (Processed) Memory Usage: 2.76 MB (0.0% of available memory) Data preprocessing and feature engineering runtime = 2.19s ... AutoGluon will gauge predictive performance using evaluation metric: 'accuracy' To change this, specify the eval_metric argument of fit() Fitting model: LightGBM_BAG_L0 ... 0.8797 = Validation accuracy score 8.12s = Training runtime 0.07s = Validation runtime Fitting model: LightGBMXT_BAG_L0 ... 0.8598 = Validation accuracy score 7.38s = Training runtime 0.07s = Validation runtime Fitting model: CatBoost_BAG_L0 ... 0.8745 = Validation accuracy score 4.73s = Training runtime 0.08s = Validation runtime Fitting model: TextNeuralNetV1_BAG_L0 ... All Logs will be saved to model4/models/TextNeuralNetV1_BAG_L0/S1F1/S1F1/main.log Starting Hyperparameter Tuning ... (num_trials=1) .. parsed-literal:: :class: output 0%| | 0/1 [00:00
model score_test score_val pred_time_test pred_time_val fit_time pred_time_test_marginal pred_time_val_marginal fit_time_marginal stack_level can_infer fit_order
0 TextNeuralNetV1_BAG_L0 0.897959 0.883010 4.781237 35.009925 503.525820 4.781237 35.009925 503.525820 0 True 4
1 WeightedEnsemble_L1 0.897959 0.883010 4.783215 35.010866 503.988360 0.001978 0.000941 0.462540 1 True 5
2 CatBoost_BAG_L1 0.896389 0.884058 5.075670 35.320508 535.708009 0.046778 0.094063 11.955566 1 True 8
3 LightGBM_BAG_L1 0.893250 0.884582 5.080883 35.270539 533.896461 0.051992 0.044093 10.144018 1 True 6
4 WeightedEnsemble_L2 0.893250 0.884582 5.082560 35.271442 534.292905 0.001677 0.000904 0.396444 2 True 9
5 LightGBMXT_BAG_L1 0.893250 0.880915 5.109002 35.294523 533.344140 0.080111 0.068078 9.591697 1 True 7
6 CatBoost_BAG_L0 0.886970 0.874454 0.042751 0.083205 4.729065 0.042751 0.083205 4.729065 0 True 3
7 LightGBM_BAG_L0 0.886970 0.879693 0.118651 0.068003 8.121579 0.118651 0.068003 8.121579 0 True 1
8 LightGBMXT_BAG_L0 0.877551 0.859787 0.086252 0.065313 7.375978 0.086252 0.065313 7.375978 0 True 2
Model 5: Multimodal embedding + TabularPredictor ------------------------------------------------ Also, since the neural network in text prediction can directly handle multi-modal data, we can fit a model with TextPrediction first and then use that as an embedding extractor. This can be viewed as an improved version of Model-2. .. code:: python predictor_text_multimodal = TextPrediction.fit(train_df, label=label, time_limits=None, eval_metric='accuracy', stopping_metric='accuracy', hyperparameters='default_no_hpo', output_directory='predictor_text_multimodal') train_sentence_multimodal_embeddings = predictor_text_multimodal.extract_embedding(train_df) dev_sentence_multimodal_embeddings = predictor_text_multimodal.extract_embedding(dev_df) predictor_model5 = TabularPredictor(label=label, eval_metric='accuracy', path='model5').fit(train_df) .. parsed-literal:: :class: output 2021-02-23 19:42:14,695 - autogluon.text.text_prediction.text_prediction - INFO - All Logs will be saved to predictor_text_multimodal/ag_text_prediction.log All Logs will be saved to predictor_text_multimodal/ag_text_prediction.log 2021-02-23 19:42:14,715 - autogluon.text.text_prediction.text_prediction - INFO - Train Dataset: Train Dataset: 2021-02-23 19:42:14,716 - autogluon.text.text_prediction.text_prediction - INFO - Columns: - Text( name="Product_Description" #total/missing=4581/0 length, min/avg/max=11/104.7640253219821/178 ) - Categorical( name="Product_Type" #total/missing=4581/0 num_class (total/non_special)=10/10 categories=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9] freq=[38, 43, 336, 218, 15, 152, 474, 233, 142, 2930] ) - Categorical( name="Sentiment" #total/missing=4581/0 num_class (total/non_special)=4/4 categories=[0, 1, 2, 3] freq=[75, 276, 2717, 1513] ) Columns: - Text( name="Product_Description" #total/missing=4581/0 length, min/avg/max=11/104.7640253219821/178 ) - Categorical( name="Product_Type" #total/missing=4581/0 num_class (total/non_special)=10/10 categories=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9] freq=[38, 43, 336, 218, 15, 152, 474, 233, 142, 2930] ) - Categorical( name="Sentiment" #total/missing=4581/0 num_class (total/non_special)=4/4 categories=[0, 1, 2, 3] freq=[75, 276, 2717, 1513] ) 2021-02-23 19:42:14,717 - autogluon.text.text_prediction.text_prediction - INFO - Tuning Dataset: Tuning Dataset: 2021-02-23 19:42:14,718 - autogluon.text.text_prediction.text_prediction - INFO - Columns: - Text( name="Product_Description" #total/missing=1146/0 length, min/avg/max=32/105.1719022687609/159 ) - Categorical( name="Product_Type" #total/missing=1146/0 num_class (total/non_special)=10/10 categories=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9] freq=[8, 13, 75, 55, 2, 42, 123, 62, 36, 730] ) - Categorical( name="Sentiment" #total/missing=1146/0 num_class (total/non_special)=4/4 categories=[0, 1, 2, 3] freq=[25, 83, 671, 367] ) Columns: - Text( name="Product_Description" #total/missing=1146/0 length, min/avg/max=32/105.1719022687609/159 ) - Categorical( name="Product_Type" #total/missing=1146/0 num_class (total/non_special)=10/10 categories=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9] freq=[8, 13, 75, 55, 2, 42, 123, 62, 36, 730] ) - Categorical( name="Sentiment" #total/missing=1146/0 num_class (total/non_special)=4/4 categories=[0, 1, 2, 3] freq=[25, 83, 671, 367] ) Label columns=['Sentiment'], Feature columns=['Product_Description', 'Product_Type'], Problem types=['classification'], Label shapes=[4] Eval Metric=accuracy, Stop Metric=accuracy, Log Metrics=['acc', 'log_loss', 'accuracy'] 2021-02-23 19:42:14,722 - autogluon.text.text_prediction.text_prediction - INFO - All Logs will be saved to predictor_text_multimodal/main.log All Logs will be saved to predictor_text_multimodal/main.log Starting Hyperparameter Tuning ... (num_trials=1) .. parsed-literal:: :class: output 0%| | 0/1 [00:00
model score_test score_val pred_time_test pred_time_val fit_time pred_time_test_marginal pred_time_val_marginal fit_time_marginal stack_level can_infer fit_order
0 CatBoost 0.886970 0.872600 0.014327 0.005457 0.895682 0.014327 0.005457 0.895682 0 True 11
1 RandomForestGini 0.886970 0.879581 0.120903 0.079546 0.959055 0.120903 0.079546 0.959055 0 True 5
2 WeightedEnsemble_L1 0.886970 0.890052 10.760290 0.648267 29.524343 0.006635 0.000556 0.402977 1 True 14
3 LightGBM 0.885400 0.877836 0.012522 0.006084 1.307731 0.012522 0.006084 1.307731 0 True 9
4 NeuralNetMXNet 0.885400 0.874346 0.043936 0.037668 5.120788 0.043936 0.037668 5.120788 0 True 1
5 LightGBMLarge 0.883830 0.884817 0.028177 0.007567 4.015435 0.028177 0.007567 4.015435 0 True 13
6 KNeighborsUnif 0.883830 0.853403 0.031159 0.019027 0.017875 0.031159 0.019027 0.017875 0 True 3
7 KNeighborsDist 0.883830 0.853403 0.040395 0.017965 0.017411 0.040395 0.017965 0.017411 0 True 4
8 XGBoost 0.883830 0.877836 0.113737 0.011080 2.421276 0.113737 0.011080 2.421276 0 True 12
9 RandomForestEntr 0.883830 0.876091 0.119312 0.080413 0.995966 0.119312 0.080413 0.995966 0 True 6
10 ExtraTreesGini 0.875981 0.853403 0.177299 0.080179 1.055089 0.177299 0.080179 1.055089 0 True 7
11 NeuralNetFastAI 0.874411 0.849913 10.059017 0.280836 17.511071 10.059017 0.280836 17.511071 0 True 2
12 LightGBMXT 0.871272 0.858639 0.012264 0.006685 1.283650 0.012264 0.006685 1.283650 0 True 10
13 ExtraTreesEntr 0.869702 0.849913 0.169439 0.082980 1.080275 0.169439 0.082980 1.080275 0 True 8
Model 6: Use a larger backbone ------------------------------ Now, we will choose to use a larger backbone: ELECTRA-base. We will find that the performance gets improved after we change to use a larger backbone model. However, we should notice that the training time will be longer and the inference cost will be higher. .. code:: python from autogluon.text.text_prediction.text_prediction import ag_text_prediction_params from autogluon.tabular.configs.hyperparameter_configs import get_hyperparameter_config import copy text_nn_params = ag_text_prediction_params.create('default_electra_base_no_hpo') tabular_multimodel_hparam_v2 = { 'GBM': [{}, {'extra_trees': True, 'ag_args': {'name_suffix': 'XT'}}], 'CAT': {}, 'TEXT_NN_V1': text_nn_params, } predictor_model6 = TabularPredictor(label=label, eval_metric='accuracy', path='model6').fit( train_df, hyperparameters=tabular_multimodel_hparam_v2 ) .. parsed-literal:: :class: output Beginning AutoGluon training ... AutoGluon will save models to "model6/" AutoGluon Version: 0.1.0b20210223 Train Data Rows: 5727 Train Data Columns: 2 Preprocessing data ... AutoGluon infers your prediction problem is: 'multiclass' (because dtype of label-column == int, but few unique label-values observed). 4 unique label values: [3, 2, 1, 0] If 'multiclass' is not the correct problem_type, please manually specify the problem_type argument in fit() (You may specify problem_type as one of: ['binary', 'multiclass', 'regression']) Train Data Class Count: 4 Using Feature Generators to preprocess the data ... Fitting AutoMLPipelineFeatureGenerator... Available Memory: 13221.79 MB Train Data (Original) Memory Usage: 1.0 MB (0.0% of available memory) Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features. Stage 1 Generators: Fitting AsTypeFeatureGenerator... Stage 2 Generators: Fitting FillNaFeatureGenerator... Stage 3 Generators: Fitting IdentityFeatureGenerator... Fitting IdentityFeatureGenerator... Fitting RenameFeatureGenerator... Fitting CategoryFeatureGenerator... Fitting CategoryMemoryMinimizeFeatureGenerator... Fitting TextSpecialFeatureGenerator... Fitting BinnedFeatureGenerator... Fitting DropDuplicatesFeatureGenerator... Fitting TextNgramFeatureGenerator... Fitting CountVectorizer for text features: ['Product_Description'] CountVectorizer fit with vocabulary size = 725 Warning: Due to memory constraints, ngram feature count is being reduced. Allocate more memory to maximize model quality. Reducing Vectorizer vocab size from 725 to 256 to avoid OOM error Stage 4 Generators: Fitting DropUniqueFeatureGenerator... Types of features in original data (raw dtype, special dtypes): ('int', []) : 1 | ['Product_Type'] ('object', ['text']) : 1 | ['Product_Description'] Types of features in processed data (raw dtype, special dtypes): ('int', []) : 1 | ['Product_Type'] ('int', ['binned', 'text_special']) : 38 | ['Product_Description.char_count', 'Product_Description.word_count', 'Product_Description.capital_ratio', 'Product_Description.lower_ratio', 'Product_Description.digit_ratio', ...] ('int', ['text_ngram']) : 257 | ['__nlp__.about', '__nlp__.all', '__nlp__.amp', '__nlp__.an', '__nlp__.an ipad', ...] ('object', ['text']) : 1 | ['Product_Description_raw_text'] 2.2s = Fit runtime 2 features in original data used to generate 297 features in processed data. Train Data (Processed) Memory Usage: 2.71 MB (0.0% of available memory) Data preprocessing and feature engineering runtime = 2.21s ... AutoGluon will gauge predictive performance using evaluation metric: 'accuracy' To change this, specify the eval_metric argument of fit() Automatically generating train/validation split with holdout_frac=0.1, Train Rows: 5154, Val Rows: 573 Fitting model: LightGBM ... 0.8778 = Validation accuracy score 1.16s = Training runtime 0.01s = Validation runtime Fitting model: LightGBMXT ... 0.8569 = Validation accuracy score 1.37s = Training runtime 0.02s = Validation runtime Fitting model: CatBoost ... 0.8726 = Validation accuracy score 0.9s = Training runtime 0.02s = Validation runtime Fitting model: TextNeuralNetV1 ... All Logs will be saved to model6/models/TextNeuralNetV1/TextNeuralNetV1/main.log Starting Hyperparameter Tuning ... (num_trials=1) .. parsed-literal:: :class: output 0%| | 0/1 [00:00
model score_test score_val pred_time_test pred_time_val fit_time pred_time_test_marginal pred_time_val_marginal fit_time_marginal stack_level can_infer fit_order
0 TextNeuralNetV1 0.899529 0.916230 3.329292 2.263031 319.174685 3.329292 2.263031 319.174685 0 True 4
1 WeightedEnsemble_L1 0.899529 0.916230 3.337787 2.263554 319.317756 0.008495 0.000524 0.143071 1 True 5
2 CatBoost 0.886970 0.872600 0.015280 0.015155 0.900599 0.015280 0.015155 0.900599 0 True 3
3 LightGBM 0.885400 0.877836 0.007218 0.006019 1.159134 0.007218 0.006019 1.159134 0 True 1
4 LightGBMXT 0.868132 0.856894 0.013989 0.017886 1.365870 0.013989 0.017886 1.365870 0 True 2
Major Takeaways --------------- After performing these comparisons, we have the following takeaways: - The multimodal text neural network structure used in TextPrediction is a good for dealing with tabular data with text and categorical features. - K-fold bagging / stacking and weighted ensemble are helpful - We need a larger backbone. This aligns with the observation in recent papers, e.g., `Scaling Laws for Autoregressive Generative Modeling `__.