Explore Models for Data Tables with Text and Categorical Features¶
We will introduce how to use AutoGluon to deal with tabular data that involves text and categorical features. This type of data, i.e., data which contains text and other features, is prevalent in real world applications. For example, when building a sentiment analysis model of users’ tweets, we can not only use the raw text in the tweets but also other features such as the topic of the tweet and the user profile. In the following, we will investigate different ways to ensemble the state-of-the-art (pretrained) language models in AutoGluon TextPrediction with all the other models used in AutoGluon’s TabularPredictor. For more details about the inner-working of the neural network architecture used in AutoGluon TextPrediction, you may refer to Section “What’s happening inside?” in Text Prediction - Heterogeneous Data Types.
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import pprint
import random
from autogluon.text import TextPrediction
from autogluon.tabular import TabularPredictor
import mxnet as mx
np.random.seed(123)
random.seed(123)
mx.random.seed(123)
Product Sentiment Analysis Dataset¶
In the following, we will use the product sentiment analysis dataset from this MachineHack hackathon. The goal of this task is to predict the user’s sentiment towards a product given a review that is in raw text and the product’s type, e.g., Tablet, Mobile, etc. We have split the original training data to be 90% for training and 10% for development.
!mkdir -p product_sentiment_machine_hack
!wget https://autogluon-text-data.s3.amazonaws.com/multimodal_text/machine_hack_product_sentiment/train.csv -O product_sentiment_machine_hack/train.csv
!wget https://autogluon-text-data.s3.amazonaws.com/multimodal_text/machine_hack_product_sentiment/dev.csv -O product_sentiment_machine_hack/dev.csv
!wget https://autogluon-text-data.s3.amazonaws.com/multimodal_text/machine_hack_product_sentiment/test.csv -O product_sentiment_machine_hack/test.csv
--2021-02-23 19:24:46-- https://autogluon-text-data.s3.amazonaws.com/multimodal_text/machine_hack_product_sentiment/train.csv
Resolving autogluon-text-data.s3.amazonaws.com (autogluon-text-data.s3.amazonaws.com)... 52.216.101.195
Connecting to autogluon-text-data.s3.amazonaws.com (autogluon-text-data.s3.amazonaws.com)|52.216.101.195|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 689486 (673K) [text/csv]
Saving to: ‘product_sentiment_machine_hack/train.csv’
product_sentiment_m 100%[===================>] 673.33K 2.09MB/s in 0.3s
2021-02-23 19:24:46 (2.09 MB/s) - ‘product_sentiment_machine_hack/train.csv’ saved [689486/689486]
--2021-02-23 19:24:47-- https://autogluon-text-data.s3.amazonaws.com/multimodal_text/machine_hack_product_sentiment/dev.csv
Resolving autogluon-text-data.s3.amazonaws.com (autogluon-text-data.s3.amazonaws.com)... 52.216.101.195
Connecting to autogluon-text-data.s3.amazonaws.com (autogluon-text-data.s3.amazonaws.com)|52.216.101.195|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 75517 (74K) [text/csv]
Saving to: ‘product_sentiment_machine_hack/dev.csv’
product_sentiment_m 100%[===================>] 73.75K --.-KB/s in 0.1s
2021-02-23 19:24:48 (508 KB/s) - ‘product_sentiment_machine_hack/dev.csv’ saved [75517/75517]
--2021-02-23 19:24:48-- https://autogluon-text-data.s3.amazonaws.com/multimodal_text/machine_hack_product_sentiment/test.csv
Resolving autogluon-text-data.s3.amazonaws.com (autogluon-text-data.s3.amazonaws.com)... 52.216.101.195
Connecting to autogluon-text-data.s3.amazonaws.com (autogluon-text-data.s3.amazonaws.com)|52.216.101.195|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 312194 (305K) [text/csv]
Saving to: ‘product_sentiment_machine_hack/test.csv’
product_sentiment_m 100%[===================>] 304.88K 1.18MB/s in 0.3s
2021-02-23 19:24:49 (1.18 MB/s) - ‘product_sentiment_machine_hack/test.csv’ saved [312194/312194]
feature_columns = ['Product_Description', 'Product_Type']
label = 'Sentiment'
train_df = pd.read_csv('product_sentiment_machine_hack/train.csv')
dev_df = pd.read_csv('product_sentiment_machine_hack/dev.csv')
test_df = pd.read_csv('product_sentiment_machine_hack/test.csv')
train_df = train_df[feature_columns + [label]]
dev_df = dev_df[feature_columns + [label]]
test_df = test_df[feature_columns]
print('Number of training samples:', len(train_df))
print('Number of dev samples:', len(dev_df))
print('Number of test samples:', len(test_df))
Number of training samples: 5727
Number of dev samples: 637
Number of test samples: 2728
There are two features in the dataset: the users’ review of the product and the product’s type. Also, there are four classes and we have split the train and dev set based on stratified sampling.
train_df
Product_Description | Product_Type | Sentiment | |
---|---|---|---|
0 | Just heard that Apple is opening a store in do... | 2 | 3 |
1 | Tristan H, apture: being fast & iterative ... | 9 | 2 |
2 | Hey, you lucky dogs at #SXSW with iPads -- che... | 6 | 3 |
3 | RT @mention THIS was the best thing I saw at #... | 9 | 2 |
4 | Apple is opening temp retail store in Austin t... | 2 | 3 |
... | ... | ... | ... |
5722 | RT @mention At #SXSW and want to win an iPad? ... | 9 | 2 |
5723 | RT @mention I mean, sliced bread is great. But... | 3 | 3 |
5724 | Apple cited as the opposite of crowdsourcing -... | 2 | 1 |
5725 | Good CNN article on why #SXSW is important to ... | 7 | 3 |
5726 | ÛÏ@mention Google to Launch Major New Social ... | 3 | 3 |
5727 rows × 3 columns
dev_df
Product_Description | Product_Type | Sentiment | |
---|---|---|---|
0 | Do it. RT @mention Come party w/ Google tonigh... | 3 | 3 |
1 | Line for iPads at #SXSW. Doesn't look too bad!... | 6 | 3 |
2 | First up: iPad Design Headaches (2 Tablets, Ca... | 6 | 2 |
3 | #SXSW: Mint Talks Mobile App Development Chall... | 9 | 2 |
4 | ÛÏ@mention Apple store downtown Austin open t... | 9 | 2 |
... | ... | ... | ... |
632 | Bet on a GoogleBuzz-like #fail. People don't c... | 9 | 0 |
633 | RT > @mention Guy gets tattoo at SXSW so he... | 9 | 2 |
634 | #austinites #sxsw and check it out on #iphone ... | 9 | 2 |
635 | New @mention for iPhone+Android.. No more serv... | 0 | 3 |
636 | Why isn't news industry spending more R&D?... | 9 | 2 |
637 rows × 3 columns
test_df
Product_Description | Product_Type | |
---|---|---|
0 | RT @mention Going to #SXSW? The new iPhone gui... | 7 |
1 | RT @mention 95% of iPhone and Droid apps have ... | 9 |
2 | RT @mention Thank you to @mention for letting ... | 9 |
3 | #Thanks @mention we're lovin' the @mention app... | 7 |
4 | At #sxsw? @mention / @mention wanna buy you a ... | 9 |
... | ... | ... |
2723 | RT @mention eww and LOL. RT @mention Just saw ... | 9 |
2724 | Free 22 track #sxsw sampler album on iTunes. #... | 9 |
2725 | Setting up for the Google #gsdm #sxsw party. ... | 3 |
2726 | RT @mention #SXSW Come see Bitbop in Austin #g... | 9 |
2727 | So many Google products. isn't it time to tra... | 5 |
2728 rows × 2 columns
What happens if we ignore all the non-text features?¶
First of all, let’s try to ignore all the non-text features. We will use the TextPrediction model in AutoGluon to train a predictor with text data only. This will internally use the ELECTRA-small model as the backbone. As we can see, the result is not very good.
predictor_text_only = TextPrediction.fit(train_df[['Product_Description', 'Sentiment']],
label=label,
time_limits=None,
ngpus_per_trial=1,
hyperparameters='default_no_hpo',
eval_metric='accuracy',
stopping_metric='accuracy',
output_directory='ag_text_only')
2021-02-23 19:24:49,386 - autogluon.text.text_prediction.text_prediction - INFO - All Logs will be saved to ag_text_only/ag_text_prediction.log
All Logs will be saved to ag_text_only/ag_text_prediction.log
2021-02-23 19:24:49,404 - autogluon.text.text_prediction.text_prediction - INFO - Train Dataset:
Train Dataset:
2021-02-23 19:24:49,405 - autogluon.text.text_prediction.text_prediction - INFO - Columns:
- Text(
name="Product_Description"
#total/missing=4581/0
length, min/avg/max=11/104.81707050862258/170
)
- Categorical(
name="Sentiment"
#total/missing=4581/0
num_class (total/non_special)=4/4
categories=[0, 1, 2, 3]
freq=[87, 280, 2721, 1493]
)
Columns:
- Text(
name="Product_Description"
#total/missing=4581/0
length, min/avg/max=11/104.81707050862258/170
)
- Categorical(
name="Sentiment"
#total/missing=4581/0
num_class (total/non_special)=4/4
categories=[0, 1, 2, 3]
freq=[87, 280, 2721, 1493]
)
2021-02-23 19:24:49,406 - autogluon.text.text_prediction.text_prediction - INFO - Tuning Dataset:
Tuning Dataset:
2021-02-23 19:24:49,407 - autogluon.text.text_prediction.text_prediction - INFO - Columns:
- Text(
name="Product_Description"
#total/missing=1146/0
length, min/avg/max=29/104.95986038394415/178
)
- Categorical(
name="Sentiment"
#total/missing=1146/0
num_class (total/non_special)=4/4
categories=[0, 1, 2, 3]
freq=[13, 79, 667, 387]
)
Columns:
- Text(
name="Product_Description"
#total/missing=1146/0
length, min/avg/max=29/104.95986038394415/178
)
- Categorical(
name="Sentiment"
#total/missing=1146/0
num_class (total/non_special)=4/4
categories=[0, 1, 2, 3]
freq=[13, 79, 667, 387]
)
WARNING: changing multiprocessing start method to forkserver
2021-02-23 19:24:49,415 - autogluon.text.text_prediction.text_prediction - INFO - All Logs will be saved to ag_text_only/main.log
All Logs will be saved to ag_text_only/main.log
0%| | 0/1 [00:00<?, ?it/s]
2021-02-23 19:26:08,587 - autogluon.text.text_prediction.text_prediction - INFO - Results=
Results=
2021-02-23 19:26:08,589 - autogluon.text.text_prediction.text_prediction - INFO - Best_config={'search_space▁optimization.lr': 5e-05}
Best_config={'search_space▁optimization.lr': 5e-05}
(task:0) 2021-02-23 19:24:52,491 - root - INFO - All Logs will be saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/docs/_build/eval/tutorials/tabular_prediction/ag_text_only/task0/training.log
2021-02-23 19:24:52,491 - root - INFO - learning:
early_stopping_patience: 10
log_metrics: auto
stop_metric: auto
valid_ratio: 0.15
misc:
exp_dir: /var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/docs/_build/eval/tutorials/tabular_prediction/ag_text_only/task0
seed: 123
model:
backbone:
name: google_electra_small
network:
agg_net:
activation: tanh
agg_type: concat
data_dropout: False
dropout: 0.1
feature_proj_num_layers: -1
initializer:
bias: ['zeros']
weight: ['xavier', 'uniform', 'avg', 3.0]
mid_units: 256
norm_eps: 1e-05
normalization: layer_norm
out_proj_num_layers: 0
categorical_net:
activation: leaky
data_dropout: False
dropout: 0.1
emb_units: 32
initializer:
bias: ['zeros']
embed: ['xavier', 'gaussian', 'in', 1.0]
weight: ['xavier', 'uniform', 'avg', 3.0]
mid_units: 64
norm_eps: 1e-05
normalization: layer_norm
num_layers: 1
feature_units: -1
initializer:
bias: ['zeros']
weight: ['truncnorm', 0, 0.02]
numerical_net:
activation: leaky
data_dropout: False
dropout: 0.1
initializer:
bias: ['zeros']
weight: ['xavier', 'uniform', 'avg', 3.0]
input_centering: False
mid_units: 128
norm_eps: 1e-05
normalization: layer_norm
num_layers: 1
text_net:
pool_type: cls
use_segment_id: True
preprocess:
max_length: 128
merge_text: True
optimization:
batch_size: 32
begin_lr: 0.0
final_lr: 0.0
layerwise_lr_decay: 0.8
log_frequency: 0.1
lr: 5e-05
lr_scheduler: triangular
max_grad_norm: 1.0
model_average: 5
num_train_epochs: 4
optimizer: adamw
optimizer_params: [('beta1', 0.9), ('beta2', 0.999), ('epsilon', 1e-06), ('correct_bias', False)]
per_device_batch_size: 16
val_batch_size_mult: 2
valid_frequency: 0.1
warmup_portion: 0.1
wd: 0.01
version: 1
2021-02-23 19:24:52,645 - root - INFO - Process training set...
2021-02-23 19:24:56,192 - root - INFO - Done!
2021-02-23 19:24:56,192 - root - INFO - Process dev set...
2021-02-23 19:24:58,906 - root - INFO - Done!
2021-02-23 19:25:04,337 - root - INFO - #Total Params/Fixed Params=13484036/0
2021-02-23 19:25:04,352 - root - INFO - Using gradient accumulation. Global batch size = 32
2021-02-23 19:25:06,689 - root - INFO - [Iter 15/572, Epoch 0] train loss=8.3855e-01, gnorm=1.2309e+01, lr=1.3158e-05, #samples processed=720, #sample per second=314.10
2021-02-23 19:25:07,481 - root - INFO - [Iter 15/572, Epoch 0] valid accuracy=5.8028e-01, log_loss=1.1171e+00, accuracy=5.8028e-01, time spent=0.715s, total_time=0.05min
2021-02-23 19:25:08,990 - root - INFO - [Iter 30/572, Epoch 0] train loss=6.4969e-01, gnorm=5.2289e+00, lr=2.6316e-05, #samples processed=720, #sample per second=312.86
2021-02-23 19:25:09,823 - root - INFO - [Iter 30/572, Epoch 0] valid accuracy=5.8115e-01, log_loss=9.4872e-01, accuracy=5.8115e-01, time spent=0.695s, total_time=0.09min
2021-02-23 19:25:11,273 - root - INFO - [Iter 45/572, Epoch 0] train loss=6.7008e-01, gnorm=7.3572e+00, lr=3.9474e-05, #samples processed=720, #sample per second=315.50
2021-02-23 19:25:12,108 - root - INFO - [Iter 45/572, Epoch 0] valid accuracy=6.0384e-01, log_loss=8.8923e-01, accuracy=6.0384e-01, time spent=0.703s, total_time=0.13min
2021-02-23 19:25:13,588 - root - INFO - [Iter 60/572, Epoch 0] train loss=6.6470e-01, gnorm=4.5690e+00, lr=4.9709e-05, #samples processed=720, #sample per second=311.03
2021-02-23 19:25:14,416 - root - INFO - [Iter 60/572, Epoch 0] valid accuracy=6.2391e-01, log_loss=9.0283e-01, accuracy=6.2391e-01, time spent=0.696s, total_time=0.17min
2021-02-23 19:25:15,839 - root - INFO - [Iter 75/572, Epoch 0] train loss=6.2934e-01, gnorm=4.4687e+00, lr=4.8252e-05, #samples processed=720, #sample per second=319.82
2021-02-23 19:25:16,550 - root - INFO - [Iter 75/572, Epoch 0] valid accuracy=6.1257e-01, log_loss=8.9348e-01, accuracy=6.1257e-01, time spent=0.711s, total_time=0.20min
2021-02-23 19:25:17,929 - root - INFO - [Iter 90/572, Epoch 0] train loss=6.5662e-01, gnorm=5.8075e+00, lr=4.6796e-05, #samples processed=720, #sample per second=344.50
2021-02-23 19:25:18,641 - root - INFO - [Iter 90/572, Epoch 0] valid accuracy=6.1431e-01, log_loss=8.4047e-01, accuracy=6.1431e-01, time spent=0.711s, total_time=0.24min
2021-02-23 19:25:20,112 - root - INFO - [Iter 105/572, Epoch 0] train loss=6.1765e-01, gnorm=4.4335e+00, lr=4.5340e-05, #samples processed=720, #sample per second=329.88
2021-02-23 19:25:20,960 - root - INFO - [Iter 105/572, Epoch 0] valid accuracy=6.3264e-01, log_loss=8.3699e-01, accuracy=6.3264e-01, time spent=0.705s, total_time=0.28min
2021-02-23 19:25:22,380 - root - INFO - [Iter 120/572, Epoch 0] train loss=5.8680e-01, gnorm=6.0388e+00, lr=4.3883e-05, #samples processed=720, #sample per second=317.53
2021-02-23 19:25:23,089 - root - INFO - [Iter 120/572, Epoch 0] valid accuracy=6.1082e-01, log_loss=8.6694e-01, accuracy=6.1082e-01, time spent=0.709s, total_time=0.31min
2021-02-23 19:25:24,448 - root - INFO - [Iter 135/572, Epoch 0] train loss=6.5235e-01, gnorm=5.0395e+00, lr=4.2427e-05, #samples processed=720, #sample per second=348.19
2021-02-23 19:25:25,287 - root - INFO - [Iter 135/572, Epoch 0] valid accuracy=6.5096e-01, log_loss=8.1716e-01, accuracy=6.5096e-01, time spent=0.709s, total_time=0.35min
2021-02-23 19:25:26,702 - root - INFO - [Iter 150/572, Epoch 1] train loss=6.0744e-01, gnorm=8.6245e+00, lr=4.0971e-05, #samples processed=698, #sample per second=309.71
2021-02-23 19:25:27,411 - root - INFO - [Iter 150/572, Epoch 1] valid accuracy=6.4834e-01, log_loss=8.0773e-01, accuracy=6.4834e-01, time spent=0.709s, total_time=0.38min
2021-02-23 19:25:28,826 - root - INFO - [Iter 165/572, Epoch 1] train loss=5.5537e-01, gnorm=9.9330e+00, lr=3.9515e-05, #samples processed=720, #sample per second=338.93
2021-02-23 19:25:29,663 - root - INFO - [Iter 165/572, Epoch 1] valid accuracy=6.5445e-01, log_loss=8.1417e-01, accuracy=6.5445e-01, time spent=0.704s, total_time=0.42min
2021-02-23 19:25:31,099 - root - INFO - [Iter 180/572, Epoch 1] train loss=5.6724e-01, gnorm=4.1054e+00, lr=3.8058e-05, #samples processed=720, #sample per second=316.83
2021-02-23 19:25:31,810 - root - INFO - [Iter 180/572, Epoch 1] valid accuracy=6.5009e-01, log_loss=8.0914e-01, accuracy=6.5009e-01, time spent=0.711s, total_time=0.46min
2021-02-23 19:25:33,226 - root - INFO - [Iter 195/572, Epoch 1] train loss=5.9127e-01, gnorm=1.0215e+01, lr=3.6602e-05, #samples processed=720, #sample per second=338.49
2021-02-23 19:25:33,932 - root - INFO - [Iter 195/572, Epoch 1] valid accuracy=6.5271e-01, log_loss=7.9851e-01, accuracy=6.5271e-01, time spent=0.706s, total_time=0.49min
2021-02-23 19:25:35,342 - root - INFO - [Iter 210/572, Epoch 1] train loss=5.3871e-01, gnorm=4.3828e+00, lr=3.5146e-05, #samples processed=720, #sample per second=340.33
2021-02-23 19:25:36,061 - root - INFO - [Iter 210/572, Epoch 1] valid accuracy=6.3874e-01, log_loss=8.1257e-01, accuracy=6.3874e-01, time spent=0.718s, total_time=0.53min
2021-02-23 19:25:37,499 - root - INFO - [Iter 225/572, Epoch 1] train loss=5.3807e-01, gnorm=5.2270e+00, lr=3.3689e-05, #samples processed=720, #sample per second=333.78
2021-02-23 19:25:38,217 - root - INFO - [Iter 225/572, Epoch 1] valid accuracy=6.4660e-01, log_loss=7.9136e-01, accuracy=6.4660e-01, time spent=0.717s, total_time=0.56min
2021-02-23 19:25:39,636 - root - INFO - [Iter 240/572, Epoch 1] train loss=5.8410e-01, gnorm=6.8311e+00, lr=3.2233e-05, #samples processed=720, #sample per second=336.94
2021-02-23 19:25:40,466 - root - INFO - [Iter 240/572, Epoch 1] valid accuracy=6.6841e-01, log_loss=7.7052e-01, accuracy=6.6841e-01, time spent=0.700s, total_time=0.60min
2021-02-23 19:25:41,844 - root - INFO - [Iter 255/572, Epoch 1] train loss=5.3738e-01, gnorm=5.7739e+00, lr=3.0777e-05, #samples processed=720, #sample per second=326.23
2021-02-23 19:25:42,544 - root - INFO - [Iter 255/572, Epoch 1] valid accuracy=6.6754e-01, log_loss=7.8309e-01, accuracy=6.6754e-01, time spent=0.700s, total_time=0.64min
2021-02-23 19:25:43,952 - root - INFO - [Iter 270/572, Epoch 1] train loss=5.0632e-01, gnorm=5.0476e+00, lr=2.9320e-05, #samples processed=720, #sample per second=341.52
2021-02-23 19:25:44,676 - root - INFO - [Iter 270/572, Epoch 1] valid accuracy=6.5794e-01, log_loss=7.8638e-01, accuracy=6.5794e-01, time spent=0.724s, total_time=0.67min
2021-02-23 19:25:46,088 - root - INFO - [Iter 285/572, Epoch 1] train loss=5.3142e-01, gnorm=5.6990e+00, lr=2.7864e-05, #samples processed=720, #sample per second=337.05
2021-02-23 19:25:46,932 - root - INFO - [Iter 285/572, Epoch 1] valid accuracy=6.7016e-01, log_loss=7.8180e-01, accuracy=6.7016e-01, time spent=0.709s, total_time=0.71min
2021-02-23 19:25:48,345 - root - INFO - [Iter 300/572, Epoch 2] train loss=5.3040e-01, gnorm=9.8988e+00, lr=2.6408e-05, #samples processed=709, #sample per second=314.15
2021-02-23 19:25:49,061 - root - INFO - [Iter 300/572, Epoch 2] valid accuracy=6.6754e-01, log_loss=7.6951e-01, accuracy=6.6754e-01, time spent=0.715s, total_time=0.74min
2021-02-23 19:25:50,485 - root - INFO - [Iter 315/572, Epoch 2] train loss=5.1332e-01, gnorm=1.2150e+01, lr=2.4951e-05, #samples processed=720, #sample per second=336.58
2021-02-23 19:25:51,199 - root - INFO - [Iter 315/572, Epoch 2] valid accuracy=6.5620e-01, log_loss=7.7306e-01, accuracy=6.5620e-01, time spent=0.713s, total_time=0.78min
2021-02-23 19:25:52,606 - root - INFO - [Iter 330/572, Epoch 2] train loss=5.3934e-01, gnorm=6.6843e+00, lr=2.3495e-05, #samples processed=720, #sample per second=339.58
2021-02-23 19:25:53,322 - root - INFO - [Iter 330/572, Epoch 2] valid accuracy=6.4572e-01, log_loss=7.8320e-01, accuracy=6.4572e-01, time spent=0.716s, total_time=0.82min
2021-02-23 19:25:54,722 - root - INFO - [Iter 345/572, Epoch 2] train loss=4.8354e-01, gnorm=6.3392e+00, lr=2.2039e-05, #samples processed=720, #sample per second=340.32
2021-02-23 19:25:55,428 - root - INFO - [Iter 345/572, Epoch 2] valid accuracy=6.4834e-01, log_loss=8.2002e-01, accuracy=6.4834e-01, time spent=0.706s, total_time=0.85min
2021-02-23 19:25:56,849 - root - INFO - [Iter 360/572, Epoch 2] train loss=5.3423e-01, gnorm=5.4433e+00, lr=2.0583e-05, #samples processed=720, #sample per second=338.53
2021-02-23 19:25:57,560 - root - INFO - [Iter 360/572, Epoch 2] valid accuracy=6.5881e-01, log_loss=7.6452e-01, accuracy=6.5881e-01, time spent=0.711s, total_time=0.89min
2021-02-23 19:25:58,981 - root - INFO - [Iter 375/572, Epoch 2] train loss=5.3697e-01, gnorm=6.2103e+00, lr=1.9126e-05, #samples processed=720, #sample per second=337.70
2021-02-23 19:25:59,709 - root - INFO - [Iter 375/572, Epoch 2] valid accuracy=6.2042e-01, log_loss=8.2208e-01, accuracy=6.2042e-01, time spent=0.727s, total_time=0.92min
2021-02-23 19:26:01,114 - root - INFO - [Iter 390/572, Epoch 2] train loss=5.7040e-01, gnorm=6.7707e+00, lr=1.7670e-05, #samples processed=720, #sample per second=337.59
2021-02-23 19:26:01,836 - root - INFO - [Iter 390/572, Epoch 2] valid accuracy=6.5620e-01, log_loss=7.8493e-01, accuracy=6.5620e-01, time spent=0.722s, total_time=0.96min
2021-02-23 19:26:03,272 - root - INFO - [Iter 405/572, Epoch 2] train loss=5.2676e-01, gnorm=8.0785e+00, lr=1.6214e-05, #samples processed=720, #sample per second=333.67
2021-02-23 19:26:03,999 - root - INFO - [Iter 405/572, Epoch 2] valid accuracy=6.5096e-01, log_loss=7.7680e-01, accuracy=6.5096e-01, time spent=0.726s, total_time=0.99min
2021-02-23 19:26:05,402 - root - INFO - [Iter 420/572, Epoch 2] train loss=5.1999e-01, gnorm=1.1199e+01, lr=1.4757e-05, #samples processed=720, #sample per second=338.01
2021-02-23 19:26:06,132 - root - INFO - [Iter 420/572, Epoch 2] valid accuracy=6.6405e-01, log_loss=7.6699e-01, accuracy=6.6405e-01, time spent=0.730s, total_time=1.03min
2021-02-23 19:26:07,538 - root - INFO - [Iter 435/572, Epoch 3] train loss=5.4638e-01, gnorm=9.2815e+00, lr=1.3301e-05, #samples processed=698, #sample per second=326.79
2021-02-23 19:26:08,272 - root - INFO - [Iter 435/572, Epoch 3] valid accuracy=6.6143e-01, log_loss=7.6100e-01, accuracy=6.6143e-01, time spent=0.734s, total_time=1.06min
2021-02-23 19:26:08,276 - root - INFO - Early stopping patience reached!
print(predictor_text_only.evaluate(dev_df[['Product_Description', 'Sentiment']], metrics='accuracy'))
/var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/venv/lib/python3.8/site-packages/mxnet/gluon/block.py:995: UserWarning: The 3-th input to HybridBlock is not used by any computation. Is this intended?
self._build_cache(*args)
{'accuracy': 0.6671899529042387}
Model 1: Baseline with N-Gram + TF-IDF¶
The first baseline model is to directly call AutoGluon’s TabularPredictor to train a predictor. TabularPredictor uses the n-gram and TF-IDF based features for text columns and considers text and categorical columns simultaneously.
predictor_model1 = TabularPredictor(label=label, eval_metric='accuracy', path='model1').fit(train_df)
Beginning AutoGluon training ...
AutoGluon will save models to "model1/"
AutoGluon Version: 0.1.0b20210223
Train Data Rows: 5727
Train Data Columns: 2
Preprocessing data ...
AutoGluon infers your prediction problem is: 'multiclass' (because dtype of label-column == int, but few unique label-values observed).
4 unique label values: [3, 2, 1, 0]
If 'multiclass' is not the correct problem_type, please manually specify the problem_type argument in fit() (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])
NumExpr defaulting to 8 threads.
Train Data Class Count: 4
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
Available Memory: 14382.9 MB
Train Data (Original) Memory Usage: 1.0 MB (0.0% of available memory)
Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
Stage 1 Generators:
Fitting AsTypeFeatureGenerator...
Stage 2 Generators:
Fitting FillNaFeatureGenerator...
Stage 3 Generators:
Fitting IdentityFeatureGenerator...
Fitting CategoryFeatureGenerator...
Fitting CategoryMemoryMinimizeFeatureGenerator...
Fitting TextSpecialFeatureGenerator...
Fitting BinnedFeatureGenerator...
Fitting DropDuplicatesFeatureGenerator...
Fitting TextNgramFeatureGenerator...
Fitting CountVectorizer for text features: ['Product_Description']
CountVectorizer fit with vocabulary size = 725
Warning: Due to memory constraints, ngram feature count is being reduced. Allocate more memory to maximize model quality.
Reducing Vectorizer vocab size from 725 to 354 to avoid OOM error
Stage 4 Generators:
Fitting DropUniqueFeatureGenerator...
Types of features in original data (raw dtype, special dtypes):
('int', []) : 1 | ['Product_Type']
('object', ['text']) : 1 | ['Product_Description']
Types of features in processed data (raw dtype, special dtypes):
('int', []) : 1 | ['Product_Type']
('int', ['binned', 'text_special']) : 38 | ['Product_Description.char_count', 'Product_Description.word_count', 'Product_Description.capital_ratio', 'Product_Description.lower_ratio', 'Product_Description.digit_ratio', ...]
('int', ['text_ngram']) : 355 | ['__nlp__.10', '__nlp__.11', '__nlp__.6th', '__nlp__.about', '__nlp__.all', ...]
2.1s = Fit runtime
2 features in original data used to generate 394 features in processed data.
Train Data (Processed) Memory Usage: 2.31 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 2.16s ...
AutoGluon will gauge predictive performance using evaluation metric: 'accuracy'
To change this, specify the eval_metric argument of fit()
Automatically generating train/validation split with holdout_frac=0.1, Train Rows: 5154, Val Rows: 573
Fitting model: NeuralNetMXNet ...
0.8726 = Validation accuracy score
4.1s = Training runtime
0.03s = Validation runtime
Fitting model: NeuralNetFastAI ...
/var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/venv/lib/python3.8/site-packages/torch/cuda/__init__.py:52: UserWarning: CUDA initialization: The NVIDIA driver on your system is too old (found version 10010). Please update your GPU driver by downloading and installing a new version from the URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to: https://pytorch.org to install a PyTorch version that has been compiled with your version of the CUDA driver. (Triggered internally at /pytorch/c10/cuda/CUDAFunctions.cpp:100.)
return torch._C._cuda_getDeviceCount() > 0
█
0.8482 = Validation accuracy score
12.06s = Training runtime
0.36s = Validation runtime
Fitting model: KNeighborsUnif ...
0.8534 = Validation accuracy score
0.02s = Training runtime
0.02s = Validation runtime
Fitting model: KNeighborsDist ...
0.8534 = Validation accuracy score
0.02s = Training runtime
0.02s = Validation runtime
Fitting model: RandomForestGini ...
0.8709 = Validation accuracy score
1.03s = Training runtime
0.08s = Validation runtime
Fitting model: RandomForestEntr ...
0.8709 = Validation accuracy score
1.04s = Training runtime
0.08s = Validation runtime
Fitting model: ExtraTreesGini ...
0.8464 = Validation accuracy score
1.15s = Training runtime
0.08s = Validation runtime
Fitting model: ExtraTreesEntr ...
0.8464 = Validation accuracy score
1.15s = Training runtime
0.08s = Validation runtime
Fitting model: LightGBM ...
0.8831 = Validation accuracy score
1.08s = Training runtime
0.01s = Validation runtime
Fitting model: LightGBMXT ...
0.8534 = Validation accuracy score
1.19s = Training runtime
0.01s = Validation runtime
Fitting model: CatBoost ...
0.8726 = Validation accuracy score
1.04s = Training runtime
0.01s = Validation runtime
Fitting model: XGBoost ...
0.8778 = Validation accuracy score
1.84s = Training runtime
0.02s = Validation runtime
Fitting model: LightGBMLarge ...
0.8813 = Validation accuracy score
3.45s = Training runtime
0.01s = Validation runtime
█
Fitting model: WeightedEnsemble_L1 ...
0.8883 = Validation accuracy score
0.37s = Training runtime
0.0s = Validation runtime
AutoGluon training complete, total runtime = 44.09s ...
TabularPredictor saved. To load, use: TabularPredictor.load("model1/")
predictor_model1.leaderboard(dev_df, silent=True)
█
model | score_test | score_val | pred_time_test | pred_time_val | fit_time | pred_time_test_marginal | pred_time_val_marginal | fit_time_marginal | stack_level | can_infer | fit_order | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | WeightedEnsemble_L1 | 0.890110 | 0.888307 | 10.636745 | 0.579389 | 20.212959 | 0.005916 | 0.000465 | 0.374030 | 1 | True | 14 |
1 | LightGBMLarge | 0.886970 | 0.881326 | 0.015931 | 0.006296 | 3.445040 | 0.015931 | 0.006296 | 3.445040 | 0 | True | 13 |
2 | CatBoost | 0.886970 | 0.872600 | 0.018929 | 0.006263 | 1.039262 | 0.018929 | 0.006263 | 1.039262 | 0 | True | 11 |
3 | RandomForestGini | 0.886970 | 0.870855 | 0.115808 | 0.077118 | 1.025226 | 0.115808 | 0.077118 | 1.025226 | 0 | True | 5 |
4 | XGBoost | 0.885400 | 0.877836 | 0.079946 | 0.019477 | 1.838979 | 0.079946 | 0.019477 | 1.838979 | 0 | True | 12 |
5 | RandomForestEntr | 0.885400 | 0.870855 | 0.118663 | 0.075674 | 1.039675 | 0.118663 | 0.075674 | 1.039675 | 0 | True | 6 |
6 | KNeighborsUnif | 0.883830 | 0.853403 | 0.032106 | 0.019376 | 0.019685 | 0.032106 | 0.019376 | 0.019685 | 0 | True | 3 |
7 | KNeighborsDist | 0.883830 | 0.853403 | 0.041204 | 0.019170 | 0.019528 | 0.041204 | 0.019170 | 0.019528 | 0 | True | 4 |
8 | LightGBM | 0.882261 | 0.883072 | 0.011820 | 0.006165 | 1.076681 | 0.011820 | 0.006165 | 1.076681 | 0 | True | 9 |
9 | NeuralNetMXNet | 0.877551 | 0.872600 | 0.046874 | 0.034172 | 4.098654 | 0.046874 | 0.034172 | 4.098654 | 0 | True | 1 |
10 | LightGBMXT | 0.869702 | 0.853403 | 0.013507 | 0.008016 | 1.188565 | 0.013507 | 0.008016 | 1.188565 | 0 | True | 10 |
11 | ExtraTreesEntr | 0.868132 | 0.846422 | 0.150189 | 0.080109 | 1.153447 | 0.150189 | 0.080109 | 1.153447 | 0 | True | 8 |
12 | ExtraTreesGini | 0.866562 | 0.846422 | 0.175136 | 0.079671 | 1.148813 | 0.175136 | 0.079671 | 1.148813 | 0 | True | 7 |
13 | NeuralNetFastAI | 0.854003 | 0.848168 | 10.244841 | 0.364427 | 12.060060 | 10.244841 | 0.364427 | 12.060060 | 0 | True | 2 |
We can find that using product type (a categorical column) is quite essential for good performance in this task. The accuracy is much higher than the model trained with only text column.
Model 2: Extract Text Embedding and Use Tabular Predictor¶
Our second attempt in combining text and other features is to use the
trained TextPrediction model to extract embeddings and use
TabularPredictor to build the predictor on top of the text embeddings.
The AutoGluon TextPrediction model offers the extract_embedding()
functionality (For more details, go to
Extract Embeddings), so we are able to build
a two-stage model. In the first stage, we use the text-only model to
extract sentence embeddings. In the second stage, we use
TabularPredictor to get the final model.
train_sentence_embeddings = predictor_text_only.extract_embedding(train_df)
dev_sentence_embeddings = predictor_text_only.extract_embedding(dev_df)
print(train_sentence_embeddings)
/var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/text/src/autogluon/text/text_prediction/dataset.py:321: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
df[col_name] = df[col_name].fillna('').apply(str)
/var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/venv/lib/python3.8/site-packages/mxnet/gluon/block.py:995: UserWarning: The 3-th input to HybridBlock is not used by any computation. Is this intended?
self._build_cache(*args)
[[-0.061683 0.789052 -0.614252 -0.383968 ... -0.202426 1.144868 0.039427 -0.13562 ]
[-0.269277 0.177113 -0.197375 -0.172229 ... -0.584261 0.625235 -0.355088 0.211777]
[-0.83451 0.264692 -0.687199 0.234191 ... -0.605356 0.332709 0.029832 -0.160492]
[-0.271491 0.149054 -0.506492 -0.09476 ... -0.809414 0.29643 0.31992 -0.096194]
...
[-0.054611 -0.060668 -0.49929 -0.170906 ... -0.284143 0.805138 0.430891 -0.191869]
[-0.277809 0.150503 -0.625322 -0.241075 ... -0.525608 0.909708 -0.124487 -0.031551]
[-0.64508 -0.07616 -0.567146 0.192171 ... -1.166889 0.589877 0.242167 -0.549045]
[-0.837447 0.347837 -0.525436 -0.440289 ... -0.742066 0.565927 -0.054493 -0.411046]]
merged_train_data = train_df.join(pd.DataFrame(train_sentence_embeddings))
merged_dev_data = dev_df.join(pd.DataFrame(dev_sentence_embeddings))
print(merged_train_data)
Product_Description Product_Type 0 Just heard that Apple is opening a store in do... 2 1 Tristan H, apture: being fast & iterative ... 9 2 Hey, you lucky dogs at #SXSW with iPads -- che... 6 3 RT @mention THIS was the best thing I saw at #... 9 4 Apple is opening temp retail store in Austin t... 2 ... ... ... 5722 RT @mention At #SXSW and want to win an iPad? ... 9 5723 RT @mention I mean, sliced bread is great. But... 3 5724 Apple cited as the opposite of crowdsourcing -... 2 5725 Good CNN article on why #SXSW is important to ... 7 5726 ÛÏ@mention Google to Launch Major New Social ... 3 Sentiment 0 1 2 3 4 5 0 3 -0.061683 0.789052 -0.614252 -0.383968 0.794183 -0.581301 1 2 -0.269277 0.177113 -0.197375 -0.172229 0.547932 -0.265157 2 3 -0.834510 0.264692 -0.687199 0.234191 1.018778 -0.689753 3 2 -0.271491 0.149054 -0.506492 -0.094760 0.644704 -0.521082 4 3 0.180634 0.237529 -0.668062 -0.119891 0.387544 -0.314172 ... ... ... ... ... ... ... ... 5722 2 -0.771102 0.284567 -0.285301 -0.168485 0.645094 0.036831 5723 3 -0.054611 -0.060668 -0.499290 -0.170906 -0.367915 0.331775 5724 1 -0.277809 0.150503 -0.625322 -0.241075 0.157916 0.060280 5725 3 -0.645080 -0.076160 -0.567146 0.192171 0.524227 -0.318997 5726 3 -0.837447 0.347837 -0.525436 -0.440289 0.857342 -0.283967 6 ... 246 247 248 249 250 0 0.919014 ... -0.357193 0.390834 0.833298 -0.115630 0.786055 1 1.170505 ... -0.212880 0.123253 0.668844 -0.826962 0.772176 2 1.244796 ... -0.300159 0.729561 0.551330 -0.400327 0.671096 3 1.293411 ... -0.233223 0.269830 0.657665 -0.285630 0.512562 4 1.199535 ... -0.490884 0.116289 0.888642 -0.219426 0.773025 ... ... ... ... ... ... ... ... 5722 1.063322 ... -0.032328 0.582992 0.674876 -0.196406 0.172549 5723 0.314198 ... -0.639878 -0.178192 0.481809 -0.700696 -0.039856 5724 0.656375 ... -0.459748 -0.046848 0.820173 -0.563527 0.208965 5725 1.467807 ... -0.601973 0.303506 0.423291 -0.275483 0.578006 5726 1.174435 ... -0.239029 0.465101 0.134878 -0.096706 0.547499 251 252 253 254 255 0 0.243626 -0.202426 1.144868 0.039427 -0.135620 1 -0.053033 -0.584261 0.625235 -0.355088 0.211777 2 0.120588 -0.605356 0.332709 0.029832 -0.160492 3 0.320794 -0.809414 0.296430 0.319920 -0.096194 4 0.546610 -0.479898 1.102872 0.085935 -0.345065 ... ... ... ... ... ... 5722 0.049309 -0.042409 0.503684 -0.348889 -0.508032 5723 1.003379 -0.284143 0.805138 0.430891 -0.191869 5724 0.631439 -0.525608 0.909708 -0.124487 -0.031551 5725 0.421409 -1.166889 0.589877 0.242167 -0.549045 5726 0.293577 -0.742066 0.565927 -0.054493 -0.411046 [5727 rows x 259 columns]
predictor_model2 = TabularPredictor(label=label, eval_metric='accuracy', path='model2').fit(merged_train_data)
Beginning AutoGluon training ...
AutoGluon will save models to "model2/"
AutoGluon Version: 0.1.0b20210223
Train Data Rows: 5727
Train Data Columns: 258
Preprocessing data ...
AutoGluon infers your prediction problem is: 'multiclass' (because dtype of label-column == int, but few unique label-values observed).
4 unique label values: [3, 2, 1, 0]
If 'multiclass' is not the correct problem_type, please manually specify the problem_type argument in fit() (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])
Train Data Class Count: 4
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
Available Memory: 13929.9 MB
Train Data (Original) Memory Usage: 6.87 MB (0.0% of available memory)
Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
Stage 1 Generators:
Fitting AsTypeFeatureGenerator...
Stage 2 Generators:
Fitting FillNaFeatureGenerator...
Stage 3 Generators:
Fitting IdentityFeatureGenerator...
Fitting CategoryFeatureGenerator...
Fitting CategoryMemoryMinimizeFeatureGenerator...
Fitting TextSpecialFeatureGenerator...
Fitting BinnedFeatureGenerator...
Fitting DropDuplicatesFeatureGenerator...
Fitting TextNgramFeatureGenerator...
Fitting CountVectorizer for text features: ['Product_Description']
CountVectorizer fit with vocabulary size = 725
Warning: Due to memory constraints, ngram feature count is being reduced. Allocate more memory to maximize model quality.
Reducing Vectorizer vocab size from 725 to 303 to avoid OOM error
Stage 4 Generators:
Fitting DropUniqueFeatureGenerator...
Types of features in original data (raw dtype, special dtypes):
('float', []) : 256 | ['0', '1', '2', '3', '4', ...]
('int', []) : 1 | ['Product_Type']
('object', ['text']) : 1 | ['Product_Description']
Types of features in processed data (raw dtype, special dtypes):
('float', []) : 256 | ['0', '1', '2', '3', '4', ...]
('int', []) : 1 | ['Product_Type']
('int', ['binned', 'text_special']) : 38 | ['Product_Description.char_count', 'Product_Description.word_count', 'Product_Description.capital_ratio', 'Product_Description.lower_ratio', 'Product_Description.digit_ratio', ...]
('int', ['text_ngram']) : 304 | ['__nlp__.11', '__nlp__.6th', '__nlp__.about', '__nlp__.all', '__nlp__.amp', ...]
2.4s = Fit runtime
258 features in original data used to generate 599 features in processed data.
Train Data (Processed) Memory Usage: 7.89 MB (0.1% of available memory)
Data preprocessing and feature engineering runtime = 2.48s ...
AutoGluon will gauge predictive performance using evaluation metric: 'accuracy'
To change this, specify the eval_metric argument of fit()
Automatically generating train/validation split with holdout_frac=0.1, Train Rows: 5154, Val Rows: 573
Fitting model: NeuralNetMXNet ...
0.8918 = Validation accuracy score
6.3s = Training runtime
0.04s = Validation runtime
Fitting model: NeuralNetFastAI ...
█
0.8534 = Validation accuracy score
18.82s = Training runtime
0.52s = Validation runtime
Fitting model: KNeighborsUnif ...
0.8551 = Validation accuracy score
0.02s = Training runtime
0.13s = Validation runtime
Fitting model: KNeighborsDist ...
0.8586 = Validation accuracy score
0.03s = Training runtime
0.12s = Validation runtime
Fitting model: RandomForestGini ...
0.8691 = Validation accuracy score
2.91s = Training runtime
0.08s = Validation runtime
Fitting model: RandomForestEntr ...
0.8551 = Validation accuracy score
5.16s = Training runtime
0.08s = Validation runtime
Fitting model: ExtraTreesGini ...
0.8168 = Validation accuracy score
1.25s = Training runtime
0.08s = Validation runtime
Fitting model: ExtraTreesEntr ...
0.8028 = Validation accuracy score
1.31s = Training runtime
0.08s = Validation runtime
Fitting model: LightGBM ...
0.8935 = Validation accuracy score
9.69s = Training runtime
0.01s = Validation runtime
Fitting model: LightGBMXT ...
0.8726 = Validation accuracy score
9.19s = Training runtime
0.01s = Validation runtime
Fitting model: CatBoost ...
0.8883 = Validation accuracy score
19.0s = Training runtime
0.01s = Validation runtime
Fitting model: XGBoost ...
0.8935 = Validation accuracy score
39.02s = Training runtime
0.05s = Validation runtime
Fitting model: LightGBMLarge ...
0.8935 = Validation accuracy score
57.6s = Training runtime
0.02s = Validation runtime
█
Fitting model: WeightedEnsemble_L1 ...
0.8988 = Validation accuracy score
0.38s = Training runtime
0.0s = Validation runtime
AutoGluon training complete, total runtime = 186.44s ...
TabularPredictor saved. To load, use: TabularPredictor.load("model2/")
predictor_model2.leaderboard(merged_dev_data, silent=True)
█
model | score_test | score_val | pred_time_test | pred_time_val | fit_time | pred_time_test_marginal | pred_time_val_marginal | fit_time_marginal | stack_level | can_infer | fit_order | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | WeightedEnsemble_L1 | 0.893250 | 0.898778 | 10.744249 | 0.715136 | 98.361377 | 0.005639 | 0.000464 | 0.375998 | 1 | True | 14 |
1 | LightGBM | 0.891680 | 0.893543 | 0.029830 | 0.007839 | 9.685987 | 0.029830 | 0.007839 | 9.685987 | 0 | True | 9 |
2 | CatBoost | 0.886970 | 0.888307 | 0.028248 | 0.012147 | 19.002439 | 0.028248 | 0.012147 | 19.002439 | 0 | True | 11 |
3 | NeuralNetMXNet | 0.886970 | 0.891798 | 0.053362 | 0.042966 | 6.302544 | 0.053362 | 0.042966 | 6.302544 | 0 | True | 1 |
4 | LightGBMLarge | 0.886970 | 0.893543 | 0.059007 | 0.016558 | 57.602369 | 0.059007 | 0.016558 | 57.602369 | 0 | True | 13 |
5 | XGBoost | 0.886970 | 0.893543 | 0.160011 | 0.054713 | 39.018259 | 0.160011 | 0.054713 | 39.018259 | 0 | True | 12 |
6 | LightGBMXT | 0.872841 | 0.872600 | 0.018458 | 0.009431 | 9.185145 | 0.018458 | 0.009431 | 9.185145 | 0 | True | 10 |
7 | KNeighborsUnif | 0.855573 | 0.855148 | 0.153559 | 0.128035 | 0.021495 | 0.153559 | 0.128035 | 0.021495 | 0 | True | 3 |
8 | KNeighborsDist | 0.854003 | 0.858639 | 0.123584 | 0.121682 | 0.030962 | 0.123584 | 0.121682 | 0.030962 | 0 | True | 4 |
9 | NeuralNetFastAI | 0.850863 | 0.853403 | 10.367347 | 0.518267 | 18.815119 | 10.367347 | 0.518267 | 18.815119 | 0 | True | 2 |
10 | RandomForestGini | 0.830455 | 0.869110 | 0.101238 | 0.078978 | 2.912344 | 0.101238 | 0.078978 | 2.912344 | 0 | True | 5 |
11 | RandomForestEntr | 0.808477 | 0.855148 | 0.099813 | 0.078739 | 5.161032 | 0.099813 | 0.078739 | 5.161032 | 0 | True | 6 |
12 | ExtraTreesGini | 0.800628 | 0.816754 | 0.150204 | 0.082309 | 1.251960 | 0.150204 | 0.082309 | 1.251960 | 0 | True | 7 |
13 | ExtraTreesEntr | 0.778650 | 0.802792 | 0.126574 | 0.082750 | 1.306403 | 0.126574 | 0.082750 | 1.306403 | 0 | True | 8 |
The performance is better than the first model.
Model 3: Use the Neural Network in AutoGluon-Text in Tabular Weighted Ensemble¶
Another option is to directly include the neural network in
AutoGluon-Text as one candidate of TabularPredictor. We can do that now
by changing the hyperparameters. Note that for the purpose of this
tutorial, we are manually setting the hyperparameters
and we will
release some good pre-configurations soon.
tabular_multimodel_hparam_v1 = {
'GBM': [{}, {'extra_trees': True, 'ag_args': {'name_suffix': 'XT'}}],
'CAT': {},
'TEXT_NN_V1': {},
}
predictor_model3 = TabularPredictor(label=label, eval_metric='accuracy', path='model3').fit(
train_df, hyperparameters=tabular_multimodel_hparam_v1
)
Beginning AutoGluon training ...
AutoGluon will save models to "model3/"
AutoGluon Version: 0.1.0b20210223
Train Data Rows: 5727
Train Data Columns: 2
Preprocessing data ...
AutoGluon infers your prediction problem is: 'multiclass' (because dtype of label-column == int, but few unique label-values observed).
4 unique label values: [3, 2, 1, 0]
If 'multiclass' is not the correct problem_type, please manually specify the problem_type argument in fit() (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])
Train Data Class Count: 4
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
Available Memory: 13504.55 MB
Train Data (Original) Memory Usage: 1.0 MB (0.0% of available memory)
Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
Stage 1 Generators:
Fitting AsTypeFeatureGenerator...
Stage 2 Generators:
Fitting FillNaFeatureGenerator...
Stage 3 Generators:
Fitting IdentityFeatureGenerator...
Fitting IdentityFeatureGenerator...
Fitting RenameFeatureGenerator...
Fitting CategoryFeatureGenerator...
Fitting CategoryMemoryMinimizeFeatureGenerator...
Fitting TextSpecialFeatureGenerator...
Fitting BinnedFeatureGenerator...
Fitting DropDuplicatesFeatureGenerator...
Fitting TextNgramFeatureGenerator...
Fitting CountVectorizer for text features: ['Product_Description']
CountVectorizer fit with vocabulary size = 725
Warning: Due to memory constraints, ngram feature count is being reduced. Allocate more memory to maximize model quality.
Reducing Vectorizer vocab size from 725 to 271 to avoid OOM error
Stage 4 Generators:
Fitting DropUniqueFeatureGenerator...
Types of features in original data (raw dtype, special dtypes):
('int', []) : 1 | ['Product_Type']
('object', ['text']) : 1 | ['Product_Description']
Types of features in processed data (raw dtype, special dtypes):
('int', []) : 1 | ['Product_Type']
('int', ['binned', 'text_special']) : 38 | ['Product_Description.char_count', 'Product_Description.word_count', 'Product_Description.capital_ratio', 'Product_Description.lower_ratio', 'Product_Description.digit_ratio', ...]
('int', ['text_ngram']) : 272 | ['__nlp__.about', '__nlp__.all', '__nlp__.amp', '__nlp__.an', '__nlp__.an ipad', ...]
('object', ['text']) : 1 | ['Product_Description_raw_text']
2.1s = Fit runtime
2 features in original data used to generate 312 features in processed data.
Train Data (Processed) Memory Usage: 2.8 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 2.13s ...
AutoGluon will gauge predictive performance using evaluation metric: 'accuracy'
To change this, specify the eval_metric argument of fit()
Automatically generating train/validation split with holdout_frac=0.1, Train Rows: 5154, Val Rows: 573
Fitting model: LightGBM ...
0.8796 = Validation accuracy score
1.02s = Training runtime
0.01s = Validation runtime
Fitting model: LightGBMXT ...
0.8586 = Validation accuracy score
1.22s = Training runtime
0.01s = Validation runtime
Fitting model: CatBoost ...
0.8726 = Validation accuracy score
0.95s = Training runtime
0.02s = Validation runtime
Fitting model: TextNeuralNetV1 ...
All Logs will be saved to model3/models/TextNeuralNetV1/TextNeuralNetV1/main.log
Starting Hyperparameter Tuning ... (num_trials=1)
0%| | 0/1 [00:00<?, ?it/s]
/var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/venv/lib/python3.8/site-packages/mxnet/gluon/block.py:995: UserWarning: The 3-th input to HybridBlock is not used by any computation. Is this intended?
self._build_cache(*args)
0.8918 = Validation accuracy score
90.19s = Training runtime
0.65s = Validation runtime
/var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/venv/lib/python3.8/site-packages/mxnet/gluon/block.py:995: UserWarning: The 3-th input to HybridBlock is not used by any computation. Is this intended?
self._build_cache(*args)
Fitting model: WeightedEnsemble_L1 ...
0.8935 = Validation accuracy score
0.15s = Training runtime
0.0s = Validation runtime
AutoGluon training complete, total runtime = 97.4s ...
TabularPredictor saved. To load, use: TabularPredictor.load("model3/")
predictor_model3.leaderboard(dev_df, silent=True)
/var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/venv/lib/python3.8/site-packages/mxnet/gluon/block.py:995: UserWarning: The 3-th input to HybridBlock is not used by any computation. Is this intended?
self._build_cache(*args)
model | score_test | score_val | pred_time_test | pred_time_val | fit_time | pred_time_test_marginal | pred_time_val_marginal | fit_time_marginal | stack_level | can_infer | fit_order | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | WeightedEnsemble_L1 | 0.896389 | 0.893543 | 0.999798 | 0.672808 | 92.509537 | 0.009143 | 0.000531 | 0.149596 | 1 | True | 5 |
1 | TextNeuralNetV1 | 0.888540 | 0.891798 | 0.960964 | 0.649250 | 90.191949 | 0.960964 | 0.649250 | 90.191949 | 0 | True | 4 |
2 | LightGBM | 0.886970 | 0.879581 | 0.008237 | 0.006089 | 1.020931 | 0.008237 | 0.006089 | 1.020931 | 0 | True | 1 |
3 | CatBoost | 0.886970 | 0.872600 | 0.017194 | 0.015831 | 0.948771 | 0.017194 | 0.015831 | 0.948771 | 0 | True | 3 |
4 | LightGBMXT | 0.868132 | 0.858639 | 0.012497 | 0.007196 | 1.219221 | 0.012497 | 0.007196 | 1.219221 | 0 | True | 2 |
Model 4: K-Fold Bagging and Stack Ensemble¶
A more advanced strategy is to use 5-fold bagging and call stack ensembling. This is expected to improve the final performance.
predictor_model4 = TabularPredictor(label=label, eval_metric='accuracy', path='model4').fit(
train_df, hyperparameters=tabular_multimodel_hparam_v1, num_bag_folds=5, num_stack_levels=1
)
Beginning AutoGluon training ...
AutoGluon will save models to "model4/"
AutoGluon Version: 0.1.0b20210223
Train Data Rows: 5727
Train Data Columns: 2
Preprocessing data ...
AutoGluon infers your prediction problem is: 'multiclass' (because dtype of label-column == int, but few unique label-values observed).
4 unique label values: [3, 2, 1, 0]
If 'multiclass' is not the correct problem_type, please manually specify the problem_type argument in fit() (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])
Train Data Class Count: 4
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
Available Memory: 13421.11 MB
Train Data (Original) Memory Usage: 1.0 MB (0.0% of available memory)
Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
Stage 1 Generators:
Fitting AsTypeFeatureGenerator...
Stage 2 Generators:
Fitting FillNaFeatureGenerator...
Stage 3 Generators:
Fitting IdentityFeatureGenerator...
Fitting IdentityFeatureGenerator...
Fitting RenameFeatureGenerator...
Fitting CategoryFeatureGenerator...
Fitting CategoryMemoryMinimizeFeatureGenerator...
Fitting TextSpecialFeatureGenerator...
Fitting BinnedFeatureGenerator...
Fitting DropDuplicatesFeatureGenerator...
Fitting TextNgramFeatureGenerator...
Fitting CountVectorizer for text features: ['Product_Description']
CountVectorizer fit with vocabulary size = 725
Warning: Due to memory constraints, ngram feature count is being reduced. Allocate more memory to maximize model quality.
Reducing Vectorizer vocab size from 725 to 265 to avoid OOM error
Stage 4 Generators:
Fitting DropUniqueFeatureGenerator...
Types of features in original data (raw dtype, special dtypes):
('int', []) : 1 | ['Product_Type']
('object', ['text']) : 1 | ['Product_Description']
Types of features in processed data (raw dtype, special dtypes):
('int', []) : 1 | ['Product_Type']
('int', ['binned', 'text_special']) : 38 | ['Product_Description.char_count', 'Product_Description.word_count', 'Product_Description.capital_ratio', 'Product_Description.lower_ratio', 'Product_Description.digit_ratio', ...]
('int', ['text_ngram']) : 266 | ['__nlp__.about', '__nlp__.all', '__nlp__.amp', '__nlp__.an', '__nlp__.an ipad', ...]
('object', ['text']) : 1 | ['Product_Description_raw_text']
2.2s = Fit runtime
2 features in original data used to generate 306 features in processed data.
Train Data (Processed) Memory Usage: 2.76 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 2.19s ...
AutoGluon will gauge predictive performance using evaluation metric: 'accuracy'
To change this, specify the eval_metric argument of fit()
Fitting model: LightGBM_BAG_L0 ...
0.8797 = Validation accuracy score
8.12s = Training runtime
0.07s = Validation runtime
Fitting model: LightGBMXT_BAG_L0 ...
0.8598 = Validation accuracy score
7.38s = Training runtime
0.07s = Validation runtime
Fitting model: CatBoost_BAG_L0 ...
0.8745 = Validation accuracy score
4.73s = Training runtime
0.08s = Validation runtime
Fitting model: TextNeuralNetV1_BAG_L0 ...
All Logs will be saved to model4/models/TextNeuralNetV1_BAG_L0/S1F1/S1F1/main.log
Starting Hyperparameter Tuning ... (num_trials=1)
0%| | 0/1 [00:00<?, ?it/s]
/var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/venv/lib/python3.8/site-packages/mxnet/gluon/block.py:995: UserWarning: The 3-th input to HybridBlock is not used by any computation. Is this intended?
self._build_cache(*args)
All Logs will be saved to model4/models/TextNeuralNetV1_BAG_L0/S1F2/S1F2/main.log
Starting Hyperparameter Tuning ... (num_trials=1)
0%| | 0/1 [00:00<?, ?it/s]
/var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/venv/lib/python3.8/site-packages/mxnet/gluon/block.py:995: UserWarning: The 3-th input to HybridBlock is not used by any computation. Is this intended?
self._build_cache(*args)
All Logs will be saved to model4/models/TextNeuralNetV1_BAG_L0/S1F3/S1F3/main.log
Starting Hyperparameter Tuning ... (num_trials=1)
0%| | 0/1 [00:00<?, ?it/s]
/var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/venv/lib/python3.8/site-packages/mxnet/gluon/block.py:995: UserWarning: The 3-th input to HybridBlock is not used by any computation. Is this intended?
self._build_cache(*args)
All Logs will be saved to model4/models/TextNeuralNetV1_BAG_L0/S1F4/S1F4/main.log
Starting Hyperparameter Tuning ... (num_trials=1)
0%| | 0/1 [00:00<?, ?it/s]
/var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/venv/lib/python3.8/site-packages/mxnet/gluon/block.py:995: UserWarning: The 3-th input to HybridBlock is not used by any computation. Is this intended?
self._build_cache(*args)
All Logs will be saved to model4/models/TextNeuralNetV1_BAG_L0/S1F5/S1F5/main.log
Starting Hyperparameter Tuning ... (num_trials=1)
0%| | 0/1 [00:00<?, ?it/s]
/var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/venv/lib/python3.8/site-packages/mxnet/gluon/block.py:995: UserWarning: The 3-th input to HybridBlock is not used by any computation. Is this intended?
self._build_cache(*args)
0.883 = Validation accuracy score
503.53s = Training runtime
35.01s = Validation runtime
Fitting model: WeightedEnsemble_L1 ...
0.883 = Validation accuracy score
0.46s = Training runtime
0.0s = Validation runtime
Fitting model: LightGBM_BAG_L1 ...
0.8846 = Validation accuracy score
10.14s = Training runtime
0.04s = Validation runtime
Fitting model: LightGBMXT_BAG_L1 ...
0.8809 = Validation accuracy score
9.59s = Training runtime
0.07s = Validation runtime
Fitting model: CatBoost_BAG_L1 ...
0.8841 = Validation accuracy score
11.96s = Training runtime
0.09s = Validation runtime
Fitting model: WeightedEnsemble_L2 ...
0.8846 = Validation accuracy score
0.4s = Training runtime
0.0s = Validation runtime
AutoGluon training complete, total runtime = 594.87s ...
TabularPredictor saved. To load, use: TabularPredictor.load("model4/")
predictor_model4.leaderboard(dev_df, silent=True)
/var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/venv/lib/python3.8/site-packages/mxnet/gluon/block.py:995: UserWarning: The 3-th input to HybridBlock is not used by any computation. Is this intended?
self._build_cache(*args)
model | score_test | score_val | pred_time_test | pred_time_val | fit_time | pred_time_test_marginal | pred_time_val_marginal | fit_time_marginal | stack_level | can_infer | fit_order | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | TextNeuralNetV1_BAG_L0 | 0.897959 | 0.883010 | 4.781237 | 35.009925 | 503.525820 | 4.781237 | 35.009925 | 503.525820 | 0 | True | 4 |
1 | WeightedEnsemble_L1 | 0.897959 | 0.883010 | 4.783215 | 35.010866 | 503.988360 | 0.001978 | 0.000941 | 0.462540 | 1 | True | 5 |
2 | CatBoost_BAG_L1 | 0.896389 | 0.884058 | 5.075670 | 35.320508 | 535.708009 | 0.046778 | 0.094063 | 11.955566 | 1 | True | 8 |
3 | LightGBM_BAG_L1 | 0.893250 | 0.884582 | 5.080883 | 35.270539 | 533.896461 | 0.051992 | 0.044093 | 10.144018 | 1 | True | 6 |
4 | WeightedEnsemble_L2 | 0.893250 | 0.884582 | 5.082560 | 35.271442 | 534.292905 | 0.001677 | 0.000904 | 0.396444 | 2 | True | 9 |
5 | LightGBMXT_BAG_L1 | 0.893250 | 0.880915 | 5.109002 | 35.294523 | 533.344140 | 0.080111 | 0.068078 | 9.591697 | 1 | True | 7 |
6 | CatBoost_BAG_L0 | 0.886970 | 0.874454 | 0.042751 | 0.083205 | 4.729065 | 0.042751 | 0.083205 | 4.729065 | 0 | True | 3 |
7 | LightGBM_BAG_L0 | 0.886970 | 0.879693 | 0.118651 | 0.068003 | 8.121579 | 0.118651 | 0.068003 | 8.121579 | 0 | True | 1 |
8 | LightGBMXT_BAG_L0 | 0.877551 | 0.859787 | 0.086252 | 0.065313 | 7.375978 | 0.086252 | 0.065313 | 7.375978 | 0 | True | 2 |
Model 5: Multimodal embedding + TabularPredictor¶
Also, since the neural network in text prediction can directly handle multi-modal data, we can fit a model with TextPrediction first and then use that as an embedding extractor. This can be viewed as an improved version of Model-2.
predictor_text_multimodal = TextPrediction.fit(train_df,
label=label,
time_limits=None,
eval_metric='accuracy',
stopping_metric='accuracy',
hyperparameters='default_no_hpo',
output_directory='predictor_text_multimodal')
train_sentence_multimodal_embeddings = predictor_text_multimodal.extract_embedding(train_df)
dev_sentence_multimodal_embeddings = predictor_text_multimodal.extract_embedding(dev_df)
predictor_model5 = TabularPredictor(label=label, eval_metric='accuracy', path='model5').fit(train_df)
2021-02-23 19:42:14,695 - autogluon.text.text_prediction.text_prediction - INFO - All Logs will be saved to predictor_text_multimodal/ag_text_prediction.log
All Logs will be saved to predictor_text_multimodal/ag_text_prediction.log
2021-02-23 19:42:14,715 - autogluon.text.text_prediction.text_prediction - INFO - Train Dataset:
Train Dataset:
2021-02-23 19:42:14,716 - autogluon.text.text_prediction.text_prediction - INFO - Columns:
- Text(
name="Product_Description"
#total/missing=4581/0
length, min/avg/max=11/104.7640253219821/178
)
- Categorical(
name="Product_Type"
#total/missing=4581/0
num_class (total/non_special)=10/10
categories=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
freq=[38, 43, 336, 218, 15, 152, 474, 233, 142, 2930]
)
- Categorical(
name="Sentiment"
#total/missing=4581/0
num_class (total/non_special)=4/4
categories=[0, 1, 2, 3]
freq=[75, 276, 2717, 1513]
)
Columns:
- Text(
name="Product_Description"
#total/missing=4581/0
length, min/avg/max=11/104.7640253219821/178
)
- Categorical(
name="Product_Type"
#total/missing=4581/0
num_class (total/non_special)=10/10
categories=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
freq=[38, 43, 336, 218, 15, 152, 474, 233, 142, 2930]
)
- Categorical(
name="Sentiment"
#total/missing=4581/0
num_class (total/non_special)=4/4
categories=[0, 1, 2, 3]
freq=[75, 276, 2717, 1513]
)
2021-02-23 19:42:14,717 - autogluon.text.text_prediction.text_prediction - INFO - Tuning Dataset:
Tuning Dataset:
2021-02-23 19:42:14,718 - autogluon.text.text_prediction.text_prediction - INFO - Columns:
- Text(
name="Product_Description"
#total/missing=1146/0
length, min/avg/max=32/105.1719022687609/159
)
- Categorical(
name="Product_Type"
#total/missing=1146/0
num_class (total/non_special)=10/10
categories=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
freq=[8, 13, 75, 55, 2, 42, 123, 62, 36, 730]
)
- Categorical(
name="Sentiment"
#total/missing=1146/0
num_class (total/non_special)=4/4
categories=[0, 1, 2, 3]
freq=[25, 83, 671, 367]
)
Columns:
- Text(
name="Product_Description"
#total/missing=1146/0
length, min/avg/max=32/105.1719022687609/159
)
- Categorical(
name="Product_Type"
#total/missing=1146/0
num_class (total/non_special)=10/10
categories=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
freq=[8, 13, 75, 55, 2, 42, 123, 62, 36, 730]
)
- Categorical(
name="Sentiment"
#total/missing=1146/0
num_class (total/non_special)=4/4
categories=[0, 1, 2, 3]
freq=[25, 83, 671, 367]
)
Label columns=['Sentiment'], Feature columns=['Product_Description', 'Product_Type'], Problem types=['classification'], Label shapes=[4]
Eval Metric=accuracy, Stop Metric=accuracy, Log Metrics=['acc', 'log_loss', 'accuracy']
2021-02-23 19:42:14,722 - autogluon.text.text_prediction.text_prediction - INFO - All Logs will be saved to predictor_text_multimodal/main.log
All Logs will be saved to predictor_text_multimodal/main.log
Starting Hyperparameter Tuning ... (num_trials=1)
0%| | 0/1 [00:00<?, ?it/s]
2021-02-23 19:43:58,181 - autogluon.text.text_prediction.text_prediction - INFO - Results=
Results=
2021-02-23 19:43:58,183 - autogluon.text.text_prediction.text_prediction - INFO - Best_config={'search_space▁optimization.lr': 5e-05}
Best_config={'search_space▁optimization.lr': 5e-05}
(task:7) 2021-02-23 19:42:16,860 - root - INFO - All Logs will be saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/docs/_build/eval/tutorials/tabular_prediction/predictor_text_multimodal/task7/training.log
2021-02-23 19:42:16,860 - root - INFO - learning:
early_stopping_patience: 10
log_metrics: auto
stop_metric: auto
valid_ratio: 0.15
misc:
exp_dir: /var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/docs/_build/eval/tutorials/tabular_prediction/predictor_text_multimodal/task7
seed: 123
model:
backbone:
name: google_electra_small
network:
agg_net:
activation: tanh
agg_type: concat
data_dropout: False
dropout: 0.1
feature_proj_num_layers: -1
initializer:
bias: ['zeros']
weight: ['xavier', 'uniform', 'avg', 3.0]
mid_units: 256
norm_eps: 1e-05
normalization: layer_norm
out_proj_num_layers: 0
categorical_net:
activation: leaky
data_dropout: False
dropout: 0.1
emb_units: 32
initializer:
bias: ['zeros']
embed: ['xavier', 'gaussian', 'in', 1.0]
weight: ['xavier', 'uniform', 'avg', 3.0]
mid_units: 64
norm_eps: 1e-05
normalization: layer_norm
num_layers: 1
feature_units: -1
initializer:
bias: ['zeros']
weight: ['truncnorm', 0, 0.02]
numerical_net:
activation: leaky
data_dropout: False
dropout: 0.1
initializer:
bias: ['zeros']
weight: ['xavier', 'uniform', 'avg', 3.0]
input_centering: False
mid_units: 128
norm_eps: 1e-05
normalization: layer_norm
num_layers: 1
text_net:
pool_type: cls
use_segment_id: True
preprocess:
max_length: 128
merge_text: True
optimization:
batch_size: 32
begin_lr: 0.0
final_lr: 0.0
layerwise_lr_decay: 0.8
log_frequency: 0.1
lr: 5e-05
lr_scheduler: triangular
max_grad_norm: 1.0
model_average: 5
num_train_epochs: 4
optimizer: adamw
optimizer_params: [('beta1', 0.9), ('beta2', 0.999), ('epsilon', 1e-06), ('correct_bias', False)]
per_device_batch_size: 16
val_batch_size_mult: 2
valid_frequency: 0.1
warmup_portion: 0.1
wd: 0.01
version: 1
2021-02-23 19:42:17,008 - root - INFO - Process training set...
2021-02-23 19:42:20,561 - root - INFO - Done!
2021-02-23 19:42:20,563 - root - INFO - Process dev set...
2021-02-23 19:42:23,413 - root - INFO - Done!
2021-02-23 19:42:28,728 - root - INFO - #Total Params/Fixed Params=13504196/0
2021-02-23 19:42:28,744 - root - INFO - Using gradient accumulation. Global batch size = 32
2021-02-23 19:42:31,227 - root - INFO - [Iter 15/572, Epoch 0] train loss=7.6880e-01, gnorm=4.5836e+00, lr=1.3158e-05, #samples processed=720, #sample per second=294.59
2021-02-23 19:42:32,048 - root - INFO - [Iter 15/572, Epoch 0] valid accuracy=6.9110e-01, log_loss=8.6215e-01, accuracy=6.9110e-01, time spent=0.740s, total_time=0.05min
2021-02-23 19:42:33,539 - root - INFO - [Iter 30/572, Epoch 0] train loss=5.2239e-01, gnorm=4.9510e+00, lr=2.6316e-05, #samples processed=720, #sample per second=311.42
2021-02-23 19:42:34,447 - root - INFO - [Iter 30/572, Epoch 0] valid accuracy=8.2373e-01, log_loss=6.7741e-01, accuracy=8.2373e-01, time spent=0.736s, total_time=0.09min
2021-02-23 19:42:35,948 - root - INFO - [Iter 45/572, Epoch 0] train loss=4.0803e-01, gnorm=2.5185e+00, lr=3.9474e-05, #samples processed=720, #sample per second=298.88
2021-02-23 19:42:36,801 - root - INFO - [Iter 45/572, Epoch 0] valid accuracy=8.5689e-01, log_loss=5.3321e-01, accuracy=8.5689e-01, time spent=0.734s, total_time=0.13min
2021-02-23 19:42:38,372 - root - INFO - [Iter 60/572, Epoch 0] train loss=3.7348e-01, gnorm=3.0502e+00, lr=4.9709e-05, #samples processed=720, #sample per second=297.09
2021-02-23 19:42:39,249 - root - INFO - [Iter 60/572, Epoch 0] valid accuracy=8.5689e-01, log_loss=4.9822e-01, accuracy=8.5689e-01, time spent=0.740s, total_time=0.17min
2021-02-23 19:42:40,683 - root - INFO - [Iter 75/572, Epoch 0] train loss=3.2895e-01, gnorm=3.5754e+00, lr=4.8252e-05, #samples processed=720, #sample per second=311.56
2021-02-23 19:42:41,602 - root - INFO - [Iter 75/572, Epoch 0] valid accuracy=8.6126e-01, log_loss=5.3280e-01, accuracy=8.6126e-01, time spent=0.744s, total_time=0.21min
2021-02-23 19:42:43,136 - root - INFO - [Iter 90/572, Epoch 0] train loss=2.7651e-01, gnorm=2.3192e+00, lr=4.6796e-05, #samples processed=720, #sample per second=293.54
2021-02-23 19:42:43,881 - root - INFO - [Iter 90/572, Epoch 0] valid accuracy=8.5777e-01, log_loss=4.9850e-01, accuracy=8.5777e-01, time spent=0.744s, total_time=0.25min
2021-02-23 19:42:45,342 - root - INFO - [Iter 105/572, Epoch 0] train loss=3.2964e-01, gnorm=2.4756e+00, lr=4.5340e-05, #samples processed=720, #sample per second=326.43
2021-02-23 19:42:46,231 - root - INFO - [Iter 105/572, Epoch 0] valid accuracy=8.6126e-01, log_loss=4.8262e-01, accuracy=8.6126e-01, time spent=0.748s, total_time=0.29min
2021-02-23 19:42:47,723 - root - INFO - [Iter 120/572, Epoch 0] train loss=2.7775e-01, gnorm=3.3851e+00, lr=4.3883e-05, #samples processed=720, #sample per second=302.40
2021-02-23 19:42:48,593 - root - INFO - [Iter 120/572, Epoch 0] valid accuracy=8.6126e-01, log_loss=4.9458e-01, accuracy=8.6126e-01, time spent=0.747s, total_time=0.33min
2021-02-23 19:42:50,058 - root - INFO - [Iter 135/572, Epoch 0] train loss=3.5077e-01, gnorm=5.2572e+00, lr=4.2427e-05, #samples processed=720, #sample per second=308.41
2021-02-23 19:42:50,945 - root - INFO - [Iter 135/572, Epoch 0] valid accuracy=8.6126e-01, log_loss=5.0231e-01, accuracy=8.6126e-01, time spent=0.746s, total_time=0.37min
2021-02-23 19:42:52,411 - root - INFO - [Iter 150/572, Epoch 1] train loss=3.0280e-01, gnorm=2.9394e+00, lr=4.0971e-05, #samples processed=698, #sample per second=296.61
2021-02-23 19:42:53,289 - root - INFO - [Iter 150/572, Epoch 1] valid accuracy=8.6126e-01, log_loss=4.6663e-01, accuracy=8.6126e-01, time spent=0.741s, total_time=0.41min
2021-02-23 19:42:54,739 - root - INFO - [Iter 165/572, Epoch 1] train loss=3.3440e-01, gnorm=1.9468e+00, lr=3.9515e-05, #samples processed=720, #sample per second=309.34
2021-02-23 19:42:55,618 - root - INFO - [Iter 165/572, Epoch 1] valid accuracy=8.6126e-01, log_loss=4.7071e-01, accuracy=8.6126e-01, time spent=0.743s, total_time=0.45min
2021-02-23 19:42:57,103 - root - INFO - [Iter 180/572, Epoch 1] train loss=3.1203e-01, gnorm=2.5760e+00, lr=3.8058e-05, #samples processed=720, #sample per second=304.62
2021-02-23 19:42:58,016 - root - INFO - [Iter 180/572, Epoch 1] valid accuracy=8.6475e-01, log_loss=4.5359e-01, accuracy=8.6475e-01, time spent=0.770s, total_time=0.49min
2021-02-23 19:42:59,492 - root - INFO - [Iter 195/572, Epoch 1] train loss=3.1515e-01, gnorm=3.4280e+00, lr=3.6602e-05, #samples processed=720, #sample per second=301.42
2021-02-23 19:43:00,239 - root - INFO - [Iter 195/572, Epoch 1] valid accuracy=8.6300e-01, log_loss=4.7073e-01, accuracy=8.6300e-01, time spent=0.747s, total_time=0.52min
2021-02-23 19:43:01,687 - root - INFO - [Iter 210/572, Epoch 1] train loss=3.1073e-01, gnorm=3.3016e+00, lr=3.5146e-05, #samples processed=720, #sample per second=328.04
2021-02-23 19:43:02,578 - root - INFO - [Iter 210/572, Epoch 1] valid accuracy=8.6562e-01, log_loss=4.5557e-01, accuracy=8.6562e-01, time spent=0.757s, total_time=0.56min
2021-02-23 19:43:04,062 - root - INFO - [Iter 225/572, Epoch 1] train loss=2.7208e-01, gnorm=4.7893e+00, lr=3.3689e-05, #samples processed=720, #sample per second=303.15
2021-02-23 19:43:04,950 - root - INFO - [Iter 225/572, Epoch 1] valid accuracy=8.6562e-01, log_loss=4.6826e-01, accuracy=8.6562e-01, time spent=0.750s, total_time=0.60min
2021-02-23 19:43:06,403 - root - INFO - [Iter 240/572, Epoch 1] train loss=2.8616e-01, gnorm=2.7858e+00, lr=3.2233e-05, #samples processed=720, #sample per second=307.56
2021-02-23 19:43:07,296 - root - INFO - [Iter 240/572, Epoch 1] valid accuracy=8.6736e-01, log_loss=4.4909e-01, accuracy=8.6736e-01, time spent=0.755s, total_time=0.64min
2021-02-23 19:43:08,790 - root - INFO - [Iter 255/572, Epoch 1] train loss=2.3186e-01, gnorm=1.3332e+00, lr=3.0777e-05, #samples processed=720, #sample per second=301.64
2021-02-23 19:43:09,552 - root - INFO - [Iter 255/572, Epoch 1] valid accuracy=8.6126e-01, log_loss=5.0038e-01, accuracy=8.6126e-01, time spent=0.761s, total_time=0.68min
2021-02-23 19:43:11,041 - root - INFO - [Iter 270/572, Epoch 1] train loss=2.5154e-01, gnorm=3.5720e+00, lr=2.9320e-05, #samples processed=720, #sample per second=320.01
2021-02-23 19:43:11,801 - root - INFO - [Iter 270/572, Epoch 1] valid accuracy=8.6649e-01, log_loss=4.6431e-01, accuracy=8.6649e-01, time spent=0.761s, total_time=0.72min
2021-02-23 19:43:13,268 - root - INFO - [Iter 285/572, Epoch 1] train loss=2.8520e-01, gnorm=3.8546e+00, lr=2.7864e-05, #samples processed=720, #sample per second=323.30
2021-02-23 19:43:14,027 - root - INFO - [Iter 285/572, Epoch 1] valid accuracy=8.6562e-01, log_loss=4.6998e-01, accuracy=8.6562e-01, time spent=0.758s, total_time=0.75min
2021-02-23 19:43:15,491 - root - INFO - [Iter 300/572, Epoch 2] train loss=3.3013e-01, gnorm=3.5853e+00, lr=2.6408e-05, #samples processed=709, #sample per second=318.87
2021-02-23 19:43:16,396 - root - INFO - [Iter 300/572, Epoch 2] valid accuracy=8.6911e-01, log_loss=4.4177e-01, accuracy=8.6911e-01, time spent=0.759s, total_time=0.79min
2021-02-23 19:43:17,867 - root - INFO - [Iter 315/572, Epoch 2] train loss=2.7419e-01, gnorm=3.3656e+00, lr=2.4951e-05, #samples processed=720, #sample per second=303.16
2021-02-23 19:43:18,608 - root - INFO - [Iter 315/572, Epoch 2] valid accuracy=8.6300e-01, log_loss=4.9422e-01, accuracy=8.6300e-01, time spent=0.741s, total_time=0.83min
2021-02-23 19:43:20,083 - root - INFO - [Iter 330/572, Epoch 2] train loss=2.5470e-01, gnorm=2.7021e+00, lr=2.3495e-05, #samples processed=720, #sample per second=324.87
2021-02-23 19:43:20,839 - root - INFO - [Iter 330/572, Epoch 2] valid accuracy=8.6824e-01, log_loss=4.5256e-01, accuracy=8.6824e-01, time spent=0.756s, total_time=0.87min
2021-02-23 19:43:22,290 - root - INFO - [Iter 345/572, Epoch 2] train loss=2.8035e-01, gnorm=3.6649e+00, lr=2.2039e-05, #samples processed=720, #sample per second=326.20
2021-02-23 19:43:23,049 - root - INFO - [Iter 345/572, Epoch 2] valid accuracy=8.6736e-01, log_loss=4.4729e-01, accuracy=8.6736e-01, time spent=0.758s, total_time=0.90min
2021-02-23 19:43:24,557 - root - INFO - [Iter 360/572, Epoch 2] train loss=2.5612e-01, gnorm=3.1314e+00, lr=2.0583e-05, #samples processed=720, #sample per second=317.63
2021-02-23 19:43:25,339 - root - INFO - [Iter 360/572, Epoch 2] valid accuracy=8.6475e-01, log_loss=4.6779e-01, accuracy=8.6475e-01, time spent=0.781s, total_time=0.94min
2021-02-23 19:43:26,821 - root - INFO - [Iter 375/572, Epoch 2] train loss=2.2449e-01, gnorm=5.5644e+00, lr=1.9126e-05, #samples processed=720, #sample per second=318.10
2021-02-23 19:43:27,578 - root - INFO - [Iter 375/572, Epoch 2] valid accuracy=8.6824e-01, log_loss=4.4857e-01, accuracy=8.6824e-01, time spent=0.757s, total_time=0.98min
2021-02-23 19:43:29,074 - root - INFO - [Iter 390/572, Epoch 2] train loss=2.8870e-01, gnorm=4.0543e+00, lr=1.7670e-05, #samples processed=720, #sample per second=319.61
2021-02-23 19:43:29,829 - root - INFO - [Iter 390/572, Epoch 2] valid accuracy=8.6649e-01, log_loss=4.6483e-01, accuracy=8.6649e-01, time spent=0.755s, total_time=1.02min
2021-02-23 19:43:31,234 - root - INFO - [Iter 405/572, Epoch 2] train loss=2.4080e-01, gnorm=5.0446e+00, lr=1.6214e-05, #samples processed=720, #sample per second=333.41
2021-02-23 19:43:32,136 - root - INFO - [Iter 405/572, Epoch 2] valid accuracy=8.6911e-01, log_loss=4.3994e-01, accuracy=8.6911e-01, time spent=0.764s, total_time=1.06min
2021-02-23 19:43:33,652 - root - INFO - [Iter 420/572, Epoch 2] train loss=2.6520e-01, gnorm=4.7820e+00, lr=1.4757e-05, #samples processed=720, #sample per second=297.77
2021-02-23 19:43:34,531 - root - INFO - [Iter 420/572, Epoch 2] valid accuracy=8.6911e-01, log_loss=4.4517e-01, accuracy=8.6911e-01, time spent=0.747s, total_time=1.10min
2021-02-23 19:43:35,988 - root - INFO - [Iter 435/572, Epoch 3] train loss=2.7997e-01, gnorm=2.2674e+00, lr=1.3301e-05, #samples processed=698, #sample per second=298.79
2021-02-23 19:43:36,768 - root - INFO - [Iter 435/572, Epoch 3] valid accuracy=8.6824e-01, log_loss=4.5767e-01, accuracy=8.6824e-01, time spent=0.780s, total_time=1.13min
2021-02-23 19:43:38,196 - root - INFO - [Iter 450/572, Epoch 3] train loss=2.6275e-01, gnorm=1.0901e+01, lr=1.1845e-05, #samples processed=720, #sample per second=326.10
2021-02-23 19:43:38,967 - root - INFO - [Iter 450/572, Epoch 3] valid accuracy=8.6736e-01, log_loss=4.6367e-01, accuracy=8.6736e-01, time spent=0.771s, total_time=1.17min
2021-02-23 19:43:40,466 - root - INFO - [Iter 465/572, Epoch 3] train loss=2.9477e-01, gnorm=6.8845e+00, lr=1.0388e-05, #samples processed=720, #sample per second=317.20
2021-02-23 19:43:41,352 - root - INFO - [Iter 465/572, Epoch 3] valid accuracy=8.6998e-01, log_loss=4.3869e-01, accuracy=8.6998e-01, time spent=0.754s, total_time=1.21min
2021-02-23 19:43:42,808 - root - INFO - [Iter 480/572, Epoch 3] train loss=2.5355e-01, gnorm=2.8429e+00, lr=8.9320e-06, #samples processed=720, #sample per second=307.44
2021-02-23 19:43:43,567 - root - INFO - [Iter 480/572, Epoch 3] valid accuracy=8.6911e-01, log_loss=4.5120e-01, accuracy=8.6911e-01, time spent=0.758s, total_time=1.25min
2021-02-23 19:43:45,077 - root - INFO - [Iter 495/572, Epoch 3] train loss=2.4979e-01, gnorm=4.2404e+00, lr=7.4757e-06, #samples processed=720, #sample per second=317.39
2021-02-23 19:43:45,833 - root - INFO - [Iter 495/572, Epoch 3] valid accuracy=8.6824e-01, log_loss=4.4716e-01, accuracy=8.6824e-01, time spent=0.756s, total_time=1.28min
2021-02-23 19:43:47,312 - root - INFO - [Iter 510/572, Epoch 3] train loss=2.6619e-01, gnorm=3.3240e+00, lr=6.0194e-06, #samples processed=720, #sample per second=322.11
2021-02-23 19:43:48,061 - root - INFO - [Iter 510/572, Epoch 3] valid accuracy=8.6911e-01, log_loss=4.4639e-01, accuracy=8.6911e-01, time spent=0.748s, total_time=1.32min
2021-02-23 19:43:49,506 - root - INFO - [Iter 525/572, Epoch 3] train loss=2.0779e-01, gnorm=2.5354e+00, lr=4.5631e-06, #samples processed=720, #sample per second=328.20
2021-02-23 19:43:50,264 - root - INFO - [Iter 525/572, Epoch 3] valid accuracy=8.6911e-01, log_loss=4.5519e-01, accuracy=8.6911e-01, time spent=0.758s, total_time=1.36min
2021-02-23 19:43:51,690 - root - INFO - [Iter 540/572, Epoch 3] train loss=2.6498e-01, gnorm=3.6586e+00, lr=3.1068e-06, #samples processed=720, #sample per second=329.82
2021-02-23 19:43:52,433 - root - INFO - [Iter 540/572, Epoch 3] valid accuracy=8.6911e-01, log_loss=4.5044e-01, accuracy=8.6911e-01, time spent=0.743s, total_time=1.39min
2021-02-23 19:43:53,926 - root - INFO - [Iter 555/572, Epoch 3] train loss=2.9774e-01, gnorm=4.7774e+00, lr=1.6505e-06, #samples processed=720, #sample per second=321.93
2021-02-23 19:43:54,681 - root - INFO - [Iter 555/572, Epoch 3] valid accuracy=8.6824e-01, log_loss=4.4645e-01, accuracy=8.6824e-01, time spent=0.754s, total_time=1.43min
2021-02-23 19:43:56,197 - root - INFO - [Iter 570/572, Epoch 3] train loss=2.0356e-01, gnorm=2.9091e+00, lr=1.9417e-07, #samples processed=720, #sample per second=317.11
2021-02-23 19:43:56,947 - root - INFO - [Iter 570/572, Epoch 3] valid accuracy=8.6824e-01, log_loss=4.4582e-01, accuracy=8.6824e-01, time spent=0.749s, total_time=1.47min
2021-02-23 19:43:57,899 - root - INFO - [Iter 572/572, Epoch 3] valid accuracy=8.6824e-01, log_loss=4.4583e-01, accuracy=8.6824e-01, time spent=0.760s, total_time=1.49min
/var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/text/src/autogluon/text/text_prediction/dataset.py:321: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
df[col_name] = df[col_name].fillna('').apply(str)
/var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/venv/lib/python3.8/site-packages/mxnet/gluon/block.py:995: UserWarning: The 3-th input to HybridBlock is not used by any computation. Is this intended?
self._build_cache(*args)
Beginning AutoGluon training ...
AutoGluon will save models to "model5/"
AutoGluon Version: 0.1.0b20210223
Train Data Rows: 5727
Train Data Columns: 2
Preprocessing data ...
AutoGluon infers your prediction problem is: 'multiclass' (because dtype of label-column == int, but few unique label-values observed).
4 unique label values: [3, 2, 1, 0]
If 'multiclass' is not the correct problem_type, please manually specify the problem_type argument in fit() (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])
Train Data Class Count: 4
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
Available Memory: 13323.89 MB
Train Data (Original) Memory Usage: 1.0 MB (0.0% of available memory)
Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
Stage 1 Generators:
Fitting AsTypeFeatureGenerator...
Stage 2 Generators:
Fitting FillNaFeatureGenerator...
Stage 3 Generators:
Fitting IdentityFeatureGenerator...
Fitting CategoryFeatureGenerator...
Fitting CategoryMemoryMinimizeFeatureGenerator...
Fitting TextSpecialFeatureGenerator...
Fitting BinnedFeatureGenerator...
Fitting DropDuplicatesFeatureGenerator...
Fitting TextNgramFeatureGenerator...
Fitting CountVectorizer for text features: ['Product_Description']
CountVectorizer fit with vocabulary size = 725
Warning: Due to memory constraints, ngram feature count is being reduced. Allocate more memory to maximize model quality.
Reducing Vectorizer vocab size from 725 to 259 to avoid OOM error
Stage 4 Generators:
Fitting DropUniqueFeatureGenerator...
Types of features in original data (raw dtype, special dtypes):
('int', []) : 1 | ['Product_Type']
('object', ['text']) : 1 | ['Product_Description']
Types of features in processed data (raw dtype, special dtypes):
('int', []) : 1 | ['Product_Type']
('int', ['binned', 'text_special']) : 38 | ['Product_Description.char_count', 'Product_Description.word_count', 'Product_Description.capital_ratio', 'Product_Description.lower_ratio', 'Product_Description.digit_ratio', ...]
('int', ['text_ngram']) : 260 | ['__nlp__.about', '__nlp__.all', '__nlp__.amp', '__nlp__.an', '__nlp__.an ipad', ...]
2.1s = Fit runtime
2 features in original data used to generate 299 features in processed data.
Train Data (Processed) Memory Usage: 1.77 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 2.17s ...
AutoGluon will gauge predictive performance using evaluation metric: 'accuracy'
To change this, specify the eval_metric argument of fit()
Automatically generating train/validation split with holdout_frac=0.1, Train Rows: 5154, Val Rows: 573
Fitting model: NeuralNetMXNet ...
0.8743 = Validation accuracy score
5.12s = Training runtime
0.04s = Validation runtime
Fitting model: NeuralNetFastAI ...
█
0.8499 = Validation accuracy score
17.51s = Training runtime
0.28s = Validation runtime
Fitting model: KNeighborsUnif ...
0.8534 = Validation accuracy score
0.02s = Training runtime
0.02s = Validation runtime
Fitting model: KNeighborsDist ...
0.8534 = Validation accuracy score
0.02s = Training runtime
0.02s = Validation runtime
Fitting model: RandomForestGini ...
0.8796 = Validation accuracy score
0.96s = Training runtime
0.08s = Validation runtime
Fitting model: RandomForestEntr ...
0.8761 = Validation accuracy score
1.0s = Training runtime
0.08s = Validation runtime
Fitting model: ExtraTreesGini ...
0.8534 = Validation accuracy score
1.06s = Training runtime
0.08s = Validation runtime
Fitting model: ExtraTreesEntr ...
0.8499 = Validation accuracy score
1.08s = Training runtime
0.08s = Validation runtime
Fitting model: LightGBM ...
0.8778 = Validation accuracy score
1.31s = Training runtime
0.01s = Validation runtime
Fitting model: LightGBMXT ...
0.8586 = Validation accuracy score
1.28s = Training runtime
0.01s = Validation runtime
Fitting model: CatBoost ...
0.8726 = Validation accuracy score
0.9s = Training runtime
0.01s = Validation runtime
Fitting model: XGBoost ...
0.8778 = Validation accuracy score
2.42s = Training runtime
0.01s = Validation runtime
Fitting model: LightGBMLarge ...
0.8848 = Validation accuracy score
4.02s = Training runtime
0.01s = Validation runtime
█
Fitting model: WeightedEnsemble_L1 ...
0.8901 = Validation accuracy score
0.4s = Training runtime
0.0s = Validation runtime
AutoGluon training complete, total runtime = 51.48s ...
TabularPredictor saved. To load, use: TabularPredictor.load("model5/")
predictor_model5.leaderboard(dev_df.join(pd.DataFrame(dev_sentence_multimodal_embeddings)), silent=True)
█
model | score_test | score_val | pred_time_test | pred_time_val | fit_time | pred_time_test_marginal | pred_time_val_marginal | fit_time_marginal | stack_level | can_infer | fit_order | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | CatBoost | 0.886970 | 0.872600 | 0.014327 | 0.005457 | 0.895682 | 0.014327 | 0.005457 | 0.895682 | 0 | True | 11 |
1 | RandomForestGini | 0.886970 | 0.879581 | 0.120903 | 0.079546 | 0.959055 | 0.120903 | 0.079546 | 0.959055 | 0 | True | 5 |
2 | WeightedEnsemble_L1 | 0.886970 | 0.890052 | 10.760290 | 0.648267 | 29.524343 | 0.006635 | 0.000556 | 0.402977 | 1 | True | 14 |
3 | LightGBM | 0.885400 | 0.877836 | 0.012522 | 0.006084 | 1.307731 | 0.012522 | 0.006084 | 1.307731 | 0 | True | 9 |
4 | NeuralNetMXNet | 0.885400 | 0.874346 | 0.043936 | 0.037668 | 5.120788 | 0.043936 | 0.037668 | 5.120788 | 0 | True | 1 |
5 | LightGBMLarge | 0.883830 | 0.884817 | 0.028177 | 0.007567 | 4.015435 | 0.028177 | 0.007567 | 4.015435 | 0 | True | 13 |
6 | KNeighborsUnif | 0.883830 | 0.853403 | 0.031159 | 0.019027 | 0.017875 | 0.031159 | 0.019027 | 0.017875 | 0 | True | 3 |
7 | KNeighborsDist | 0.883830 | 0.853403 | 0.040395 | 0.017965 | 0.017411 | 0.040395 | 0.017965 | 0.017411 | 0 | True | 4 |
8 | XGBoost | 0.883830 | 0.877836 | 0.113737 | 0.011080 | 2.421276 | 0.113737 | 0.011080 | 2.421276 | 0 | True | 12 |
9 | RandomForestEntr | 0.883830 | 0.876091 | 0.119312 | 0.080413 | 0.995966 | 0.119312 | 0.080413 | 0.995966 | 0 | True | 6 |
10 | ExtraTreesGini | 0.875981 | 0.853403 | 0.177299 | 0.080179 | 1.055089 | 0.177299 | 0.080179 | 1.055089 | 0 | True | 7 |
11 | NeuralNetFastAI | 0.874411 | 0.849913 | 10.059017 | 0.280836 | 17.511071 | 10.059017 | 0.280836 | 17.511071 | 0 | True | 2 |
12 | LightGBMXT | 0.871272 | 0.858639 | 0.012264 | 0.006685 | 1.283650 | 0.012264 | 0.006685 | 1.283650 | 0 | True | 10 |
13 | ExtraTreesEntr | 0.869702 | 0.849913 | 0.169439 | 0.082980 | 1.080275 | 0.169439 | 0.082980 | 1.080275 | 0 | True | 8 |
Model 6: Use a larger backbone¶
Now, we will choose to use a larger backbone: ELECTRA-base. We will find that the performance gets improved after we change to use a larger backbone model. However, we should notice that the training time will be longer and the inference cost will be higher.
from autogluon.text.text_prediction.text_prediction import ag_text_prediction_params
from autogluon.tabular.configs.hyperparameter_configs import get_hyperparameter_config
import copy
text_nn_params = ag_text_prediction_params.create('default_electra_base_no_hpo')
tabular_multimodel_hparam_v2 = {
'GBM': [{}, {'extra_trees': True, 'ag_args': {'name_suffix': 'XT'}}],
'CAT': {},
'TEXT_NN_V1': text_nn_params,
}
predictor_model6 = TabularPredictor(label=label, eval_metric='accuracy', path='model6').fit(
train_df, hyperparameters=tabular_multimodel_hparam_v2
)
Beginning AutoGluon training ...
AutoGluon will save models to "model6/"
AutoGluon Version: 0.1.0b20210223
Train Data Rows: 5727
Train Data Columns: 2
Preprocessing data ...
AutoGluon infers your prediction problem is: 'multiclass' (because dtype of label-column == int, but few unique label-values observed).
4 unique label values: [3, 2, 1, 0]
If 'multiclass' is not the correct problem_type, please manually specify the problem_type argument in fit() (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])
Train Data Class Count: 4
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
Available Memory: 13221.79 MB
Train Data (Original) Memory Usage: 1.0 MB (0.0% of available memory)
Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
Stage 1 Generators:
Fitting AsTypeFeatureGenerator...
Stage 2 Generators:
Fitting FillNaFeatureGenerator...
Stage 3 Generators:
Fitting IdentityFeatureGenerator...
Fitting IdentityFeatureGenerator...
Fitting RenameFeatureGenerator...
Fitting CategoryFeatureGenerator...
Fitting CategoryMemoryMinimizeFeatureGenerator...
Fitting TextSpecialFeatureGenerator...
Fitting BinnedFeatureGenerator...
Fitting DropDuplicatesFeatureGenerator...
Fitting TextNgramFeatureGenerator...
Fitting CountVectorizer for text features: ['Product_Description']
CountVectorizer fit with vocabulary size = 725
Warning: Due to memory constraints, ngram feature count is being reduced. Allocate more memory to maximize model quality.
Reducing Vectorizer vocab size from 725 to 256 to avoid OOM error
Stage 4 Generators:
Fitting DropUniqueFeatureGenerator...
Types of features in original data (raw dtype, special dtypes):
('int', []) : 1 | ['Product_Type']
('object', ['text']) : 1 | ['Product_Description']
Types of features in processed data (raw dtype, special dtypes):
('int', []) : 1 | ['Product_Type']
('int', ['binned', 'text_special']) : 38 | ['Product_Description.char_count', 'Product_Description.word_count', 'Product_Description.capital_ratio', 'Product_Description.lower_ratio', 'Product_Description.digit_ratio', ...]
('int', ['text_ngram']) : 257 | ['__nlp__.about', '__nlp__.all', '__nlp__.amp', '__nlp__.an', '__nlp__.an ipad', ...]
('object', ['text']) : 1 | ['Product_Description_raw_text']
2.2s = Fit runtime
2 features in original data used to generate 297 features in processed data.
Train Data (Processed) Memory Usage: 2.71 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 2.21s ...
AutoGluon will gauge predictive performance using evaluation metric: 'accuracy'
To change this, specify the eval_metric argument of fit()
Automatically generating train/validation split with holdout_frac=0.1, Train Rows: 5154, Val Rows: 573
Fitting model: LightGBM ...
0.8778 = Validation accuracy score
1.16s = Training runtime
0.01s = Validation runtime
Fitting model: LightGBMXT ...
0.8569 = Validation accuracy score
1.37s = Training runtime
0.02s = Validation runtime
Fitting model: CatBoost ...
0.8726 = Validation accuracy score
0.9s = Training runtime
0.02s = Validation runtime
Fitting model: TextNeuralNetV1 ...
All Logs will be saved to model6/models/TextNeuralNetV1/TextNeuralNetV1/main.log
Starting Hyperparameter Tuning ... (num_trials=1)
0%| | 0/1 [00:00<?, ?it/s]
/var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/venv/lib/python3.8/site-packages/mxnet/gluon/block.py:995: UserWarning: The 3-th input to HybridBlock is not used by any computation. Is this intended?
self._build_cache(*args)
0.9162 = Validation accuracy score
319.17s = Training runtime
2.26s = Validation runtime
/var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/venv/lib/python3.8/site-packages/mxnet/gluon/block.py:995: UserWarning: The 3-th input to HybridBlock is not used by any computation. Is this intended?
self._build_cache(*args)
Fitting model: WeightedEnsemble_L1 ...
0.9162 = Validation accuracy score
0.14s = Training runtime
0.0s = Validation runtime
AutoGluon training complete, total runtime = 331.01s ...
TabularPredictor saved. To load, use: TabularPredictor.load("model6/")
predictor_model6.leaderboard(dev_df, silent=True)
/var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/venv/lib/python3.8/site-packages/mxnet/gluon/block.py:995: UserWarning: The 3-th input to HybridBlock is not used by any computation. Is this intended?
self._build_cache(*args)
model | score_test | score_val | pred_time_test | pred_time_val | fit_time | pred_time_test_marginal | pred_time_val_marginal | fit_time_marginal | stack_level | can_infer | fit_order | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | TextNeuralNetV1 | 0.899529 | 0.916230 | 3.329292 | 2.263031 | 319.174685 | 3.329292 | 2.263031 | 319.174685 | 0 | True | 4 |
1 | WeightedEnsemble_L1 | 0.899529 | 0.916230 | 3.337787 | 2.263554 | 319.317756 | 0.008495 | 0.000524 | 0.143071 | 1 | True | 5 |
2 | CatBoost | 0.886970 | 0.872600 | 0.015280 | 0.015155 | 0.900599 | 0.015280 | 0.015155 | 0.900599 | 0 | True | 3 |
3 | LightGBM | 0.885400 | 0.877836 | 0.007218 | 0.006019 | 1.159134 | 0.007218 | 0.006019 | 1.159134 | 0 | True | 1 |
4 | LightGBMXT | 0.868132 | 0.856894 | 0.013989 | 0.017886 | 1.365870 | 0.013989 | 0.017886 | 1.365870 | 0 | True | 2 |
Major Takeaways¶
After performing these comparisons, we have the following takeaways:
The multimodal text neural network structure used in TextPrediction is a good for dealing with tabular data with text and categorical features.
K-fold bagging / stacking and weighted ensemble are helpful
We need a larger backbone. This aligns with the observation in recent papers, e.g., Scaling Laws for Autoregressive Generative Modeling.