Explore Models for Data Tables with Text and Categorical Features¶

We will introduce how to use AutoGluon to deal with tabular data that involves text and categorical features. This type of data, i.e., data which contains text and other features, is prevalent in real world applications. For example, when building a sentiment analysis model of users’ tweets, we can not only use the raw text in the tweets but also other features such as the topic of the tweet and the user profile. In the following, we will investigate different ways to ensemble the state-of-the-art (pretrained) language models in AutoGluon TextPrediction with all the other models used in AutoGluon’s TabularPredictor. For more details about the inner-working of the neural network architecture used in AutoGluon TextPrediction, you may refer to Section “What’s happening inside?” in Text Prediction - Heterogeneous Data Types.

%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import pprint
import random
from autogluon.text import TextPrediction
from autogluon.tabular import TabularPredictor
import mxnet as mx

np.random.seed(123)
random.seed(123)
mx.random.seed(123)

Product Sentiment Analysis Dataset¶

In the following, we will use the product sentiment analysis dataset from this MachineHack hackathon. The goal of this task is to predict the user’s sentiment towards a product given a review that is in raw text and the product’s type, e.g., Tablet, Mobile, etc. We have split the original training data to be 90% for training and 10% for development.

!mkdir -p product_sentiment_machine_hack
!wget https://autogluon-text-data.s3.amazonaws.com/multimodal_text/machine_hack_product_sentiment/train.csv -O product_sentiment_machine_hack/train.csv
!wget https://autogluon-text-data.s3.amazonaws.com/multimodal_text/machine_hack_product_sentiment/dev.csv -O product_sentiment_machine_hack/dev.csv
!wget https://autogluon-text-data.s3.amazonaws.com/multimodal_text/machine_hack_product_sentiment/test.csv -O product_sentiment_machine_hack/test.csv

--2021-02-23 19:24:46--  https://autogluon-text-data.s3.amazonaws.com/multimodal_text/machine_hack_product_sentiment/train.csv
Resolving autogluon-text-data.s3.amazonaws.com (autogluon-text-data.s3.amazonaws.com)... 52.216.101.195
Connecting to autogluon-text-data.s3.amazonaws.com (autogluon-text-data.s3.amazonaws.com)|52.216.101.195|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 689486 (673K) [text/csv]
Saving to: ‘product_sentiment_machine_hack/train.csv’

product_sentiment_m 100%[===================>] 673.33K  2.09MB/s    in 0.3s

2021-02-23 19:24:46 (2.09 MB/s) - ‘product_sentiment_machine_hack/train.csv’ saved [689486/689486]

--2021-02-23 19:24:47--  https://autogluon-text-data.s3.amazonaws.com/multimodal_text/machine_hack_product_sentiment/dev.csv
Resolving autogluon-text-data.s3.amazonaws.com (autogluon-text-data.s3.amazonaws.com)... 52.216.101.195
Connecting to autogluon-text-data.s3.amazonaws.com (autogluon-text-data.s3.amazonaws.com)|52.216.101.195|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 75517 (74K) [text/csv]
Saving to: ‘product_sentiment_machine_hack/dev.csv’

product_sentiment_m 100%[===================>]  73.75K  --.-KB/s    in 0.1s

2021-02-23 19:24:48 (508 KB/s) - ‘product_sentiment_machine_hack/dev.csv’ saved [75517/75517]

--2021-02-23 19:24:48--  https://autogluon-text-data.s3.amazonaws.com/multimodal_text/machine_hack_product_sentiment/test.csv
Resolving autogluon-text-data.s3.amazonaws.com (autogluon-text-data.s3.amazonaws.com)... 52.216.101.195
Connecting to autogluon-text-data.s3.amazonaws.com (autogluon-text-data.s3.amazonaws.com)|52.216.101.195|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 312194 (305K) [text/csv]
Saving to: ‘product_sentiment_machine_hack/test.csv’

product_sentiment_m 100%[===================>] 304.88K  1.18MB/s    in 0.3s

2021-02-23 19:24:49 (1.18 MB/s) - ‘product_sentiment_machine_hack/test.csv’ saved [312194/312194]

feature_columns = ['Product_Description', 'Product_Type']
label = 'Sentiment'

train_df = pd.read_csv('product_sentiment_machine_hack/train.csv')
dev_df = pd.read_csv('product_sentiment_machine_hack/dev.csv')
test_df = pd.read_csv('product_sentiment_machine_hack/test.csv')

train_df = train_df[feature_columns + [label]]
dev_df = dev_df[feature_columns + [label]]
test_df = test_df[feature_columns]
print('Number of training samples:', len(train_df))
print('Number of dev samples:', len(dev_df))
print('Number of test samples:', len(test_df))

Number of training samples: 5727
Number of dev samples: 637
Number of test samples: 2728

There are two features in the dataset: the users’ review of the product and the product’s type. Also, there are four classes and we have split the train and dev set based on stratified sampling.

train_df

	Product_Description	Product_Type	Sentiment
0	Just heard that Apple is opening a store in do...	2	3
1	Tristan H, apture: being fast & iterative ...	9	2
2	Hey, you lucky dogs at #SXSW with iPads -- che...	6	3
3	RT @mention THIS was the best thing I saw at #...	9	2
4	Apple is opening temp retail store in Austin t...	2	3
...	...	...	...
5722	RT @mention At #SXSW and want to win an iPad? ...	9	2
5723	RT @mention I mean, sliced bread is great. But...	3	3
5724	Apple cited as the opposite of crowdsourcing -...	2	1
5725	Good CNN article on why #SXSW is important to ...	7	3
5726	ÛÏ@mention Google to Launch Major New Social ...	3	3

5727 rows × 3 columns

dev_df

	Product_Description	Product_Type	Sentiment
0	Do it. RT @mention Come party w/ Google tonigh...	3	3
1	Line for iPads at #SXSW. Doesn't look too bad!...	6	3
2	First up: iPad Design Headaches (2 Tablets, Ca...	6	2
3	#SXSW: Mint Talks Mobile App Development Chall...	9	2
4	ÛÏ@mention Apple store downtown Austin open t...	9	2
...	...	...	...
632	Bet on a GoogleBuzz-like #fail. People don't c...	9	0
633	RT > @mention Guy gets tattoo at SXSW so he...	9	2
634	#austinites #sxsw and check it out on #iphone ...	9	2
635	New @mention for iPhone+Android.. No more serv...	0	3
636	Why isn't news industry spending more R&D?...	9	2

637 rows × 3 columns

test_df

	Product_Description	Product_Type
0	RT @mention Going to #SXSW? The new iPhone gui...	7
1	RT @mention 95% of iPhone and Droid apps have ...	9
2	RT @mention Thank you to @mention for letting ...	9
3	#Thanks @mention we're lovin' the @mention app...	7
4	At #sxsw? @mention / @mention wanna buy you a ...	9
...	...	...
2723	RT @mention eww and LOL. RT @mention Just saw ...	9
2724	Free 22 track #sxsw sampler album on iTunes. #...	9
2725	Setting up for the Google #gsdm #sxsw party. ...	3
2726	RT @mention #SXSW Come see Bitbop in Austin #g...	9
2727	So many Google products. isn't it time to tra...	5

2728 rows × 2 columns

What happens if we ignore all the non-text features?¶

First of all, let’s try to ignore all the non-text features. We will use the TextPrediction model in AutoGluon to train a predictor with text data only. This will internally use the ELECTRA-small model as the backbone. As we can see, the result is not very good.

predictor_text_only = TextPrediction.fit(train_df[['Product_Description', 'Sentiment']],
                                         label=label,
                                         time_limits=None,
                                         ngpus_per_trial=1,
                                         hyperparameters='default_no_hpo',
                                         eval_metric='accuracy',
                                         stopping_metric='accuracy',
                                         output_directory='ag_text_only')

2021-02-23 19:24:49,386 - autogluon.text.text_prediction.text_prediction - INFO - All Logs will be saved to ag_text_only/ag_text_prediction.log
All Logs will be saved to ag_text_only/ag_text_prediction.log
2021-02-23 19:24:49,404 - autogluon.text.text_prediction.text_prediction - INFO - Train Dataset:
Train Dataset:
2021-02-23 19:24:49,405 - autogluon.text.text_prediction.text_prediction - INFO - Columns:

- Text(
   name="Product_Description"
   #total/missing=4581/0
   length, min/avg/max=11/104.81707050862258/170
)
- Categorical(
   name="Sentiment"
   #total/missing=4581/0
   num_class (total/non_special)=4/4
   categories=[0, 1, 2, 3]
   freq=[87, 280, 2721, 1493]
)


Columns:

- Text(
   name="Product_Description"
   #total/missing=4581/0
   length, min/avg/max=11/104.81707050862258/170
)
- Categorical(
   name="Sentiment"
   #total/missing=4581/0
   num_class (total/non_special)=4/4
   categories=[0, 1, 2, 3]
   freq=[87, 280, 2721, 1493]
)


2021-02-23 19:24:49,406 - autogluon.text.text_prediction.text_prediction - INFO - Tuning Dataset:
Tuning Dataset:
2021-02-23 19:24:49,407 - autogluon.text.text_prediction.text_prediction - INFO - Columns:

- Text(
   name="Product_Description"
   #total/missing=1146/0
   length, min/avg/max=29/104.95986038394415/178
)
- Categorical(
   name="Sentiment"
   #total/missing=1146/0
   num_class (total/non_special)=4/4
   categories=[0, 1, 2, 3]
   freq=[13, 79, 667, 387]
)


Columns:

- Text(
   name="Product_Description"
   #total/missing=1146/0
   length, min/avg/max=29/104.95986038394415/178
)
- Categorical(
   name="Sentiment"
   #total/missing=1146/0
   num_class (total/non_special)=4/4
   categories=[0, 1, 2, 3]
   freq=[13, 79, 667, 387]
)


WARNING: changing multiprocessing start method to forkserver
2021-02-23 19:24:49,415 - autogluon.text.text_prediction.text_prediction - INFO - All Logs will be saved to ag_text_only/main.log
All Logs will be saved to ag_text_only/main.log

  0%|          | 0/1 [00:00<?, ?it/s]

2021-02-23 19:26:08,587 - autogluon.text.text_prediction.text_prediction - INFO - Results=
Results=
2021-02-23 19:26:08,589 - autogluon.text.text_prediction.text_prediction - INFO - Best_config={'search_space▁optimization.lr': 5e-05}
Best_config={'search_space▁optimization.lr': 5e-05}

(task:0)    2021-02-23 19:24:52,491 - root - INFO - All Logs will be saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/docs/_build/eval/tutorials/tabular_prediction/ag_text_only/task0/training.log
2021-02-23 19:24:52,491 - root - INFO - learning:
  early_stopping_patience: 10
  log_metrics: auto
  stop_metric: auto
  valid_ratio: 0.15
misc:
  exp_dir: /var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/docs/_build/eval/tutorials/tabular_prediction/ag_text_only/task0
  seed: 123
model:
  backbone:
    name: google_electra_small
  network:
    agg_net:
      activation: tanh
      agg_type: concat
      data_dropout: False
      dropout: 0.1
      feature_proj_num_layers: -1
      initializer:
        bias: ['zeros']
        weight: ['xavier', 'uniform', 'avg', 3.0]
      mid_units: 256
      norm_eps: 1e-05
      normalization: layer_norm
      out_proj_num_layers: 0
    categorical_net:
      activation: leaky
      data_dropout: False
      dropout: 0.1
      emb_units: 32
      initializer:
        bias: ['zeros']
        embed: ['xavier', 'gaussian', 'in', 1.0]
        weight: ['xavier', 'uniform', 'avg', 3.0]
      mid_units: 64
      norm_eps: 1e-05
      normalization: layer_norm
      num_layers: 1
    feature_units: -1
    initializer:
      bias: ['zeros']
      weight: ['truncnorm', 0, 0.02]
    numerical_net:
      activation: leaky
      data_dropout: False
      dropout: 0.1
      initializer:
        bias: ['zeros']
        weight: ['xavier', 'uniform', 'avg', 3.0]
      input_centering: False
      mid_units: 128
      norm_eps: 1e-05
      normalization: layer_norm
      num_layers: 1
    text_net:
      pool_type: cls
      use_segment_id: True
  preprocess:
    max_length: 128
    merge_text: True
optimization:
  batch_size: 32
  begin_lr: 0.0
  final_lr: 0.0
  layerwise_lr_decay: 0.8
  log_frequency: 0.1
  lr: 5e-05
  lr_scheduler: triangular
  max_grad_norm: 1.0
  model_average: 5
  num_train_epochs: 4
  optimizer: adamw
  optimizer_params: [('beta1', 0.9), ('beta2', 0.999), ('epsilon', 1e-06), ('correct_bias', False)]
  per_device_batch_size: 16
  val_batch_size_mult: 2
  valid_frequency: 0.1
  warmup_portion: 0.1
  wd: 0.01
version: 1
2021-02-23 19:24:52,645 - root - INFO - Process training set...
2021-02-23 19:24:56,192 - root - INFO - Done!
2021-02-23 19:24:56,192 - root - INFO - Process dev set...
2021-02-23 19:24:58,906 - root - INFO - Done!
2021-02-23 19:25:04,337 - root - INFO - #Total Params/Fixed Params=13484036/0
2021-02-23 19:25:04,352 - root - INFO - Using gradient accumulation. Global batch size = 32
2021-02-23 19:25:06,689 - root - INFO - [Iter 15/572, Epoch 0] train loss=8.3855e-01, gnorm=1.2309e+01, lr=1.3158e-05, #samples processed=720, #sample per second=314.10
2021-02-23 19:25:07,481 - root - INFO - [Iter 15/572, Epoch 0] valid accuracy=5.8028e-01, log_loss=1.1171e+00, accuracy=5.8028e-01, time spent=0.715s, total_time=0.05min
2021-02-23 19:25:08,990 - root - INFO - [Iter 30/572, Epoch 0] train loss=6.4969e-01, gnorm=5.2289e+00, lr=2.6316e-05, #samples processed=720, #sample per second=312.86
2021-02-23 19:25:09,823 - root - INFO - [Iter 30/572, Epoch 0] valid accuracy=5.8115e-01, log_loss=9.4872e-01, accuracy=5.8115e-01, time spent=0.695s, total_time=0.09min
2021-02-23 19:25:11,273 - root - INFO - [Iter 45/572, Epoch 0] train loss=6.7008e-01, gnorm=7.3572e+00, lr=3.9474e-05, #samples processed=720, #sample per second=315.50
2021-02-23 19:25:12,108 - root - INFO - [Iter 45/572, Epoch 0] valid accuracy=6.0384e-01, log_loss=8.8923e-01, accuracy=6.0384e-01, time spent=0.703s, total_time=0.13min
2021-02-23 19:25:13,588 - root - INFO - [Iter 60/572, Epoch 0] train loss=6.6470e-01, gnorm=4.5690e+00, lr=4.9709e-05, #samples processed=720, #sample per second=311.03
2021-02-23 19:25:14,416 - root - INFO - [Iter 60/572, Epoch 0] valid accuracy=6.2391e-01, log_loss=9.0283e-01, accuracy=6.2391e-01, time spent=0.696s, total_time=0.17min
2021-02-23 19:25:15,839 - root - INFO - [Iter 75/572, Epoch 0] train loss=6.2934e-01, gnorm=4.4687e+00, lr=4.8252e-05, #samples processed=720, #sample per second=319.82
2021-02-23 19:25:16,550 - root - INFO - [Iter 75/572, Epoch 0] valid accuracy=6.1257e-01, log_loss=8.9348e-01, accuracy=6.1257e-01, time spent=0.711s, total_time=0.20min
2021-02-23 19:25:17,929 - root - INFO - [Iter 90/572, Epoch 0] train loss=6.5662e-01, gnorm=5.8075e+00, lr=4.6796e-05, #samples processed=720, #sample per second=344.50
2021-02-23 19:25:18,641 - root - INFO - [Iter 90/572, Epoch 0] valid accuracy=6.1431e-01, log_loss=8.4047e-01, accuracy=6.1431e-01, time spent=0.711s, total_time=0.24min
2021-02-23 19:25:20,112 - root - INFO - [Iter 105/572, Epoch 0] train loss=6.1765e-01, gnorm=4.4335e+00, lr=4.5340e-05, #samples processed=720, #sample per second=329.88
2021-02-23 19:25:20,960 - root - INFO - [Iter 105/572, Epoch 0] valid accuracy=6.3264e-01, log_loss=8.3699e-01, accuracy=6.3264e-01, time spent=0.705s, total_time=0.28min
2021-02-23 19:25:22,380 - root - INFO - [Iter 120/572, Epoch 0] train loss=5.8680e-01, gnorm=6.0388e+00, lr=4.3883e-05, #samples processed=720, #sample per second=317.53
2021-02-23 19:25:23,089 - root - INFO - [Iter 120/572, Epoch 0] valid accuracy=6.1082e-01, log_loss=8.6694e-01, accuracy=6.1082e-01, time spent=0.709s, total_time=0.31min
2021-02-23 19:25:24,448 - root - INFO - [Iter 135/572, Epoch 0] train loss=6.5235e-01, gnorm=5.0395e+00, lr=4.2427e-05, #samples processed=720, #sample per second=348.19
2021-02-23 19:25:25,287 - root - INFO - [Iter 135/572, Epoch 0] valid accuracy=6.5096e-01, log_loss=8.1716e-01, accuracy=6.5096e-01, time spent=0.709s, total_time=0.35min
2021-02-23 19:25:26,702 - root - INFO - [Iter 150/572, Epoch 1] train loss=6.0744e-01, gnorm=8.6245e+00, lr=4.0971e-05, #samples processed=698, #sample per second=309.71
2021-02-23 19:25:27,411 - root - INFO - [Iter 150/572, Epoch 1] valid accuracy=6.4834e-01, log_loss=8.0773e-01, accuracy=6.4834e-01, time spent=0.709s, total_time=0.38min
2021-02-23 19:25:28,826 - root - INFO - [Iter 165/572, Epoch 1] train loss=5.5537e-01, gnorm=9.9330e+00, lr=3.9515e-05, #samples processed=720, #sample per second=338.93
2021-02-23 19:25:29,663 - root - INFO - [Iter 165/572, Epoch 1] valid accuracy=6.5445e-01, log_loss=8.1417e-01, accuracy=6.5445e-01, time spent=0.704s, total_time=0.42min
2021-02-23 19:25:31,099 - root - INFO - [Iter 180/572, Epoch 1] train loss=5.6724e-01, gnorm=4.1054e+00, lr=3.8058e-05, #samples processed=720, #sample per second=316.83
2021-02-23 19:25:31,810 - root - INFO - [Iter 180/572, Epoch 1] valid accuracy=6.5009e-01, log_loss=8.0914e-01, accuracy=6.5009e-01, time spent=0.711s, total_time=0.46min
2021-02-23 19:25:33,226 - root - INFO - [Iter 195/572, Epoch 1] train loss=5.9127e-01, gnorm=1.0215e+01, lr=3.6602e-05, #samples processed=720, #sample per second=338.49
2021-02-23 19:25:33,932 - root - INFO - [Iter 195/572, Epoch 1] valid accuracy=6.5271e-01, log_loss=7.9851e-01, accuracy=6.5271e-01, time spent=0.706s, total_time=0.49min
2021-02-23 19:25:35,342 - root - INFO - [Iter 210/572, Epoch 1] train loss=5.3871e-01, gnorm=4.3828e+00, lr=3.5146e-05, #samples processed=720, #sample per second=340.33
2021-02-23 19:25:36,061 - root - INFO - [Iter 210/572, Epoch 1] valid accuracy=6.3874e-01, log_loss=8.1257e-01, accuracy=6.3874e-01, time spent=0.718s, total_time=0.53min
2021-02-23 19:25:37,499 - root - INFO - [Iter 225/572, Epoch 1] train loss=5.3807e-01, gnorm=5.2270e+00, lr=3.3689e-05, #samples processed=720, #sample per second=333.78
2021-02-23 19:25:38,217 - root - INFO - [Iter 225/572, Epoch 1] valid accuracy=6.4660e-01, log_loss=7.9136e-01, accuracy=6.4660e-01, time spent=0.717s, total_time=0.56min
2021-02-23 19:25:39,636 - root - INFO - [Iter 240/572, Epoch 1] train loss=5.8410e-01, gnorm=6.8311e+00, lr=3.2233e-05, #samples processed=720, #sample per second=336.94
2021-02-23 19:25:40,466 - root - INFO - [Iter 240/572, Epoch 1] valid accuracy=6.6841e-01, log_loss=7.7052e-01, accuracy=6.6841e-01, time spent=0.700s, total_time=0.60min
2021-02-23 19:25:41,844 - root - INFO - [Iter 255/572, Epoch 1] train loss=5.3738e-01, gnorm=5.7739e+00, lr=3.0777e-05, #samples processed=720, #sample per second=326.23
2021-02-23 19:25:42,544 - root - INFO - [Iter 255/572, Epoch 1] valid accuracy=6.6754e-01, log_loss=7.8309e-01, accuracy=6.6754e-01, time spent=0.700s, total_time=0.64min
2021-02-23 19:25:43,952 - root - INFO - [Iter 270/572, Epoch 1] train loss=5.0632e-01, gnorm=5.0476e+00, lr=2.9320e-05, #samples processed=720, #sample per second=341.52
2021-02-23 19:25:44,676 - root - INFO - [Iter 270/572, Epoch 1] valid accuracy=6.5794e-01, log_loss=7.8638e-01, accuracy=6.5794e-01, time spent=0.724s, total_time=0.67min
2021-02-23 19:25:46,088 - root - INFO - [Iter 285/572, Epoch 1] train loss=5.3142e-01, gnorm=5.6990e+00, lr=2.7864e-05, #samples processed=720, #sample per second=337.05
2021-02-23 19:25:46,932 - root - INFO - [Iter 285/572, Epoch 1] valid accuracy=6.7016e-01, log_loss=7.8180e-01, accuracy=6.7016e-01, time spent=0.709s, total_time=0.71min
2021-02-23 19:25:48,345 - root - INFO - [Iter 300/572, Epoch 2] train loss=5.3040e-01, gnorm=9.8988e+00, lr=2.6408e-05, #samples processed=709, #sample per second=314.15
2021-02-23 19:25:49,061 - root - INFO - [Iter 300/572, Epoch 2] valid accuracy=6.6754e-01, log_loss=7.6951e-01, accuracy=6.6754e-01, time spent=0.715s, total_time=0.74min
2021-02-23 19:25:50,485 - root - INFO - [Iter 315/572, Epoch 2] train loss=5.1332e-01, gnorm=1.2150e+01, lr=2.4951e-05, #samples processed=720, #sample per second=336.58
2021-02-23 19:25:51,199 - root - INFO - [Iter 315/572, Epoch 2] valid accuracy=6.5620e-01, log_loss=7.7306e-01, accuracy=6.5620e-01, time spent=0.713s, total_time=0.78min
2021-02-23 19:25:52,606 - root - INFO - [Iter 330/572, Epoch 2] train loss=5.3934e-01, gnorm=6.6843e+00, lr=2.3495e-05, #samples processed=720, #sample per second=339.58
2021-02-23 19:25:53,322 - root - INFO - [Iter 330/572, Epoch 2] valid accuracy=6.4572e-01, log_loss=7.8320e-01, accuracy=6.4572e-01, time spent=0.716s, total_time=0.82min
2021-02-23 19:25:54,722 - root - INFO - [Iter 345/572, Epoch 2] train loss=4.8354e-01, gnorm=6.3392e+00, lr=2.2039e-05, #samples processed=720, #sample per second=340.32
2021-02-23 19:25:55,428 - root - INFO - [Iter 345/572, Epoch 2] valid accuracy=6.4834e-01, log_loss=8.2002e-01, accuracy=6.4834e-01, time spent=0.706s, total_time=0.85min
2021-02-23 19:25:56,849 - root - INFO - [Iter 360/572, Epoch 2] train loss=5.3423e-01, gnorm=5.4433e+00, lr=2.0583e-05, #samples processed=720, #sample per second=338.53
2021-02-23 19:25:57,560 - root - INFO - [Iter 360/572, Epoch 2] valid accuracy=6.5881e-01, log_loss=7.6452e-01, accuracy=6.5881e-01, time spent=0.711s, total_time=0.89min
2021-02-23 19:25:58,981 - root - INFO - [Iter 375/572, Epoch 2] train loss=5.3697e-01, gnorm=6.2103e+00, lr=1.9126e-05, #samples processed=720, #sample per second=337.70
2021-02-23 19:25:59,709 - root - INFO - [Iter 375/572, Epoch 2] valid accuracy=6.2042e-01, log_loss=8.2208e-01, accuracy=6.2042e-01, time spent=0.727s, total_time=0.92min
2021-02-23 19:26:01,114 - root - INFO - [Iter 390/572, Epoch 2] train loss=5.7040e-01, gnorm=6.7707e+00, lr=1.7670e-05, #samples processed=720, #sample per second=337.59
2021-02-23 19:26:01,836 - root - INFO - [Iter 390/572, Epoch 2] valid accuracy=6.5620e-01, log_loss=7.8493e-01, accuracy=6.5620e-01, time spent=0.722s, total_time=0.96min
2021-02-23 19:26:03,272 - root - INFO - [Iter 405/572, Epoch 2] train loss=5.2676e-01, gnorm=8.0785e+00, lr=1.6214e-05, #samples processed=720, #sample per second=333.67
2021-02-23 19:26:03,999 - root - INFO - [Iter 405/572, Epoch 2] valid accuracy=6.5096e-01, log_loss=7.7680e-01, accuracy=6.5096e-01, time spent=0.726s, total_time=0.99min
2021-02-23 19:26:05,402 - root - INFO - [Iter 420/572, Epoch 2] train loss=5.1999e-01, gnorm=1.1199e+01, lr=1.4757e-05, #samples processed=720, #sample per second=338.01
2021-02-23 19:26:06,132 - root - INFO - [Iter 420/572, Epoch 2] valid accuracy=6.6405e-01, log_loss=7.6699e-01, accuracy=6.6405e-01, time spent=0.730s, total_time=1.03min
2021-02-23 19:26:07,538 - root - INFO - [Iter 435/572, Epoch 3] train loss=5.4638e-01, gnorm=9.2815e+00, lr=1.3301e-05, #samples processed=698, #sample per second=326.79
2021-02-23 19:26:08,272 - root - INFO - [Iter 435/572, Epoch 3] valid accuracy=6.6143e-01, log_loss=7.6100e-01, accuracy=6.6143e-01, time spent=0.734s, total_time=1.06min
2021-02-23 19:26:08,276 - root - INFO - Early stopping patience reached!

print(predictor_text_only.evaluate(dev_df[['Product_Description', 'Sentiment']], metrics='accuracy'))

/var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/venv/lib/python3.8/site-packages/mxnet/gluon/block.py:995: UserWarning: The 3-th input to HybridBlock is not used by any computation. Is this intended?
  self._build_cache(*args)

{'accuracy': 0.6671899529042387}

Model 1: Baseline with N-Gram + TF-IDF¶

The first baseline model is to directly call AutoGluon’s TabularPredictor to train a predictor. TabularPredictor uses the n-gram and TF-IDF based features for text columns and considers text and categorical columns simultaneously.

predictor_model1 = TabularPredictor(label=label, eval_metric='accuracy', path='model1').fit(train_df)

Beginning AutoGluon training ...
AutoGluon will save models to "model1/"
AutoGluon Version:  0.1.0b20210223
Train Data Rows:    5727
Train Data Columns: 2
Preprocessing data ...
AutoGluon infers your prediction problem is: 'multiclass' (because dtype of label-column == int, but few unique label-values observed).
    4 unique label values:  [3, 2, 1, 0]
    If 'multiclass' is not the correct problem_type, please manually specify the problem_type argument in fit() (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])
NumExpr defaulting to 8 threads.
Train Data Class Count: 4
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
    Available Memory:                    14382.9 MB
    Train Data (Original)  Memory Usage: 1.0 MB (0.0% of available memory)
    Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
    Stage 1 Generators:
            Fitting AsTypeFeatureGenerator...
    Stage 2 Generators:
            Fitting FillNaFeatureGenerator...
    Stage 3 Generators:
            Fitting IdentityFeatureGenerator...
            Fitting CategoryFeatureGenerator...
                    Fitting CategoryMemoryMinimizeFeatureGenerator...
            Fitting TextSpecialFeatureGenerator...
                    Fitting BinnedFeatureGenerator...
                    Fitting DropDuplicatesFeatureGenerator...
            Fitting TextNgramFeatureGenerator...
                    Fitting CountVectorizer for text features: ['Product_Description']
                    CountVectorizer fit with vocabulary size = 725
            Warning: Due to memory constraints, ngram feature count is being reduced. Allocate more memory to maximize model quality.
            Reducing Vectorizer vocab size from 725 to 354 to avoid OOM error
    Stage 4 Generators:
            Fitting DropUniqueFeatureGenerator...
    Types of features in original data (raw dtype, special dtypes):
            ('int', [])          : 1 | ['Product_Type']
            ('object', ['text']) : 1 | ['Product_Description']
    Types of features in processed data (raw dtype, special dtypes):
            ('int', [])                         :   1 | ['Product_Type']
            ('int', ['binned', 'text_special']) :  38 | ['Product_Description.char_count', 'Product_Description.word_count', 'Product_Description.capital_ratio', 'Product_Description.lower_ratio', 'Product_Description.digit_ratio', ...]
            ('int', ['text_ngram'])             : 355 | ['__nlp__.10', '__nlp__.11', '__nlp__.6th', '__nlp__.about', '__nlp__.all', ...]
    2.1s = Fit runtime
    2 features in original data used to generate 394 features in processed data.
    Train Data (Processed) Memory Usage: 2.31 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 2.16s ...
AutoGluon will gauge predictive performance using evaluation metric: 'accuracy'
    To change this, specify the eval_metric argument of fit()
Automatically generating train/validation split with holdout_frac=0.1, Train Rows: 5154, Val Rows: 573
Fitting model: NeuralNetMXNet ...
    0.8726   = Validation accuracy score
    4.1s     = Training runtime
    0.03s    = Validation runtime
Fitting model: NeuralNetFastAI ...
/var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/venv/lib/python3.8/site-packages/torch/cuda/__init__.py:52: UserWarning: CUDA initialization: The NVIDIA driver on your system is too old (found version 10010). Please update your GPU driver by downloading and installing a new version from the URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to: https://pytorch.org to install a PyTorch version that has been compiled with your version of the CUDA driver. (Triggered internally at  /pytorch/c10/cuda/CUDAFunctions.cpp:100.)
  return torch._C._cuda_getDeviceCount() > 0

█

8482   = Validation accuracy score
06s   = Training runtime
36s    = Validation runtime
Fitting model: KNeighborsUnif ...
8534   = Validation accuracy score
02s    = Training runtime
02s    = Validation runtime
Fitting model: KNeighborsDist ...
8534   = Validation accuracy score
02s    = Training runtime
02s    = Validation runtime
Fitting model: RandomForestGini ...
8709   = Validation accuracy score
03s    = Training runtime
08s    = Validation runtime
Fitting model: RandomForestEntr ...
8709   = Validation accuracy score
04s    = Training runtime
08s    = Validation runtime
Fitting model: ExtraTreesGini ...
8464   = Validation accuracy score
15s    = Training runtime
08s    = Validation runtime
Fitting model: ExtraTreesEntr ...
8464   = Validation accuracy score
15s    = Training runtime
08s    = Validation runtime
Fitting model: LightGBM ...
8831   = Validation accuracy score
08s    = Training runtime
01s    = Validation runtime
Fitting model: LightGBMXT ...
8534   = Validation accuracy score
19s    = Training runtime
01s    = Validation runtime
Fitting model: CatBoost ...
8726   = Validation accuracy score
04s    = Training runtime
01s    = Validation runtime
Fitting model: XGBoost ...
8778   = Validation accuracy score
84s    = Training runtime
02s    = Validation runtime
Fitting model: LightGBMLarge ...
8813   = Validation accuracy score
45s    = Training runtime
01s    = Validation runtime

█

Fitting model: WeightedEnsemble_L1 ...
    0.8883   = Validation accuracy score
    0.37s    = Training runtime
    0.0s     = Validation runtime
AutoGluon training complete, total runtime = 44.09s ...
TabularPredictor saved. To load, use: TabularPredictor.load("model1/")

predictor_model1.leaderboard(dev_df, silent=True)

█

	model	score_test	score_val	pred_time_test	pred_time_val	fit_time	pred_time_test_marginal	pred_time_val_marginal	fit_time_marginal	stack_level	can_infer	fit_order
0	WeightedEnsemble_L1	0.890110	0.888307	10.636745	0.579389	20.212959	0.005916	0.000465	0.374030	1	True	14
1	LightGBMLarge	0.886970	0.881326	0.015931	0.006296	3.445040	0.015931	0.006296	3.445040	0	True	13
2	CatBoost	0.886970	0.872600	0.018929	0.006263	1.039262	0.018929	0.006263	1.039262	0	True	11
3	RandomForestGini	0.886970	0.870855	0.115808	0.077118	1.025226	0.115808	0.077118	1.025226	0	True	5
4	XGBoost	0.885400	0.877836	0.079946	0.019477	1.838979	0.079946	0.019477	1.838979	0	True	12
5	RandomForestEntr	0.885400	0.870855	0.118663	0.075674	1.039675	0.118663	0.075674	1.039675	0	True	6
6	KNeighborsUnif	0.883830	0.853403	0.032106	0.019376	0.019685	0.032106	0.019376	0.019685	0	True	3
7	KNeighborsDist	0.883830	0.853403	0.041204	0.019170	0.019528	0.041204	0.019170	0.019528	0	True	4
8	LightGBM	0.882261	0.883072	0.011820	0.006165	1.076681	0.011820	0.006165	1.076681	0	True	9
9	NeuralNetMXNet	0.877551	0.872600	0.046874	0.034172	4.098654	0.046874	0.034172	4.098654	0	True	1
10	LightGBMXT	0.869702	0.853403	0.013507	0.008016	1.188565	0.013507	0.008016	1.188565	0	True	10
11	ExtraTreesEntr	0.868132	0.846422	0.150189	0.080109	1.153447	0.150189	0.080109	1.153447	0	True	8
12	ExtraTreesGini	0.866562	0.846422	0.175136	0.079671	1.148813	0.175136	0.079671	1.148813	0	True	7
13	NeuralNetFastAI	0.854003	0.848168	10.244841	0.364427	12.060060	10.244841	0.364427	12.060060	0	True	2

We can find that using product type (a categorical column) is quite essential for good performance in this task. The accuracy is much higher than the model trained with only text column.

Model 2: Extract Text Embedding and Use Tabular Predictor¶

Our second attempt in combining text and other features is to use the trained TextPrediction model to extract embeddings and use TabularPredictor to build the predictor on top of the text embeddings. The AutoGluon TextPrediction model offers the extract_embedding() functionality (For more details, go to Extract Embeddings), so we are able to build a two-stage model. In the first stage, we use the text-only model to extract sentence embeddings. In the second stage, we use TabularPredictor to get the final model.

train_sentence_embeddings = predictor_text_only.extract_embedding(train_df)
dev_sentence_embeddings = predictor_text_only.extract_embedding(dev_df)
print(train_sentence_embeddings)

/var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/text/src/autogluon/text/text_prediction/dataset.py:321: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df[col_name] = df[col_name].fillna('').apply(str)
/var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/venv/lib/python3.8/site-packages/mxnet/gluon/block.py:995: UserWarning: The 3-th input to HybridBlock is not used by any computation. Is this intended?
  self._build_cache(*args)

[[-0.061683  0.789052 -0.614252 -0.383968 ... -0.202426  1.144868  0.039427 -0.13562 ]
 [-0.269277  0.177113 -0.197375 -0.172229 ... -0.584261  0.625235 -0.355088  0.211777]
 [-0.83451   0.264692 -0.687199  0.234191 ... -0.605356  0.332709  0.029832 -0.160492]
 [-0.271491  0.149054 -0.506492 -0.09476  ... -0.809414  0.29643   0.31992  -0.096194]
 ...
 [-0.054611 -0.060668 -0.49929  -0.170906 ... -0.284143  0.805138  0.430891 -0.191869]
 [-0.277809  0.150503 -0.625322 -0.241075 ... -0.525608  0.909708 -0.124487 -0.031551]
 [-0.64508  -0.07616  -0.567146  0.192171 ... -1.166889  0.589877  0.242167 -0.549045]
 [-0.837447  0.347837 -0.525436 -0.440289 ... -0.742066  0.565927 -0.054493 -0.411046]]

merged_train_data = train_df.join(pd.DataFrame(train_sentence_embeddings))
merged_dev_data = dev_df.join(pd.DataFrame(dev_sentence_embeddings))
print(merged_train_data)

                                    Product_Description  Product_Type  0     Just heard that Apple is opening a store in do...             2
1     Tristan H, apture: being fast &amp; iterative ...             9
2     Hey, you lucky dogs at #SXSW with iPads -- che...             6
3     RT @mention THIS was the best thing I saw at #...             9
4     Apple is opening temp retail store in Austin t...             2
...                                                 ...           ...
5722  RT @mention At #SXSW and want to win an iPad? ...             9
5723  RT @mention I mean, sliced bread is great. But...             3
5724  Apple cited as the opposite of crowdsourcing -...             2
5725  Good CNN article on why #SXSW is important to ...             7
5726  ÛÏ@mention Google to Launch Major New Social ...             3

      Sentiment         0         1         2         3         4         5  0             3 -0.061683  0.789052 -0.614252 -0.383968  0.794183 -0.581301
1             2 -0.269277  0.177113 -0.197375 -0.172229  0.547932 -0.265157
2             3 -0.834510  0.264692 -0.687199  0.234191  1.018778 -0.689753
3             2 -0.271491  0.149054 -0.506492 -0.094760  0.644704 -0.521082
4             3  0.180634  0.237529 -0.668062 -0.119891  0.387544 -0.314172
...         ...       ...       ...       ...       ...       ...       ...
5722          2 -0.771102  0.284567 -0.285301 -0.168485  0.645094  0.036831
5723          3 -0.054611 -0.060668 -0.499290 -0.170906 -0.367915  0.331775
5724          1 -0.277809  0.150503 -0.625322 -0.241075  0.157916  0.060280
5725          3 -0.645080 -0.076160 -0.567146  0.192171  0.524227 -0.318997
5726          3 -0.837447  0.347837 -0.525436 -0.440289  0.857342 -0.283967

             6  ...       246       247       248       249       250  0     0.919014  ... -0.357193  0.390834  0.833298 -0.115630  0.786055
1     1.170505  ... -0.212880  0.123253  0.668844 -0.826962  0.772176
2     1.244796  ... -0.300159  0.729561  0.551330 -0.400327  0.671096
3     1.293411  ... -0.233223  0.269830  0.657665 -0.285630  0.512562
4     1.199535  ... -0.490884  0.116289  0.888642 -0.219426  0.773025
...        ...  ...       ...       ...       ...       ...       ...
5722  1.063322  ... -0.032328  0.582992  0.674876 -0.196406  0.172549
5723  0.314198  ... -0.639878 -0.178192  0.481809 -0.700696 -0.039856
5724  0.656375  ... -0.459748 -0.046848  0.820173 -0.563527  0.208965
5725  1.467807  ... -0.601973  0.303506  0.423291 -0.275483  0.578006
5726  1.174435  ... -0.239029  0.465101  0.134878 -0.096706  0.547499

           251       252       253       254       255
0     0.243626 -0.202426  1.144868  0.039427 -0.135620
1    -0.053033 -0.584261  0.625235 -0.355088  0.211777
2     0.120588 -0.605356  0.332709  0.029832 -0.160492
3     0.320794 -0.809414  0.296430  0.319920 -0.096194
4     0.546610 -0.479898  1.102872  0.085935 -0.345065
...        ...       ...       ...       ...       ...
5722  0.049309 -0.042409  0.503684 -0.348889 -0.508032
5723  1.003379 -0.284143  0.805138  0.430891 -0.191869
5724  0.631439 -0.525608  0.909708 -0.124487 -0.031551
5725  0.421409 -1.166889  0.589877  0.242167 -0.549045
5726  0.293577 -0.742066  0.565927 -0.054493 -0.411046

[5727 rows x 259 columns]

predictor_model2 = TabularPredictor(label=label, eval_metric='accuracy', path='model2').fit(merged_train_data)

Beginning AutoGluon training ...
AutoGluon will save models to "model2/"
AutoGluon Version:  0.1.0b20210223
Train Data Rows:    5727
Train Data Columns: 258
Preprocessing data ...
AutoGluon infers your prediction problem is: 'multiclass' (because dtype of label-column == int, but few unique label-values observed).
    4 unique label values:  [3, 2, 1, 0]
    If 'multiclass' is not the correct problem_type, please manually specify the problem_type argument in fit() (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])
Train Data Class Count: 4
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
    Available Memory:                    13929.9 MB
    Train Data (Original)  Memory Usage: 6.87 MB (0.0% of available memory)
    Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
    Stage 1 Generators:
            Fitting AsTypeFeatureGenerator...
    Stage 2 Generators:
            Fitting FillNaFeatureGenerator...
    Stage 3 Generators:
            Fitting IdentityFeatureGenerator...
            Fitting CategoryFeatureGenerator...
                    Fitting CategoryMemoryMinimizeFeatureGenerator...
            Fitting TextSpecialFeatureGenerator...
                    Fitting BinnedFeatureGenerator...
                    Fitting DropDuplicatesFeatureGenerator...
            Fitting TextNgramFeatureGenerator...
                    Fitting CountVectorizer for text features: ['Product_Description']
                    CountVectorizer fit with vocabulary size = 725
            Warning: Due to memory constraints, ngram feature count is being reduced. Allocate more memory to maximize model quality.
            Reducing Vectorizer vocab size from 725 to 303 to avoid OOM error
    Stage 4 Generators:
            Fitting DropUniqueFeatureGenerator...
    Types of features in original data (raw dtype, special dtypes):
            ('float', [])        : 256 | ['0', '1', '2', '3', '4', ...]
            ('int', [])          :   1 | ['Product_Type']
            ('object', ['text']) :   1 | ['Product_Description']
    Types of features in processed data (raw dtype, special dtypes):
            ('float', [])                       : 256 | ['0', '1', '2', '3', '4', ...]
            ('int', [])                         :   1 | ['Product_Type']
            ('int', ['binned', 'text_special']) :  38 | ['Product_Description.char_count', 'Product_Description.word_count', 'Product_Description.capital_ratio', 'Product_Description.lower_ratio', 'Product_Description.digit_ratio', ...]
            ('int', ['text_ngram'])             : 304 | ['__nlp__.11', '__nlp__.6th', '__nlp__.about', '__nlp__.all', '__nlp__.amp', ...]
    2.4s = Fit runtime
    258 features in original data used to generate 599 features in processed data.
    Train Data (Processed) Memory Usage: 7.89 MB (0.1% of available memory)
Data preprocessing and feature engineering runtime = 2.48s ...
AutoGluon will gauge predictive performance using evaluation metric: 'accuracy'
    To change this, specify the eval_metric argument of fit()
Automatically generating train/validation split with holdout_frac=0.1, Train Rows: 5154, Val Rows: 573
Fitting model: NeuralNetMXNet ...
    0.8918   = Validation accuracy score
    6.3s     = Training runtime
    0.04s    = Validation runtime
Fitting model: NeuralNetFastAI ...

█

8534   = Validation accuracy score
82s   = Training runtime
52s    = Validation runtime
Fitting model: KNeighborsUnif ...
8551   = Validation accuracy score
02s    = Training runtime
13s    = Validation runtime
Fitting model: KNeighborsDist ...
8586   = Validation accuracy score
03s    = Training runtime
12s    = Validation runtime
Fitting model: RandomForestGini ...
8691   = Validation accuracy score
91s    = Training runtime
08s    = Validation runtime
Fitting model: RandomForestEntr ...
8551   = Validation accuracy score
16s    = Training runtime
08s    = Validation runtime
Fitting model: ExtraTreesGini ...
8168   = Validation accuracy score
25s    = Training runtime
08s    = Validation runtime
Fitting model: ExtraTreesEntr ...
8028   = Validation accuracy score
31s    = Training runtime
08s    = Validation runtime
Fitting model: LightGBM ...
8935   = Validation accuracy score
69s    = Training runtime
01s    = Validation runtime
Fitting model: LightGBMXT ...
8726   = Validation accuracy score
19s    = Training runtime
01s    = Validation runtime
Fitting model: CatBoost ...
8883   = Validation accuracy score
0s    = Training runtime
01s    = Validation runtime
Fitting model: XGBoost ...
8935   = Validation accuracy score
02s   = Training runtime
05s    = Validation runtime
Fitting model: LightGBMLarge ...
8935   = Validation accuracy score
6s    = Training runtime
02s    = Validation runtime

█

Fitting model: WeightedEnsemble_L1 ...
    0.8988   = Validation accuracy score
    0.38s    = Training runtime
    0.0s     = Validation runtime
AutoGluon training complete, total runtime = 186.44s ...
TabularPredictor saved. To load, use: TabularPredictor.load("model2/")

predictor_model2.leaderboard(merged_dev_data, silent=True)

█

	model	score_test	score_val	pred_time_test	pred_time_val	fit_time	pred_time_test_marginal	pred_time_val_marginal	fit_time_marginal	stack_level	can_infer	fit_order
0	WeightedEnsemble_L1	0.893250	0.898778	10.744249	0.715136	98.361377	0.005639	0.000464	0.375998	1	True	14
1	LightGBM	0.891680	0.893543	0.029830	0.007839	9.685987	0.029830	0.007839	9.685987	0	True	9
2	CatBoost	0.886970	0.888307	0.028248	0.012147	19.002439	0.028248	0.012147	19.002439	0	True	11
3	NeuralNetMXNet	0.886970	0.891798	0.053362	0.042966	6.302544	0.053362	0.042966	6.302544	0	True	1
4	LightGBMLarge	0.886970	0.893543	0.059007	0.016558	57.602369	0.059007	0.016558	57.602369	0	True	13
5	XGBoost	0.886970	0.893543	0.160011	0.054713	39.018259	0.160011	0.054713	39.018259	0	True	12
6	LightGBMXT	0.872841	0.872600	0.018458	0.009431	9.185145	0.018458	0.009431	9.185145	0	True	10
7	KNeighborsUnif	0.855573	0.855148	0.153559	0.128035	0.021495	0.153559	0.128035	0.021495	0	True	3
8	KNeighborsDist	0.854003	0.858639	0.123584	0.121682	0.030962	0.123584	0.121682	0.030962	0	True	4
9	NeuralNetFastAI	0.850863	0.853403	10.367347	0.518267	18.815119	10.367347	0.518267	18.815119	0	True	2
10	RandomForestGini	0.830455	0.869110	0.101238	0.078978	2.912344	0.101238	0.078978	2.912344	0	True	5
11	RandomForestEntr	0.808477	0.855148	0.099813	0.078739	5.161032	0.099813	0.078739	5.161032	0	True	6
12	ExtraTreesGini	0.800628	0.816754	0.150204	0.082309	1.251960	0.150204	0.082309	1.251960	0	True	7
13	ExtraTreesEntr	0.778650	0.802792	0.126574	0.082750	1.306403	0.126574	0.082750	1.306403	0	True	8

The performance is better than the first model.

Model 3: Use the Neural Network in AutoGluon-Text in Tabular Weighted Ensemble¶

Another option is to directly include the neural network in AutoGluon-Text as one candidate of TabularPredictor. We can do that now by changing the hyperparameters. Note that for the purpose of this tutorial, we are manually setting the hyperparameters and we will release some good pre-configurations soon.

tabular_multimodel_hparam_v1 = {
    'GBM': [{}, {'extra_trees': True, 'ag_args': {'name_suffix': 'XT'}}],
    'CAT': {},
    'TEXT_NN_V1': {},
}

predictor_model3 = TabularPredictor(label=label, eval_metric='accuracy', path='model3').fit(
    train_df, hyperparameters=tabular_multimodel_hparam_v1
)

Beginning AutoGluon training ...
AutoGluon will save models to "model3/"
AutoGluon Version:  0.1.0b20210223
Train Data Rows:    5727
Train Data Columns: 2
Preprocessing data ...
AutoGluon infers your prediction problem is: 'multiclass' (because dtype of label-column == int, but few unique label-values observed).
    4 unique label values:  [3, 2, 1, 0]
    If 'multiclass' is not the correct problem_type, please manually specify the problem_type argument in fit() (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])
Train Data Class Count: 4
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
    Available Memory:                    13504.55 MB
    Train Data (Original)  Memory Usage: 1.0 MB (0.0% of available memory)
    Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
    Stage 1 Generators:
            Fitting AsTypeFeatureGenerator...
    Stage 2 Generators:
            Fitting FillNaFeatureGenerator...
    Stage 3 Generators:
            Fitting IdentityFeatureGenerator...
            Fitting IdentityFeatureGenerator...
                    Fitting RenameFeatureGenerator...
            Fitting CategoryFeatureGenerator...
                    Fitting CategoryMemoryMinimizeFeatureGenerator...
            Fitting TextSpecialFeatureGenerator...
                    Fitting BinnedFeatureGenerator...
                    Fitting DropDuplicatesFeatureGenerator...
            Fitting TextNgramFeatureGenerator...
                    Fitting CountVectorizer for text features: ['Product_Description']
                    CountVectorizer fit with vocabulary size = 725
            Warning: Due to memory constraints, ngram feature count is being reduced. Allocate more memory to maximize model quality.
            Reducing Vectorizer vocab size from 725 to 271 to avoid OOM error
    Stage 4 Generators:
            Fitting DropUniqueFeatureGenerator...
    Types of features in original data (raw dtype, special dtypes):
            ('int', [])          : 1 | ['Product_Type']
            ('object', ['text']) : 1 | ['Product_Description']
    Types of features in processed data (raw dtype, special dtypes):
            ('int', [])                         :   1 | ['Product_Type']
            ('int', ['binned', 'text_special']) :  38 | ['Product_Description.char_count', 'Product_Description.word_count', 'Product_Description.capital_ratio', 'Product_Description.lower_ratio', 'Product_Description.digit_ratio', ...]
            ('int', ['text_ngram'])             : 272 | ['__nlp__.about', '__nlp__.all', '__nlp__.amp', '__nlp__.an', '__nlp__.an ipad', ...]
            ('object', ['text'])                :   1 | ['Product_Description_raw_text']
    2.1s = Fit runtime
    2 features in original data used to generate 312 features in processed data.
    Train Data (Processed) Memory Usage: 2.8 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 2.13s ...
AutoGluon will gauge predictive performance using evaluation metric: 'accuracy'
    To change this, specify the eval_metric argument of fit()
Automatically generating train/validation split with holdout_frac=0.1, Train Rows: 5154, Val Rows: 573
Fitting model: LightGBM ...
    0.8796   = Validation accuracy score
    1.02s    = Training runtime
    0.01s    = Validation runtime
Fitting model: LightGBMXT ...
    0.8586   = Validation accuracy score
    1.22s    = Training runtime
    0.01s    = Validation runtime
Fitting model: CatBoost ...
    0.8726   = Validation accuracy score
    0.95s    = Training runtime
    0.02s    = Validation runtime
Fitting model: TextNeuralNetV1 ...
All Logs will be saved to model3/models/TextNeuralNetV1/TextNeuralNetV1/main.log
Starting Hyperparameter Tuning ... (num_trials=1)

  0%|          | 0/1 [00:00<?, ?it/s]

/var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/venv/lib/python3.8/site-packages/mxnet/gluon/block.py:995: UserWarning: The 3-th input to HybridBlock is not used by any computation. Is this intended?
  self._build_cache(*args)
    0.8918   = Validation accuracy score
    90.19s   = Training runtime
    0.65s    = Validation runtime
/var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/venv/lib/python3.8/site-packages/mxnet/gluon/block.py:995: UserWarning: The 3-th input to HybridBlock is not used by any computation. Is this intended?
  self._build_cache(*args)
Fitting model: WeightedEnsemble_L1 ...
    0.8935   = Validation accuracy score
    0.15s    = Training runtime
    0.0s     = Validation runtime
AutoGluon training complete, total runtime = 97.4s ...
TabularPredictor saved. To load, use: TabularPredictor.load("model3/")

predictor_model3.leaderboard(dev_df, silent=True)

/var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/venv/lib/python3.8/site-packages/mxnet/gluon/block.py:995: UserWarning: The 3-th input to HybridBlock is not used by any computation. Is this intended?
  self._build_cache(*args)

	model	score_test	score_val	pred_time_test	pred_time_val	fit_time	pred_time_test_marginal	pred_time_val_marginal	fit_time_marginal	stack_level	can_infer	fit_order
0	WeightedEnsemble_L1	0.896389	0.893543	0.999798	0.672808	92.509537	0.009143	0.000531	0.149596	1	True	5
1	TextNeuralNetV1	0.888540	0.891798	0.960964	0.649250	90.191949	0.960964	0.649250	90.191949	0	True	4
2	LightGBM	0.886970	0.879581	0.008237	0.006089	1.020931	0.008237	0.006089	1.020931	0	True	1
3	CatBoost	0.886970	0.872600	0.017194	0.015831	0.948771	0.017194	0.015831	0.948771	0	True	3
4	LightGBMXT	0.868132	0.858639	0.012497	0.007196	1.219221	0.012497	0.007196	1.219221	0	True	2

Model 4: K-Fold Bagging and Stack Ensemble¶

A more advanced strategy is to use 5-fold bagging and call stack ensembling. This is expected to improve the final performance.

predictor_model4 = TabularPredictor(label=label, eval_metric='accuracy', path='model4').fit(
    train_df, hyperparameters=tabular_multimodel_hparam_v1, num_bag_folds=5, num_stack_levels=1
)

Beginning AutoGluon training ...
AutoGluon will save models to "model4/"
AutoGluon Version:  0.1.0b20210223
Train Data Rows:    5727
Train Data Columns: 2
Preprocessing data ...
AutoGluon infers your prediction problem is: 'multiclass' (because dtype of label-column == int, but few unique label-values observed).
    4 unique label values:  [3, 2, 1, 0]
    If 'multiclass' is not the correct problem_type, please manually specify the problem_type argument in fit() (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])
Train Data Class Count: 4
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
    Available Memory:                    13421.11 MB
    Train Data (Original)  Memory Usage: 1.0 MB (0.0% of available memory)
    Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
    Stage 1 Generators:
            Fitting AsTypeFeatureGenerator...
    Stage 2 Generators:
            Fitting FillNaFeatureGenerator...
    Stage 3 Generators:
            Fitting IdentityFeatureGenerator...
            Fitting IdentityFeatureGenerator...
                    Fitting RenameFeatureGenerator...
            Fitting CategoryFeatureGenerator...
                    Fitting CategoryMemoryMinimizeFeatureGenerator...
            Fitting TextSpecialFeatureGenerator...
                    Fitting BinnedFeatureGenerator...
                    Fitting DropDuplicatesFeatureGenerator...
            Fitting TextNgramFeatureGenerator...
                    Fitting CountVectorizer for text features: ['Product_Description']
                    CountVectorizer fit with vocabulary size = 725
            Warning: Due to memory constraints, ngram feature count is being reduced. Allocate more memory to maximize model quality.
            Reducing Vectorizer vocab size from 725 to 265 to avoid OOM error
    Stage 4 Generators:
            Fitting DropUniqueFeatureGenerator...
    Types of features in original data (raw dtype, special dtypes):
            ('int', [])          : 1 | ['Product_Type']
            ('object', ['text']) : 1 | ['Product_Description']
    Types of features in processed data (raw dtype, special dtypes):
            ('int', [])                         :   1 | ['Product_Type']
            ('int', ['binned', 'text_special']) :  38 | ['Product_Description.char_count', 'Product_Description.word_count', 'Product_Description.capital_ratio', 'Product_Description.lower_ratio', 'Product_Description.digit_ratio', ...]
            ('int', ['text_ngram'])             : 266 | ['__nlp__.about', '__nlp__.all', '__nlp__.amp', '__nlp__.an', '__nlp__.an ipad', ...]
            ('object', ['text'])                :   1 | ['Product_Description_raw_text']
    2.2s = Fit runtime
    2 features in original data used to generate 306 features in processed data.
    Train Data (Processed) Memory Usage: 2.76 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 2.19s ...
AutoGluon will gauge predictive performance using evaluation metric: 'accuracy'
    To change this, specify the eval_metric argument of fit()
Fitting model: LightGBM_BAG_L0 ...
    0.8797   = Validation accuracy score
    8.12s    = Training runtime
    0.07s    = Validation runtime
Fitting model: LightGBMXT_BAG_L0 ...
    0.8598   = Validation accuracy score
    7.38s    = Training runtime
    0.07s    = Validation runtime
Fitting model: CatBoost_BAG_L0 ...
    0.8745   = Validation accuracy score
    4.73s    = Training runtime
    0.08s    = Validation runtime
Fitting model: TextNeuralNetV1_BAG_L0 ...
All Logs will be saved to model4/models/TextNeuralNetV1_BAG_L0/S1F1/S1F1/main.log
Starting Hyperparameter Tuning ... (num_trials=1)

  0%|          | 0/1 [00:00<?, ?it/s]

/var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/venv/lib/python3.8/site-packages/mxnet/gluon/block.py:995: UserWarning: The 3-th input to HybridBlock is not used by any computation. Is this intended?
  self._build_cache(*args)
All Logs will be saved to model4/models/TextNeuralNetV1_BAG_L0/S1F2/S1F2/main.log
Starting Hyperparameter Tuning ... (num_trials=1)

  0%|          | 0/1 [00:00<?, ?it/s]

/var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/venv/lib/python3.8/site-packages/mxnet/gluon/block.py:995: UserWarning: The 3-th input to HybridBlock is not used by any computation. Is this intended?
  self._build_cache(*args)
All Logs will be saved to model4/models/TextNeuralNetV1_BAG_L0/S1F3/S1F3/main.log
Starting Hyperparameter Tuning ... (num_trials=1)

  0%|          | 0/1 [00:00<?, ?it/s]

/var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/venv/lib/python3.8/site-packages/mxnet/gluon/block.py:995: UserWarning: The 3-th input to HybridBlock is not used by any computation. Is this intended?
  self._build_cache(*args)
All Logs will be saved to model4/models/TextNeuralNetV1_BAG_L0/S1F4/S1F4/main.log
Starting Hyperparameter Tuning ... (num_trials=1)

  0%|          | 0/1 [00:00<?, ?it/s]

/var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/venv/lib/python3.8/site-packages/mxnet/gluon/block.py:995: UserWarning: The 3-th input to HybridBlock is not used by any computation. Is this intended?
  self._build_cache(*args)
All Logs will be saved to model4/models/TextNeuralNetV1_BAG_L0/S1F5/S1F5/main.log
Starting Hyperparameter Tuning ... (num_trials=1)

  0%|          | 0/1 [00:00<?, ?it/s]

/var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/venv/lib/python3.8/site-packages/mxnet/gluon/block.py:995: UserWarning: The 3-th input to HybridBlock is not used by any computation. Is this intended?
  self._build_cache(*args)
    0.883    = Validation accuracy score
    503.53s  = Training runtime
    35.01s   = Validation runtime
Fitting model: WeightedEnsemble_L1 ...
    0.883    = Validation accuracy score
    0.46s    = Training runtime
    0.0s     = Validation runtime
Fitting model: LightGBM_BAG_L1 ...
    0.8846   = Validation accuracy score
    10.14s   = Training runtime
    0.04s    = Validation runtime
Fitting model: LightGBMXT_BAG_L1 ...
    0.8809   = Validation accuracy score
    9.59s    = Training runtime
    0.07s    = Validation runtime
Fitting model: CatBoost_BAG_L1 ...
    0.8841   = Validation accuracy score
    11.96s   = Training runtime
    0.09s    = Validation runtime
Fitting model: WeightedEnsemble_L2 ...
    0.8846   = Validation accuracy score
    0.4s     = Training runtime
    0.0s     = Validation runtime
AutoGluon training complete, total runtime = 594.87s ...
TabularPredictor saved. To load, use: TabularPredictor.load("model4/")

predictor_model4.leaderboard(dev_df, silent=True)

/var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/venv/lib/python3.8/site-packages/mxnet/gluon/block.py:995: UserWarning: The 3-th input to HybridBlock is not used by any computation. Is this intended?
  self._build_cache(*args)

	model	score_test	score_val	pred_time_test	pred_time_val	fit_time	pred_time_test_marginal	pred_time_val_marginal	fit_time_marginal	stack_level	can_infer	fit_order
0	TextNeuralNetV1_BAG_L0	0.897959	0.883010	4.781237	35.009925	503.525820	4.781237	35.009925	503.525820	0	True	4
1	WeightedEnsemble_L1	0.897959	0.883010	4.783215	35.010866	503.988360	0.001978	0.000941	0.462540	1	True	5
2	CatBoost_BAG_L1	0.896389	0.884058	5.075670	35.320508	535.708009	0.046778	0.094063	11.955566	1	True	8
3	LightGBM_BAG_L1	0.893250	0.884582	5.080883	35.270539	533.896461	0.051992	0.044093	10.144018	1	True	6
4	WeightedEnsemble_L2	0.893250	0.884582	5.082560	35.271442	534.292905	0.001677	0.000904	0.396444	2	True	9
5	LightGBMXT_BAG_L1	0.893250	0.880915	5.109002	35.294523	533.344140	0.080111	0.068078	9.591697	1	True	7
6	CatBoost_BAG_L0	0.886970	0.874454	0.042751	0.083205	4.729065	0.042751	0.083205	4.729065	0	True	3
7	LightGBM_BAG_L0	0.886970	0.879693	0.118651	0.068003	8.121579	0.118651	0.068003	8.121579	0	True	1
8	LightGBMXT_BAG_L0	0.877551	0.859787	0.086252	0.065313	7.375978	0.086252	0.065313	7.375978	0	True	2

Model 5: Multimodal embedding + TabularPredictor¶

Also, since the neural network in text prediction can directly handle multi-modal data, we can fit a model with TextPrediction first and then use that as an embedding extractor. This can be viewed as an improved version of Model-2.

predictor_text_multimodal = TextPrediction.fit(train_df,
                                               label=label,
                                               time_limits=None,
                                               eval_metric='accuracy',
                                               stopping_metric='accuracy',
                                               hyperparameters='default_no_hpo',
                                               output_directory='predictor_text_multimodal')

train_sentence_multimodal_embeddings = predictor_text_multimodal.extract_embedding(train_df)
dev_sentence_multimodal_embeddings = predictor_text_multimodal.extract_embedding(dev_df)

predictor_model5 = TabularPredictor(label=label, eval_metric='accuracy', path='model5').fit(train_df)

2021-02-23 19:42:14,695 - autogluon.text.text_prediction.text_prediction - INFO - All Logs will be saved to predictor_text_multimodal/ag_text_prediction.log
All Logs will be saved to predictor_text_multimodal/ag_text_prediction.log
2021-02-23 19:42:14,715 - autogluon.text.text_prediction.text_prediction - INFO - Train Dataset:
Train Dataset:
2021-02-23 19:42:14,716 - autogluon.text.text_prediction.text_prediction - INFO - Columns:

- Text(
   name="Product_Description"
   #total/missing=4581/0
   length, min/avg/max=11/104.7640253219821/178
)
- Categorical(
   name="Product_Type"
   #total/missing=4581/0
   num_class (total/non_special)=10/10
   categories=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
   freq=[38, 43, 336, 218, 15, 152, 474, 233, 142, 2930]
)
- Categorical(
   name="Sentiment"
   #total/missing=4581/0
   num_class (total/non_special)=4/4
   categories=[0, 1, 2, 3]
   freq=[75, 276, 2717, 1513]
)


Columns:

- Text(
   name="Product_Description"
   #total/missing=4581/0
   length, min/avg/max=11/104.7640253219821/178
)
- Categorical(
   name="Product_Type"
   #total/missing=4581/0
   num_class (total/non_special)=10/10
   categories=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
   freq=[38, 43, 336, 218, 15, 152, 474, 233, 142, 2930]
)
- Categorical(
   name="Sentiment"
   #total/missing=4581/0
   num_class (total/non_special)=4/4
   categories=[0, 1, 2, 3]
   freq=[75, 276, 2717, 1513]
)


2021-02-23 19:42:14,717 - autogluon.text.text_prediction.text_prediction - INFO - Tuning Dataset:
Tuning Dataset:
2021-02-23 19:42:14,718 - autogluon.text.text_prediction.text_prediction - INFO - Columns:

- Text(
   name="Product_Description"
   #total/missing=1146/0
   length, min/avg/max=32/105.1719022687609/159
)
- Categorical(
   name="Product_Type"
   #total/missing=1146/0
   num_class (total/non_special)=10/10
   categories=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
   freq=[8, 13, 75, 55, 2, 42, 123, 62, 36, 730]
)
- Categorical(
   name="Sentiment"
   #total/missing=1146/0
   num_class (total/non_special)=4/4
   categories=[0, 1, 2, 3]
   freq=[25, 83, 671, 367]
)


Columns:

- Text(
   name="Product_Description"
   #total/missing=1146/0
   length, min/avg/max=32/105.1719022687609/159
)
- Categorical(
   name="Product_Type"
   #total/missing=1146/0
   num_class (total/non_special)=10/10
   categories=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
   freq=[8, 13, 75, 55, 2, 42, 123, 62, 36, 730]
)
- Categorical(
   name="Sentiment"
   #total/missing=1146/0
   num_class (total/non_special)=4/4
   categories=[0, 1, 2, 3]
   freq=[25, 83, 671, 367]
)


Label columns=['Sentiment'], Feature columns=['Product_Description', 'Product_Type'], Problem types=['classification'], Label shapes=[4]
Eval Metric=accuracy, Stop Metric=accuracy, Log Metrics=['acc', 'log_loss', 'accuracy']
2021-02-23 19:42:14,722 - autogluon.text.text_prediction.text_prediction - INFO - All Logs will be saved to predictor_text_multimodal/main.log
All Logs will be saved to predictor_text_multimodal/main.log
Starting Hyperparameter Tuning ... (num_trials=1)

  0%|          | 0/1 [00:00<?, ?it/s]

2021-02-23 19:43:58,181 - autogluon.text.text_prediction.text_prediction - INFO - Results=
Results=
2021-02-23 19:43:58,183 - autogluon.text.text_prediction.text_prediction - INFO - Best_config={'search_space▁optimization.lr': 5e-05}
Best_config={'search_space▁optimization.lr': 5e-05}

(task:7)    2021-02-23 19:42:16,860 - root - INFO - All Logs will be saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/docs/_build/eval/tutorials/tabular_prediction/predictor_text_multimodal/task7/training.log
2021-02-23 19:42:16,860 - root - INFO - learning:
  early_stopping_patience: 10
  log_metrics: auto
  stop_metric: auto
  valid_ratio: 0.15
misc:
  exp_dir: /var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/docs/_build/eval/tutorials/tabular_prediction/predictor_text_multimodal/task7
  seed: 123
model:
  backbone:
    name: google_electra_small
  network:
    agg_net:
      activation: tanh
      agg_type: concat
      data_dropout: False
      dropout: 0.1
      feature_proj_num_layers: -1
      initializer:
        bias: ['zeros']
        weight: ['xavier', 'uniform', 'avg', 3.0]
      mid_units: 256
      norm_eps: 1e-05
      normalization: layer_norm
      out_proj_num_layers: 0
    categorical_net:
      activation: leaky
      data_dropout: False
      dropout: 0.1
      emb_units: 32
      initializer:
        bias: ['zeros']
        embed: ['xavier', 'gaussian', 'in', 1.0]
        weight: ['xavier', 'uniform', 'avg', 3.0]
      mid_units: 64
      norm_eps: 1e-05
      normalization: layer_norm
      num_layers: 1
    feature_units: -1
    initializer:
      bias: ['zeros']
      weight: ['truncnorm', 0, 0.02]
    numerical_net:
      activation: leaky
      data_dropout: False
      dropout: 0.1
      initializer:
        bias: ['zeros']
        weight: ['xavier', 'uniform', 'avg', 3.0]
      input_centering: False
      mid_units: 128
      norm_eps: 1e-05
      normalization: layer_norm
      num_layers: 1
    text_net:
      pool_type: cls
      use_segment_id: True
  preprocess:
    max_length: 128
    merge_text: True
optimization:
  batch_size: 32
  begin_lr: 0.0
  final_lr: 0.0
  layerwise_lr_decay: 0.8
  log_frequency: 0.1
  lr: 5e-05
  lr_scheduler: triangular
  max_grad_norm: 1.0
  model_average: 5
  num_train_epochs: 4
  optimizer: adamw
  optimizer_params: [('beta1', 0.9), ('beta2', 0.999), ('epsilon', 1e-06), ('correct_bias', False)]
  per_device_batch_size: 16
  val_batch_size_mult: 2
  valid_frequency: 0.1
  warmup_portion: 0.1
  wd: 0.01
version: 1
2021-02-23 19:42:17,008 - root - INFO - Process training set...
2021-02-23 19:42:20,561 - root - INFO - Done!
2021-02-23 19:42:20,563 - root - INFO - Process dev set...
2021-02-23 19:42:23,413 - root - INFO - Done!
2021-02-23 19:42:28,728 - root - INFO - #Total Params/Fixed Params=13504196/0
2021-02-23 19:42:28,744 - root - INFO - Using gradient accumulation. Global batch size = 32
2021-02-23 19:42:31,227 - root - INFO - [Iter 15/572, Epoch 0] train loss=7.6880e-01, gnorm=4.5836e+00, lr=1.3158e-05, #samples processed=720, #sample per second=294.59
2021-02-23 19:42:32,048 - root - INFO - [Iter 15/572, Epoch 0] valid accuracy=6.9110e-01, log_loss=8.6215e-01, accuracy=6.9110e-01, time spent=0.740s, total_time=0.05min
2021-02-23 19:42:33,539 - root - INFO - [Iter 30/572, Epoch 0] train loss=5.2239e-01, gnorm=4.9510e+00, lr=2.6316e-05, #samples processed=720, #sample per second=311.42
2021-02-23 19:42:34,447 - root - INFO - [Iter 30/572, Epoch 0] valid accuracy=8.2373e-01, log_loss=6.7741e-01, accuracy=8.2373e-01, time spent=0.736s, total_time=0.09min
2021-02-23 19:42:35,948 - root - INFO - [Iter 45/572, Epoch 0] train loss=4.0803e-01, gnorm=2.5185e+00, lr=3.9474e-05, #samples processed=720, #sample per second=298.88
2021-02-23 19:42:36,801 - root - INFO - [Iter 45/572, Epoch 0] valid accuracy=8.5689e-01, log_loss=5.3321e-01, accuracy=8.5689e-01, time spent=0.734s, total_time=0.13min
2021-02-23 19:42:38,372 - root - INFO - [Iter 60/572, Epoch 0] train loss=3.7348e-01, gnorm=3.0502e+00, lr=4.9709e-05, #samples processed=720, #sample per second=297.09
2021-02-23 19:42:39,249 - root - INFO - [Iter 60/572, Epoch 0] valid accuracy=8.5689e-01, log_loss=4.9822e-01, accuracy=8.5689e-01, time spent=0.740s, total_time=0.17min
2021-02-23 19:42:40,683 - root - INFO - [Iter 75/572, Epoch 0] train loss=3.2895e-01, gnorm=3.5754e+00, lr=4.8252e-05, #samples processed=720, #sample per second=311.56
2021-02-23 19:42:41,602 - root - INFO - [Iter 75/572, Epoch 0] valid accuracy=8.6126e-01, log_loss=5.3280e-01, accuracy=8.6126e-01, time spent=0.744s, total_time=0.21min
2021-02-23 19:42:43,136 - root - INFO - [Iter 90/572, Epoch 0] train loss=2.7651e-01, gnorm=2.3192e+00, lr=4.6796e-05, #samples processed=720, #sample per second=293.54
2021-02-23 19:42:43,881 - root - INFO - [Iter 90/572, Epoch 0] valid accuracy=8.5777e-01, log_loss=4.9850e-01, accuracy=8.5777e-01, time spent=0.744s, total_time=0.25min
2021-02-23 19:42:45,342 - root - INFO - [Iter 105/572, Epoch 0] train loss=3.2964e-01, gnorm=2.4756e+00, lr=4.5340e-05, #samples processed=720, #sample per second=326.43
2021-02-23 19:42:46,231 - root - INFO - [Iter 105/572, Epoch 0] valid accuracy=8.6126e-01, log_loss=4.8262e-01, accuracy=8.6126e-01, time spent=0.748s, total_time=0.29min
2021-02-23 19:42:47,723 - root - INFO - [Iter 120/572, Epoch 0] train loss=2.7775e-01, gnorm=3.3851e+00, lr=4.3883e-05, #samples processed=720, #sample per second=302.40
2021-02-23 19:42:48,593 - root - INFO - [Iter 120/572, Epoch 0] valid accuracy=8.6126e-01, log_loss=4.9458e-01, accuracy=8.6126e-01, time spent=0.747s, total_time=0.33min
2021-02-23 19:42:50,058 - root - INFO - [Iter 135/572, Epoch 0] train loss=3.5077e-01, gnorm=5.2572e+00, lr=4.2427e-05, #samples processed=720, #sample per second=308.41
2021-02-23 19:42:50,945 - root - INFO - [Iter 135/572, Epoch 0] valid accuracy=8.6126e-01, log_loss=5.0231e-01, accuracy=8.6126e-01, time spent=0.746s, total_time=0.37min
2021-02-23 19:42:52,411 - root - INFO - [Iter 150/572, Epoch 1] train loss=3.0280e-01, gnorm=2.9394e+00, lr=4.0971e-05, #samples processed=698, #sample per second=296.61
2021-02-23 19:42:53,289 - root - INFO - [Iter 150/572, Epoch 1] valid accuracy=8.6126e-01, log_loss=4.6663e-01, accuracy=8.6126e-01, time spent=0.741s, total_time=0.41min
2021-02-23 19:42:54,739 - root - INFO - [Iter 165/572, Epoch 1] train loss=3.3440e-01, gnorm=1.9468e+00, lr=3.9515e-05, #samples processed=720, #sample per second=309.34
2021-02-23 19:42:55,618 - root - INFO - [Iter 165/572, Epoch 1] valid accuracy=8.6126e-01, log_loss=4.7071e-01, accuracy=8.6126e-01, time spent=0.743s, total_time=0.45min
2021-02-23 19:42:57,103 - root - INFO - [Iter 180/572, Epoch 1] train loss=3.1203e-01, gnorm=2.5760e+00, lr=3.8058e-05, #samples processed=720, #sample per second=304.62
2021-02-23 19:42:58,016 - root - INFO - [Iter 180/572, Epoch 1] valid accuracy=8.6475e-01, log_loss=4.5359e-01, accuracy=8.6475e-01, time spent=0.770s, total_time=0.49min
2021-02-23 19:42:59,492 - root - INFO - [Iter 195/572, Epoch 1] train loss=3.1515e-01, gnorm=3.4280e+00, lr=3.6602e-05, #samples processed=720, #sample per second=301.42
2021-02-23 19:43:00,239 - root - INFO - [Iter 195/572, Epoch 1] valid accuracy=8.6300e-01, log_loss=4.7073e-01, accuracy=8.6300e-01, time spent=0.747s, total_time=0.52min
2021-02-23 19:43:01,687 - root - INFO - [Iter 210/572, Epoch 1] train loss=3.1073e-01, gnorm=3.3016e+00, lr=3.5146e-05, #samples processed=720, #sample per second=328.04
2021-02-23 19:43:02,578 - root - INFO - [Iter 210/572, Epoch 1] valid accuracy=8.6562e-01, log_loss=4.5557e-01, accuracy=8.6562e-01, time spent=0.757s, total_time=0.56min
2021-02-23 19:43:04,062 - root - INFO - [Iter 225/572, Epoch 1] train loss=2.7208e-01, gnorm=4.7893e+00, lr=3.3689e-05, #samples processed=720, #sample per second=303.15
2021-02-23 19:43:04,950 - root - INFO - [Iter 225/572, Epoch 1] valid accuracy=8.6562e-01, log_loss=4.6826e-01, accuracy=8.6562e-01, time spent=0.750s, total_time=0.60min
2021-02-23 19:43:06,403 - root - INFO - [Iter 240/572, Epoch 1] train loss=2.8616e-01, gnorm=2.7858e+00, lr=3.2233e-05, #samples processed=720, #sample per second=307.56
2021-02-23 19:43:07,296 - root - INFO - [Iter 240/572, Epoch 1] valid accuracy=8.6736e-01, log_loss=4.4909e-01, accuracy=8.6736e-01, time spent=0.755s, total_time=0.64min
2021-02-23 19:43:08,790 - root - INFO - [Iter 255/572, Epoch 1] train loss=2.3186e-01, gnorm=1.3332e+00, lr=3.0777e-05, #samples processed=720, #sample per second=301.64
2021-02-23 19:43:09,552 - root - INFO - [Iter 255/572, Epoch 1] valid accuracy=8.6126e-01, log_loss=5.0038e-01, accuracy=8.6126e-01, time spent=0.761s, total_time=0.68min
2021-02-23 19:43:11,041 - root - INFO - [Iter 270/572, Epoch 1] train loss=2.5154e-01, gnorm=3.5720e+00, lr=2.9320e-05, #samples processed=720, #sample per second=320.01
2021-02-23 19:43:11,801 - root - INFO - [Iter 270/572, Epoch 1] valid accuracy=8.6649e-01, log_loss=4.6431e-01, accuracy=8.6649e-01, time spent=0.761s, total_time=0.72min
2021-02-23 19:43:13,268 - root - INFO - [Iter 285/572, Epoch 1] train loss=2.8520e-01, gnorm=3.8546e+00, lr=2.7864e-05, #samples processed=720, #sample per second=323.30
2021-02-23 19:43:14,027 - root - INFO - [Iter 285/572, Epoch 1] valid accuracy=8.6562e-01, log_loss=4.6998e-01, accuracy=8.6562e-01, time spent=0.758s, total_time=0.75min
2021-02-23 19:43:15,491 - root - INFO - [Iter 300/572, Epoch 2] train loss=3.3013e-01, gnorm=3.5853e+00, lr=2.6408e-05, #samples processed=709, #sample per second=318.87
2021-02-23 19:43:16,396 - root - INFO - [Iter 300/572, Epoch 2] valid accuracy=8.6911e-01, log_loss=4.4177e-01, accuracy=8.6911e-01, time spent=0.759s, total_time=0.79min
2021-02-23 19:43:17,867 - root - INFO - [Iter 315/572, Epoch 2] train loss=2.7419e-01, gnorm=3.3656e+00, lr=2.4951e-05, #samples processed=720, #sample per second=303.16
2021-02-23 19:43:18,608 - root - INFO - [Iter 315/572, Epoch 2] valid accuracy=8.6300e-01, log_loss=4.9422e-01, accuracy=8.6300e-01, time spent=0.741s, total_time=0.83min
2021-02-23 19:43:20,083 - root - INFO - [Iter 330/572, Epoch 2] train loss=2.5470e-01, gnorm=2.7021e+00, lr=2.3495e-05, #samples processed=720, #sample per second=324.87
2021-02-23 19:43:20,839 - root - INFO - [Iter 330/572, Epoch 2] valid accuracy=8.6824e-01, log_loss=4.5256e-01, accuracy=8.6824e-01, time spent=0.756s, total_time=0.87min
2021-02-23 19:43:22,290 - root - INFO - [Iter 345/572, Epoch 2] train loss=2.8035e-01, gnorm=3.6649e+00, lr=2.2039e-05, #samples processed=720, #sample per second=326.20
2021-02-23 19:43:23,049 - root - INFO - [Iter 345/572, Epoch 2] valid accuracy=8.6736e-01, log_loss=4.4729e-01, accuracy=8.6736e-01, time spent=0.758s, total_time=0.90min
2021-02-23 19:43:24,557 - root - INFO - [Iter 360/572, Epoch 2] train loss=2.5612e-01, gnorm=3.1314e+00, lr=2.0583e-05, #samples processed=720, #sample per second=317.63
2021-02-23 19:43:25,339 - root - INFO - [Iter 360/572, Epoch 2] valid accuracy=8.6475e-01, log_loss=4.6779e-01, accuracy=8.6475e-01, time spent=0.781s, total_time=0.94min
2021-02-23 19:43:26,821 - root - INFO - [Iter 375/572, Epoch 2] train loss=2.2449e-01, gnorm=5.5644e+00, lr=1.9126e-05, #samples processed=720, #sample per second=318.10
2021-02-23 19:43:27,578 - root - INFO - [Iter 375/572, Epoch 2] valid accuracy=8.6824e-01, log_loss=4.4857e-01, accuracy=8.6824e-01, time spent=0.757s, total_time=0.98min
2021-02-23 19:43:29,074 - root - INFO - [Iter 390/572, Epoch 2] train loss=2.8870e-01, gnorm=4.0543e+00, lr=1.7670e-05, #samples processed=720, #sample per second=319.61
2021-02-23 19:43:29,829 - root - INFO - [Iter 390/572, Epoch 2] valid accuracy=8.6649e-01, log_loss=4.6483e-01, accuracy=8.6649e-01, time spent=0.755s, total_time=1.02min
2021-02-23 19:43:31,234 - root - INFO - [Iter 405/572, Epoch 2] train loss=2.4080e-01, gnorm=5.0446e+00, lr=1.6214e-05, #samples processed=720, #sample per second=333.41
2021-02-23 19:43:32,136 - root - INFO - [Iter 405/572, Epoch 2] valid accuracy=8.6911e-01, log_loss=4.3994e-01, accuracy=8.6911e-01, time spent=0.764s, total_time=1.06min
2021-02-23 19:43:33,652 - root - INFO - [Iter 420/572, Epoch 2] train loss=2.6520e-01, gnorm=4.7820e+00, lr=1.4757e-05, #samples processed=720, #sample per second=297.77
2021-02-23 19:43:34,531 - root - INFO - [Iter 420/572, Epoch 2] valid accuracy=8.6911e-01, log_loss=4.4517e-01, accuracy=8.6911e-01, time spent=0.747s, total_time=1.10min
2021-02-23 19:43:35,988 - root - INFO - [Iter 435/572, Epoch 3] train loss=2.7997e-01, gnorm=2.2674e+00, lr=1.3301e-05, #samples processed=698, #sample per second=298.79
2021-02-23 19:43:36,768 - root - INFO - [Iter 435/572, Epoch 3] valid accuracy=8.6824e-01, log_loss=4.5767e-01, accuracy=8.6824e-01, time spent=0.780s, total_time=1.13min
2021-02-23 19:43:38,196 - root - INFO - [Iter 450/572, Epoch 3] train loss=2.6275e-01, gnorm=1.0901e+01, lr=1.1845e-05, #samples processed=720, #sample per second=326.10
2021-02-23 19:43:38,967 - root - INFO - [Iter 450/572, Epoch 3] valid accuracy=8.6736e-01, log_loss=4.6367e-01, accuracy=8.6736e-01, time spent=0.771s, total_time=1.17min
2021-02-23 19:43:40,466 - root - INFO - [Iter 465/572, Epoch 3] train loss=2.9477e-01, gnorm=6.8845e+00, lr=1.0388e-05, #samples processed=720, #sample per second=317.20
2021-02-23 19:43:41,352 - root - INFO - [Iter 465/572, Epoch 3] valid accuracy=8.6998e-01, log_loss=4.3869e-01, accuracy=8.6998e-01, time spent=0.754s, total_time=1.21min
2021-02-23 19:43:42,808 - root - INFO - [Iter 480/572, Epoch 3] train loss=2.5355e-01, gnorm=2.8429e+00, lr=8.9320e-06, #samples processed=720, #sample per second=307.44
2021-02-23 19:43:43,567 - root - INFO - [Iter 480/572, Epoch 3] valid accuracy=8.6911e-01, log_loss=4.5120e-01, accuracy=8.6911e-01, time spent=0.758s, total_time=1.25min
2021-02-23 19:43:45,077 - root - INFO - [Iter 495/572, Epoch 3] train loss=2.4979e-01, gnorm=4.2404e+00, lr=7.4757e-06, #samples processed=720, #sample per second=317.39
2021-02-23 19:43:45,833 - root - INFO - [Iter 495/572, Epoch 3] valid accuracy=8.6824e-01, log_loss=4.4716e-01, accuracy=8.6824e-01, time spent=0.756s, total_time=1.28min
2021-02-23 19:43:47,312 - root - INFO - [Iter 510/572, Epoch 3] train loss=2.6619e-01, gnorm=3.3240e+00, lr=6.0194e-06, #samples processed=720, #sample per second=322.11
2021-02-23 19:43:48,061 - root - INFO - [Iter 510/572, Epoch 3] valid accuracy=8.6911e-01, log_loss=4.4639e-01, accuracy=8.6911e-01, time spent=0.748s, total_time=1.32min
2021-02-23 19:43:49,506 - root - INFO - [Iter 525/572, Epoch 3] train loss=2.0779e-01, gnorm=2.5354e+00, lr=4.5631e-06, #samples processed=720, #sample per second=328.20
2021-02-23 19:43:50,264 - root - INFO - [Iter 525/572, Epoch 3] valid accuracy=8.6911e-01, log_loss=4.5519e-01, accuracy=8.6911e-01, time spent=0.758s, total_time=1.36min
2021-02-23 19:43:51,690 - root - INFO - [Iter 540/572, Epoch 3] train loss=2.6498e-01, gnorm=3.6586e+00, lr=3.1068e-06, #samples processed=720, #sample per second=329.82
2021-02-23 19:43:52,433 - root - INFO - [Iter 540/572, Epoch 3] valid accuracy=8.6911e-01, log_loss=4.5044e-01, accuracy=8.6911e-01, time spent=0.743s, total_time=1.39min
2021-02-23 19:43:53,926 - root - INFO - [Iter 555/572, Epoch 3] train loss=2.9774e-01, gnorm=4.7774e+00, lr=1.6505e-06, #samples processed=720, #sample per second=321.93
2021-02-23 19:43:54,681 - root - INFO - [Iter 555/572, Epoch 3] valid accuracy=8.6824e-01, log_loss=4.4645e-01, accuracy=8.6824e-01, time spent=0.754s, total_time=1.43min
2021-02-23 19:43:56,197 - root - INFO - [Iter 570/572, Epoch 3] train loss=2.0356e-01, gnorm=2.9091e+00, lr=1.9417e-07, #samples processed=720, #sample per second=317.11
2021-02-23 19:43:56,947 - root - INFO - [Iter 570/572, Epoch 3] valid accuracy=8.6824e-01, log_loss=4.4582e-01, accuracy=8.6824e-01, time spent=0.749s, total_time=1.47min
2021-02-23 19:43:57,899 - root - INFO - [Iter 572/572, Epoch 3] valid accuracy=8.6824e-01, log_loss=4.4583e-01, accuracy=8.6824e-01, time spent=0.760s, total_time=1.49min

/var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/text/src/autogluon/text/text_prediction/dataset.py:321: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df[col_name] = df[col_name].fillna('').apply(str)
/var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/venv/lib/python3.8/site-packages/mxnet/gluon/block.py:995: UserWarning: The 3-th input to HybridBlock is not used by any computation. Is this intended?
  self._build_cache(*args)
Beginning AutoGluon training ...
AutoGluon will save models to "model5/"
AutoGluon Version:  0.1.0b20210223
Train Data Rows:    5727
Train Data Columns: 2
Preprocessing data ...
AutoGluon infers your prediction problem is: 'multiclass' (because dtype of label-column == int, but few unique label-values observed).
    4 unique label values:  [3, 2, 1, 0]
    If 'multiclass' is not the correct problem_type, please manually specify the problem_type argument in fit() (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])
Train Data Class Count: 4
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
    Available Memory:                    13323.89 MB
    Train Data (Original)  Memory Usage: 1.0 MB (0.0% of available memory)
    Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
    Stage 1 Generators:
            Fitting AsTypeFeatureGenerator...
    Stage 2 Generators:
            Fitting FillNaFeatureGenerator...
    Stage 3 Generators:
            Fitting IdentityFeatureGenerator...
            Fitting CategoryFeatureGenerator...
                    Fitting CategoryMemoryMinimizeFeatureGenerator...
            Fitting TextSpecialFeatureGenerator...
                    Fitting BinnedFeatureGenerator...
                    Fitting DropDuplicatesFeatureGenerator...
            Fitting TextNgramFeatureGenerator...
                    Fitting CountVectorizer for text features: ['Product_Description']
                    CountVectorizer fit with vocabulary size = 725
            Warning: Due to memory constraints, ngram feature count is being reduced. Allocate more memory to maximize model quality.
            Reducing Vectorizer vocab size from 725 to 259 to avoid OOM error
    Stage 4 Generators:
            Fitting DropUniqueFeatureGenerator...
    Types of features in original data (raw dtype, special dtypes):
            ('int', [])          : 1 | ['Product_Type']
            ('object', ['text']) : 1 | ['Product_Description']
    Types of features in processed data (raw dtype, special dtypes):
            ('int', [])                         :   1 | ['Product_Type']
            ('int', ['binned', 'text_special']) :  38 | ['Product_Description.char_count', 'Product_Description.word_count', 'Product_Description.capital_ratio', 'Product_Description.lower_ratio', 'Product_Description.digit_ratio', ...]
            ('int', ['text_ngram'])             : 260 | ['__nlp__.about', '__nlp__.all', '__nlp__.amp', '__nlp__.an', '__nlp__.an ipad', ...]
    2.1s = Fit runtime
    2 features in original data used to generate 299 features in processed data.
    Train Data (Processed) Memory Usage: 1.77 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 2.17s ...
AutoGluon will gauge predictive performance using evaluation metric: 'accuracy'
    To change this, specify the eval_metric argument of fit()
Automatically generating train/validation split with holdout_frac=0.1, Train Rows: 5154, Val Rows: 573
Fitting model: NeuralNetMXNet ...
    0.8743   = Validation accuracy score
    5.12s    = Training runtime
    0.04s    = Validation runtime
Fitting model: NeuralNetFastAI ...

█

8499   = Validation accuracy score
51s   = Training runtime
28s    = Validation runtime
Fitting model: KNeighborsUnif ...
8534   = Validation accuracy score
02s    = Training runtime
02s    = Validation runtime
Fitting model: KNeighborsDist ...
8534   = Validation accuracy score
02s    = Training runtime
02s    = Validation runtime
Fitting model: RandomForestGini ...
8796   = Validation accuracy score
96s    = Training runtime
08s    = Validation runtime
Fitting model: RandomForestEntr ...
8761   = Validation accuracy score
0s     = Training runtime
08s    = Validation runtime
Fitting model: ExtraTreesGini ...
8534   = Validation accuracy score
06s    = Training runtime
08s    = Validation runtime
Fitting model: ExtraTreesEntr ...
8499   = Validation accuracy score
08s    = Training runtime
08s    = Validation runtime
Fitting model: LightGBM ...
8778   = Validation accuracy score
31s    = Training runtime
01s    = Validation runtime
Fitting model: LightGBMXT ...
8586   = Validation accuracy score
28s    = Training runtime
01s    = Validation runtime
Fitting model: CatBoost ...
8726   = Validation accuracy score
9s     = Training runtime
01s    = Validation runtime
Fitting model: XGBoost ...
8778   = Validation accuracy score
42s    = Training runtime
01s    = Validation runtime
Fitting model: LightGBMLarge ...
8848   = Validation accuracy score
02s    = Training runtime
01s    = Validation runtime

█

Fitting model: WeightedEnsemble_L1 ...
    0.8901   = Validation accuracy score
    0.4s     = Training runtime
    0.0s     = Validation runtime
AutoGluon training complete, total runtime = 51.48s ...
TabularPredictor saved. To load, use: TabularPredictor.load("model5/")

predictor_model5.leaderboard(dev_df.join(pd.DataFrame(dev_sentence_multimodal_embeddings)), silent=True)

█

	model	score_test	score_val	pred_time_test	pred_time_val	fit_time	pred_time_test_marginal	pred_time_val_marginal	fit_time_marginal	stack_level	can_infer	fit_order
0	CatBoost	0.886970	0.872600	0.014327	0.005457	0.895682	0.014327	0.005457	0.895682	0	True	11
1	RandomForestGini	0.886970	0.879581	0.120903	0.079546	0.959055	0.120903	0.079546	0.959055	0	True	5
2	WeightedEnsemble_L1	0.886970	0.890052	10.760290	0.648267	29.524343	0.006635	0.000556	0.402977	1	True	14
3	LightGBM	0.885400	0.877836	0.012522	0.006084	1.307731	0.012522	0.006084	1.307731	0	True	9
4	NeuralNetMXNet	0.885400	0.874346	0.043936	0.037668	5.120788	0.043936	0.037668	5.120788	0	True	1
5	LightGBMLarge	0.883830	0.884817	0.028177	0.007567	4.015435	0.028177	0.007567	4.015435	0	True	13
6	KNeighborsUnif	0.883830	0.853403	0.031159	0.019027	0.017875	0.031159	0.019027	0.017875	0	True	3
7	KNeighborsDist	0.883830	0.853403	0.040395	0.017965	0.017411	0.040395	0.017965	0.017411	0	True	4
8	XGBoost	0.883830	0.877836	0.113737	0.011080	2.421276	0.113737	0.011080	2.421276	0	True	12
9	RandomForestEntr	0.883830	0.876091	0.119312	0.080413	0.995966	0.119312	0.080413	0.995966	0	True	6
10	ExtraTreesGini	0.875981	0.853403	0.177299	0.080179	1.055089	0.177299	0.080179	1.055089	0	True	7
11	NeuralNetFastAI	0.874411	0.849913	10.059017	0.280836	17.511071	10.059017	0.280836	17.511071	0	True	2
12	LightGBMXT	0.871272	0.858639	0.012264	0.006685	1.283650	0.012264	0.006685	1.283650	0	True	10
13	ExtraTreesEntr	0.869702	0.849913	0.169439	0.082980	1.080275	0.169439	0.082980	1.080275	0	True	8

Model 6: Use a larger backbone¶

Now, we will choose to use a larger backbone: ELECTRA-base. We will find that the performance gets improved after we change to use a larger backbone model. However, we should notice that the training time will be longer and the inference cost will be higher.

from autogluon.text.text_prediction.text_prediction import ag_text_prediction_params
from autogluon.tabular.configs.hyperparameter_configs import get_hyperparameter_config
import copy

text_nn_params = ag_text_prediction_params.create('default_electra_base_no_hpo')

tabular_multimodel_hparam_v2 = {
    'GBM': [{}, {'extra_trees': True, 'ag_args': {'name_suffix': 'XT'}}],
    'CAT': {},
    'TEXT_NN_V1': text_nn_params,
}

predictor_model6 = TabularPredictor(label=label, eval_metric='accuracy', path='model6').fit(
    train_df, hyperparameters=tabular_multimodel_hparam_v2
)

Beginning AutoGluon training ...
AutoGluon will save models to "model6/"
AutoGluon Version:  0.1.0b20210223
Train Data Rows:    5727
Train Data Columns: 2
Preprocessing data ...
AutoGluon infers your prediction problem is: 'multiclass' (because dtype of label-column == int, but few unique label-values observed).
    4 unique label values:  [3, 2, 1, 0]
    If 'multiclass' is not the correct problem_type, please manually specify the problem_type argument in fit() (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])
Train Data Class Count: 4
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
    Available Memory:                    13221.79 MB
    Train Data (Original)  Memory Usage: 1.0 MB (0.0% of available memory)
    Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
    Stage 1 Generators:
            Fitting AsTypeFeatureGenerator...
    Stage 2 Generators:
            Fitting FillNaFeatureGenerator...
    Stage 3 Generators:
            Fitting IdentityFeatureGenerator...
            Fitting IdentityFeatureGenerator...
                    Fitting RenameFeatureGenerator...
            Fitting CategoryFeatureGenerator...
                    Fitting CategoryMemoryMinimizeFeatureGenerator...
            Fitting TextSpecialFeatureGenerator...
                    Fitting BinnedFeatureGenerator...
                    Fitting DropDuplicatesFeatureGenerator...
            Fitting TextNgramFeatureGenerator...
                    Fitting CountVectorizer for text features: ['Product_Description']
                    CountVectorizer fit with vocabulary size = 725
            Warning: Due to memory constraints, ngram feature count is being reduced. Allocate more memory to maximize model quality.
            Reducing Vectorizer vocab size from 725 to 256 to avoid OOM error
    Stage 4 Generators:
            Fitting DropUniqueFeatureGenerator...
    Types of features in original data (raw dtype, special dtypes):
            ('int', [])          : 1 | ['Product_Type']
            ('object', ['text']) : 1 | ['Product_Description']
    Types of features in processed data (raw dtype, special dtypes):
            ('int', [])                         :   1 | ['Product_Type']
            ('int', ['binned', 'text_special']) :  38 | ['Product_Description.char_count', 'Product_Description.word_count', 'Product_Description.capital_ratio', 'Product_Description.lower_ratio', 'Product_Description.digit_ratio', ...]
            ('int', ['text_ngram'])             : 257 | ['__nlp__.about', '__nlp__.all', '__nlp__.amp', '__nlp__.an', '__nlp__.an ipad', ...]
            ('object', ['text'])                :   1 | ['Product_Description_raw_text']
    2.2s = Fit runtime
    2 features in original data used to generate 297 features in processed data.
    Train Data (Processed) Memory Usage: 2.71 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 2.21s ...
AutoGluon will gauge predictive performance using evaluation metric: 'accuracy'
    To change this, specify the eval_metric argument of fit()
Automatically generating train/validation split with holdout_frac=0.1, Train Rows: 5154, Val Rows: 573
Fitting model: LightGBM ...
    0.8778   = Validation accuracy score
    1.16s    = Training runtime
    0.01s    = Validation runtime
Fitting model: LightGBMXT ...
    0.8569   = Validation accuracy score
    1.37s    = Training runtime
    0.02s    = Validation runtime
Fitting model: CatBoost ...
    0.8726   = Validation accuracy score
    0.9s     = Training runtime
    0.02s    = Validation runtime
Fitting model: TextNeuralNetV1 ...
All Logs will be saved to model6/models/TextNeuralNetV1/TextNeuralNetV1/main.log
Starting Hyperparameter Tuning ... (num_trials=1)

  0%|          | 0/1 [00:00<?, ?it/s]

/var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/venv/lib/python3.8/site-packages/mxnet/gluon/block.py:995: UserWarning: The 3-th input to HybridBlock is not used by any computation. Is this intended?
  self._build_cache(*args)
    0.9162   = Validation accuracy score
    319.17s  = Training runtime
    2.26s    = Validation runtime
/var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/venv/lib/python3.8/site-packages/mxnet/gluon/block.py:995: UserWarning: The 3-th input to HybridBlock is not used by any computation. Is this intended?
  self._build_cache(*args)
Fitting model: WeightedEnsemble_L1 ...
    0.9162   = Validation accuracy score
    0.14s    = Training runtime
    0.0s     = Validation runtime
AutoGluon training complete, total runtime = 331.01s ...
TabularPredictor saved. To load, use: TabularPredictor.load("model6/")

predictor_model6.leaderboard(dev_df, silent=True)

/var/lib/jenkins/workspace/workspace/autogluon-tutorial-tabular-v3/venv/lib/python3.8/site-packages/mxnet/gluon/block.py:995: UserWarning: The 3-th input to HybridBlock is not used by any computation. Is this intended?
  self._build_cache(*args)

	model	score_test	score_val	pred_time_test	pred_time_val	fit_time	pred_time_test_marginal	pred_time_val_marginal	fit_time_marginal	stack_level	can_infer	fit_order
0	TextNeuralNetV1	0.899529	0.916230	3.329292	2.263031	319.174685	3.329292	2.263031	319.174685	0	True	4
1	WeightedEnsemble_L1	0.899529	0.916230	3.337787	2.263554	319.317756	0.008495	0.000524	0.143071	1	True	5
2	CatBoost	0.886970	0.872600	0.015280	0.015155	0.900599	0.015280	0.015155	0.900599	0	True	3
3	LightGBM	0.885400	0.877836	0.007218	0.006019	1.159134	0.007218	0.006019	1.159134	0	True	1
4	LightGBMXT	0.868132	0.856894	0.013989	0.017886	1.365870	0.013989	0.017886	1.365870	0	True	2

Major Takeaways¶

After performing these comparisons, we have the following takeaways:

The multimodal text neural network structure used in TextPrediction is a good for dealing with tabular data with text and categorical features.
K-fold bagging / stacking and weighted ensemble are helpful
We need a larger backbone. This aligns with the observation in recent papers, e.g., Scaling Laws for Autoregressive Generative Modeling.