Text Classification - Quick Start¶
Note: TextClassification is in preview mode and is not feature
complete. While the tutorial described below is functional, using
TextClassification on custom datasets is not yet supported. For an
alternative, text data can be passed to TabularPrediction in tabular
format which has text feature support.
We adopt the task of Text Classification as a running example to illustrate basic usage of AutoGluon’s NLP capability.
The AutoGluon Text functionality depends on the
GluonNLP package. Thus, in order to
use AutoGluon-Text, you will need to install GluonNLP via
pip install gluonnlp==0.8.1
In this tutorial, we are using sentiment analysis as a text
classification example. We will load sentences and the corresponding
labels (sentiment) into AutoGluon and use this data to obtain a neural
network that can classify new sentences. Different from traditional
machine learning where we need to manually define the neural network,
and specify the hyperparameters in the training process, with just a
single call to AutoGluon’s fit function, AutoGluon will
automatically train many models under thousands of different
hyperparameter configurations and then return the best model.
We begin by specifying TextClassification as our task of interest:
import autogluon as ag
from autogluon import TextClassification as task
Create AutoGluon Dataset¶
We are using a subset of the Stanford Sentiment Treebank (SST). The original dataset consists of sentences from movie reviews and human annotations of their sentiment. The task is to classify whether a given sentence has positive or negative sentiment (binary classification).
dataset = task.Dataset(name='ToySST')
In the above call, we have the proper train/validation/test split of the SST dataset.
Use AutoGluon to fit Models¶
Now, we want to obtain a neural network classifier using AutoGluon. In the default configuration, rather than attempting to train complex models from scratch using our data, AutoGluon fine-tunes neural networks that have already been pretrained on large scale text dataset such as Wikicorpus. Although the dataset involves entirely different text, lower-level features captured in the representations of the pretrained network (such as edge/texture detectors) are likely to remain useful for our own text dataset.
While we primarily stick with default configurations in this Beginner
tutorial, the Advanced tutorial covers various options that you can
specify for greater control over the training process. With just a
single call to AutoGluon’s fit function, AutoGluon will train many
models with different hyperparameter configurations and return the best
model.
However, neural network training can be quite time-costly. To ensure
quick runtimes, we tell AutoGluon to obey strict limits: epochs
specifies how much computational effort can be devoted to training any
single network, while time_limits in seconds specifies how much time
fit has to return a model (more precisely, training runs are started
as long as time_limits is not reached). For demo purposes, we
specify only small values for time_limits, epochs:
predictor = task.fit(dataset, epochs=1, time_limits=30)
TextClassification is in preview mode.Please feel free to request new features in issues if it is not covered in the current implementation. If your dataset is in tabular format, you could also try out our TabularPrediction module. scheduler_options: Key 'training_history_callback_delta_secs': Imputing default value 60 scheduler_options: Key 'delay_get_config': Imputing default value True Starting Experiments Num of Finished Tasks is 0 Time out (secs) is 30
scheduler: FIFOScheduler(
DistributedResourceManager{
(Remote: Remote REMOTE_ID: 0,
<Remote: 'inproc://172.31.45.231/23380/1' processes=1 threads=8, memory=33.24 GB>, Resource: NodeResourceManager(8 CPUs, 1 GPUs))
})
/var/lib/jenkins/miniconda3/envs/autogluon_docs/lib/python3.7/site-packages/mxnet/optimizer/optimizer.py:167: UserWarning: WARNING: New optimizer gluonnlp.optimizer.lamb.LAMB is overriding existing optimizer mxnet.optimizer.optimizer.LAMB
Optimizer.opt_registry[name].__name__))
Using gradient accumulation. Effective batch size = batch_size * accumulate = 32
100%|██████████| 27/27 [00:14<00:00, 1.83it/s]
100%|██████████| 1/1 [00:00<00:00, 5.97it/s]
validation metrics:accuracy:0.3750
/var/lib/jenkins/miniconda3/envs/autogluon_docs/lib/python3.7/site-packages/mxnet/optimizer/optimizer.py:167: UserWarning: WARNING: New optimizer gluonnlp.optimizer.lamb.LAMB is overriding existing optimizer mxnet.optimizer.optimizer.LAMB
Optimizer.opt_registry[name].__name__))
Using gradient accumulation. Effective batch size = batch_size * accumulate = 32
0%| | 0/1 [00:00<?, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 5.94it/s]validation metrics:accuracy:0.3750
100%|██████████| 1/1 [00:00<00:00, 5.89it/s]
/var/lib/jenkins/miniconda3/envs/autogluon_docs/lib/python3.7/site-packages/mxnet/optimizer/optimizer.py:167: UserWarning: WARNING: New optimizer gluonnlp.optimizer.lamb.LAMB is overriding existing optimizer mxnet.optimizer.optimizer.LAMB
Optimizer.opt_registry[name].__name__))
Using gradient accumulation. Effective batch size = batch_size * accumulate = 32
100%|██████████| 27/27 [00:15<00:00, 1.79it/s]
100%|██████████| 1/1 [00:00<00:00, 5.89it/s]
validation metrics:accuracy:0.6250
/var/lib/jenkins/miniconda3/envs/autogluon_docs/lib/python3.7/site-packages/mxnet/optimizer/optimizer.py:167: UserWarning: WARNING: New optimizer gluonnlp.optimizer.lamb.LAMB is overriding existing optimizer mxnet.optimizer.optimizer.LAMB
Optimizer.opt_registry[name].__name__))
Using gradient accumulation. Effective batch size = batch_size * accumulate = 32
0%| | 0/1 [00:00<?, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 5.83it/s]validation metrics:accuracy:0.6250
/var/lib/jenkins/miniconda3/envs/autogluon_docs/lib/python3.7/site-packages/mxnet/optimizer/optimizer.py:167: UserWarning: WARNING: New optimizer gluonnlp.optimizer.lamb.LAMB is overriding existing optimizer mxnet.optimizer.optimizer.LAMB
Optimizer.opt_registry[name].__name__))
/var/lib/jenkins/miniconda3/envs/autogluon_docs/lib/python3.7/site-packages/mxnet/optimizer/optimizer.py:167: UserWarning: WARNING: New optimizer gluonnlp.optimizer.lamb.LAMB is overriding existing optimizer mxnet.optimizer.optimizer.LAMB
Optimizer.opt_registry[name].__name__))
Within fit, the model with the best hyperparameter configuration is
selected based on its validation accuracy after being trained on the
data in the training split.
The best Top-1 accuracy achieved on the validation set is:
print('Top-1 val acc: %.3f' % predictor.results['best_reward'])
Top-1 val acc: 0.625
Within fit, this model is also finally fitted on our entire dataset
(i.e., merging training+validation) using the same optimal
hyperparameter configuration. The resulting model is considered as final
model to be applied to classify new text.
We now construct a test dataset similarly as we did with the train
dataset, and then evaluate the final model produced by fit on
the test data:
test_acc = predictor.evaluate(dataset)
print('Top-1 test acc: %.3f' % test_acc)
Top-1 test acc: 0.625
Given an example sentence, we can easily use the final model to
predict the label (and the conditional class-probability):
sentence = 'I feel this is awesome!'
ind = predictor.predict(sentence)
print('The input sentence sentiment is classified as [%d].' % ind.asscalar())
The input sentence sentiment is classified as [1].
The results object returned by fit contains summaries describing
various aspects of the training process. For example, we can inspect the
best hyperparameter configuration corresponding to the final model which
achieved the above (best) results:
print('The best configuration is:')
print(predictor.results['best_config'])
The best configuration is:
{'lr': 0.0001424912879717965, 'net▁choice': 0, 'pretrained_dataset▁choice': 0}