.. _sec_textquick: Text Classification - Quick Start ================================= Note: ``TextClassification`` is in preview mode and is not feature complete. While the tutorial described below is functional, using ``TextClassification`` on custom datasets is not yet supported. For an alternative, text data can be passed to ``TabularPrediction`` in tabular format which has text feature support. We adopt the task of Text Classification as a running example to illustrate basic usage of AutoGluon’s NLP capability. The AutoGluon Text functionality depends on the `GluonNLP `__ package. Thus, in order to use AutoGluon-Text, you will need to install GluonNLP via ``pip install gluonnlp==0.8.1`` In this tutorial, we are using sentiment analysis as a text classification example. We will load sentences and the corresponding labels (sentiment) into AutoGluon and use this data to obtain a neural network that can classify new sentences. Different from traditional machine learning where we need to manually define the neural network, and specify the hyperparameters in the training process, with just a single call to ``AutoGluon``'s ``fit`` function, AutoGluon will automatically train many models under thousands of different hyperparameter configurations and then return the best model. We begin by specifying ``TextClassification`` as our task of interest: .. code:: python import autogluon as ag from autogluon import TextClassification as task Create AutoGluon Dataset ------------------------ We are using a subset of the Stanford Sentiment Treebank (`SST `__). The original dataset consists of sentences from movie reviews and human annotations of their sentiment. The task is to classify whether a given sentence has positive or negative sentiment (binary classification). .. code:: python dataset = task.Dataset(name='ToySST') In the above call, we have the proper train/validation/test split of the SST dataset. Use AutoGluon to fit Models --------------------------- Now, we want to obtain a neural network classifier using AutoGluon. In the default configuration, rather than attempting to train complex models from scratch using our data, AutoGluon fine-tunes neural networks that have already been pretrained on large scale text dataset such as Wikicorpus. Although the dataset involves entirely different text, lower-level features captured in the representations of the pretrained network (such as edge/texture detectors) are likely to remain useful for our own text dataset. While we primarily stick with default configurations in this Beginner tutorial, the Advanced tutorial covers various options that you can specify for greater control over the training process. With just a single call to AutoGluon's ``fit`` function, AutoGluon will train many models with different hyperparameter configurations and return the best model. However, neural network training can be quite time-costly. To ensure quick runtimes, we tell AutoGluon to obey strict limits: ``epochs`` specifies how much computational effort can be devoted to training any single network, while ``time_limits`` in seconds specifies how much time ``fit`` has to return a model (more precisely, training runs are started as long as ``time_limits`` is not reached). For demo purposes, we specify only small values for ``time_limits``, ``epochs``: .. code:: python predictor = task.fit(dataset, epochs=1, time_limits=30) .. parsed-literal:: :class: output `TextClassification` is in preview mode.Please feel free to request new features in issues if it is not covered in the current implementation. If your dataset is in tabular format, you could also try out our `TabularPrediction` module. scheduler_options: Key 'training_history_callback_delta_secs': Imputing default value 60 scheduler_options: Key 'delay_get_config': Imputing default value True Starting Experiments Num of Finished Tasks is 0 Time out (secs) is 30 .. parsed-literal:: :class: output scheduler: FIFOScheduler( DistributedResourceManager{ (Remote: Remote REMOTE_ID: 0, , Resource: NodeResourceManager(8 CPUs, 1 GPUs)) }) .. parsed-literal:: :class: output /var/lib/jenkins/miniconda3/envs/autogluon_docs/lib/python3.7/site-packages/mxnet/optimizer/optimizer.py:167: UserWarning: WARNING: New optimizer gluonnlp.optimizer.lamb.LAMB is overriding existing optimizer mxnet.optimizer.optimizer.LAMB Optimizer.opt_registry[name].__name__)) Using gradient accumulation. Effective batch size = batch_size * accumulate = 32 /var/lib/jenkins/miniconda3/envs/autogluon_docs/lib/python3.7/site-packages/gluonnlp/data/sampler.py:354: UserWarning: Some buckets are empty and will be removed. Unused bucket keys=[59] str(unused_bucket_keys)) 100%|██████████| 26/26 [00:14<00:00, 1.78it/s] 100%|██████████| 1/1 [00:00<00:00, 5.28it/s] validation metrics:accuracy:0.6250 /var/lib/jenkins/miniconda3/envs/autogluon_docs/lib/python3.7/site-packages/mxnet/optimizer/optimizer.py:167: UserWarning: WARNING: New optimizer gluonnlp.optimizer.lamb.LAMB is overriding existing optimizer mxnet.optimizer.optimizer.LAMB Optimizer.opt_registry[name].__name__)) Using gradient accumulation. Effective batch size = batch_size * accumulate = 32 /var/lib/jenkins/miniconda3/envs/autogluon_docs/lib/python3.7/site-packages/gluonnlp/data/sampler.py:354: UserWarning: Some buckets are empty and will be removed. Unused bucket keys=[59] str(unused_bucket_keys)) 100%|██████████| 26/26 [00:14<00:00, 1.78it/s] 100%|██████████| 1/1 [00:00<00:00, 6.19it/s] validation metrics:accuracy:0.7500 /var/lib/jenkins/miniconda3/envs/autogluon_docs/lib/python3.7/site-packages/mxnet/optimizer/optimizer.py:167: UserWarning: WARNING: New optimizer gluonnlp.optimizer.lamb.LAMB is overriding existing optimizer mxnet.optimizer.optimizer.LAMB Optimizer.opt_registry[name].__name__)) Using gradient accumulation. Effective batch size = batch_size * accumulate = 32 /var/lib/jenkins/miniconda3/envs/autogluon_docs/lib/python3.7/site-packages/gluonnlp/data/sampler.py:354: UserWarning: Some buckets are empty and will be removed. Unused bucket keys=[59] str(unused_bucket_keys)) 100%|██████████| 26/26 [00:14<00:00, 1.76it/s] 100%|██████████| 1/1 [00:00<00:00, 5.86it/s] validation metrics:accuracy:0.6250 /var/lib/jenkins/miniconda3/envs/autogluon_docs/lib/python3.7/site-packages/mxnet/optimizer/optimizer.py:167: UserWarning: WARNING: New optimizer gluonnlp.optimizer.lamb.LAMB is overriding existing optimizer mxnet.optimizer.optimizer.LAMB Optimizer.opt_registry[name].__name__)) Using gradient accumulation. Effective batch size = batch_size * accumulate = 32 /var/lib/jenkins/miniconda3/envs/autogluon_docs/lib/python3.7/site-packages/gluonnlp/data/sampler.py:354: UserWarning: Some buckets are empty and will be removed. Unused bucket keys=[59] str(unused_bucket_keys)) 100%|██████████| 26/26 [00:14<00:00, 1.74it/s] 100%|██████████| 1/1 [00:00<00:00, 6.07it/s] 100%|██████████| 1/1 [00:00<00:00, 6.03it/s]validation metrics:accuracy:0.7500 /var/lib/jenkins/miniconda3/envs/autogluon_docs/lib/python3.7/site-packages/mxnet/optimizer/optimizer.py:167: UserWarning: WARNING: New optimizer gluonnlp.optimizer.lamb.LAMB is overriding existing optimizer mxnet.optimizer.optimizer.LAMB Optimizer.opt_registry[name].__name__)) /var/lib/jenkins/miniconda3/envs/autogluon_docs/lib/python3.7/site-packages/mxnet/optimizer/optimizer.py:167: UserWarning: WARNING: New optimizer gluonnlp.optimizer.lamb.LAMB is overriding existing optimizer mxnet.optimizer.optimizer.LAMB Optimizer.opt_registry[name].__name__)) Within ``fit``, the model with the best hyperparameter configuration is selected based on its validation accuracy after being trained on the data in the training split. The best Top-1 accuracy achieved on the validation set is: .. code:: python print('Top-1 val acc: %.3f' % predictor.results['best_reward']) .. parsed-literal:: :class: output Top-1 val acc: 0.750 Within ``fit``, this model is also finally fitted on our entire dataset (i.e., merging training+validation) using the same optimal hyperparameter configuration. The resulting model is considered as final model to be applied to classify new text. We now construct a test dataset similarly as we did with the train dataset, and then ``evaluate`` the final model produced by ``fit`` on the test data: .. code:: python test_acc = predictor.evaluate(dataset) print('Top-1 test acc: %.3f' % test_acc) .. parsed-literal:: :class: output Top-1 test acc: 0.750 Given an example sentence, we can easily use the final model to ``predict`` the label (and the conditional class-probability): .. code:: python sentence = 'I feel this is awesome!' ind = predictor.predict(sentence) print('The input sentence sentiment is classified as [%d].' % ind.asscalar()) .. parsed-literal:: :class: output The input sentence sentiment is classified as [1]. The ``results`` object returned by ``fit`` contains summaries describing various aspects of the training process. For example, we can inspect the best hyperparameter configuration corresponding to the final model which achieved the above (best) results: .. code:: python print('The best configuration is:') print(predictor.results['best_config']) .. parsed-literal:: :class: output The best configuration is: {'lr': 4.442672673345054e-05, 'net.choice': 0, 'pretrained_dataset.choice': 0}