.. _sec_textprediction_customization: Text Prediction - Customized Hyperparameter Search ================================================== This tutorial teaches you how to control the hyperparameter tuning process in ``TextPrediction`` by specifying: - A custom search space of candidate hyperparameter values to consider. - Which hyperparameter optimization algorithm should be used to actually search through this space. .. code:: python import numpy as np import warnings warnings.filterwarnings('ignore') np.random.seed(123) Paraphrase Identification ------------------------- We consider a Paraphrase Identification task for illustration. Given a pair of sentences, the goal is to predict whether or not one sentence is a restatement of the other (a binary classification task). Here we train models on the `Microsoft Research Paraphrase Corpus `__ dataset. For quick demonstration, we will subsample the training data and only use 800 samples. .. code:: python from autogluon.core.utils.loaders import load_pd train_data = load_pd.load('https://autogluon-text.s3-accelerate.amazonaws.com/glue/mrpc/train.parquet') dev_data = load_pd.load('https://autogluon-text.s3-accelerate.amazonaws.com/glue/mrpc/dev.parquet') rand_idx = np.random.permutation(np.arange(len(train_data)))[:800] train_data = train_data.iloc[rand_idx] train_data.reset_index(inplace=True, drop=True) train_data.head(10) .. raw:: html
sentence1 sentence2 label
0 Altria shares fell 2.2 percent or 96 cents to ... Its shares fell $ 9.61 to $ 50.26 , ranking as... 1
1 One of the Commission 's publicly stated goals... The Commission has publicly said one of its go... 1
2 " I don 't think my brain is going to go dead ... In a conference call yesterday , he said , " I... 1
3 " This will put a severe crimp in our reserves... " This is going to put a severe crimp in our r... 1
4 The Dow Jones industrials climbed more than 14... The Dow Jones industrials briefly surpassed th... 1
5 Massachusetts is one of 12 states that does no... Massachusetts is one of 12 states without the ... 1
6 Mr. Geoghan had been living in a protective cu... He had been in protective custody since being ... 1
7 Since December 2002 , Evans has been the vice ... Evans is also the vice-chairman of the Federal... 1
8 Business groups and border cities have raised ... Business groups and border cities have raised ... 1
9 A member of the chart-topping collective So So... A member of the rap group So Solid Crew threw ... 1
.. code:: python from autogluon_contrib_nlp.data.tokenizers import MosesTokenizer tokenizer = MosesTokenizer('en') # just used to display sentences row_index = 2 print('Paraphrase example:') print('Sentence1: ', tokenizer.decode(train_data['sentence1'][row_index].split())) print('Sentence2: ', tokenizer.decode(train_data['sentence2'][row_index].split())) print('Label: ', train_data['label'][row_index]) row_index = 3 print('\nNot Paraphrase example:') print('Sentence1:', tokenizer.decode(train_data['sentence1'][row_index].split())) print('Sentence2:', tokenizer.decode(train_data['sentence2'][row_index].split())) print('Label:', train_data['label'][row_index]) .. parsed-literal:: :class: output Paraphrase example: Sentence1: "I don't think my brain is going to go dead this afternoon or next week," he said. Sentence2: In a conference call yesterday, he said, "I don't think that my brain is going to go dead this afternoon or next week." Label: 1 Not Paraphrase example: Sentence1: "This will put a severe crimp in our reserves," O'Keefe said Friday during a roundtable discussion with reporters at NASA headquarters. Sentence2: "This is going to put a severe crimp in our reserves," O'Keefe said during a breakfast with reporters. Label: 1 Perform HPO over a Customized Search Space with Random Search ------------------------------------------------------------- To control which hyperparameter values are considered during ``fit()``, we specify the ``hyperparameters`` argument. Rather than specifying a particular fixed value for a hyperparameter, we can specify a space of values to search over via ``ag.space``. We can also specify which HPO algorithm to use for the search via ``search_strategy`` (a simple `random search `__ is specified below). In this example, we search for good values of the following hyperparameters: - warmup - learning rate - dropout before the first task-specific layer - layer-wise learning rate decay - number of task-specific layers .. code:: python import autogluon.core as ag from autogluon.text import TextPrediction as task hyperparameters = { 'models': { 'BertForTextPredictionBasic': { 'search_space': { 'model.network.agg_net.mid_units': ag.space.Int(32, 128), 'model.network.agg_net.data_dropout': ag.space.Categorical(False, True), 'optimization.num_train_epochs': 4, 'optimization.warmup_portion': ag.space.Real(0.1, 0.2), 'optimization.layerwise_lr_decay': ag.space.Real(0.8, 1.0), 'optimization.lr': ag.space.Real(1E-5, 1E-4) } }, }, 'hpo_params': { 'search_strategy': 'random' # perform HPO via simple random search } } We can now call ``fit()`` with hyperparameter-tuning over our custom search space. Below ``num_trials`` controls the maximal number of different hyperparameter configurations for which AutoGluon will train models (5 models are trained under different hyperparameter configurations in this case). To achieve good performance in your applications, you should use larger values of ``num_trials``, which may identify superior hyperparameter values but will require longer runtimes. .. code:: python predictor_mrpc = task.fit(train_data, label='label', hyperparameters=hyperparameters, num_trials=2, # increase this to achieve good performance in your applications time_limits=60 * 2, ngpus_per_trial=1, seed=123, output_directory='./ag_mrpc_random_search') .. parsed-literal:: :class: output 2021-02-23 19:27:23,233 - autogluon.text.text_prediction.text_prediction - INFO - All Logs will be saved to ./ag_mrpc_random_search/ag_text_prediction.log INFO:autogluon.text.text_prediction.text_prediction:All Logs will be saved to ./ag_mrpc_random_search/ag_text_prediction.log 2021-02-23 19:27:23,248 - autogluon.text.text_prediction.text_prediction - INFO - Train Dataset: INFO:autogluon.text.text_prediction.text_prediction:Train Dataset: 2021-02-23 19:27:23,249 - autogluon.text.text_prediction.text_prediction - INFO - Columns: - Text( name="sentence1" #total/missing=640/0 length, min/avg/max=44/116.909375/200 ) - Text( name="sentence2" #total/missing=640/0 length, min/avg/max=42/117.7421875/210 ) - Categorical( name="label" #total/missing=640/0 num_class (total/non_special)=2/2 categories=[0, 1] freq=[208, 432] ) INFO:autogluon.text.text_prediction.text_prediction:Columns: - Text( name="sentence1" #total/missing=640/0 length, min/avg/max=44/116.909375/200 ) - Text( name="sentence2" #total/missing=640/0 length, min/avg/max=42/117.7421875/210 ) - Categorical( name="label" #total/missing=640/0 num_class (total/non_special)=2/2 categories=[0, 1] freq=[208, 432] ) 2021-02-23 19:27:23,250 - autogluon.text.text_prediction.text_prediction - INFO - Tuning Dataset: INFO:autogluon.text.text_prediction.text_prediction:Tuning Dataset: 2021-02-23 19:27:23,251 - autogluon.text.text_prediction.text_prediction - INFO - Columns: - Text( name="sentence1" #total/missing=160/0 length, min/avg/max=54/119.2625/195 ) - Text( name="sentence2" #total/missing=160/0 length, min/avg/max=50/118.3125/199 ) - Categorical( name="label" #total/missing=160/0 num_class (total/non_special)=2/2 categories=[0, 1] freq=[60, 100] ) INFO:autogluon.text.text_prediction.text_prediction:Columns: - Text( name="sentence1" #total/missing=160/0 length, min/avg/max=54/119.2625/195 ) - Text( name="sentence2" #total/missing=160/0 length, min/avg/max=50/118.3125/199 ) - Categorical( name="label" #total/missing=160/0 num_class (total/non_special)=2/2 categories=[0, 1] freq=[60, 100] ) WARNING:autogluon.core.utils.multiprocessing_utils:WARNING: changing multiprocessing start method to forkserver 2021-02-23 19:27:23,258 - autogluon.text.text_prediction.text_prediction - INFO - All Logs will be saved to ./ag_mrpc_random_search/main.log INFO:autogluon.text.text_prediction.text_prediction:All Logs will be saved to ./ag_mrpc_random_search/main.log .. parsed-literal:: :class: output 0%| | 0/2 [00:00`__. Here we specify **bayesopt** as the searcher. .. code:: python hyperparameters['hpo_params'] = { 'search_strategy': 'bayesopt' } predictor_mrpc_bo = task.fit(train_data, label='label', hyperparameters=hyperparameters, time_limits=60 * 2, num_trials=2, # increase this to get good performance in your applications ngpus_per_trial=1, seed=123, output_directory='./ag_mrpc_custom_space_fifo_bo') .. parsed-literal:: :class: output 2021-02-23 19:28:18,743 - autogluon.text.text_prediction.text_prediction - INFO - All Logs will be saved to ./ag_mrpc_custom_space_fifo_bo/ag_text_prediction.log INFO:autogluon.text.text_prediction.text_prediction:All Logs will be saved to ./ag_mrpc_custom_space_fifo_bo/ag_text_prediction.log 2021-02-23 19:28:18,757 - autogluon.text.text_prediction.text_prediction - INFO - Train Dataset: INFO:autogluon.text.text_prediction.text_prediction:Train Dataset: 2021-02-23 19:28:18,758 - autogluon.text.text_prediction.text_prediction - INFO - Columns: - Text( name="sentence1" #total/missing=640/0 length, min/avg/max=44/115.9390625/200 ) - Text( name="sentence2" #total/missing=640/0 length, min/avg/max=42/116.503125/210 ) - Categorical( name="label" #total/missing=640/0 num_class (total/non_special)=2/2 categories=[0, 1] freq=[216, 424] ) INFO:autogluon.text.text_prediction.text_prediction:Columns: - Text( name="sentence1" #total/missing=640/0 length, min/avg/max=44/115.9390625/200 ) - Text( name="sentence2" #total/missing=640/0 length, min/avg/max=42/116.503125/210 ) - Categorical( name="label" #total/missing=640/0 num_class (total/non_special)=2/2 categories=[0, 1] freq=[216, 424] ) 2021-02-23 19:28:18,759 - autogluon.text.text_prediction.text_prediction - INFO - Tuning Dataset: INFO:autogluon.text.text_prediction.text_prediction:Tuning Dataset: 2021-02-23 19:28:18,760 - autogluon.text.text_prediction.text_prediction - INFO - Columns: - Text( name="sentence1" #total/missing=160/0 length, min/avg/max=54/123.14375/195 ) - Text( name="sentence2" #total/missing=160/0 length, min/avg/max=59/123.26875/208 ) - Categorical( name="label" #total/missing=160/0 num_class (total/non_special)=2/2 categories=[0, 1] freq=[52, 108] ) INFO:autogluon.text.text_prediction.text_prediction:Columns: - Text( name="sentence1" #total/missing=160/0 length, min/avg/max=54/123.14375/195 ) - Text( name="sentence2" #total/missing=160/0 length, min/avg/max=59/123.26875/208 ) - Categorical( name="label" #total/missing=160/0 num_class (total/non_special)=2/2 categories=[0, 1] freq=[52, 108] ) 2021-02-23 19:28:18,762 - autogluon.text.text_prediction.text_prediction - INFO - All Logs will be saved to ./ag_mrpc_custom_space_fifo_bo/main.log INFO:autogluon.text.text_prediction.text_prediction:All Logs will be saved to ./ag_mrpc_custom_space_fifo_bo/main.log .. parsed-literal:: :class: output 0%| | 0/2 [00:00`__ for HPO. Hyperband will try multiple hyperparameter configurations simultaneously and will early stop training under poor configurations to free compute resources for exploring new hyperparameter configurations. It may be able to identify good hyperparameter values more quickly than other search strategies in your applications. .. code:: python scheduler_options = {'max_t': 40} # Maximal number of epochs for training the neural network hyperparameters['hpo_params'] = { 'search_strategy': 'hyperband', 'scheduler_options': scheduler_options } .. code:: python predictor_mrpc_hyperband = task.fit(train_data, label='label', hyperparameters=hyperparameters, time_limits=60 * 2, ngpus_per_trial=1, seed=123, output_directory='./ag_mrpc_custom_space_hyperband') .. parsed-literal:: :class: output 2021-02-23 19:29:17,426 - autogluon.text.text_prediction.text_prediction - INFO - All Logs will be saved to ./ag_mrpc_custom_space_hyperband/ag_text_prediction.log INFO:autogluon.text.text_prediction.text_prediction:All Logs will be saved to ./ag_mrpc_custom_space_hyperband/ag_text_prediction.log 2021-02-23 19:29:17,440 - autogluon.text.text_prediction.text_prediction - INFO - Train Dataset: INFO:autogluon.text.text_prediction.text_prediction:Train Dataset: 2021-02-23 19:29:17,441 - autogluon.text.text_prediction.text_prediction - INFO - Columns: - Text( name="sentence1" #total/missing=640/0 length, min/avg/max=46/116.8921875/197 ) - Text( name="sentence2" #total/missing=640/0 length, min/avg/max=42/117.6984375/210 ) - Categorical( name="label" #total/missing=640/0 num_class (total/non_special)=2/2 categories=[0, 1] freq=[215, 425] ) INFO:autogluon.text.text_prediction.text_prediction:Columns: - Text( name="sentence1" #total/missing=640/0 length, min/avg/max=46/116.8921875/197 ) - Text( name="sentence2" #total/missing=640/0 length, min/avg/max=42/117.6984375/210 ) - Categorical( name="label" #total/missing=640/0 num_class (total/non_special)=2/2 categories=[0, 1] freq=[215, 425] ) 2021-02-23 19:29:17,442 - autogluon.text.text_prediction.text_prediction - INFO - Tuning Dataset: INFO:autogluon.text.text_prediction.text_prediction:Tuning Dataset: 2021-02-23 19:29:17,443 - autogluon.text.text_prediction.text_prediction - INFO - Columns: - Text( name="sentence1" #total/missing=160/0 length, min/avg/max=44/119.33125/200 ) - Text( name="sentence2" #total/missing=160/0 length, min/avg/max=50/118.4875/207 ) - Categorical( name="label" #total/missing=160/0 num_class (total/non_special)=2/2 categories=[0, 1] freq=[53, 107] ) INFO:autogluon.text.text_prediction.text_prediction:Columns: - Text( name="sentence1" #total/missing=160/0 length, min/avg/max=44/119.33125/200 ) - Text( name="sentence2" #total/missing=160/0 length, min/avg/max=50/118.4875/207 ) - Categorical( name="label" #total/missing=160/0 num_class (total/non_special)=2/2 categories=[0, 1] freq=[53, 107] ) 2021-02-23 19:29:17,446 - autogluon.text.text_prediction.text_prediction - INFO - All Logs will be saved to ./ag_mrpc_custom_space_hyperband/main.log INFO:autogluon.text.text_prediction.text_prediction:All Logs will be saved to ./ag_mrpc_custom_space_hyperband/main.log .. parsed-literal:: :class: output 0%| | 0/3 [00:00`__. .. code:: python scheduler_options = {'max_t': 40} hyperparameters['hpo_params'] = { 'search_strategy': 'bayesopt_hyperband', 'scheduler_options': scheduler_options } .. code:: python predictor_mrpc_bohb = task.fit( train_data, label='label', hyperparameters=hyperparameters, time_limits=60 * 2, ngpus_per_trial=1, seed=123, output_directory='./ag_mrpc_custom_space_bohb') .. parsed-literal:: :class: output 2021-02-23 19:30:32,585 - autogluon.text.text_prediction.text_prediction - INFO - All Logs will be saved to ./ag_mrpc_custom_space_bohb/ag_text_prediction.log INFO:autogluon.text.text_prediction.text_prediction:All Logs will be saved to ./ag_mrpc_custom_space_bohb/ag_text_prediction.log 2021-02-23 19:30:32,599 - autogluon.text.text_prediction.text_prediction - INFO - Train Dataset: INFO:autogluon.text.text_prediction.text_prediction:Train Dataset: 2021-02-23 19:30:32,600 - autogluon.text.text_prediction.text_prediction - INFO - Columns: - Text( name="sentence1" #total/missing=640/0 length, min/avg/max=44/117.4984375/200 ) - Text( name="sentence2" #total/missing=640/0 length, min/avg/max=46/117.54375/210 ) - Categorical( name="label" #total/missing=640/0 num_class (total/non_special)=2/2 categories=[0, 1] freq=[206, 434] ) INFO:autogluon.text.text_prediction.text_prediction:Columns: - Text( name="sentence1" #total/missing=640/0 length, min/avg/max=44/117.4984375/200 ) - Text( name="sentence2" #total/missing=640/0 length, min/avg/max=46/117.54375/210 ) - Categorical( name="label" #total/missing=640/0 num_class (total/non_special)=2/2 categories=[0, 1] freq=[206, 434] ) 2021-02-23 19:30:32,601 - autogluon.text.text_prediction.text_prediction - INFO - Tuning Dataset: INFO:autogluon.text.text_prediction.text_prediction:Tuning Dataset: 2021-02-23 19:30:32,602 - autogluon.text.text_prediction.text_prediction - INFO - Columns: - Text( name="sentence1" #total/missing=160/0 length, min/avg/max=51/116.90625/193 ) - Text( name="sentence2" #total/missing=160/0 length, min/avg/max=42/119.10625/208 ) - Categorical( name="label" #total/missing=160/0 num_class (total/non_special)=2/2 categories=[0, 1] freq=[62, 98] ) INFO:autogluon.text.text_prediction.text_prediction:Columns: - Text( name="sentence1" #total/missing=160/0 length, min/avg/max=51/116.90625/193 ) - Text( name="sentence2" #total/missing=160/0 length, min/avg/max=42/119.10625/208 ) - Categorical( name="label" #total/missing=160/0 num_class (total/non_special)=2/2 categories=[0, 1] freq=[62, 98] ) 2021-02-23 19:30:32,604 - autogluon.text.text_prediction.text_prediction - INFO - All Logs will be saved to ./ag_mrpc_custom_space_bohb/main.log INFO:autogluon.text.text_prediction.text_prediction:All Logs will be saved to ./ag_mrpc_custom_space_bohb/main.log .. parsed-literal:: :class: output 0%| | 0/3 [00:00