Searchable Objects¶

When defining custom Python objects such as network architectures, or specialized optimizers, it may be hard to decide what values to set for all of their attributes. AutoGluon provides an API that allows you to instead specify a search space of possible values to consider for such attributes, within which the optimal value will be automatically searched for at runtime. This tutorial demonstrates how easy this is to do, without having to modify your existing code at all!

Example for Constructing a Network¶

This tutorial covers an example of selecting a neural network’s architecture as a hyperparameter optimization (HPO) task. If you are interested in efficient neural architecture search (NAS), please refer to this other tutorial instead: sec_proxyless_ .

CIFAR ResNet in GluonCV¶

GluonCV provides CIFARResNet, which allow user to specify how many layers at each stage. For example, we can construct a CIFAR ResNet with only 1 layer per stage:

from gluoncv.model_zoo.cifarresnet import CIFARResNetV1, CIFARBasicBlockV1

layers = [1, 1, 1]
channels = [16, 16, 32, 64]
net = CIFARResNetV1(CIFARBasicBlockV1, layers, channels)

/var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/venv/lib/python3.7/site-packages/gluoncv/__init__.py:40: UserWarning: Both mxnet==1.7.0 and torch==1.9.0+cu102 are installed. You might encounter increased GPU memory footprint if both framework are used at the same time.
  warnings.warn(f'Both mxnet=={mx.__version__} and torch=={torch.__version__} are installed. '

We can visualize the network:

import autogluon.core as ag
from autogluon.vision.utils import plot_network

plot_network(net, (1, 3, 32, 32))

../../_images/output_object_d3e86d_3_0.svg

Searchable Network Architecture Using AutoGluon Object¶

autogluon.obj() enables customized search space to any user defined class. It can also be used within autogluon.Categorical() if you have multiple networks to choose from.

@ag.obj(
    nstage1=ag.space.Int(2, 4),
    nstage2=ag.space.Int(2, 4),
)
class MyCifarResNet(CIFARResNetV1):
    def __init__(self, nstage1, nstage2):
        nstage3 = 9 - nstage1 - nstage2
        layers = [nstage1, nstage2, nstage3]
        channels = [16, 16, 32, 64]
        super().__init__(CIFARBasicBlockV1, layers=layers, channels=channels)

Create one network instance and print the configuration space:

mynet=MyCifarResNet()
print(mynet.cs)

Configuration space object:
  Hyperparameters:
    nstage1, Type: UniformInteger, Range: [2, 4], Default: 3
    nstage2, Type: UniformInteger, Range: [2, 4], Default: 3

We can also overwrite existing search spaces:

mynet1 = MyCifarResNet(nstage1=1,
                       nstage2=ag.space.Int(5, 10))
print(mynet1.cs)

Configuration space object:
  Hyperparameters:
    nstage2, Type: UniformInteger, Range: [5, 10], Default: 8

Decorate Existing Class¶

We can also use autogluon.obj() to easily decorate any existing classes. For example, if we want to search learning rate and weight decay for Adam optimizer, we only need to add a decorator:

from mxnet import optimizer as optim
@ag.obj()
class Adam(optim.Adam):
    pass

Then we can create an instance:

myoptim = Adam(learning_rate=ag.Real(1e-2, 1e-1, log=True), wd=ag.Real(1e-5, 1e-3, log=True))
print(myoptim.cs)

Configuration space object:
  Hyperparameters:
    learning_rate, Type: UniformFloat, Range: [0.01, 0.1], Default: 0.0316227766, on log-scale
    wd, Type: UniformFloat, Range: [1e-05, 0.001], Default: 0.0001, on log-scale

Launch Experiments Using AutoGluon Object¶

AutoGluon Object is compatible with Fit API in AutoGluon tasks, and also works with user-defined training scripts using autogluon.autogluon_register_args(). We can start fitting:

from autogluon.vision import ImagePredictor
classifier = ImagePredictor().fit('cifar10', hyperparameters={'net': mynet, 'optimizer': myoptim, 'epochs': 1}, ngpus_per_trial=1)

time_limit=auto set to time_limit=7200.
Starting fit without HPO
modified configs(<old> != <new>): {
root.train.epochs    10 != 1
root.train.rec_train ~/.mxnet/datasets/imagenet/rec/train.rec != auto
root.train.rec_val   ~/.mxnet/datasets/imagenet/rec/val.rec != auto
root.train.lr        0.1 != 0.01
root.train.early_stop_patience -1 != 10
root.train.early_stop_baseline 0.0 != -inf
root.train.num_training_samples 1281167 != -1
root.train.rec_val_idx ~/.mxnet/datasets/imagenet/rec/val.idx != auto
root.train.data_dir  ~/.mxnet/datasets/imagenet != auto
root.train.num_workers 4 != 0
root.train.rec_train_idx ~/.mxnet/datasets/imagenet/rec/train.idx != auto
root.train.early_stop_max_value 1.0 != inf
root.train.batch_size 128 != 16
root.img_cls.model   resnet50_v1 != resnet50_v1b
root.valid.batch_size 128 != 16
root.valid.num_workers 4 != 0
}
Saved config to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/docs/_build/eval/tutorials/course/cd63b438/.trial_0/config.yaml
Start training from [Epoch 0]
Epoch[0] Batch [49] Speed: 103.952959 samples/sec   accuracy=0.183750       lr=0.010000
Epoch[0] Batch [99] Speed: 104.316909 samples/sec   accuracy=0.240625       lr=0.010000
Epoch[0] Batch [149]        Speed: 104.649139 samples/sec   accuracy=0.287083       lr=0.010000
Epoch[0] Batch [199]        Speed: 104.185913 samples/sec   accuracy=0.315937       lr=0.010000
Epoch[0] Batch [249]        Speed: 104.121571 samples/sec   accuracy=0.347500       lr=0.010000
Epoch[0] Batch [299]        Speed: 103.875297 samples/sec   accuracy=0.365208       lr=0.010000
Epoch[0] Batch [349]        Speed: 103.677087 samples/sec   accuracy=0.383750       lr=0.010000
Epoch[0] Batch [399]        Speed: 103.537669 samples/sec   accuracy=0.403125       lr=0.010000
Epoch[0] Batch [449]        Speed: 103.265421 samples/sec   accuracy=0.416528       lr=0.010000
Epoch[0] Batch [499]        Speed: 103.178163 samples/sec   accuracy=0.431000       lr=0.010000
Epoch[0] Batch [549]        Speed: 102.955286 samples/sec   accuracy=0.440795       lr=0.010000
Epoch[0] Batch [599]        Speed: 102.578884 samples/sec   accuracy=0.452396       lr=0.010000
Epoch[0] Batch [649]        Speed: 102.759640 samples/sec   accuracy=0.464519       lr=0.010000
Epoch[0] Batch [699]        Speed: 102.643537 samples/sec   accuracy=0.474196       lr=0.010000
Epoch[0] Batch [749]        Speed: 102.393588 samples/sec   accuracy=0.482000       lr=0.010000
Epoch[0] Batch [799]        Speed: 102.094896 samples/sec   accuracy=0.490234       lr=0.010000
Epoch[0] Batch [849]        Speed: 102.055022 samples/sec   accuracy=0.497794       lr=0.010000
Epoch[0] Batch [899]        Speed: 101.968473 samples/sec   accuracy=0.503958       lr=0.010000
Epoch[0] Batch [949]        Speed: 101.485773 samples/sec   accuracy=0.510066       lr=0.010000
Epoch[0] Batch [999]        Speed: 101.425954 samples/sec   accuracy=0.517000       lr=0.010000
Epoch[0] Batch [1049]       Speed: 101.131413 samples/sec   accuracy=0.522262       lr=0.010000
Epoch[0] Batch [1099]       Speed: 100.886963 samples/sec   accuracy=0.526875       lr=0.010000
Epoch[0] Batch [1149]       Speed: 100.628574 samples/sec   accuracy=0.531522       lr=0.010000
Epoch[0] Batch [1199]       Speed: 100.348758 samples/sec   accuracy=0.535990       lr=0.010000
Epoch[0] Batch [1249]       Speed: 99.841114 samples/sec    accuracy=0.540750       lr=0.010000
Epoch[0] Batch [1299]       Speed: 99.645259 samples/sec    accuracy=0.545481       lr=0.010000
Epoch[0] Batch [1349]       Speed: 99.262510 samples/sec    accuracy=0.550093       lr=0.010000
Epoch[0] Batch [1399]       Speed: 98.918671 samples/sec    accuracy=0.553616       lr=0.010000
Epoch[0] Batch [1449]       Speed: 98.446586 samples/sec    accuracy=0.557026       lr=0.010000
Epoch[0] Batch [1499]       Speed: 98.009802 samples/sec    accuracy=0.560917       lr=0.010000
Epoch[0] Batch [1549]       Speed: 99.073632 samples/sec    accuracy=0.564113       lr=0.010000
Epoch[0] Batch [1599]       Speed: 100.367971 samples/sec   accuracy=0.567656       lr=0.010000
Epoch[0] Batch [1649]       Speed: 101.205901 samples/sec   accuracy=0.570720       lr=0.010000
Epoch[0] Batch [1699]       Speed: 101.743144 samples/sec   accuracy=0.574301       lr=0.010000
Epoch[0] Batch [1749]       Speed: 101.937068 samples/sec   accuracy=0.576821       lr=0.010000
Epoch[0] Batch [1799]       Speed: 102.209711 samples/sec   accuracy=0.579514       lr=0.010000
Epoch[0] Batch [1849]       Speed: 102.416528 samples/sec   accuracy=0.582466       lr=0.010000
Epoch[0] Batch [1899]       Speed: 102.257414 samples/sec   accuracy=0.584704       lr=0.010000
Epoch[0] Batch [1949]       Speed: 102.646150 samples/sec   accuracy=0.587404       lr=0.010000
Epoch[0] Batch [1999]       Speed: 102.579837 samples/sec   accuracy=0.590187       lr=0.010000
Epoch[0] Batch [2049]       Speed: 102.623264 samples/sec   accuracy=0.592683       lr=0.010000
Epoch[0] Batch [2099]       Speed: 102.620665 samples/sec   accuracy=0.595357       lr=0.010000
Epoch[0] Batch [2149]       Speed: 102.554401 samples/sec   accuracy=0.597878       lr=0.010000
Epoch[0] Batch [2199]       Speed: 102.168652 samples/sec   accuracy=0.599659       lr=0.010000
Epoch[0] Batch [2249]       Speed: 102.354136 samples/sec   accuracy=0.601889       lr=0.010000
Epoch[0] Batch [2299]       Speed: 102.433636 samples/sec   accuracy=0.604266       lr=0.010000
Epoch[0] Batch [2349]       Speed: 102.441967 samples/sec   accuracy=0.605878       lr=0.010000
Epoch[0] Batch [2399]       Speed: 102.274417 samples/sec   accuracy=0.607474       lr=0.010000
Epoch[0] Batch [2449]       Speed: 102.193720 samples/sec   accuracy=0.609643       lr=0.010000
Epoch[0] Batch [2499]       Speed: 102.043651 samples/sec   accuracy=0.611125       lr=0.010000
Epoch[0] Batch [2549]       Speed: 101.977045 samples/sec   accuracy=0.612549       lr=0.010000
Epoch[0] Batch [2599]       Speed: 101.846221 samples/sec   accuracy=0.614423       lr=0.010000
Epoch[0] Batch [2649]       Speed: 101.715910 samples/sec   accuracy=0.616203       lr=0.010000
Epoch[0] Batch [2699]       Speed: 101.703692 samples/sec   accuracy=0.617639       lr=0.010000
Epoch[0] Batch [2749]       Speed: 101.350481 samples/sec   accuracy=0.619159       lr=0.010000
Epoch[0] Batch [2799]       Speed: 101.368196 samples/sec   accuracy=0.621116       lr=0.010000
Epoch[0] Batch [2849]       Speed: 101.020412 samples/sec   accuracy=0.622588       lr=0.010000
Epoch[0] Batch [2899]       Speed: 100.798044 samples/sec   accuracy=0.623750       lr=0.010000
Epoch[0] Batch [2949]       Speed: 100.704037 samples/sec   accuracy=0.625169       lr=0.010000
Epoch[0] Batch [2999]       Speed: 100.269781 samples/sec   accuracy=0.626979       lr=0.010000
Epoch[0] Batch [3049]       Speed: 100.168320 samples/sec   accuracy=0.628238       lr=0.010000
Epoch[0] Batch [3099]       Speed: 99.967839 samples/sec    accuracy=0.629496       lr=0.010000
Epoch[0] Batch [3149]       Speed: 99.598852 samples/sec    accuracy=0.630873       lr=0.010000
Epoch[0] Batch [3199]       Speed: 99.237980 samples/sec    accuracy=0.631934       lr=0.010000
Epoch[0] Batch [3249]       Speed: 99.026619 samples/sec    accuracy=0.633192       lr=0.010000
Epoch[0] Batch [3299]       Speed: 98.723223 samples/sec    accuracy=0.634659       lr=0.010000
Epoch[0] Batch [3349]       Speed: 98.060766 samples/sec    accuracy=0.635877       lr=0.010000
[Epoch 0] training: accuracy=0.636648
[Epoch 0] speed: 101 samples/sec    time cost: 531.942785
[Epoch 0] validation: top1=0.909167 top5=0.997667
[Epoch 0] Current best top-1: 0.909167 vs previous -inf, saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/docs/_build/eval/tutorials/course/cd63b438/.trial_0/best_checkpoint.pkl
Applying the state from the best checkpoint...
Finished, total runtime is 569.02 s
{ 'best_config': { 'batch_size': 16,
                   'custom_net': MyCifarResNet(
  (features): HybridSequential(
    (0): Conv2D(None -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
    (2): HybridSequential(
      (0): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (1): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (2): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
    )
    (3): HybridSequential(
      (0): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(16 -> 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
        (downsample): HybridSequential(
          (0): Conv2D(16 -> 32, kernel_size=(1, 1), stride=(2, 2), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (1): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (2): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (3): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (4): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (5): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (6): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (7): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
    )
    (4): HybridSequential(
      (0): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
        (downsample): HybridSequential(
          (0): Conv2D(32 -> 64, kernel_size=(1, 1), stride=(2, 2), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
    )
    (5): GlobalAvgPool2D(size=(1, 1), stride=(1, 1), padding=(0, 0), ceil_mode=True, global_pool=True, pool_type=avg, layout=NCHW)
  )
  (output): Dense(64 -> 10, linear)
),
                   'custom_optimizer': <__main__.Adam object at 0x7f80e8252e10>,
                   'dist_ip_addrs': None,
                   'early_stop_baseline': -inf,
                   'early_stop_max_value': inf,
                   'early_stop_patience': 10,
                   'epochs': 1,
                   'estimator': <class 'gluoncv.auto.estimators.image_classification.image_classification.ImageClassificationEstimator'>,
                   'final_fit': False,
                   'gpus': [0],
                   'log_dir': '/var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/docs/_build/eval/tutorials/course/cd63b438',
                   'lr': 0.01,
                   'model': 'resnet50_v1b',
                   'ngpus_per_trial': 1,
                   'nthreads_per_trial': 0,
                   'num_trials': 1,
                   'num_workers': 0,
                   'problem_type': 'multiclass',
                   'scheduler': 'local',
                   'search_strategy': 'random',
                   'searcher': 'random',
                   'seed': 139,
                   'time_limits': 7200,
                   'wall_clock_tick': 1624564107.043953},
  'total_time': 448.58130168914795,
  'train_acc': 0.6366481481481482,
  'valid_acc': 0.9091666666666667}

print(classifier.fit_summary())

{'train_acc': 0.6366481481481482, 'valid_acc': 0.9091666666666667, 'total_time': 448.58130168914795, 'best_config': {'model': 'resnet50_v1b', 'lr': 0.01, 'num_trials': 1, 'epochs': 1, 'batch_size': 16, 'nthreads_per_trial': 0, 'ngpus_per_trial': 1, 'time_limits': 7200, 'search_strategy': 'random', 'dist_ip_addrs': None, 'log_dir': '/var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/docs/_build/eval/tutorials/course/cd63b438', 'searcher': 'random', 'scheduler': 'local', 'custom_net': MyCifarResNet(
  (features): HybridSequential(
    (0): Conv2D(None -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
    (2): HybridSequential(
      (0): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (1): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (2): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
    )
    (3): HybridSequential(
      (0): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(16 -> 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
        (downsample): HybridSequential(
          (0): Conv2D(16 -> 32, kernel_size=(1, 1), stride=(2, 2), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (1): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (2): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (3): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (4): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (5): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (6): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (7): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
    )
    (4): HybridSequential(
      (0): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
        (downsample): HybridSequential(
          (0): Conv2D(32 -> 64, kernel_size=(1, 1), stride=(2, 2), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
    )
    (5): GlobalAvgPool2D(size=(1, 1), stride=(1, 1), padding=(0, 0), ceil_mode=True, global_pool=True, pool_type=avg, layout=NCHW)
  )
  (output): Dense(64 -> 10, linear)
), 'custom_optimizer': <__main__.Adam object at 0x7f80e8252e10>, 'early_stop_patience': 10, 'early_stop_baseline': -inf, 'early_stop_max_value': inf, 'num_workers': 0, 'gpus': [0], 'seed': 139, 'final_fit': False, 'estimator': <class 'gluoncv.auto.estimators.image_classification.image_classification.ImageClassificationEstimator'>, 'wall_clock_tick': 1624564107.043953, 'problem_type': 'multiclass'}, 'fit_history': {'train_acc': 0.6366481481481482, 'valid_acc': 0.9091666666666667, 'total_time': 448.58130168914795, 'best_config': {'model': 'resnet50_v1b', 'lr': 0.01, 'num_trials': 1, 'epochs': 1, 'batch_size': 16, 'nthreads_per_trial': 0, 'ngpus_per_trial': 1, 'time_limits': 7200, 'search_strategy': 'random', 'dist_ip_addrs': None, 'log_dir': '/var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/docs/_build/eval/tutorials/course/cd63b438', 'searcher': 'random', 'scheduler': 'local', 'custom_net': MyCifarResNet(
  (features): HybridSequential(
    (0): Conv2D(None -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
    (2): HybridSequential(
      (0): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (1): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (2): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
    )
    (3): HybridSequential(
      (0): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(16 -> 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
        (downsample): HybridSequential(
          (0): Conv2D(16 -> 32, kernel_size=(1, 1), stride=(2, 2), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (1): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (2): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (3): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (4): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (5): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (6): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
      (7): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
    )
    (4): HybridSequential(
      (0): CIFARBasicBlockV1(
        (body): HybridSequential(
          (0): Conv2D(32 -> 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
          (2): Activation(relu)
          (3): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
          (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
        (downsample): HybridSequential(
          (0): Conv2D(32 -> 64, kernel_size=(1, 1), stride=(2, 2), bias=False)
          (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
        )
      )
    )
    (5): GlobalAvgPool2D(size=(1, 1), stride=(1, 1), padding=(0, 0), ceil_mode=True, global_pool=True, pool_type=avg, layout=NCHW)
  )
  (output): Dense(64 -> 10, linear)
), 'custom_optimizer': <__main__.Adam object at 0x7f80e8252e10>, 'early_stop_patience': 10, 'early_stop_baseline': -inf, 'early_stop_max_value': inf, 'num_workers': 0, 'gpus': [0], 'seed': 139, 'final_fit': False, 'estimator': <class 'gluoncv.auto.estimators.image_classification.image_classification.ImageClassificationEstimator'>, 'wall_clock_tick': 1624564107.043953, 'problem_type': 'multiclass'}}}