Searchable Objects¶
When defining custom Python objects such as network architectures, or specialized optimizers, it may be hard to decide what values to set for all of their attributes. AutoGluon provides an API that allows you to instead specify a search space of possible values to consider for such attributes, within which the optimal value will be automatically searched for at runtime. This tutorial demonstrates how easy this is to do, without having to modify your existing code at all!
Example for Constructing a Network¶
This tutorial covers an example of selecting a neural network’s
architecture as a hyperparameter optimization (HPO) task. If you are
interested in efficient neural architecture search (NAS), please refer
to this other tutorial instead: sec_proxyless_ .
CIFAR ResNet in GluonCV¶
GluonCV provides CIFARResNet, which allow user to specify how many layers at each stage. For example, we can construct a CIFAR ResNet with only 1 layer per stage:
from gluoncv.model_zoo.cifarresnet import CIFARResNetV1, CIFARBasicBlockV1
layers = [1, 1, 1]
channels = [16, 16, 32, 64]
net = CIFARResNetV1(CIFARBasicBlockV1, layers, channels)
/var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/venv/lib/python3.7/site-packages/gluoncv/__init__.py:40: UserWarning: Both mxnet==1.7.0 and torch==1.9.0+cu102 are installed. You might encounter increased GPU memory footprint if both framework are used at the same time.
warnings.warn(f'Both mxnet=={mx.__version__} and torch=={torch.__version__} are installed. '
We can visualize the network:
import autogluon.core as ag
from autogluon.vision.utils import plot_network
plot_network(net, (1, 3, 32, 32))
Searchable Network Architecture Using AutoGluon Object¶
autogluon.obj() enables customized search space to any user
defined class. It can also be used within autogluon.Categorical() if
you have multiple networks to choose from.
@ag.obj(
nstage1=ag.space.Int(2, 4),
nstage2=ag.space.Int(2, 4),
)
class MyCifarResNet(CIFARResNetV1):
def __init__(self, nstage1, nstage2):
nstage3 = 9 - nstage1 - nstage2
layers = [nstage1, nstage2, nstage3]
channels = [16, 16, 32, 64]
super().__init__(CIFARBasicBlockV1, layers=layers, channels=channels)
Create one network instance and print the configuration space:
mynet=MyCifarResNet()
print(mynet.cs)
Configuration space object:
Hyperparameters:
nstage1, Type: UniformInteger, Range: [2, 4], Default: 3
nstage2, Type: UniformInteger, Range: [2, 4], Default: 3
We can also overwrite existing search spaces:
mynet1 = MyCifarResNet(nstage1=1,
nstage2=ag.space.Int(5, 10))
print(mynet1.cs)
Configuration space object:
Hyperparameters:
nstage2, Type: UniformInteger, Range: [5, 10], Default: 8
Decorate Existing Class¶
We can also use autogluon.obj() to easily decorate any existing
classes. For example, if we want to search learning rate and weight
decay for Adam optimizer, we only need to add a decorator:
from mxnet import optimizer as optim
@ag.obj()
class Adam(optim.Adam):
pass
Then we can create an instance:
myoptim = Adam(learning_rate=ag.Real(1e-2, 1e-1, log=True), wd=ag.Real(1e-5, 1e-3, log=True))
print(myoptim.cs)
Configuration space object:
Hyperparameters:
learning_rate, Type: UniformFloat, Range: [0.01, 0.1], Default: 0.0316227766, on log-scale
wd, Type: UniformFloat, Range: [1e-05, 0.001], Default: 0.0001, on log-scale
Launch Experiments Using AutoGluon Object¶
AutoGluon Object is compatible with Fit API in AutoGluon tasks, and also
works with user-defined training scripts using
autogluon.autogluon_register_args(). We can start fitting:
from autogluon.vision import ImagePredictor
classifier = ImagePredictor().fit('cifar10', hyperparameters={'net': mynet, 'optimizer': myoptim, 'epochs': 1}, ngpus_per_trial=1)
time_limit=auto set to time_limit=7200.
Starting fit without HPO
modified configs(<old> != <new>): {
root.train.epochs 10 != 1
root.train.rec_train ~/.mxnet/datasets/imagenet/rec/train.rec != auto
root.train.rec_val ~/.mxnet/datasets/imagenet/rec/val.rec != auto
root.train.lr 0.1 != 0.01
root.train.early_stop_patience -1 != 10
root.train.early_stop_baseline 0.0 != -inf
root.train.num_training_samples 1281167 != -1
root.train.rec_val_idx ~/.mxnet/datasets/imagenet/rec/val.idx != auto
root.train.data_dir ~/.mxnet/datasets/imagenet != auto
root.train.num_workers 4 != 0
root.train.rec_train_idx ~/.mxnet/datasets/imagenet/rec/train.idx != auto
root.train.early_stop_max_value 1.0 != inf
root.train.batch_size 128 != 16
root.img_cls.model resnet50_v1 != resnet50_v1b
root.valid.batch_size 128 != 16
root.valid.num_workers 4 != 0
}
Saved config to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/docs/_build/eval/tutorials/course/cd63b438/.trial_0/config.yaml
Start training from [Epoch 0]
Epoch[0] Batch [49] Speed: 103.952959 samples/sec accuracy=0.183750 lr=0.010000
Epoch[0] Batch [99] Speed: 104.316909 samples/sec accuracy=0.240625 lr=0.010000
Epoch[0] Batch [149] Speed: 104.649139 samples/sec accuracy=0.287083 lr=0.010000
Epoch[0] Batch [199] Speed: 104.185913 samples/sec accuracy=0.315937 lr=0.010000
Epoch[0] Batch [249] Speed: 104.121571 samples/sec accuracy=0.347500 lr=0.010000
Epoch[0] Batch [299] Speed: 103.875297 samples/sec accuracy=0.365208 lr=0.010000
Epoch[0] Batch [349] Speed: 103.677087 samples/sec accuracy=0.383750 lr=0.010000
Epoch[0] Batch [399] Speed: 103.537669 samples/sec accuracy=0.403125 lr=0.010000
Epoch[0] Batch [449] Speed: 103.265421 samples/sec accuracy=0.416528 lr=0.010000
Epoch[0] Batch [499] Speed: 103.178163 samples/sec accuracy=0.431000 lr=0.010000
Epoch[0] Batch [549] Speed: 102.955286 samples/sec accuracy=0.440795 lr=0.010000
Epoch[0] Batch [599] Speed: 102.578884 samples/sec accuracy=0.452396 lr=0.010000
Epoch[0] Batch [649] Speed: 102.759640 samples/sec accuracy=0.464519 lr=0.010000
Epoch[0] Batch [699] Speed: 102.643537 samples/sec accuracy=0.474196 lr=0.010000
Epoch[0] Batch [749] Speed: 102.393588 samples/sec accuracy=0.482000 lr=0.010000
Epoch[0] Batch [799] Speed: 102.094896 samples/sec accuracy=0.490234 lr=0.010000
Epoch[0] Batch [849] Speed: 102.055022 samples/sec accuracy=0.497794 lr=0.010000
Epoch[0] Batch [899] Speed: 101.968473 samples/sec accuracy=0.503958 lr=0.010000
Epoch[0] Batch [949] Speed: 101.485773 samples/sec accuracy=0.510066 lr=0.010000
Epoch[0] Batch [999] Speed: 101.425954 samples/sec accuracy=0.517000 lr=0.010000
Epoch[0] Batch [1049] Speed: 101.131413 samples/sec accuracy=0.522262 lr=0.010000
Epoch[0] Batch [1099] Speed: 100.886963 samples/sec accuracy=0.526875 lr=0.010000
Epoch[0] Batch [1149] Speed: 100.628574 samples/sec accuracy=0.531522 lr=0.010000
Epoch[0] Batch [1199] Speed: 100.348758 samples/sec accuracy=0.535990 lr=0.010000
Epoch[0] Batch [1249] Speed: 99.841114 samples/sec accuracy=0.540750 lr=0.010000
Epoch[0] Batch [1299] Speed: 99.645259 samples/sec accuracy=0.545481 lr=0.010000
Epoch[0] Batch [1349] Speed: 99.262510 samples/sec accuracy=0.550093 lr=0.010000
Epoch[0] Batch [1399] Speed: 98.918671 samples/sec accuracy=0.553616 lr=0.010000
Epoch[0] Batch [1449] Speed: 98.446586 samples/sec accuracy=0.557026 lr=0.010000
Epoch[0] Batch [1499] Speed: 98.009802 samples/sec accuracy=0.560917 lr=0.010000
Epoch[0] Batch [1549] Speed: 99.073632 samples/sec accuracy=0.564113 lr=0.010000
Epoch[0] Batch [1599] Speed: 100.367971 samples/sec accuracy=0.567656 lr=0.010000
Epoch[0] Batch [1649] Speed: 101.205901 samples/sec accuracy=0.570720 lr=0.010000
Epoch[0] Batch [1699] Speed: 101.743144 samples/sec accuracy=0.574301 lr=0.010000
Epoch[0] Batch [1749] Speed: 101.937068 samples/sec accuracy=0.576821 lr=0.010000
Epoch[0] Batch [1799] Speed: 102.209711 samples/sec accuracy=0.579514 lr=0.010000
Epoch[0] Batch [1849] Speed: 102.416528 samples/sec accuracy=0.582466 lr=0.010000
Epoch[0] Batch [1899] Speed: 102.257414 samples/sec accuracy=0.584704 lr=0.010000
Epoch[0] Batch [1949] Speed: 102.646150 samples/sec accuracy=0.587404 lr=0.010000
Epoch[0] Batch [1999] Speed: 102.579837 samples/sec accuracy=0.590187 lr=0.010000
Epoch[0] Batch [2049] Speed: 102.623264 samples/sec accuracy=0.592683 lr=0.010000
Epoch[0] Batch [2099] Speed: 102.620665 samples/sec accuracy=0.595357 lr=0.010000
Epoch[0] Batch [2149] Speed: 102.554401 samples/sec accuracy=0.597878 lr=0.010000
Epoch[0] Batch [2199] Speed: 102.168652 samples/sec accuracy=0.599659 lr=0.010000
Epoch[0] Batch [2249] Speed: 102.354136 samples/sec accuracy=0.601889 lr=0.010000
Epoch[0] Batch [2299] Speed: 102.433636 samples/sec accuracy=0.604266 lr=0.010000
Epoch[0] Batch [2349] Speed: 102.441967 samples/sec accuracy=0.605878 lr=0.010000
Epoch[0] Batch [2399] Speed: 102.274417 samples/sec accuracy=0.607474 lr=0.010000
Epoch[0] Batch [2449] Speed: 102.193720 samples/sec accuracy=0.609643 lr=0.010000
Epoch[0] Batch [2499] Speed: 102.043651 samples/sec accuracy=0.611125 lr=0.010000
Epoch[0] Batch [2549] Speed: 101.977045 samples/sec accuracy=0.612549 lr=0.010000
Epoch[0] Batch [2599] Speed: 101.846221 samples/sec accuracy=0.614423 lr=0.010000
Epoch[0] Batch [2649] Speed: 101.715910 samples/sec accuracy=0.616203 lr=0.010000
Epoch[0] Batch [2699] Speed: 101.703692 samples/sec accuracy=0.617639 lr=0.010000
Epoch[0] Batch [2749] Speed: 101.350481 samples/sec accuracy=0.619159 lr=0.010000
Epoch[0] Batch [2799] Speed: 101.368196 samples/sec accuracy=0.621116 lr=0.010000
Epoch[0] Batch [2849] Speed: 101.020412 samples/sec accuracy=0.622588 lr=0.010000
Epoch[0] Batch [2899] Speed: 100.798044 samples/sec accuracy=0.623750 lr=0.010000
Epoch[0] Batch [2949] Speed: 100.704037 samples/sec accuracy=0.625169 lr=0.010000
Epoch[0] Batch [2999] Speed: 100.269781 samples/sec accuracy=0.626979 lr=0.010000
Epoch[0] Batch [3049] Speed: 100.168320 samples/sec accuracy=0.628238 lr=0.010000
Epoch[0] Batch [3099] Speed: 99.967839 samples/sec accuracy=0.629496 lr=0.010000
Epoch[0] Batch [3149] Speed: 99.598852 samples/sec accuracy=0.630873 lr=0.010000
Epoch[0] Batch [3199] Speed: 99.237980 samples/sec accuracy=0.631934 lr=0.010000
Epoch[0] Batch [3249] Speed: 99.026619 samples/sec accuracy=0.633192 lr=0.010000
Epoch[0] Batch [3299] Speed: 98.723223 samples/sec accuracy=0.634659 lr=0.010000
Epoch[0] Batch [3349] Speed: 98.060766 samples/sec accuracy=0.635877 lr=0.010000
[Epoch 0] training: accuracy=0.636648
[Epoch 0] speed: 101 samples/sec time cost: 531.942785
[Epoch 0] validation: top1=0.909167 top5=0.997667
[Epoch 0] Current best top-1: 0.909167 vs previous -inf, saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/docs/_build/eval/tutorials/course/cd63b438/.trial_0/best_checkpoint.pkl
Applying the state from the best checkpoint...
Finished, total runtime is 569.02 s
{ 'best_config': { 'batch_size': 16,
'custom_net': MyCifarResNet(
(features): HybridSequential(
(0): Conv2D(None -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): HybridSequential(
(0): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(1): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(2): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
)
(3): HybridSequential(
(0): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
(downsample): HybridSequential(
(0): Conv2D(16 -> 32, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(1): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(2): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(3): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(4): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(5): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(6): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(7): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
)
(4): HybridSequential(
(0): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
(downsample): HybridSequential(
(0): Conv2D(32 -> 64, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
)
(5): GlobalAvgPool2D(size=(1, 1), stride=(1, 1), padding=(0, 0), ceil_mode=True, global_pool=True, pool_type=avg, layout=NCHW)
)
(output): Dense(64 -> 10, linear)
),
'custom_optimizer': <__main__.Adam object at 0x7f80e8252e10>,
'dist_ip_addrs': None,
'early_stop_baseline': -inf,
'early_stop_max_value': inf,
'early_stop_patience': 10,
'epochs': 1,
'estimator': <class 'gluoncv.auto.estimators.image_classification.image_classification.ImageClassificationEstimator'>,
'final_fit': False,
'gpus': [0],
'log_dir': '/var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/docs/_build/eval/tutorials/course/cd63b438',
'lr': 0.01,
'model': 'resnet50_v1b',
'ngpus_per_trial': 1,
'nthreads_per_trial': 0,
'num_trials': 1,
'num_workers': 0,
'problem_type': 'multiclass',
'scheduler': 'local',
'search_strategy': 'random',
'searcher': 'random',
'seed': 139,
'time_limits': 7200,
'wall_clock_tick': 1624564107.043953},
'total_time': 448.58130168914795,
'train_acc': 0.6366481481481482,
'valid_acc': 0.9091666666666667}
print(classifier.fit_summary())
{'train_acc': 0.6366481481481482, 'valid_acc': 0.9091666666666667, 'total_time': 448.58130168914795, 'best_config': {'model': 'resnet50_v1b', 'lr': 0.01, 'num_trials': 1, 'epochs': 1, 'batch_size': 16, 'nthreads_per_trial': 0, 'ngpus_per_trial': 1, 'time_limits': 7200, 'search_strategy': 'random', 'dist_ip_addrs': None, 'log_dir': '/var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/docs/_build/eval/tutorials/course/cd63b438', 'searcher': 'random', 'scheduler': 'local', 'custom_net': MyCifarResNet(
(features): HybridSequential(
(0): Conv2D(None -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): HybridSequential(
(0): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(1): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(2): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
)
(3): HybridSequential(
(0): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
(downsample): HybridSequential(
(0): Conv2D(16 -> 32, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(1): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(2): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(3): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(4): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(5): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(6): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(7): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
)
(4): HybridSequential(
(0): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
(downsample): HybridSequential(
(0): Conv2D(32 -> 64, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
)
(5): GlobalAvgPool2D(size=(1, 1), stride=(1, 1), padding=(0, 0), ceil_mode=True, global_pool=True, pool_type=avg, layout=NCHW)
)
(output): Dense(64 -> 10, linear)
), 'custom_optimizer': <__main__.Adam object at 0x7f80e8252e10>, 'early_stop_patience': 10, 'early_stop_baseline': -inf, 'early_stop_max_value': inf, 'num_workers': 0, 'gpus': [0], 'seed': 139, 'final_fit': False, 'estimator': <class 'gluoncv.auto.estimators.image_classification.image_classification.ImageClassificationEstimator'>, 'wall_clock_tick': 1624564107.043953, 'problem_type': 'multiclass'}, 'fit_history': {'train_acc': 0.6366481481481482, 'valid_acc': 0.9091666666666667, 'total_time': 448.58130168914795, 'best_config': {'model': 'resnet50_v1b', 'lr': 0.01, 'num_trials': 1, 'epochs': 1, 'batch_size': 16, 'nthreads_per_trial': 0, 'ngpus_per_trial': 1, 'time_limits': 7200, 'search_strategy': 'random', 'dist_ip_addrs': None, 'log_dir': '/var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/docs/_build/eval/tutorials/course/cd63b438', 'searcher': 'random', 'scheduler': 'local', 'custom_net': MyCifarResNet(
(features): HybridSequential(
(0): Conv2D(None -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): HybridSequential(
(0): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(1): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(2): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
)
(3): HybridSequential(
(0): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
(downsample): HybridSequential(
(0): Conv2D(16 -> 32, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(1): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(2): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(3): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(4): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(5): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(6): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(7): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
)
(4): HybridSequential(
(0): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
(downsample): HybridSequential(
(0): Conv2D(32 -> 64, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
)
(5): GlobalAvgPool2D(size=(1, 1), stride=(1, 1), padding=(0, 0), ceil_mode=True, global_pool=True, pool_type=avg, layout=NCHW)
)
(output): Dense(64 -> 10, linear)
), 'custom_optimizer': <__main__.Adam object at 0x7f80e8252e10>, 'early_stop_patience': 10, 'early_stop_baseline': -inf, 'early_stop_max_value': inf, 'num_workers': 0, 'gpus': [0], 'seed': 139, 'final_fit': False, 'estimator': <class 'gluoncv.auto.estimators.image_classification.image_classification.ImageClassificationEstimator'>, 'wall_clock_tick': 1624564107.043953, 'problem_type': 'multiclass'}}}