Searchable Objects¶
When defining custom Python objects such as network architectures, or specialized optimizers, it may be hard to decide what values to set for all of their attributes. AutoGluon provides an API that allows you to instead specify a search space of possible values to consider for such attributes, within which the optimal value will be automatically searched for at runtime. This tutorial demonstrates how easy this is to do, without having to modify your existing code at all!
Example for Constructing a Network¶
This tutorial covers an example of selecting a neural network’s
architecture as a hyperparameter optimization (HPO) task. If you are
interested in efficient neural architecture search (NAS), please refer
to this other tutorial instead: sec_proxyless
_ .
CIFAR ResNet in GluonCV¶
GluonCV provides CIFARResNet, which allow user to specify how many layers at each stage. For example, we can construct a CIFAR ResNet with only 1 layer per stage:
import pickle
from gluoncv.model_zoo.cifarresnet import CIFARResNetV1, CIFARBasicBlockV1
layers = [1, 1, 1]
channels = [16, 16, 32, 64]
net = CIFARResNetV1(CIFARBasicBlockV1, layers, channels)
/var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/venv/lib/python3.7/site-packages/gluoncv/__init__.py:40: UserWarning: Both mxnet==1.7.0 and torch==1.7.1+cu101 are installed. You might encounter increased GPU memory footprint if both framework are used at the same time. warnings.warn(f'Both mxnet=={mx.__version__} and torch=={torch.__version__} are installed. '
We can visualize the network:
import autogluon.core as ag
from autogluon.vision.utils import plot_network
plot_network(net, (1, 3, 32, 32))
Searchable Network Architecture Using AutoGluon Object¶
autogluon.obj()
enables customized search space to any user
defined class. It can also be used within autogluon.Categorical()
if
you have multiple networks to choose from.
@ag.obj(
nstage1=ag.space.Int(2, 4),
nstage2=ag.space.Int(2, 4),
)
class MyCifarResNet(CIFARResNetV1):
def __init__(self, nstage1, nstage2):
nstage3 = 9 - nstage1 - nstage2
layers = [nstage1, nstage2, nstage3]
channels = [16, 16, 32, 64]
super().__init__(CIFARBasicBlockV1, layers=layers, channels=channels)
Create one network instance and print the configuration space:
mynet=MyCifarResNet()
print(mynet.cs)
Configuration space object:
Hyperparameters:
nstage1, Type: UniformInteger, Range: [2, 4], Default: 3
nstage2, Type: UniformInteger, Range: [2, 4], Default: 3
We can also overwrite existing search spaces:
mynet1 = MyCifarResNet(nstage1=1,
nstage2=ag.space.Int(5, 10))
print(mynet1.cs)
Configuration space object:
Hyperparameters:
nstage2, Type: UniformInteger, Range: [5, 10], Default: 8
Decorate Existing Class¶
We can also use autogluon.obj()
to easily decorate any existing
classes. For example, if we want to search learning rate and weight
decay for Adam optimizer, we only need to add a decorator:
from mxnet import optimizer as optim
@ag.obj()
class Adam(optim.Adam):
pass
Then we can create an instance:
myoptim = Adam(learning_rate=ag.Real(1e-2, 1e-1, log=True), wd=ag.Real(1e-5, 1e-3, log=True))
print(myoptim.cs)
Configuration space object:
Hyperparameters:
learning_rate, Type: UniformFloat, Range: [0.01, 0.1], Default: 0.0316227766, on log-scale
wd, Type: UniformFloat, Range: [1e-05, 0.001], Default: 0.0001, on log-scale
Launch Experiments Using AutoGluon Object¶
AutoGluon Object is compatible with Fit API in AutoGluon tasks, and also
works with user-defined training scripts using
autogluon.autogluon_register_args()
. We can start fitting:
from autogluon.vision import ImagePredictor
classifier = ImagePredictor().fit('cifar10', hyperparameters={'net': mynet, 'optimizer': myoptim, 'epochs': 1}, ngpus_per_trial=1)
time_limit=auto set to time_limit=7200. Starting fit without HPO modified configs(<old> != <new>): { root.train.rec_val ~/.mxnet/datasets/imagenet/rec/val.rec != auto root.train.early_stop_baseline 0.0 != -inf root.train.lr 0.1 != 0.01 root.train.rec_val_idx ~/.mxnet/datasets/imagenet/rec/val.idx != auto root.train.num_workers 4 != 8 root.train.data_dir ~/.mxnet/datasets/imagenet != auto root.train.num_training_samples 1281167 != -1 root.train.rec_train_idx ~/.mxnet/datasets/imagenet/rec/train.idx != auto root.train.early_stop_patience -1 != 10 root.train.batch_size 128 != 16 root.train.rec_train ~/.mxnet/datasets/imagenet/rec/train.rec != auto root.train.early_stop_max_value 1.0 != inf root.train.epochs 10 != 1 root.img_cls.model resnet50_v1 != resnet50 root.valid.batch_size 128 != 16 root.valid.num_workers 4 != 8 } Saved config to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/docs/_build/eval/tutorials/course/77793fa2/.trial_0/config.yaml Start training from [Epoch 0] Epoch[0] Batch [49] Speed: 73.145565 samples/sec accuracy=0.131250 lr=0.010000 Epoch[0] Batch [99] Speed: 73.460455 samples/sec accuracy=0.147500 lr=0.010000 Epoch[0] Batch [149] Speed: 72.912889 samples/sec accuracy=0.155833 lr=0.010000 Epoch[0] Batch [199] Speed: 72.360211 samples/sec accuracy=0.165313 lr=0.010000 Epoch[0] Batch [249] Speed: 71.832403 samples/sec accuracy=0.166000 lr=0.010000 Epoch[0] Batch [299] Speed: 71.404373 samples/sec accuracy=0.169167 lr=0.010000 Epoch[0] Batch [349] Speed: 70.815421 samples/sec accuracy=0.170893 lr=0.010000 Epoch[0] Batch [399] Speed: 70.400701 samples/sec accuracy=0.172813 lr=0.010000 Epoch[0] Batch [449] Speed: 69.830927 samples/sec accuracy=0.171667 lr=0.010000 Epoch[0] Batch [499] Speed: 69.189824 samples/sec accuracy=0.174500 lr=0.010000 Epoch[0] Batch [549] Speed: 68.668975 samples/sec accuracy=0.176591 lr=0.010000 Epoch[0] Batch [599] Speed: 68.280722 samples/sec accuracy=0.180521 lr=0.010000 Epoch[0] Batch [649] Speed: 67.779656 samples/sec accuracy=0.182981 lr=0.010000 Epoch[0] Batch [699] Speed: 67.418058 samples/sec accuracy=0.185000 lr=0.010000 Epoch[0] Batch [749] Speed: 67.272013 samples/sec accuracy=0.187000 lr=0.010000 Epoch[0] Batch [799] Speed: 67.084887 samples/sec accuracy=0.188047 lr=0.010000 Epoch[0] Batch [849] Speed: 67.303782 samples/sec accuracy=0.190294 lr=0.010000 Epoch[0] Batch [899] Speed: 67.624555 samples/sec accuracy=0.191736 lr=0.010000 Epoch[0] Batch [949] Speed: 67.946375 samples/sec accuracy=0.192895 lr=0.010000 Epoch[0] Batch [999] Speed: 68.123476 samples/sec accuracy=0.194875 lr=0.010000 Epoch[0] Batch [1049] Speed: 68.345593 samples/sec accuracy=0.195595 lr=0.010000 Epoch[0] Batch [1099] Speed: 68.428706 samples/sec accuracy=0.196023 lr=0.010000 Epoch[0] Batch [1149] Speed: 68.196682 samples/sec accuracy=0.197283 lr=0.010000 Epoch[0] Batch [1199] Speed: 67.982405 samples/sec accuracy=0.198229 lr=0.010000 Epoch[0] Batch [1249] Speed: 67.570417 samples/sec accuracy=0.198750 lr=0.010000 Epoch[0] Batch [1299] Speed: 67.186257 samples/sec accuracy=0.200481 lr=0.010000 Epoch[0] Batch [1349] Speed: 67.312108 samples/sec accuracy=0.202176 lr=0.010000 Epoch[0] Batch [1399] Speed: 67.482150 samples/sec accuracy=0.203080 lr=0.010000 Epoch[0] Batch [1449] Speed: 67.649634 samples/sec accuracy=0.204224 lr=0.010000 Epoch[0] Batch [1499] Speed: 67.704975 samples/sec accuracy=0.205750 lr=0.010000 Epoch[0] Batch [1549] Speed: 67.776436 samples/sec accuracy=0.206290 lr=0.010000 Epoch[0] Batch [1599] Speed: 67.869676 samples/sec accuracy=0.207148 lr=0.010000 Epoch[0] Batch [1649] Speed: 67.949764 samples/sec accuracy=0.207689 lr=0.010000 Epoch[0] Batch [1699] Speed: 67.970022 samples/sec accuracy=0.208456 lr=0.010000 Epoch[0] Batch [1749] Speed: 68.004344 samples/sec accuracy=0.209643 lr=0.010000 Epoch[0] Batch [1799] Speed: 68.019593 samples/sec accuracy=0.210035 lr=0.010000 Epoch[0] Batch [1849] Speed: 68.101986 samples/sec accuracy=0.210878 lr=0.010000 Epoch[0] Batch [1899] Speed: 68.115528 samples/sec accuracy=0.211645 lr=0.010000 Epoch[0] Batch [1949] Speed: 68.089800 samples/sec accuracy=0.212051 lr=0.010000 Epoch[0] Batch [1999] Speed: 68.148588 samples/sec accuracy=0.212531 lr=0.010000 Epoch[0] Batch [2049] Speed: 68.177487 samples/sec accuracy=0.213110 lr=0.010000 Epoch[0] Batch [2099] Speed: 68.264643 samples/sec accuracy=0.213750 lr=0.010000 Epoch[0] Batch [2149] Speed: 68.237821 samples/sec accuracy=0.214012 lr=0.010000 Epoch[0] Batch [2199] Speed: 68.211142 samples/sec accuracy=0.214602 lr=0.010000 Epoch[0] Batch [2249] Speed: 68.298783 samples/sec accuracy=0.215000 lr=0.010000 Epoch[0] Batch [2299] Speed: 68.298768 samples/sec accuracy=0.215353 lr=0.010000 Epoch[0] Batch [2349] Speed: 68.346649 samples/sec accuracy=0.216250 lr=0.010000 Epoch[0] Batch [2399] Speed: 68.228652 samples/sec accuracy=0.216615 lr=0.010000 Epoch[0] Batch [2449] Speed: 68.141487 samples/sec accuracy=0.216990 lr=0.010000 Epoch[0] Batch [2499] Speed: 67.960875 samples/sec accuracy=0.217350 lr=0.010000 Epoch[0] Batch [2549] Speed: 67.791121 samples/sec accuracy=0.217990 lr=0.010000 Epoch[0] Batch [2599] Speed: 67.690358 samples/sec accuracy=0.218245 lr=0.010000 Epoch[0] Batch [2649] Speed: 67.553584 samples/sec accuracy=0.218561 lr=0.010000 Epoch[0] Batch [2699] Speed: 67.405833 samples/sec accuracy=0.218704 lr=0.010000 Epoch[0] Batch [2749] Speed: 67.311130 samples/sec accuracy=0.219091 lr=0.010000 Epoch[0] Batch [2799] Speed: 67.228764 samples/sec accuracy=0.220022 lr=0.010000 Epoch[0] Batch [2849] Speed: 67.136172 samples/sec accuracy=0.220351 lr=0.010000 Epoch[0] Batch [2899] Speed: 67.136661 samples/sec accuracy=0.220797 lr=0.010000 Epoch[0] Batch [2949] Speed: 67.318022 samples/sec accuracy=0.221250 lr=0.010000 Epoch[0] Batch [2999] Speed: 67.469227 samples/sec accuracy=0.221354 lr=0.010000 Epoch[0] Batch [3049] Speed: 67.542272 samples/sec accuracy=0.222971 lr=0.010000 Epoch[0] Batch [3099] Speed: 67.662216 samples/sec accuracy=0.223327 lr=0.010000 Epoch[0] Batch [3149] Speed: 67.727726 samples/sec accuracy=0.223591 lr=0.010000 Epoch[0] Batch [3199] Speed: 67.827328 samples/sec accuracy=0.224160 lr=0.010000 Epoch[0] Batch [3249] Speed: 67.822875 samples/sec accuracy=0.225154 lr=0.010000 Epoch[0] Batch [3299] Speed: 67.921529 samples/sec accuracy=0.225455 lr=0.010000 Epoch[0] Batch [3349] Speed: 67.953105 samples/sec accuracy=0.226175 lr=0.010000 [Epoch 0] training: accuracy=0.226389 [Epoch 0] speed: 68 samples/sec time cost: 790.150836 [Epoch 0] validation: top1=0.297667 top5=0.856667 [Epoch 0] Current best top-1: 0.297667 vs previous -inf, saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/docs/_build/eval/tutorials/course/77793fa2/.trial_0/best_checkpoint.pkl Unable to pickle object due to the reason: Can't pickle <class '__main__.MyCifarResNet'>: it's not the same object as __main__.MyCifarResNet. This object is not saved. Applying the state from the best checkpoint... Unable to resume the state from the best checkpoint, using the latest state. Finished, total runtime is 814.25 s { 'best_config': { 'batch_size': 16, 'custom_net': MyCifarResNet( (features): HybridSequential( (0): Conv2D(None -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) (2): HybridSequential( (0): CIFARBasicBlockV1( (body): HybridSequential( (0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) (2): Activation(relu) (3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) ) (1): CIFARBasicBlockV1( (body): HybridSequential( (0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) (2): Activation(relu) (3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) ) (2): CIFARBasicBlockV1( (body): HybridSequential( (0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) (2): Activation(relu) (3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) ) ) (3): HybridSequential( (0): CIFARBasicBlockV1( (body): HybridSequential( (0): Conv2D(16 -> 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) (2): Activation(relu) (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) (downsample): HybridSequential( (0): Conv2D(16 -> 32, kernel_size=(1, 1), stride=(2, 2), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) ) (1): CIFARBasicBlockV1( (body): HybridSequential( (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) (2): Activation(relu) (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) ) (2): CIFARBasicBlockV1( (body): HybridSequential( (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) (2): Activation(relu) (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) ) (3): CIFARBasicBlockV1( (body): HybridSequential( (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) (2): Activation(relu) (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) ) (4): CIFARBasicBlockV1( (body): HybridSequential( (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) (2): Activation(relu) (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) ) (5): CIFARBasicBlockV1( (body): HybridSequential( (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) (2): Activation(relu) (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) ) (6): CIFARBasicBlockV1( (body): HybridSequential( (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) (2): Activation(relu) (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) ) (7): CIFARBasicBlockV1( (body): HybridSequential( (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) (2): Activation(relu) (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) ) ) (4): HybridSequential( (0): CIFARBasicBlockV1( (body): HybridSequential( (0): Conv2D(32 -> 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) (2): Activation(relu) (3): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) (downsample): HybridSequential( (0): Conv2D(32 -> 64, kernel_size=(1, 1), stride=(2, 2), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) ) ) (5): GlobalAvgPool2D(size=(1, 1), stride=(1, 1), padding=(0, 0), ceil_mode=True, global_pool=True, pool_type=avg, layout=NCHW) ) (output): Dense(64 -> 10, linear) ), 'custom_optimizer': <__main__.Adam object at 0x7fdb28334110>, 'dist_ip_addrs': None, 'early_stop_baseline': -inf, 'early_stop_max_value': inf, 'early_stop_patience': 10, 'epochs': 1, 'final_fit': False, 'gpus': [0], 'log_dir': '/var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/docs/_build/eval/tutorials/course/77793fa2', 'lr': 0.01, 'model': 'resnet50', 'ngpus_per_trial': 1, 'nthreads_per_trial': 128, 'num_trials': 1, 'num_workers': 8, 'problem_type': 'multiclass', 'scheduler': 'local', 'search_strategy': 'random', 'searcher': 'random', 'seed': 738, 'time_limits': 7200, 'wall_clock_tick': 1632366429.1247652}, 'total_time': 798.6581916809082, 'train_acc': 0.2263888888888889, 'valid_acc': 0.2976666666666667}
print(classifier.fit_summary())
{'train_acc': 0.2263888888888889, 'valid_acc': 0.2976666666666667, 'total_time': 798.6581916809082, 'best_config': {'model': 'resnet50', 'lr': 0.01, 'num_trials': 1, 'epochs': 1, 'batch_size': 16, 'nthreads_per_trial': 128, 'ngpus_per_trial': 1, 'time_limits': 7200, 'search_strategy': 'random', 'dist_ip_addrs': None, 'log_dir': '/var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/docs/_build/eval/tutorials/course/77793fa2', 'searcher': 'random', 'scheduler': 'local', 'custom_net': MyCifarResNet(
(features): HybridSequential(
(0): Conv2D(None -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): HybridSequential(
(0): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(1): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(2): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
)
(3): HybridSequential(
(0): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
(downsample): HybridSequential(
(0): Conv2D(16 -> 32, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(1): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(2): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(3): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(4): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(5): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(6): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(7): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
)
(4): HybridSequential(
(0): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
(downsample): HybridSequential(
(0): Conv2D(32 -> 64, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
)
(5): GlobalAvgPool2D(size=(1, 1), stride=(1, 1), padding=(0, 0), ceil_mode=True, global_pool=True, pool_type=avg, layout=NCHW)
)
(output): Dense(64 -> 10, linear)
), 'custom_optimizer': <__main__.Adam object at 0x7fdb28334110>, 'early_stop_patience': 10, 'early_stop_baseline': -inf, 'early_stop_max_value': inf, 'num_workers': 8, 'gpus': [0], 'seed': 738, 'final_fit': False, 'wall_clock_tick': 1632366429.1247652, 'problem_type': 'multiclass'}, 'fit_history': {'train_acc': 0.2263888888888889, 'valid_acc': 0.2976666666666667, 'total_time': 798.6581916809082, 'best_config': {'model': 'resnet50', 'lr': 0.01, 'num_trials': 1, 'epochs': 1, 'batch_size': 16, 'nthreads_per_trial': 128, 'ngpus_per_trial': 1, 'time_limits': 7200, 'search_strategy': 'random', 'dist_ip_addrs': None, 'log_dir': '/var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/docs/_build/eval/tutorials/course/77793fa2', 'searcher': 'random', 'scheduler': 'local', 'custom_net': MyCifarResNet(
(features): HybridSequential(
(0): Conv2D(None -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): HybridSequential(
(0): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(1): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(2): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
)
(3): HybridSequential(
(0): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
(downsample): HybridSequential(
(0): Conv2D(16 -> 32, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(1): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(2): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(3): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(4): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(5): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(6): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(7): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
)
(4): HybridSequential(
(0): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
(downsample): HybridSequential(
(0): Conv2D(32 -> 64, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
)
(5): GlobalAvgPool2D(size=(1, 1), stride=(1, 1), padding=(0, 0), ceil_mode=True, global_pool=True, pool_type=avg, layout=NCHW)
)
(output): Dense(64 -> 10, linear)
), 'custom_optimizer': <__main__.Adam object at 0x7fdb28334110>, 'early_stop_patience': 10, 'early_stop_baseline': -inf, 'early_stop_max_value': inf, 'num_workers': 8, 'gpus': [0], 'seed': 738, 'final_fit': False, 'wall_clock_tick': 1632366429.1247652, 'problem_type': 'multiclass'}}}