Searchable Objects¶
When defining custom Python objects such as network architectures, or specialized optimizers, it may be hard to decide what values to set for all of their attributes. AutoGluon provides an API that allows you to instead specify a search space of possible values to consider for such attributes, within which the optimal value will be automatically searched for at runtime. This tutorial demonstrates how easy this is to do, without having to modify your existing code at all!
Example for Constructing a Network¶
This tutorial covers an example of selecting a neural network’s
architecture as a hyperparameter optimization (HPO) task. If you are
interested in efficient neural architecture search (NAS), please refer
to this other tutorial instead: sec_proxyless
_ .
CIFAR ResNet in GluonCV¶
GluonCV provides CIFARResNet, which allow user to specify how many layers at each stage. For example, we can construct a CIFAR ResNet with only 1 layer per stage:
import pickle
from gluoncv.model_zoo.cifarresnet import CIFARResNetV1, CIFARBasicBlockV1
layers = [1, 1, 1]
channels = [16, 16, 32, 64]
net = CIFARResNetV1(CIFARBasicBlockV1, layers, channels)
/var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/venv/lib/python3.7/site-packages/gluoncv/__init__.py:40: UserWarning: Both mxnet==1.7.0 and torch==1.7.1+cu101 are installed. You might encounter increased GPU memory footprint if both framework are used at the same time. warnings.warn(f'Both mxnet=={mx.__version__} and torch=={torch.__version__} are installed. '
We can visualize the network:
import autogluon.core as ag
from autogluon.vision.utils import plot_network
plot_network(net, (1, 3, 32, 32))
Searchable Network Architecture Using AutoGluon Object¶
autogluon.obj()
enables customized search space to any user
defined class. It can also be used within autogluon.Categorical()
if
you have multiple networks to choose from.
@ag.obj(
nstage1=ag.space.Int(2, 4),
nstage2=ag.space.Int(2, 4),
)
class MyCifarResNet(CIFARResNetV1):
def __init__(self, nstage1, nstage2):
nstage3 = 9 - nstage1 - nstage2
layers = [nstage1, nstage2, nstage3]
channels = [16, 16, 32, 64]
super().__init__(CIFARBasicBlockV1, layers=layers, channels=channels)
Create one network instance and print the configuration space:
mynet=MyCifarResNet()
print(mynet.cs)
Configuration space object:
Hyperparameters:
nstage1, Type: UniformInteger, Range: [2, 4], Default: 3
nstage2, Type: UniformInteger, Range: [2, 4], Default: 3
We can also overwrite existing search spaces:
mynet1 = MyCifarResNet(nstage1=1,
nstage2=ag.space.Int(5, 10))
print(mynet1.cs)
Configuration space object:
Hyperparameters:
nstage2, Type: UniformInteger, Range: [5, 10], Default: 8
Decorate Existing Class¶
We can also use autogluon.obj()
to easily decorate any existing
classes. For example, if we want to search learning rate and weight
decay for Adam optimizer, we only need to add a decorator:
from mxnet import optimizer as optim
@ag.obj()
class Adam(optim.Adam):
pass
Then we can create an instance:
myoptim = Adam(learning_rate=ag.Real(1e-2, 1e-1, log=True), wd=ag.Real(1e-5, 1e-3, log=True))
print(myoptim.cs)
Configuration space object:
Hyperparameters:
learning_rate, Type: UniformFloat, Range: [0.01, 0.1], Default: 0.0316227766, on log-scale
wd, Type: UniformFloat, Range: [1e-05, 0.001], Default: 0.0001, on log-scale
Launch Experiments Using AutoGluon Object¶
AutoGluon Object is compatible with Fit API in AutoGluon tasks, and also
works with user-defined training scripts using
autogluon.autogluon_register_args()
. We can start fitting:
from autogluon.vision import ImagePredictor
classifier = ImagePredictor().fit('cifar10', hyperparameters={'net': mynet, 'optimizer': myoptim, 'epochs': 1}, ngpus_per_trial=1)
time_limit=auto set to time_limit=7200. Starting fit without HPO modified configs(<old> != <new>): { root.train.rec_val_idx ~/.mxnet/datasets/imagenet/rec/val.idx != auto root.train.rec_train_idx ~/.mxnet/datasets/imagenet/rec/train.idx != auto root.train.num_training_samples 1281167 != -1 root.train.num_workers 4 != 8 root.train.epochs 10 != 1 root.train.rec_train ~/.mxnet/datasets/imagenet/rec/train.rec != auto root.train.early_stop_patience -1 != 10 root.train.lr 0.1 != 0.01 root.train.rec_val ~/.mxnet/datasets/imagenet/rec/val.rec != auto root.train.early_stop_max_value 1.0 != inf root.train.batch_size 128 != 16 root.train.early_stop_baseline 0.0 != -inf root.train.data_dir ~/.mxnet/datasets/imagenet != auto root.img_cls.model resnet50_v1 != resnet50 root.valid.num_workers 4 != 8 root.valid.batch_size 128 != 16 } Saved config to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/docs/_build/eval/tutorials/course/4afab4dc/.trial_0/config.yaml Start training from [Epoch 0] Epoch[0] Batch [49] Speed: 73.609623 samples/sec accuracy=0.131250 lr=0.010000 Epoch[0] Batch [99] Speed: 73.953846 samples/sec accuracy=0.154375 lr=0.010000 Epoch[0] Batch [149] Speed: 73.771416 samples/sec accuracy=0.157917 lr=0.010000 Epoch[0] Batch [199] Speed: 73.504140 samples/sec accuracy=0.159375 lr=0.010000 Epoch[0] Batch [249] Speed: 73.313569 samples/sec accuracy=0.156750 lr=0.010000 Epoch[0] Batch [299] Speed: 73.051226 samples/sec accuracy=0.170000 lr=0.010000 Epoch[0] Batch [349] Speed: 72.894171 samples/sec accuracy=0.168750 lr=0.010000 Epoch[0] Batch [399] Speed: 72.719174 samples/sec accuracy=0.168750 lr=0.010000 Epoch[0] Batch [449] Speed: 72.567706 samples/sec accuracy=0.170139 lr=0.010000 Epoch[0] Batch [499] Speed: 72.429780 samples/sec accuracy=0.171875 lr=0.010000 Epoch[0] Batch [549] Speed: 72.153648 samples/sec accuracy=0.172500 lr=0.010000 Epoch[0] Batch [599] Speed: 71.923419 samples/sec accuracy=0.175000 lr=0.010000 Epoch[0] Batch [649] Speed: 71.748176 samples/sec accuracy=0.176346 lr=0.010000 Epoch[0] Batch [699] Speed: 71.751177 samples/sec accuracy=0.178929 lr=0.010000 Epoch[0] Batch [749] Speed: 71.239497 samples/sec accuracy=0.180250 lr=0.010000 Epoch[0] Batch [799] Speed: 70.887625 samples/sec accuracy=0.183281 lr=0.010000 Epoch[0] Batch [849] Speed: 70.377529 samples/sec accuracy=0.184338 lr=0.010000 Epoch[0] Batch [899] Speed: 69.965424 samples/sec accuracy=0.186667 lr=0.010000 Epoch[0] Batch [949] Speed: 69.307087 samples/sec accuracy=0.188224 lr=0.010000 Epoch[0] Batch [999] Speed: 68.671152 samples/sec accuracy=0.189562 lr=0.010000 Epoch[0] Batch [1049] Speed: 68.130965 samples/sec accuracy=0.190595 lr=0.010000 Epoch[0] Batch [1099] Speed: 67.409133 samples/sec accuracy=0.189773 lr=0.010000 Epoch[0] Batch [1149] Speed: 67.007981 samples/sec accuracy=0.189565 lr=0.010000 Epoch[0] Batch [1199] Speed: 66.446417 samples/sec accuracy=0.190052 lr=0.010000 Epoch[0] Batch [1249] Speed: 65.992137 samples/sec accuracy=0.190700 lr=0.010000 Epoch[0] Batch [1299] Speed: 65.623976 samples/sec accuracy=0.192115 lr=0.010000 Epoch[0] Batch [1349] Speed: 65.556607 samples/sec accuracy=0.193148 lr=0.010000 Epoch[0] Batch [1399] Speed: 65.955670 samples/sec accuracy=0.193438 lr=0.010000 Epoch[0] Batch [1449] Speed: 66.468534 samples/sec accuracy=0.193578 lr=0.010000 Epoch[0] Batch [1499] Speed: 66.749608 samples/sec accuracy=0.194500 lr=0.010000 Epoch[0] Batch [1549] Speed: 66.845887 samples/sec accuracy=0.195000 lr=0.010000 Epoch[0] Batch [1599] Speed: 66.644517 samples/sec accuracy=0.195898 lr=0.010000 Epoch[0] Batch [1649] Speed: 66.248934 samples/sec accuracy=0.196742 lr=0.010000 Epoch[0] Batch [1699] Speed: 65.678913 samples/sec accuracy=0.197831 lr=0.010000 Epoch[0] Batch [1749] Speed: 65.370992 samples/sec accuracy=0.198607 lr=0.010000 Epoch[0] Batch [1799] Speed: 65.554665 samples/sec accuracy=0.199132 lr=0.010000 Epoch[0] Batch [1849] Speed: 65.884527 samples/sec accuracy=0.200203 lr=0.010000 Epoch[0] Batch [1899] Speed: 66.157593 samples/sec accuracy=0.200855 lr=0.010000 Epoch[0] Batch [1949] Speed: 66.362573 samples/sec accuracy=0.201218 lr=0.010000 Epoch[0] Batch [1999] Speed: 66.492927 samples/sec accuracy=0.201906 lr=0.010000 Epoch[0] Batch [2049] Speed: 66.652606 samples/sec accuracy=0.202409 lr=0.010000 Epoch[0] Batch [2099] Speed: 66.704572 samples/sec accuracy=0.203185 lr=0.010000 Epoch[0] Batch [2149] Speed: 66.438463 samples/sec accuracy=0.203517 lr=0.010000 Epoch[0] Batch [2199] Speed: 66.017319 samples/sec accuracy=0.203977 lr=0.010000 Epoch[0] Batch [2249] Speed: 65.636090 samples/sec accuracy=0.204722 lr=0.010000 Epoch[0] Batch [2299] Speed: 67.803284 samples/sec accuracy=0.204918 lr=0.010000 Epoch[0] Batch [2349] Speed: 70.009562 samples/sec accuracy=0.205559 lr=0.010000 Epoch[0] Batch [2399] Speed: 71.227305 samples/sec accuracy=0.206380 lr=0.010000 Epoch[0] Batch [2449] Speed: 71.841090 samples/sec accuracy=0.207041 lr=0.010000 Epoch[0] Batch [2499] Speed: 72.001251 samples/sec accuracy=0.207300 lr=0.010000 Epoch[0] Batch [2549] Speed: 72.378897 samples/sec accuracy=0.207647 lr=0.010000 Epoch[0] Batch [2599] Speed: 72.539489 samples/sec accuracy=0.207548 lr=0.010000 Epoch[0] Batch [2649] Speed: 72.654240 samples/sec accuracy=0.208090 lr=0.010000 Epoch[0] Batch [2699] Speed: 72.708417 samples/sec accuracy=0.208519 lr=0.010000 Epoch[0] Batch [2749] Speed: 72.662435 samples/sec accuracy=0.209182 lr=0.010000 Epoch[0] Batch [2799] Speed: 72.606764 samples/sec accuracy=0.209955 lr=0.010000 Epoch[0] Batch [2849] Speed: 72.563571 samples/sec accuracy=0.210285 lr=0.010000 Epoch[0] Batch [2899] Speed: 72.568693 samples/sec accuracy=0.210927 lr=0.010000 Epoch[0] Batch [2949] Speed: 72.462617 samples/sec accuracy=0.211462 lr=0.010000 Epoch[0] Batch [2999] Speed: 72.314121 samples/sec accuracy=0.212188 lr=0.010000 Epoch[0] Batch [3049] Speed: 72.224966 samples/sec accuracy=0.213361 lr=0.010000 Epoch[0] Batch [3099] Speed: 71.928598 samples/sec accuracy=0.213589 lr=0.010000 Epoch[0] Batch [3149] Speed: 71.650627 samples/sec accuracy=0.213849 lr=0.010000 Epoch[0] Batch [3199] Speed: 71.605896 samples/sec accuracy=0.214023 lr=0.010000 Epoch[0] Batch [3249] Speed: 71.740414 samples/sec accuracy=0.214981 lr=0.010000 Epoch[0] Batch [3299] Speed: 71.444014 samples/sec accuracy=0.215246 lr=0.010000 Epoch[0] Batch [3349] Speed: 71.080322 samples/sec accuracy=0.215821 lr=0.010000 [Epoch 0] training: accuracy=0.216037 [Epoch 0] speed: 69 samples/sec time cost: 774.745033 [Epoch 0] validation: top1=0.271333 top5=0.843000 [Epoch 0] Current best top-1: 0.271333 vs previous -inf, saved to /var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/docs/_build/eval/tutorials/course/4afab4dc/.trial_0/best_checkpoint.pkl Unable to pickle object due to the reason: Can't pickle <class '__main__.MyCifarResNet'>: it's not the same object as __main__.MyCifarResNet. This object is not saved. Applying the state from the best checkpoint... Unable to resume the state from the best checkpoint, using the latest state. Finished, total runtime is 798.08 s { 'best_config': { 'batch_size': 16, 'custom_net': MyCifarResNet( (features): HybridSequential( (0): Conv2D(None -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) (2): HybridSequential( (0): CIFARBasicBlockV1( (body): HybridSequential( (0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) (2): Activation(relu) (3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) ) (1): CIFARBasicBlockV1( (body): HybridSequential( (0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) (2): Activation(relu) (3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) ) (2): CIFARBasicBlockV1( (body): HybridSequential( (0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) (2): Activation(relu) (3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) ) ) (3): HybridSequential( (0): CIFARBasicBlockV1( (body): HybridSequential( (0): Conv2D(16 -> 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) (2): Activation(relu) (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) (downsample): HybridSequential( (0): Conv2D(16 -> 32, kernel_size=(1, 1), stride=(2, 2), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) ) (1): CIFARBasicBlockV1( (body): HybridSequential( (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) (2): Activation(relu) (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) ) (2): CIFARBasicBlockV1( (body): HybridSequential( (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) (2): Activation(relu) (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) ) (3): CIFARBasicBlockV1( (body): HybridSequential( (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) (2): Activation(relu) (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) ) (4): CIFARBasicBlockV1( (body): HybridSequential( (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) (2): Activation(relu) (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) ) (5): CIFARBasicBlockV1( (body): HybridSequential( (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) (2): Activation(relu) (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) ) (6): CIFARBasicBlockV1( (body): HybridSequential( (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) (2): Activation(relu) (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) ) (7): CIFARBasicBlockV1( (body): HybridSequential( (0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) (2): Activation(relu) (3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) ) ) (4): HybridSequential( (0): CIFARBasicBlockV1( (body): HybridSequential( (0): Conv2D(32 -> 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) (2): Activation(relu) (3): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) (downsample): HybridSequential( (0): Conv2D(32 -> 64, kernel_size=(1, 1), stride=(2, 2), bias=False) (1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None) ) ) ) (5): GlobalAvgPool2D(size=(1, 1), stride=(1, 1), padding=(0, 0), ceil_mode=True, global_pool=True, pool_type=avg, layout=NCHW) ) (output): Dense(64 -> 10, linear) ), 'custom_optimizer': <__main__.Adam object at 0x7f339016de50>, 'dist_ip_addrs': None, 'early_stop_baseline': -inf, 'early_stop_max_value': inf, 'early_stop_patience': 10, 'epochs': 1, 'final_fit': False, 'gpus': [0], 'log_dir': '/var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/docs/_build/eval/tutorials/course/4afab4dc', 'lr': 0.01, 'model': 'resnet50', 'ngpus_per_trial': 1, 'nthreads_per_trial': 128, 'num_trials': 1, 'num_workers': 8, 'problem_type': 'multiclass', 'scheduler': 'local', 'search_strategy': 'random', 'searcher': 'random', 'seed': 152, 'time_limits': 7200, 'wall_clock_tick': 1634595692.526918}, 'total_time': 784.789101600647, 'train_acc': 0.21603703703703703, 'valid_acc': 0.2713333333333333}
print(classifier.fit_summary())
{'train_acc': 0.21603703703703703, 'valid_acc': 0.2713333333333333, 'total_time': 784.789101600647, 'best_config': {'model': 'resnet50', 'lr': 0.01, 'num_trials': 1, 'epochs': 1, 'batch_size': 16, 'nthreads_per_trial': 128, 'ngpus_per_trial': 1, 'time_limits': 7200, 'search_strategy': 'random', 'dist_ip_addrs': None, 'log_dir': '/var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/docs/_build/eval/tutorials/course/4afab4dc', 'searcher': 'random', 'scheduler': 'local', 'custom_net': MyCifarResNet(
(features): HybridSequential(
(0): Conv2D(None -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): HybridSequential(
(0): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(1): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(2): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
)
(3): HybridSequential(
(0): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
(downsample): HybridSequential(
(0): Conv2D(16 -> 32, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(1): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(2): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(3): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(4): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(5): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(6): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(7): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
)
(4): HybridSequential(
(0): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
(downsample): HybridSequential(
(0): Conv2D(32 -> 64, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
)
(5): GlobalAvgPool2D(size=(1, 1), stride=(1, 1), padding=(0, 0), ceil_mode=True, global_pool=True, pool_type=avg, layout=NCHW)
)
(output): Dense(64 -> 10, linear)
), 'custom_optimizer': <__main__.Adam object at 0x7f339016de50>, 'early_stop_patience': 10, 'early_stop_baseline': -inf, 'early_stop_max_value': inf, 'num_workers': 8, 'gpus': [0], 'seed': 152, 'final_fit': False, 'wall_clock_tick': 1634595692.526918, 'problem_type': 'multiclass'}, 'fit_history': {'train_acc': 0.21603703703703703, 'valid_acc': 0.2713333333333333, 'total_time': 784.789101600647, 'best_config': {'model': 'resnet50', 'lr': 0.01, 'num_trials': 1, 'epochs': 1, 'batch_size': 16, 'nthreads_per_trial': 128, 'ngpus_per_trial': 1, 'time_limits': 7200, 'search_strategy': 'random', 'dist_ip_addrs': None, 'log_dir': '/var/lib/jenkins/workspace/workspace/autogluon-tutorial-course-v3/docs/_build/eval/tutorials/course/4afab4dc', 'searcher': 'random', 'scheduler': 'local', 'custom_net': MyCifarResNet(
(features): HybridSequential(
(0): Conv2D(None -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): HybridSequential(
(0): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(1): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(2): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(16 -> 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
)
(3): HybridSequential(
(0): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(16 -> 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
(downsample): HybridSequential(
(0): Conv2D(16 -> 32, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(1): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(2): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(3): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(4): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(5): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(6): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
(7): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(32 -> 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
)
(4): HybridSequential(
(0): CIFARBasicBlockV1(
(body): HybridSequential(
(0): Conv2D(32 -> 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
(2): Activation(relu)
(3): Conv2D(64 -> 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(4): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
(downsample): HybridSequential(
(0): Conv2D(32 -> 64, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm(axis=1, eps=1e-05, momentum=0.9, fix_gamma=False, use_global_stats=False, in_channels=None)
)
)
)
(5): GlobalAvgPool2D(size=(1, 1), stride=(1, 1), padding=(0, 0), ceil_mode=True, global_pool=True, pool_type=avg, layout=NCHW)
)
(output): Dense(64 -> 10, linear)
), 'custom_optimizer': <__main__.Adam object at 0x7f339016de50>, 'early_stop_patience': 10, 'early_stop_baseline': -inf, 'early_stop_max_value': inf, 'num_workers': 8, 'gpus': [0], 'seed': 152, 'final_fit': False, 'wall_clock_tick': 1634595692.526918, 'problem_type': 'multiclass'}}}