.. _sec_customstorch: Tune PyTorch Model on MNIST =========================== In this tutorial, we demonstrate how to do Hyperparameter Optimization (HPO) using AutoGluon with PyTorch. AutoGluon is a framework agnostic HPO toolkit, which is compatible with any training code written in python. The PyTorch code used in this tutorial is adapted from this `git repo `__. In your applications, this code can be replaced with your own PyTorch code. Import the packages: .. code:: python import torch import torch.nn as nn import torch.nn.functional as F import torchvision import torchvision.transforms as transforms from tqdm.auto import tqdm Start with an MNIST Example --------------------------- Data Transforms ~~~~~~~~~~~~~~~ We first apply standard image transforms to our training and validation data: .. code:: python transform = transforms.Compose([ transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,)) ]) # the datasets trainset = torchvision.datasets.MNIST(root='./data', train=True, download=True, transform=transform) testset = torchvision.datasets.MNIST(root='./data', train=False, download=True, transform=transform) .. parsed-literal:: :class: output Downloading https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/dataset/mnist/train-images-idx3-ubyte.gz Downloading https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/dataset/mnist/train-images-idx3-ubyte.gz to ./data/MNIST/raw/train-images-idx3-ubyte.gz .. parsed-literal:: :class: output 0%| | 0/9912422 [00:00, Resource: NodeResourceManager(8 CPUs, 1 GPUs)) }) .. code:: python myscheduler.run() myscheduler.join_jobs() .. parsed-literal:: :class: output 0%| | 0/2 [00:00`__, whereas ``bayesopt`` is an own implementation. While ``skopt`` is currently somewhat more versatile (choice of acquisition function, surrogate model), ``bayesopt`` is directly optimized to asynchronous parallel scheduling. Importantly, ``bayesopt`` runs both with FIFO and Hyperband scheduler (while ``skopt`` is restricted to the FIFO scheduler). When running the following examples, comparing the different schedulers and searchers, you need to increase ``num_trials`` (or use ``time_out`` instead, which specifies the search budget in terms of wall-clock time) in order to see differences in performance. .. code:: python myscheduler = ag.scheduler.FIFOScheduler( ag_train_mnist, resource={'num_cpus': 4, 'num_gpus': 1}, searcher='bayesopt', num_trials=2, time_attr='epoch', reward_attr='accuracy') print(myscheduler) .. parsed-literal:: :class: output FIFOScheduler( DistributedResourceManager{ (Remote: Remote REMOTE_ID: 0, , Resource: NodeResourceManager(8 CPUs, 1 GPUs)) }) .. code:: python myscheduler.run() myscheduler.join_jobs() .. parsed-literal:: :class: output 0%| | 0/2 [00:00`__): .. code:: python myscheduler = ag.scheduler.HyperbandScheduler( ag_train_mnist, resource={'num_cpus': 4, 'num_gpus': 1}, searcher='bayesopt', num_trials=2, time_attr='epoch', reward_attr='accuracy', grace_period=1, reduction_factor=3, brackets=1) print(myscheduler) .. parsed-literal:: :class: output HyperbandScheduler(terminator: HyperbandBracketManager(reward_attr: accuracy, time_attr: epoch, rung_levels: [1, 3], max_t: 5, rung_systems: [Rung system: Iter 3.000: None | Iter 1.000: None]) .. code:: python myscheduler.run() myscheduler.join_jobs() .. parsed-literal:: :class: output 0%| | 0/2 [00:00