.. _sec_automm_detection_quick_start_coco:
AutoMM Detection - Quick Start on a Tiny COCO Format Dataset
============================================================
In this section, our goal is to fast finetune a pretrained model on a
small dataset in COCO format, and evaluate on its test set. Both
training and test sets are in COCO format. See
:ref:`sec_automm_detection_convert_to_coco` for how to convert other
datasets to COCO format.
Setting up the imports
~~~~~~~~~~~~~~~~~~~~~~
To start, let’s import MultiModalPredictor:
.. code:: python
from autogluon.multimodal import MultiModalPredictor
Make sure ``mmcv-full`` and ``mmdet`` are installed:
.. code:: python
!mim install mmcv-full
!pip install mmdet
.. parsed-literal::
:class: output
Looking in links: https://download.openmmlab.com/mmcv/dist/cu117/torch1.13.0/index.html
Requirement already satisfied: mmcv-full in /home/ci/opt/venv/lib/python3.8/site-packages (1.7.1)
Requirement already satisfied: yapf in /home/ci/opt/venv/lib/python3.8/site-packages (from mmcv-full) (0.32.0)
Requirement already satisfied: packaging in /home/ci/opt/venv/lib/python3.8/site-packages (from mmcv-full) (23.0)
Requirement already satisfied: opencv-python>=3 in /home/ci/opt/venv/lib/python3.8/site-packages (from mmcv-full) (4.7.0.68)
Requirement already satisfied: pyyaml in /home/ci/opt/venv/lib/python3.8/site-packages (from mmcv-full) (5.4.1)
Requirement already satisfied: numpy in /home/ci/opt/venv/lib/python3.8/site-packages (from mmcv-full) (1.23.5)
Requirement already satisfied: addict in /home/ci/opt/venv/lib/python3.8/site-packages (from mmcv-full) (2.4.0)
Requirement already satisfied: Pillow in /home/ci/opt/venv/lib/python3.8/site-packages (from mmcv-full) (9.4.0)
Requirement already satisfied: mmdet in /home/ci/opt/venv/lib/python3.8/site-packages (2.28.1)
Requirement already satisfied: six in /home/ci/opt/venv/lib/python3.8/site-packages (from mmdet) (1.16.0)
Requirement already satisfied: scipy in /home/ci/opt/venv/lib/python3.8/site-packages (from mmdet) (1.10.0)
Requirement already satisfied: pycocotools in /home/ci/opt/venv/lib/python3.8/site-packages (from mmdet) (2.0.6)
Requirement already satisfied: terminaltables in /home/ci/opt/venv/lib/python3.8/site-packages (from mmdet) (3.1.10)
Requirement already satisfied: matplotlib in /home/ci/opt/venv/lib/python3.8/site-packages (from mmdet) (3.6.3)
Requirement already satisfied: numpy in /home/ci/opt/venv/lib/python3.8/site-packages (from mmdet) (1.23.5)
Requirement already satisfied: kiwisolver>=1.0.1 in /home/ci/opt/venv/lib/python3.8/site-packages (from matplotlib->mmdet) (1.4.4)
Requirement already satisfied: python-dateutil>=2.7 in /home/ci/opt/venv/lib/python3.8/site-packages (from matplotlib->mmdet) (2.8.2)
Requirement already satisfied: packaging>=20.0 in /home/ci/opt/venv/lib/python3.8/site-packages (from matplotlib->mmdet) (23.0)
Requirement already satisfied: pyparsing>=2.2.1 in /home/ci/opt/venv/lib/python3.8/site-packages (from matplotlib->mmdet) (3.0.9)
Requirement already satisfied: contourpy>=1.0.1 in /home/ci/opt/venv/lib/python3.8/site-packages (from matplotlib->mmdet) (1.0.7)
Requirement already satisfied: pillow>=6.2.0 in /home/ci/opt/venv/lib/python3.8/site-packages (from matplotlib->mmdet) (9.4.0)
Requirement already satisfied: cycler>=0.10 in /home/ci/opt/venv/lib/python3.8/site-packages (from matplotlib->mmdet) (0.11.0)
Requirement already satisfied: fonttools>=4.22.0 in /home/ci/opt/venv/lib/python3.8/site-packages (from matplotlib->mmdet) (4.38.0)
And also import some other packages that will be used in this tutorial:
.. code:: python
import os
import time
from autogluon.core.utils.loaders import load_zip
Downloading Data
~~~~~~~~~~~~~~~~
We have the sample dataset ready in the cloud. Let’s download it:
.. code:: python
zip_file = "https://automl-mm-bench.s3.amazonaws.com/object_detection_dataset/tiny_motorbike_coco.zip"
download_dir = "./tiny_motorbike_coco"
load_zip.unzip(zip_file, unzip_dir=download_dir)
data_dir = os.path.join(download_dir, "tiny_motorbike")
train_path = os.path.join(data_dir, "Annotations", "trainval_cocoformat.json")
test_path = os.path.join(data_dir, "Annotations", "test_cocoformat.json")
.. parsed-literal::
:class: output
Downloading ./tiny_motorbike_coco/file.zip from https://automl-mm-bench.s3.amazonaws.com/object_detection_dataset/tiny_motorbike_coco.zip...
.. parsed-literal::
:class: output
100%|██████████| 21.8M/21.8M [00:00<00:00, 49.2MiB/s]
While using COCO format dataset, the input is the json annotation file
of the dataset split. In this example, ``trainval_cocoformat.json`` is
the annotation file of the train-and-validate split, and
``test_cocoformat.json`` is the annotation file of the test split.
Creating the MultiModalPredictor
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
We select the YOLOv3 with MobileNetV2 as backbone, and input resolution
is 320x320, pretrained on COCO dataset. With this setting, it is fast to
finetune or inference, and easy to deploy. And we use all the GPUs (if
any):
.. code:: python
checkpoint_name = "yolov3_mobilenetv2_320_300e_coco"
num_gpus = -1 # use all GPUs
We create the MultiModalPredictor with selected checkpoint name and
number of GPUs. We need to specify the problem_type to
``"object_detection"``, and also provide a ``sample_data_path`` for the
predictor to infer the catgories of the dataset. Here we provide the
``train_path``, and it also works using any other split of this dataset.
And we also provide a ``path`` to save the predictor. It will be saved
to a automatically generated directory with timestamp under
``AutogluonModels`` if ``path`` is not specified.
.. code:: python
# Init predictor
import uuid
model_path = f"./tmp/{uuid.uuid4().hex}-quick_start_tutorial_temp_save"
predictor = MultiModalPredictor(
hyperparameters={
"model.mmdet_image.checkpoint_name": checkpoint_name,
"env.num_gpus": num_gpus,
},
problem_type="object_detection",
sample_data_path=train_path,
path=model_path,
)
.. parsed-literal::
:class: output
processing yolov3_mobilenetv2_320_300e_coco...
.. raw:: html
.. parsed-literal::
:class: output
Output()
.. raw:: html
.. parsed-literal::
:class: output
[32mSuccessfully downloaded yolov3_mobilenetv2_320_300e_coco_20210719_215349-d18dff72.pth to /home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start[0m
[32mSuccessfully dumped yolov3_mobilenetv2_320_300e_coco.py to /home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start[0m
processing yolov3_mobilenetv2_320_300e_coco...
[32myolov3_mobilenetv2_320_300e_coco_20210719_215349-d18dff72.pth exists in /home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start[0m
[32mSuccessfully dumped yolov3_mobilenetv2_320_300e_coco.py to /home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start[0m
load checkpoint from local path: yolov3_mobilenetv2_320_300e_coco_20210719_215349-d18dff72.pth
The model and loaded state dict do not match exactly
size mismatch for bbox_head.convs_pred.0.weight: copying a param with shape torch.Size([255, 96, 1, 1]) from checkpoint, the shape in current model is torch.Size([45, 96, 1, 1]).
size mismatch for bbox_head.convs_pred.0.bias: copying a param with shape torch.Size([255]) from checkpoint, the shape in current model is torch.Size([45]).
size mismatch for bbox_head.convs_pred.1.weight: copying a param with shape torch.Size([255, 96, 1, 1]) from checkpoint, the shape in current model is torch.Size([45, 96, 1, 1]).
size mismatch for bbox_head.convs_pred.1.bias: copying a param with shape torch.Size([255]) from checkpoint, the shape in current model is torch.Size([45]).
size mismatch for bbox_head.convs_pred.2.weight: copying a param with shape torch.Size([255, 96, 1, 1]) from checkpoint, the shape in current model is torch.Size([45, 96, 1, 1]).
size mismatch for bbox_head.convs_pred.2.bias: copying a param with shape torch.Size([255]) from checkpoint, the shape in current model is torch.Size([45]).
Finetuning the Model
~~~~~~~~~~~~~~~~~~~~
We set the learning rate to be ``2e-4``. Note that we use a two-stage
learning rate option during finetuning by default, and the model head
will have 100x learning rate. Using a two-stage learning rate with high
learning rate only on head layers makes the model converge faster during
finetuning. It usually gives better performance as well, especially on
small datasets with hundreds or thousands of images. We also set the
epoch to be 15 and batch_size to be 32. We also compute the time of the
fit process here for better understanding the speed. We run it on a
g4.2xlarge EC2 machine on AWS, and part of the command outputs are shown
below:
.. code:: python
start = time.time()
# Fit
predictor.fit(
train_path,
hyperparameters={
"optimization.learning_rate": 2e-4, # we use two stage and detection head has 100x lr
"optimization.max_epochs": 30,
"env.per_gpu_batch_size": 32, # decrease it when model is large
},
)
train_end = time.time()
.. parsed-literal::
:class: output
Using default root folder: ./tiny_motorbike_coco/tiny_motorbike/Annotations/... Specify `root=...` if you feel it is wrong...
Global seed set to 123
.. parsed-literal::
:class: output
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
.. parsed-literal::
:class: output
AutoMM starts to create your model. ✨
- Model will be saved to "/home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/tmp/bfc39b047ec4495d96d5d1e12cef44b0-quick_start_tutorial_temp_save".
- Validation metric is "map".
- To track the learning progress, you can open a terminal and launch Tensorboard:
```shell
# Assume you have installed tensorboard
tensorboard --logdir /home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/tmp/bfc39b047ec4495d96d5d1e12cef44b0-quick_start_tutorial_temp_save
```
Enjoy your coffee, and let AutoMM do the job ☕☕☕ Learn more at https://auto.gluon.ai
/home/ci/opt/venv/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:577: LightningDeprecationWarning: The Trainer argument `auto_select_gpus` has been deprecated in v1.9.0 and will be removed in v2.0.0. Please use the function `pytorch_lightning.accelerators.find_usable_cuda_devices` instead.
rank_zero_deprecation(
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
`Trainer(val_check_interval=1.0)` was configured so validation will run at the end of the training epoch..
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
| Name | Type | Params
-----------------------------------------------------------------------
0 | model | MMDetAutoModelForObjectDetection | 3.7 M
1 | validation_metric | MeanAveragePrecision | 0
-----------------------------------------------------------------------
3.7 M Trainable params
0 Non-trainable params
3.7 M Total params
14.706 Total estimated model params size (MB)
/home/ci/opt/venv/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py:1609: PossibleUserWarning: The number of training batches (5) is smaller than the logging interval Trainer(log_every_n_steps=10). Set a lower value for log_every_n_steps if you want to see logs for the training epoch.
rank_zero_warn(
Epoch 2, global step 6: 'val_map' reached 0.00049 (best 0.00049), saving model to '/home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/tmp/bfc39b047ec4495d96d5d1e12cef44b0-quick_start_tutorial_temp_save/epoch=2-step=6.ckpt' as top 1
Epoch 5, global step 12: 'val_map' reached 0.03356 (best 0.03356), saving model to '/home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/tmp/bfc39b047ec4495d96d5d1e12cef44b0-quick_start_tutorial_temp_save/epoch=5-step=12.ckpt' as top 1
Epoch 8, global step 18: 'val_map' reached 0.03412 (best 0.03412), saving model to '/home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/tmp/bfc39b047ec4495d96d5d1e12cef44b0-quick_start_tutorial_temp_save/epoch=8-step=18.ckpt' as top 1
Epoch 11, global step 24: 'val_map' reached 0.06320 (best 0.06320), saving model to '/home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/tmp/bfc39b047ec4495d96d5d1e12cef44b0-quick_start_tutorial_temp_save/epoch=11-step=24.ckpt' as top 1
Epoch 14, global step 30: 'val_map' reached 0.07698 (best 0.07698), saving model to '/home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/tmp/bfc39b047ec4495d96d5d1e12cef44b0-quick_start_tutorial_temp_save/epoch=14-step=30.ckpt' as top 1
Epoch 17, global step 36: 'val_map' reached 0.08193 (best 0.08193), saving model to '/home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/tmp/bfc39b047ec4495d96d5d1e12cef44b0-quick_start_tutorial_temp_save/epoch=17-step=36.ckpt' as top 1
Epoch 20, global step 42: 'val_map' reached 0.08308 (best 0.08308), saving model to '/home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/tmp/bfc39b047ec4495d96d5d1e12cef44b0-quick_start_tutorial_temp_save/epoch=20-step=42.ckpt' as top 1
Epoch 23, global step 48: 'val_map' reached 0.09008 (best 0.09008), saving model to '/home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/tmp/bfc39b047ec4495d96d5d1e12cef44b0-quick_start_tutorial_temp_save/epoch=23-step=48.ckpt' as top 1
Epoch 26, global step 54: 'val_map' reached 0.10470 (best 0.10470), saving model to '/home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/tmp/bfc39b047ec4495d96d5d1e12cef44b0-quick_start_tutorial_temp_save/epoch=26-step=54.ckpt' as top 1
Epoch 29, global step 60: 'val_map' reached 0.11828 (best 0.11828), saving model to '/home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/tmp/bfc39b047ec4495d96d5d1e12cef44b0-quick_start_tutorial_temp_save/epoch=29-step=60.ckpt' as top 1
`Trainer.fit` stopped: `max_epochs=30` reached.
AutoMM has created your model 🎉🎉🎉
- To load the model, use the code below:
```python
from autogluon.multimodal import MultiModalPredictor
predictor = MultiModalPredictor.load("/home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/tmp/bfc39b047ec4495d96d5d1e12cef44b0-quick_start_tutorial_temp_save")
```
- You can open a terminal and launch Tensorboard to visualize the training log:
```shell
# Assume you have installed tensorboard
tensorboard --logdir /home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/tmp/bfc39b047ec4495d96d5d1e12cef44b0-quick_start_tutorial_temp_save
```
- If you are not satisfied with the model, try to increase the training time,
adjust the hyperparameters (https://auto.gluon.ai/stable/tutorials/multimodal/advanced_topics/customization.html),
or post issues on GitHub: https://github.com/autogluon/autogluon
Notice that at the end of each progress bar, if the checkpoint at
current stage is saved, it prints the model’s save path. In this
example, it’s ``./quick_start_tutorial_temp_save``.
Print out the time and we can see that it’s fast!
.. code:: python
print("This finetuning takes %.2f seconds." % (train_end - start))
.. parsed-literal::
:class: output
This finetuning takes 79.47 seconds.
Evaluation
~~~~~~~~~~
To evaluate the model we just trained, run following code.
And the evaluation results are shown in command line output. The first
line is mAP in COCO standard, and the second line is mAP in VOC standard
(or mAP50). For more details about these metrics, see `COCO’s evaluation
guideline `__. Note that for
presenting a fast finetuning we use 15 epochs, you could get better
result on this dataset by simply increasing the epochs.
.. code:: python
predictor.evaluate(test_path)
eval_end = time.time()
.. parsed-literal::
:class: output
Using default root folder: ./tiny_motorbike_coco/tiny_motorbike/Annotations/... Specify `root=...` if you feel it is wrong...
.. parsed-literal::
:class: output
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
.. parsed-literal::
:class: output
/home/ci/opt/venv/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:577: LightningDeprecationWarning: The Trainer argument `auto_select_gpus` has been deprecated in v1.9.0 and will be removed in v2.0.0. Please use the function `pytorch_lightning.accelerators.find_usable_cuda_devices` instead.
rank_zero_deprecation(
A new predictor save path is created.This is to prevent you to overwrite previous predictor saved here.You could check current save path at predictor._save_path.If you still want to use this path, set resume=True
No path specified. Models will be saved in: "AutogluonModels/ag-20230214_020501/"
.. parsed-literal::
:class: output
saving file at /home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/AutogluonModels/ag-20230214_020501/object_detection_result_cache.json
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.01s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=0.13s).
Accumulating evaluation results...
DONE (t=0.05s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.139
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.367
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.065
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.016
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.121
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.355
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.115
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.192
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.212
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.104
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.222
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.450
Print out the evaluation time:
.. code:: python
print("The evaluation takes %.2f seconds." % (eval_end - train_end))
.. parsed-literal::
:class: output
The evaluation takes 0.89 seconds.
We can load a new predictor with previous save_path, and we can also
reset the number of GPUs to use if not all the devices are available:
.. code:: python
# Load and reset num_gpus
new_predictor = MultiModalPredictor.load(model_path)
new_predictor.set_num_gpus(1)
.. parsed-literal::
:class: output
processing yolov3_mobilenetv2_320_300e_coco...
[32myolov3_mobilenetv2_320_300e_coco_20210719_215349-d18dff72.pth exists in /home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start[0m
[32mSuccessfully dumped yolov3_mobilenetv2_320_300e_coco.py to /home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start[0m
processing yolov3_mobilenetv2_320_300e_coco...
[32myolov3_mobilenetv2_320_300e_coco_20210719_215349-d18dff72.pth exists in /home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start[0m
[32mSuccessfully dumped yolov3_mobilenetv2_320_300e_coco.py to /home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start[0m
.. parsed-literal::
:class: output
Load pretrained checkpoint: /home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/tmp/bfc39b047ec4495d96d5d1e12cef44b0-quick_start_tutorial_temp_save/model.ckpt
Evaluating the new predictor gives us exactly the same result:
.. code:: python
# Evaluate new predictor
new_predictor.evaluate(test_path)
.. parsed-literal::
:class: output
Using default root folder: ./tiny_motorbike_coco/tiny_motorbike/Annotations/... Specify `root=...` if you feel it is wrong...
.. parsed-literal::
:class: output
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
.. parsed-literal::
:class: output
/home/ci/opt/venv/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:577: LightningDeprecationWarning: The Trainer argument `auto_select_gpus` has been deprecated in v1.9.0 and will be removed in v2.0.0. Please use the function `pytorch_lightning.accelerators.find_usable_cuda_devices` instead.
rank_zero_deprecation(
A new predictor save path is created.This is to prevent you to overwrite previous predictor saved here.You could check current save path at predictor._save_path.If you still want to use this path, set resume=True
No path specified. Models will be saved in: "AutogluonModels/ag-20230214_020507/"
.. parsed-literal::
:class: output
saving file at /home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/AutogluonModels/ag-20230214_020507/object_detection_result_cache.json
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.01s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=0.36s).
Accumulating evaluation results...
DONE (t=0.05s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.139
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.367
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.065
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.016
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.121
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.355
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.115
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.192
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.212
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.104
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.222
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.450
.. parsed-literal::
:class: output
{'map': 0.13892283785508422,
'mean_average_precision': 0.13892283785508422,
'map_50': 0.366640961042269,
'map_75': 0.0653530019465449,
'map_small': 0.015565106581493938,
'map_medium': 0.12067809623097113,
'map_large': 0.3551383940451113,
'mar_1': 0.11457693211181581,
'mar_10': 0.19172421893352126,
'mar_100': 0.2123260512097721,
'mar_small': 0.10375000000000001,
'mar_medium': 0.2222222222222222,
'mar_large': 0.4496626180836707}
If we set validation metric to ``"map"`` (Mean Average Precision), and
max epochs to ``50``, the predictor will have better performance with
the same pretrained model (YOLOv3). We trained it offline and uploaded
to S3. To load and check the result:
.. code:: python
# Load Trained Predictor from S3
zip_file = "https://automl-mm-bench.s3.amazonaws.com/object_detection/quick_start/AP50_433.zip"
download_dir = "./AP50_433"
load_zip.unzip(zip_file, unzip_dir=download_dir)
better_predictor = MultiModalPredictor.load("./AP50_433/quick_start_tutorial_temp_save")
better_predictor.set_num_gpus(1)
# Evaluate new predictor
better_predictor.evaluate(test_path)
.. parsed-literal::
:class: output
Downloading ./AP50_433/file.zip from https://automl-mm-bench.s3.amazonaws.com/object_detection/quick_start/AP50_433.zip...
.. parsed-literal::
:class: output
100%|██████████| 27.8M/27.8M [00:00<00:00, 59.1MiB/s]
Unzipping ./AP50_433/file.zip to ./AP50_433
Start to upgrade the previous configuration trained by AutoMM version=0.5.3b20221111.
Loading a model that has been trained via AutoGluon Multimodal<=0.6.2. Try to update the timm image size.
/home/ci/opt/venv/lib/python3.8/site-packages/sklearn/base.py:329: UserWarning: Trying to unpickle estimator LabelEncoder from version 1.0.2 when using version 1.1.1. This might lead to breaking code or invalid results. Use at your own risk. For more info please refer to:
https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations
warnings.warn(
/home/ci/opt/venv/lib/python3.8/site-packages/sklearn/base.py:329: UserWarning: Trying to unpickle estimator StandardScaler from version 1.0.2 when using version 1.1.1. This might lead to breaking code or invalid results. Use at your own risk. For more info please refer to:
https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations
warnings.warn(
.. parsed-literal::
:class: output
processing yolov3_mobilenetv2_320_300e_coco...
[32myolov3_mobilenetv2_320_300e_coco_20210719_215349-d18dff72.pth exists in /home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start[0m
[32mSuccessfully dumped yolov3_mobilenetv2_320_300e_coco.py to /home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start[0m
processing yolov3_mobilenetv2_320_300e_coco...
[32myolov3_mobilenetv2_320_300e_coco_20210719_215349-d18dff72.pth exists in /home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start[0m
[32mSuccessfully dumped yolov3_mobilenetv2_320_300e_coco.py to /home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start[0m
.. parsed-literal::
:class: output
Load pretrained checkpoint: /home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/AP50_433/quick_start_tutorial_temp_save/model.ckpt
Using default root folder: ./tiny_motorbike_coco/tiny_motorbike/Annotations/... Specify `root=...` if you feel it is wrong...
.. parsed-literal::
:class: output
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
.. parsed-literal::
:class: output
/home/ci/opt/venv/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:577: LightningDeprecationWarning: The Trainer argument `auto_select_gpus` has been deprecated in v1.9.0 and will be removed in v2.0.0. Please use the function `pytorch_lightning.accelerators.find_usable_cuda_devices` instead.
rank_zero_deprecation(
A new predictor save path is created.This is to prevent you to overwrite previous predictor saved here.You could check current save path at predictor._save_path.If you still want to use this path, set resume=True
No path specified. Models will be saved in: "AutogluonModels/ag-20230214_020514/"
.. parsed-literal::
:class: output
saving file at /home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/AutogluonModels/ag-20230214_020514/object_detection_result_cache.json
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.01s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=0.13s).
Accumulating evaluation results...
DONE (t=0.05s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.195
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.433
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.135
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.036
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.206
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.450
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.158
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.231
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.244
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.138
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.295
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.508
.. parsed-literal::
:class: output
{'map': 0.19495386487978572,
'mean_average_precision': 0.19495386487978572,
'map_50': 0.4332857299534383,
'map_75': 0.13537716307477576,
'map_small': 0.03559795706853831,
'map_medium': 0.20600519224203545,
'map_large': 0.4499958167494408,
'mar_1': 0.15790885600187926,
'mar_10': 0.23102513507164674,
'mar_100': 0.24378999295278367,
'mar_small': 0.13833333333333334,
'mar_medium': 0.2949206349206349,
'mar_large': 0.5080251911830859}
For how to set those hyperparameters and finetune the model with higher
performance, see :ref:`sec_automm_detection_high_ft_coco`.
Inference
~~~~~~~~~
Now that we have gone through the model setup, finetuning, and
evaluation, this section details the inference. Specifically, we layout
the steps for using the model to make predictions and visualize the
results.
To run inference on the entire test set, perform:
.. code:: python
pred = predictor.predict(test_path)
print(pred)
.. parsed-literal::
:class: output
Using default root folder: ./tiny_motorbike_coco/tiny_motorbike/Annotations/... Specify `root=...` if you feel it is wrong...
.. parsed-literal::
:class: output
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
.. parsed-literal::
:class: output
/home/ci/opt/venv/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:577: LightningDeprecationWarning: The Trainer argument `auto_select_gpus` has been deprecated in v1.9.0 and will be removed in v2.0.0. Please use the function `pytorch_lightning.accelerators.find_usable_cuda_devices` instead.
rank_zero_deprecation(
.. parsed-literal::
:class: output
image \
0 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
1 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
2 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
3 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
4 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
5 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
6 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
7 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
8 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
9 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
10 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
11 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
12 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
13 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
14 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
15 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
16 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
17 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
18 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
19 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
20 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
21 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
22 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
23 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
24 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
25 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
26 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
27 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
28 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
29 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
30 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
31 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
32 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
33 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
34 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
35 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
36 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
37 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
38 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
39 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
40 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
41 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
42 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
43 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
44 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
45 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
46 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
47 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
48 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
49 ./tiny_motorbike_coco/tiny_motorbike/Annotatio...
bboxes
0 [{'class': 'bicycle', 'bbox': [173.50233, 179....
1 [{'class': 'bicycle', 'bbox': [375.84824, 268....
2 [{'class': 'bicycle', 'bbox': [447.19183, 99.1...
3 [{'class': 'bicycle', 'bbox': [50.952763, 44.3...
4 [{'class': 'bicycle', 'bbox': [135.4371, 186.3...
5 [{'class': 'car', 'bbox': [23.612324, 36.72851...
6 [{'class': 'motorbike', 'bbox': [25.189875, 11...
7 [{'class': 'motorbike', 'bbox': [126.620674, 1...
8 [{'class': 'bicycle', 'bbox': [124.255196, 51....
9 [{'class': 'bicycle', 'bbox': [145.57304, -5.6...
10 [{'class': 'bicycle', 'bbox': [380.30948, 107....
11 [{'class': 'bicycle', 'bbox': [362.25558, 240....
12 [{'class': 'car', 'bbox': [456.4675, 17.214827...
13 [{'class': 'motorbike', 'bbox': [-13.019776, 4...
14 [{'class': 'car', 'bbox': [227.31514, 4.658209...
15 [{'class': 'motorbike', 'bbox': [212.12163, 18...
16 [{'class': 'bicycle', 'bbox': [442.01242, 71.1...
17 [{'class': 'bicycle', 'bbox': [30.3753, 238.42...
18 [{'class': 'car', 'bbox': [103.0758, -115.6833...
19 [{'class': 'bicycle', 'bbox': [145.97913, 85.6...
20 [{'class': 'bicycle', 'bbox': [-5.2200108, 204...
21 [{'class': 'motorbike', 'bbox': [107.0592, 202...
22 [{'class': 'bicycle', 'bbox': [303.01477, 213....
23 [{'class': 'bicycle', 'bbox': [349.40033, 0.05...
24 [{'class': 'bicycle', 'bbox': [1.1138946, -14....
25 [{'class': 'bicycle', 'bbox': [408.08844, 160....
26 [{'class': 'bicycle', 'bbox': [448.6404, -0.65...
27 [{'class': 'car', 'bbox': [-9.632367, 8.105296...
28 [{'class': 'motorbike', 'bbox': [4.1989803, 12...
29 [{'class': 'bicycle', 'bbox': [212.08734, 139....
30 [{'class': 'car', 'bbox': [34.584164, 39.71519...
31 [{'class': 'bicycle', 'bbox': [0.49745142, 95....
32 [{'class': 'motorbike', 'bbox': [119.40121, 18...
33 [{'class': 'bicycle', 'bbox': [350.8685, 36.85...
34 [{'class': 'bus', 'bbox': [404.8365, -3.599782...
35 [{'class': 'bicycle', 'bbox': [304.89337, 41.7...
36 [{'class': 'car', 'bbox': [-58.4035, 62.54875,...
37 [{'class': 'motorbike', 'bbox': [-9.7497225, 5...
38 [{'class': 'motorbike', 'bbox': [-41.79077, 25...
39 [{'class': 'motorbike', 'bbox': [170.32996, 27...
40 [{'class': 'bicycle', 'bbox': [34.81211, 192.8...
41 [{'class': 'bicycle', 'bbox': [3.5297365, 66.1...
42 [{'class': 'bicycle', 'bbox': [386.77277, 148....
43 [{'class': 'bus', 'bbox': [90.56901, 4.2331314...
44 [{'class': 'motorbike', 'bbox': [87.13502, 137...
45 [{'class': 'motorbike', 'bbox': [200.50403, 11...
46 [{'class': 'bicycle', 'bbox': [352.23834, 86.4...
47 [{'class': 'car', 'bbox': [72.093124, 40.08678...
48 [{'class': 'bicycle', 'bbox': [221.85867, 16.8...
49 [{'class': 'car', 'bbox': [54.908, 122.310745,...
The output ``pred`` is a ``pandas`` ``DataFrame`` that has two columns,
``image`` and ``bboxes``.
In ``image``, each row contains the image path
In ``bboxes``, each row is a list of dictionaries, each one representing
a bounding box:
``{"class": , "bbox": [x1, y1, x2, y2], "score": }``
Note that, by default, the ``predictor.predict`` does not save the
detection results into a file.
To run inference and save results, run the following:
.. code:: python
pred = better_predictor.predict(test_path, save_results=True)
.. parsed-literal::
:class: output
Using default root folder: ./tiny_motorbike_coco/tiny_motorbike/Annotations/... Specify `root=...` if you feel it is wrong...
.. parsed-literal::
:class: output
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
.. parsed-literal::
:class: output
/home/ci/opt/venv/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:577: LightningDeprecationWarning: The Trainer argument `auto_select_gpus` has been deprecated in v1.9.0 and will be removed in v2.0.0. Please use the function `pytorch_lightning.accelerators.find_usable_cuda_devices` instead.
rank_zero_deprecation(
A new predictor save path is created.This is to prevent you to overwrite previous predictor saved here.You could check current save path at predictor._save_path.If you still want to use this path, set resume=True
No path specified. Models will be saved in: "AutogluonModels/ag-20230214_020516/"
Saved detection results to /home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/AutogluonModels/ag-20230214_020516/result.txt
.. parsed-literal::
:class: output
Saved detection results to /home/ci/autogluon/docs/_build/eval/tutorials/multimodal/object_detection/quick_start/AutogluonModels/ag-20230214_020516/result.txt
Here, we save ``pred`` into a ``.txt`` file, which exactly follows the
same layout as in ``pred``. You can use a predictor initialzed in anyway
(i.e. finetuned predictor, predictor with pretrained model, etc.). Here,
we demonstrate using the ``better_predictor`` loaded previously.
Visualizing Results
~~~~~~~~~~~~~~~~~~~
To run visualizations, ensure that you have ``opencv`` installed. If you
haven’t already, install ``opencv`` by running
.. code:: python
!pip install opencv-python
.. parsed-literal::
:class: output
Requirement already satisfied: opencv-python in /home/ci/opt/venv/lib/python3.8/site-packages (4.7.0.68)
Requirement already satisfied: numpy>=1.17.0 in /home/ci/opt/venv/lib/python3.8/site-packages (from opencv-python) (1.23.5)
To visualize the detection bounding boxes, run the following:
.. code:: python
from autogluon.multimodal.utils import Visualizer
conf_threshold = 0.4 # Specify a confidence threshold to filter out unwanted boxes
image_result = pred.iloc[30]
img_path = image_result.image # Select an image to visualize
visualizer = Visualizer(img_path) # Initialize the Visualizer
out = visualizer.draw_instance_predictions(image_result, conf_threshold=conf_threshold) # Draw detections
visualized = out.get_image() # Get the visualized image
from PIL import Image
from IPython.display import display
img = Image.fromarray(visualized, 'RGB')
display(img)
.. figure:: output_quick_start_coco_f6564b_33_0.png
Testing on Your Own Image
~~~~~~~~~~~~~~~~~~~~~~~~~
You can also download an image and run inference on that single image.
The follow is an example:
Download the example image:
.. code:: python
from autogluon.multimodal import download
image_url = "https://raw.githubusercontent.com/dmlc/web-data/master/gluoncv/detection/street_small.jpg"
test_image = download(image_url)
.. parsed-literal::
:class: output
Downloading street_small.jpg from https://raw.githubusercontent.com/dmlc/web-data/master/gluoncv/detection/street_small.jpg...
.. parsed-literal::
:class: output
Run inference:
.. code:: python
pred_test_image = better_predictor.predict({"image": [test_image]})
print(pred_test_image)
.. parsed-literal::
:class: output
/home/ci/opt/venv/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:577: LightningDeprecationWarning: The Trainer argument `auto_select_gpus` has been deprecated in v1.9.0 and will be removed in v2.0.0. Please use the function `pytorch_lightning.accelerators.find_usable_cuda_devices` instead.
rank_zero_deprecation(
.. parsed-literal::
:class: output
image bboxes
0 street_small.jpg [{'class': 'bicycle', 'bbox': [235.36739, 216....
Other Examples
~~~~~~~~~~~~~~
You may go to `AutoMM
Examples `__
to explore other examples about AutoMM.
Customization
~~~~~~~~~~~~~
To learn how to customize AutoMM, please refer to
:ref:`sec_automm_customization`.
Citation
~~~~~~~~
::
@misc{redmon2018yolov3,
title={YOLOv3: An Incremental Improvement},
author={Joseph Redmon and Ali Farhadi},
year={2018},
eprint={1804.02767},
archivePrefix={arXiv},
primaryClass={cs.CV}
}