Image-Text Semantic Matching with AutoMM¶
Vision and language are two important aspects of human intelligence to understand the real world. Image-text semantic matching, measuring the visual-semantic similarity between image and text, plays a critical role in bridging the vision and language. Learning a joint space where text and image feature vectors are aligned is a typical solution for image-text matching. It is becoming increasingly significant for various vision-and-language tasks, such as cross-modal retrieval, image captioning, text-to-image synthesis, and multimodal neural machine translation. This tutorial will introduce how to apply AutoMM to the image-text matching task.
import os
import warnings
from IPython.display import Image, display
import numpy as np
warnings.filterwarnings('ignore')
np.random.seed(123)
Dataset¶
In this tutorial, we will use the Flickr30K dataset to demonstrate the image-text matching. The Flickr30k dataset is a popular benchmark for sentence-based picture portrayal. The dataset is comprised of 31,783 images that capture people engaged in everyday activities and events. Each image has a descriptive caption. We organized the dataset using pandas dataframe. To get started, Let’s download the dataset.
from autogluon.core.utils.loaders import load_pd
import pandas as pd
download_dir = './ag_automm_tutorial_imgtxt'
zip_file = 'https://automl-mm-bench.s3.amazonaws.com/flickr30k.zip'
from autogluon.core.utils.loaders import load_zip
load_zip.unzip(zip_file, unzip_dir=download_dir)
Downloading ./ag_automm_tutorial_imgtxt/file.zip from https://automl-mm-bench.s3.amazonaws.com/flickr30k.zip...
0%| | 0.00/4.38G [00:00<?, ?iB/s]
0%| | 6.63M/4.38G [00:00<02:33, 28.4MiB/s]
0%| | 9.48M/4.38G [00:00<03:41, 19.7MiB/s]
0%| | 15.4M/4.38G [00:00<03:18, 22.0MiB/s]
0%| | 17.6M/4.38G [00:00<04:40, 15.6MiB/s]
1%| | 23.4M/4.38G [00:01<03:26, 21.1MiB/s]
1%| | 25.8M/4.38G [00:01<03:35, 20.2MiB/s]
1%| | 31.8M/4.38G [00:01<02:39, 27.3MiB/s]
1%| | 34.9M/4.38G [00:01<03:04, 23.5MiB/s]
1%| | 40.2M/4.38G [00:01<02:44, 26.3MiB/s]
1%| | 43.0M/4.38G [00:01<03:03, 23.6MiB/s]
1%| | 45.5M/4.38G [00:02<03:03, 23.6MiB/s]
1%| | 50.3M/4.38G [00:02<03:10, 22.7MiB/s]
1%|▏ | 57.0M/4.38G [00:02<04:05, 17.6MiB/s]
1%|▏ | 59.0M/4.38G [00:02<04:33, 15.8MiB/s]
2%|▏ | 65.9M/4.38G [00:03<03:13, 22.3MiB/s]
2%|▏ | 68.6M/4.38G [00:03<03:40, 19.6MiB/s]
2%|▏ | 73.7M/4.38G [00:03<03:59, 18.0MiB/s]
2%|▏ | 75.8M/4.38G [00:03<04:01, 17.8MiB/s]
2%|▏ | 82.6M/4.38G [00:03<02:43, 26.3MiB/s]
2%|▏ | 86.0M/4.38G [00:03<02:50, 25.2MiB/s]
2%|▏ | 90.5M/4.38G [00:04<02:31, 28.4MiB/s]
2%|▏ | 93.8M/4.38G [00:04<02:49, 25.3MiB/s]
2%|▏ | 98.9M/4.38G [00:04<02:23, 29.9MiB/s]
2%|▏ | 102M/4.38G [00:04<02:34, 27.7MiB/s]
2%|▏ | 109M/4.38G [00:04<02:25, 29.3MiB/s]
3%|▎ | 116M/4.38G [00:04<02:10, 32.8MiB/s]
3%|▎ | 119M/4.38G [00:05<02:45, 25.8MiB/s]
3%|▎ | 126M/4.38G [00:05<02:52, 24.7MiB/s]
3%|▎ | 132M/4.38G [00:05<02:18, 30.6MiB/s]
3%|▎ | 136M/4.38G [00:06<04:00, 17.6MiB/s]
3%|▎ | 141M/4.38G [00:06<03:32, 19.9MiB/s]
3%|▎ | 144M/4.38G [00:06<03:57, 17.8MiB/s]
3%|▎ | 149M/4.38G [00:06<03:55, 18.0MiB/s]
3%|▎ | 151M/4.38G [00:06<04:22, 16.1MiB/s]
4%|▎ | 158M/4.38G [00:07<03:35, 19.6MiB/s]
4%|▎ | 160M/4.38G [00:07<04:03, 17.3MiB/s]
4%|▍ | 166M/4.38G [00:07<04:00, 17.5MiB/s]
4%|▍ | 168M/4.38G [00:07<04:12, 16.7MiB/s]
4%|▍ | 174M/4.38G [00:08<03:03, 22.9MiB/s]
4%|▍ | 177M/4.38G [00:08<03:04, 22.8MiB/s]
4%|▍ | 180M/4.38G [00:08<02:53, 24.1MiB/s]
4%|▍ | 185M/4.38G [00:08<03:13, 21.7MiB/s]
4%|▍ | 191M/4.38G [00:08<02:53, 24.1MiB/s]
4%|▍ | 194M/4.38G [00:08<03:18, 21.1MiB/s]
5%|▍ | 200M/4.38G [00:09<03:16, 21.3MiB/s]
5%|▍ | 202M/4.38G [00:09<03:21, 20.7MiB/s]
5%|▍ | 210M/4.38G [00:09<02:12, 31.6MiB/s]
5%|▍ | 213M/4.38G [00:09<02:29, 27.8MiB/s]
5%|▍ | 217M/4.38G [00:09<02:41, 25.8MiB/s]
5%|▌ | 220M/4.38G [00:09<03:23, 20.4MiB/s]
5%|▌ | 222M/4.38G [00:10<04:12, 16.5MiB/s]
5%|▌ | 224M/4.38G [00:10<08:28, 8.17MiB/s]
5%|▌ | 226M/4.38G [00:11<09:47, 7.08MiB/s]
5%|▌ | 227M/4.38G [00:11<09:20, 7.41MiB/s]
5%|▌ | 233M/4.38G [00:11<04:42, 14.7MiB/s]
5%|▌ | 236M/4.38G [00:11<04:46, 14.5MiB/s]
6%|▌ | 242M/4.38G [00:11<03:38, 18.9MiB/s]
6%|▌ | 244M/4.38G [00:12<04:06, 16.8MiB/s]
6%|▌ | 251M/4.38G [00:12<02:41, 25.5MiB/s]
6%|▌ | 255M/4.38G [00:12<02:56, 23.3MiB/s]
6%|▌ | 258M/4.38G [00:12<03:11, 21.5MiB/s]
6%|▌ | 261M/4.38G [00:12<03:48, 18.0MiB/s]
6%|▌ | 264M/4.38G [00:12<03:18, 20.7MiB/s]
6%|▌ | 267M/4.38G [00:13<03:17, 20.8MiB/s]
6%|▌ | 269M/4.38G [00:13<03:40, 18.7MiB/s]
6%|▋ | 275M/4.38G [00:13<02:28, 27.6MiB/s]
6%|▋ | 279M/4.38G [00:13<02:47, 24.5MiB/s]
7%|▋ | 285M/4.38G [00:13<02:23, 28.5MiB/s]
7%|▋ | 294M/4.38G [00:13<02:08, 31.8MiB/s]
7%|▋ | 300M/4.38G [00:14<02:22, 28.6MiB/s]
7%|▋ | 303M/4.38G [00:14<02:44, 24.7MiB/s]
7%|▋ | 309M/4.38G [00:14<02:27, 27.7MiB/s]
7%|▋ | 312M/4.38G [00:14<02:53, 23.5MiB/s]
7%|▋ | 317M/4.38G [00:14<02:27, 27.6MiB/s]
7%|▋ | 320M/4.38G [00:15<02:43, 24.8MiB/s]
7%|▋ | 325M/4.38G [00:15<02:15, 30.0MiB/s]
8%|▊ | 329M/4.38G [00:15<02:48, 24.0MiB/s]
8%|▊ | 334M/4.38G [00:15<02:24, 28.0MiB/s]
8%|▊ | 337M/4.38G [00:15<02:43, 24.8MiB/s]
8%|▊ | 343M/4.38G [00:15<02:05, 32.1MiB/s]
8%|▊ | 347M/4.38G [00:15<02:15, 29.7MiB/s]
8%|▊ | 351M/4.38G [00:16<02:30, 26.8MiB/s]
8%|▊ | 354M/4.38G [00:16<02:55, 22.9MiB/s]
8%|▊ | 359M/4.38G [00:16<03:02, 22.0MiB/s]
8%|▊ | 361M/4.38G [00:16<03:24, 19.7MiB/s]
8%|▊ | 367M/4.38G [00:17<03:58, 16.8MiB/s]
8%|▊ | 369M/4.38G [00:17<04:53, 13.7MiB/s]
9%|▊ | 374M/4.38G [00:17<03:30, 19.0MiB/s]
9%|▊ | 377M/4.38G [00:17<04:30, 14.8MiB/s]
9%|▊ | 379M/4.38G [00:18<04:59, 13.4MiB/s]
9%|▉ | 384M/4.38G [00:18<06:40, 9.97MiB/s]
9%|▉ | 386M/4.38G [00:18<06:22, 10.4MiB/s]
9%|▉ | 388M/4.38G [00:19<05:56, 11.2MiB/s]
9%|▉ | 393M/4.38G [00:19<06:09, 10.8MiB/s]
9%|▉ | 394M/4.38G [00:19<06:14, 10.7MiB/s]
9%|▉ | 401M/4.38G [00:19<03:59, 16.6MiB/s]
9%|▉ | 403M/4.38G [00:20<04:27, 14.9MiB/s]
9%|▉ | 409M/4.38G [00:20<03:53, 17.0MiB/s]
9%|▉ | 411M/4.38G [00:20<04:07, 16.0MiB/s]
10%|▉ | 418M/4.38G [00:20<03:12, 20.6MiB/s]
10%|▉ | 420M/4.38G [00:20<03:27, 19.1MiB/s]
10%|▉ | 426M/4.38G [00:21<02:43, 24.2MiB/s]
10%|▉ | 429M/4.38G [00:21<03:07, 21.1MiB/s]
10%|▉ | 431M/4.38G [00:21<03:24, 19.4MiB/s]
10%|▉ | 433M/4.38G [00:21<03:30, 18.7MiB/s]
10%|▉ | 436M/4.38G [00:21<03:16, 20.0MiB/s]
10%|█ | 442M/4.38G [00:21<02:16, 28.8MiB/s]
10%|█ | 445M/4.38G [00:22<03:11, 20.6MiB/s]
10%|█ | 453M/4.38G [00:22<02:37, 25.0MiB/s]
10%|█ | 460M/4.38G [00:22<02:28, 26.4MiB/s]
11%|█ | 462M/4.38G [00:22<02:30, 26.0MiB/s]
11%|█ | 468M/4.38G [00:22<02:15, 28.8MiB/s]
11%|█ | 471M/4.38G [00:22<02:30, 26.0MiB/s]
11%|█ | 477M/4.38G [00:23<01:59, 32.7MiB/s]
11%|█ | 481M/4.38G [00:23<02:32, 25.6MiB/s]
11%|█ | 487M/4.38G [00:23<02:27, 26.3MiB/s]
11%|█▏ | 493M/4.38G [00:23<02:04, 31.1MiB/s]
11%|█▏ | 497M/4.38G [00:23<02:22, 27.3MiB/s]
11%|█▏ | 502M/4.38G [00:23<02:18, 28.1MiB/s]
12%|█▏ | 505M/4.38G [00:24<02:17, 28.2MiB/s]
12%|█▏ | 510M/4.38G [00:24<02:25, 26.5MiB/s]
12%|█▏ | 513M/4.38G [00:24<03:04, 21.0MiB/s]
12%|█▏ | 519M/4.38G [00:24<03:01, 21.3MiB/s]
12%|█▏ | 521M/4.38G [00:25<03:40, 17.5MiB/s]
12%|█▏ | 527M/4.38G [00:25<03:01, 21.2MiB/s]
12%|█▏ | 529M/4.38G [00:25<04:01, 15.9MiB/s]
12%|█▏ | 531M/4.38G [00:25<04:30, 14.3MiB/s]
12%|█▏ | 534M/4.38G [00:25<03:40, 17.4MiB/s]
12%|█▏ | 537M/4.38G [00:25<03:33, 18.0MiB/s]
12%|█▏ | 544M/4.38G [00:26<02:52, 22.3MiB/s]
12%|█▏ | 546M/4.38G [00:26<03:14, 19.7MiB/s]
13%|█▎ | 551M/4.38G [00:26<03:00, 21.2MiB/s]
13%|█▎ | 553M/4.38G [00:26<03:06, 20.5MiB/s]
13%|█▎ | 555M/4.38G [00:26<03:40, 17.4MiB/s]
13%|█▎ | 561M/4.38G [00:26<02:25, 26.3MiB/s]
13%|█▎ | 564M/4.38G [00:27<02:25, 26.3MiB/s]
13%|█▎ | 569M/4.38G [00:27<02:10, 29.2MiB/s]
13%|█▎ | 572M/4.38G [00:27<02:57, 21.4MiB/s]
13%|█▎ | 578M/4.38G [00:27<02:09, 29.4MiB/s]
13%|█▎ | 582M/4.38G [00:27<02:22, 26.7MiB/s]
13%|█▎ | 585M/4.38G [00:27<02:31, 25.1MiB/s]
13%|█▎ | 588M/4.38G [00:28<02:56, 21.4MiB/s]
14%|█▎ | 594M/4.38G [00:28<02:47, 22.6MiB/s]
14%|█▎ | 596M/4.38G [00:28<03:04, 20.5MiB/s]
14%|█▍ | 604M/4.38G [00:28<02:27, 25.7MiB/s]
14%|█▍ | 611M/4.38G [00:28<02:06, 29.8MiB/s]
14%|█▍ | 614M/4.38G [00:29<02:25, 25.9MiB/s]
14%|█▍ | 619M/4.38G [00:29<01:59, 31.4MiB/s]
14%|█▍ | 623M/4.38G [00:29<02:10, 28.8MiB/s]
14%|█▍ | 627M/4.38G [00:29<02:01, 30.8MiB/s]
14%|█▍ | 631M/4.38G [00:29<02:33, 24.4MiB/s]
15%|█▍ | 636M/4.38G [00:29<02:28, 25.2MiB/s]
15%|█▍ | 639M/4.38G [00:30<02:47, 22.4MiB/s]
15%|█▍ | 644M/4.38G [00:30<03:08, 19.8MiB/s]
15%|█▍ | 646M/4.38G [00:30<03:47, 16.4MiB/s]
15%|█▍ | 653M/4.38G [00:30<03:02, 20.5MiB/s]
15%|█▍ | 655M/4.38G [00:31<03:39, 17.0MiB/s]
15%|█▌ | 661M/4.38G [00:31<04:03, 15.3MiB/s]
15%|█▌ | 663M/4.38G [00:31<04:29, 13.8MiB/s]
15%|█▌ | 670M/4.38G [00:31<02:51, 21.7MiB/s]
15%|█▌ | 673M/4.38G [00:31<03:01, 20.4MiB/s]
15%|█▌ | 676M/4.38G [00:32<03:10, 19.4MiB/s]
16%|█▌ | 679M/4.38G [00:32<02:46, 22.3MiB/s]
16%|█▌ | 682M/4.38G [00:32<03:05, 20.0MiB/s]
16%|█▌ | 686M/4.38G [00:32<02:47, 22.1MiB/s]
16%|█▌ | 689M/4.38G [00:32<03:14, 19.0MiB/s]
16%|█▌ | 691M/4.38G [00:33<04:10, 14.7MiB/s]
16%|█▌ | 693M/4.38G [00:33<04:25, 13.9MiB/s]
16%|█▌ | 696M/4.38G [00:33<05:13, 11.8MiB/s]
16%|█▌ | 697M/4.38G [00:33<05:38, 10.9MiB/s]
16%|█▌ | 703M/4.38G [00:33<03:29, 17.6MiB/s]
16%|█▌ | 705M/4.38G [00:34<04:09, 14.7MiB/s]
16%|█▌ | 711M/4.38G [00:34<02:41, 22.7MiB/s]
16%|█▋ | 714M/4.38G [00:34<02:57, 20.7MiB/s]
16%|█▋ | 720M/4.38G [00:34<02:07, 28.6MiB/s]
17%|█▋ | 724M/4.38G [00:34<02:46, 21.9MiB/s]
17%|█▋ | 728M/4.38G [00:35<03:07, 19.5MiB/s]
17%|█▋ | 731M/4.38G [00:35<03:51, 15.8MiB/s]
17%|█▋ | 736M/4.38G [00:35<03:02, 20.0MiB/s]
17%|█▋ | 739M/4.38G [00:35<03:30, 17.3MiB/s]
17%|█▋ | 745M/4.38G [00:35<02:54, 20.9MiB/s]
17%|█▋ | 747M/4.38G [00:36<03:31, 17.2MiB/s]
17%|█▋ | 753M/4.38G [00:36<02:48, 21.6MiB/s]
17%|█▋ | 756M/4.38G [00:36<02:51, 21.1MiB/s]
17%|█▋ | 762M/4.38G [00:36<02:36, 23.1MiB/s]
17%|█▋ | 764M/4.38G [00:36<03:05, 19.5MiB/s]
18%|█▊ | 770M/4.38G [00:37<03:27, 17.4MiB/s]
18%|█▊ | 772M/4.38G [00:37<03:37, 16.6MiB/s]
18%|█▊ | 778M/4.38G [00:37<03:09, 19.1MiB/s]
18%|█▊ | 780M/4.38G [00:37<03:18, 18.1MiB/s]
18%|█▊ | 787M/4.38G [00:37<02:32, 23.6MiB/s]
18%|█▊ | 789M/4.38G [00:38<02:47, 21.5MiB/s]
18%|█▊ | 795M/4.38G [00:38<02:12, 27.1MiB/s]
18%|█▊ | 798M/4.38G [00:38<02:42, 22.0MiB/s]
18%|█▊ | 804M/4.38G [00:38<02:29, 23.9MiB/s]
18%|█▊ | 806M/4.38G [00:38<02:37, 22.7MiB/s]
19%|█▊ | 812M/4.38G [00:39<03:01, 19.7MiB/s]
19%|█▊ | 814M/4.38G [00:39<03:37, 16.4MiB/s]
19%|█▊ | 821M/4.38G [00:39<02:46, 21.4MiB/s]
19%|█▉ | 823M/4.38G [00:39<02:55, 20.3MiB/s]
19%|█▉ | 829M/4.38G [00:39<02:16, 26.1MiB/s]
19%|█▉ | 832M/4.38G [00:40<02:40, 22.0MiB/s]
19%|█▉ | 837M/4.38G [00:40<02:37, 22.5MiB/s]
19%|█▉ | 840M/4.38G [00:40<02:44, 21.5MiB/s]
19%|█▉ | 845M/4.38G [00:40<03:28, 17.0MiB/s]
19%|█▉ | 847M/4.38G [00:41<03:31, 16.7MiB/s]
19%|█▉ | 854M/4.38G [00:41<02:34, 22.9MiB/s]
20%|█▉ | 857M/4.38G [00:41<02:34, 22.8MiB/s]
20%|█▉ | 864M/4.38G [00:41<02:21, 24.9MiB/s]
20%|█▉ | 871M/4.38G [00:41<01:57, 29.8MiB/s]
20%|█▉ | 874M/4.38G [00:41<02:11, 26.7MiB/s]
20%|██ | 877M/4.38G [00:42<02:24, 24.2MiB/s]
20%|██ | 879M/4.38G [00:42<02:34, 22.7MiB/s]
20%|██ | 882M/4.38G [00:42<02:54, 20.1MiB/s]
20%|██ | 887M/4.38G [00:42<02:56, 19.7MiB/s]
20%|██ | 889M/4.38G [00:42<03:11, 18.2MiB/s]
20%|██ | 896M/4.38G [00:42<02:14, 25.8MiB/s]
21%|██ | 899M/4.38G [00:43<02:20, 24.9MiB/s]
21%|██ | 904M/4.38G [00:43<02:14, 25.8MiB/s]
21%|██ | 907M/4.38G [00:43<02:09, 26.7MiB/s]
21%|██ | 913M/4.38G [00:43<01:40, 34.5MiB/s]
21%|██ | 917M/4.38G [00:43<01:59, 28.9MiB/s]
21%|██ | 921M/4.38G [00:43<02:00, 28.7MiB/s]
21%|██ | 924M/4.38G [00:43<02:19, 24.7MiB/s]
21%|██ | 929M/4.38G [00:44<01:53, 30.4MiB/s]
21%|██▏ | 933M/4.38G [00:44<02:11, 26.1MiB/s]
21%|██▏ | 938M/4.38G [00:44<02:20, 24.5MiB/s]
21%|██▏ | 940M/4.38G [00:44<02:31, 22.7MiB/s]
22%|██▏ | 946M/4.38G [00:44<02:34, 22.3MiB/s]
22%|██▏ | 949M/4.38G [00:44<02:34, 22.2MiB/s]
22%|██▏ | 953M/4.38G [00:45<02:04, 27.5MiB/s]
22%|██▏ | 956M/4.38G [00:45<02:08, 26.8MiB/s]
22%|██▏ | 964M/4.38G [00:45<01:28, 38.6MiB/s]
22%|██▏ | 968M/4.38G [00:45<02:32, 22.3MiB/s]
22%|██▏ | 973M/4.38G [00:45<02:37, 21.7MiB/s]
22%|██▏ | 980M/4.38G [00:46<02:23, 23.6MiB/s]
22%|██▏ | 983M/4.38G [00:46<02:52, 19.7MiB/s]
23%|██▎ | 988M/4.38G [00:46<02:20, 24.1MiB/s]
23%|██▎ | 991M/4.38G [00:46<02:31, 22.4MiB/s]
23%|██▎ | 996M/4.38G [00:47<02:55, 19.3MiB/s]
23%|██▎ | 999M/4.38G [00:47<03:15, 17.3MiB/s]
23%|██▎ | 1.00G/4.38G [00:47<02:24, 23.4MiB/s]
23%|██▎ | 1.01G/4.38G [00:47<02:38, 21.3MiB/s]
23%|██▎ | 1.01G/4.38G [00:47<02:34, 21.8MiB/s]
23%|██▎ | 1.01G/4.38G [00:47<02:33, 22.0MiB/s]
23%|██▎ | 1.02G/4.38G [00:47<02:50, 19.7MiB/s]
23%|██▎ | 1.02G/4.38G [00:48<02:02, 27.3MiB/s]
24%|██▎ | 1.03G/4.38G [00:48<01:56, 28.7MiB/s]
24%|██▎ | 1.03G/4.38G [00:48<02:03, 27.1MiB/s]
24%|██▎ | 1.04G/4.38G [00:48<02:06, 26.4MiB/s]
24%|██▎ | 1.04G/4.38G [00:48<02:36, 21.4MiB/s]
24%|██▍ | 1.05G/4.38G [00:49<02:10, 25.5MiB/s]
24%|██▍ | 1.05G/4.38G [00:49<02:31, 22.0MiB/s]
24%|██▍ | 1.06G/4.38G [00:49<02:41, 20.6MiB/s]
24%|██▍ | 1.06G/4.38G [00:49<02:45, 20.1MiB/s]
24%|██▍ | 1.06G/4.38G [00:49<02:18, 24.0MiB/s]
24%|██▍ | 1.07G/4.38G [00:50<02:50, 19.4MiB/s]
24%|██▍ | 1.07G/4.38G [00:50<02:20, 23.5MiB/s]
25%|██▍ | 1.07G/4.38G [00:50<02:50, 19.3MiB/s]
25%|██▍ | 1.08G/4.38G [00:50<02:22, 23.2MiB/s]
25%|██▍ | 1.08G/4.38G [00:50<02:46, 19.8MiB/s]
25%|██▍ | 1.09G/4.38G [00:51<02:47, 19.6MiB/s]
25%|██▍ | 1.09G/4.38G [00:51<02:18, 23.7MiB/s]
25%|██▌ | 1.10G/4.38G [00:51<01:50, 29.8MiB/s]
25%|██▌ | 1.10G/4.38G [00:51<02:45, 19.8MiB/s]
25%|██▌ | 1.11G/4.38G [00:51<02:08, 25.5MiB/s]
25%|██▌ | 1.11G/4.38G [00:51<02:17, 23.9MiB/s]
25%|██▌ | 1.12G/4.38G [00:52<01:58, 27.5MiB/s]
26%|██▌ | 1.12G/4.38G [00:52<01:26, 37.5MiB/s]
26%|██▌ | 1.13G/4.38G [00:52<01:57, 27.8MiB/s]
26%|██▌ | 1.13G/4.38G [00:52<02:11, 24.7MiB/s]
26%|██▌ | 1.14G/4.38G [00:52<02:18, 23.4MiB/s]
26%|██▌ | 1.14G/4.38G [00:53<02:30, 21.6MiB/s]
26%|██▌ | 1.14G/4.38G [00:53<02:45, 19.6MiB/s]
26%|██▌ | 1.15G/4.38G [00:53<02:12, 24.3MiB/s]
26%|██▋ | 1.15G/4.38G [00:53<01:53, 28.5MiB/s]
26%|██▋ | 1.16G/4.38G [00:53<01:57, 27.5MiB/s]
27%|██▋ | 1.16G/4.38G [00:53<01:34, 34.0MiB/s]
27%|██▋ | 1.17G/4.38G [00:54<01:41, 31.5MiB/s]
27%|██▋ | 1.17G/4.38G [00:54<01:52, 28.6MiB/s]
27%|██▋ | 1.18G/4.38G [00:54<02:07, 25.1MiB/s]
27%|██▋ | 1.18G/4.38G [00:54<01:42, 31.2MiB/s]
27%|██▋ | 1.19G/4.38G [00:54<01:55, 27.6MiB/s]
27%|██▋ | 1.19G/4.38G [00:54<02:02, 26.0MiB/s]
27%|██▋ | 1.20G/4.38G [00:55<02:23, 22.3MiB/s]
27%|██▋ | 1.20G/4.38G [00:55<02:40, 19.8MiB/s]
28%|██▊ | 1.21G/4.38G [00:55<01:59, 26.5MiB/s]
28%|██▊ | 1.21G/4.38G [00:55<02:05, 25.2MiB/s]
28%|██▊ | 1.21G/4.38G [00:56<03:05, 17.1MiB/s]
28%|██▊ | 1.22G/4.38G [00:56<03:17, 16.1MiB/s]
28%|██▊ | 1.22G/4.38G [00:56<02:20, 22.4MiB/s]
28%|██▊ | 1.23G/4.38G [00:56<02:31, 20.8MiB/s]
28%|██▊ | 1.23G/4.38G [00:56<02:05, 25.1MiB/s]
28%|██▊ | 1.23G/4.38G [00:57<02:36, 20.1MiB/s]
28%|██▊ | 1.24G/4.38G [00:57<02:04, 25.2MiB/s]
28%|██▊ | 1.24G/4.38G [00:57<02:29, 21.0MiB/s]
28%|██▊ | 1.25G/4.38G [00:57<02:24, 21.6MiB/s]
28%|██▊ | 1.25G/4.38G [00:57<02:25, 21.5MiB/s]
29%|██▊ | 1.25G/4.38G [00:57<02:33, 20.4MiB/s]
29%|██▊ | 1.26G/4.38G [00:57<01:57, 26.6MiB/s]
29%|██▊ | 1.26G/4.38G [00:58<02:20, 22.2MiB/s]
29%|██▉ | 1.26G/4.38G [00:58<01:51, 27.9MiB/s]
29%|██▉ | 1.27G/4.38G [00:58<01:57, 26.4MiB/s]
29%|██▉ | 1.27G/4.38G [00:58<01:45, 29.4MiB/s]
29%|██▉ | 1.28G/4.38G [00:58<01:49, 28.3MiB/s]
29%|██▉ | 1.28G/4.38G [00:58<01:37, 31.8MiB/s]
29%|██▉ | 1.28G/4.38G [00:59<02:18, 22.3MiB/s]
29%|██▉ | 1.29G/4.38G [00:59<02:00, 25.6MiB/s]
30%|██▉ | 1.29G/4.38G [00:59<02:26, 21.1MiB/s]
30%|██▉ | 1.30G/4.38G [00:59<01:59, 25.8MiB/s]
30%|██▉ | 1.30G/4.38G [00:59<02:19, 22.0MiB/s]
30%|██▉ | 1.31G/4.38G [01:00<02:50, 18.1MiB/s]
30%|██▉ | 1.31G/4.38G [01:00<03:06, 16.5MiB/s]
30%|███ | 1.32G/4.38G [01:00<02:53, 17.7MiB/s]
30%|███ | 1.32G/4.38G [01:00<03:16, 15.6MiB/s]
30%|███ | 1.32G/4.38G [01:01<03:05, 16.5MiB/s]
30%|███ | 1.33G/4.38G [01:01<03:24, 14.9MiB/s]
30%|███ | 1.33G/4.38G [01:01<02:48, 18.1MiB/s]
30%|███ | 1.33G/4.38G [01:01<02:30, 20.3MiB/s]
30%|███ | 1.34G/4.38G [01:01<02:30, 20.2MiB/s]
31%|███ | 1.34G/4.38G [01:02<02:17, 22.0MiB/s]
31%|███ | 1.35G/4.38G [01:02<01:45, 28.8MiB/s]
31%|███ | 1.35G/4.38G [01:02<01:46, 28.4MiB/s]
31%|███ | 1.36G/4.38G [01:02<01:41, 29.7MiB/s]
31%|███ | 1.37G/4.38G [01:02<01:19, 38.0MiB/s]
31%|███▏ | 1.37G/4.38G [01:02<01:29, 33.7MiB/s]
31%|███▏ | 1.37G/4.38G [01:03<01:53, 26.4MiB/s]
31%|███▏ | 1.38G/4.38G [01:03<02:06, 23.7MiB/s]
32%|███▏ | 1.38G/4.38G [01:03<01:55, 26.0MiB/s]
32%|███▏ | 1.39G/4.38G [01:03<01:32, 32.5MiB/s]
32%|███▏ | 1.39G/4.38G [01:03<02:07, 23.5MiB/s]
32%|███▏ | 1.40G/4.38G [01:04<01:59, 25.0MiB/s]
32%|███▏ | 1.41G/4.38G [01:04<01:29, 33.2MiB/s]
32%|███▏ | 1.41G/4.38G [01:04<01:52, 26.3MiB/s]
32%|███▏ | 1.42G/4.38G [01:04<02:15, 21.9MiB/s]
32%|███▏ | 1.42G/4.38G [01:04<02:23, 20.6MiB/s]
33%|███▎ | 1.42G/4.38G [01:05<02:48, 17.6MiB/s]
33%|███▎ | 1.43G/4.38G [01:05<02:49, 17.5MiB/s]
33%|███▎ | 1.43G/4.38G [01:05<02:14, 21.9MiB/s]
33%|███▎ | 1.44G/4.38G [01:05<02:46, 17.7MiB/s]
33%|███▎ | 1.44G/4.38G [01:06<02:51, 17.1MiB/s]
33%|███▎ | 1.44G/4.38G [01:06<03:24, 14.4MiB/s]
33%|███▎ | 1.45G/4.38G [01:06<03:23, 14.4MiB/s]
33%|███▎ | 1.45G/4.38G [01:06<03:00, 16.3MiB/s]
33%|███▎ | 1.45G/4.38G [01:06<03:00, 16.2MiB/s]
33%|███▎ | 1.46G/4.38G [01:07<01:53, 25.7MiB/s]
33%|███▎ | 1.46G/4.38G [01:07<02:00, 24.3MiB/s]
33%|███▎ | 1.47G/4.38G [01:07<02:04, 23.5MiB/s]
34%|███▎ | 1.47G/4.38G [01:07<02:19, 20.9MiB/s]
34%|███▎ | 1.48G/4.38G [01:07<01:58, 24.6MiB/s]
34%|███▍ | 1.48G/4.38G [01:07<01:33, 31.0MiB/s]
34%|███▍ | 1.49G/4.38G [01:08<01:53, 25.6MiB/s]
34%|███▍ | 1.49G/4.38G [01:08<02:11, 21.9MiB/s]
34%|███▍ | 1.49G/4.38G [01:08<02:17, 21.0MiB/s]
34%|███▍ | 1.50G/4.38G [01:08<02:00, 23.9MiB/s]
34%|███▍ | 1.50G/4.38G [01:08<02:00, 23.8MiB/s]
34%|███▍ | 1.51G/4.38G [01:09<01:30, 31.6MiB/s]
35%|███▍ | 1.52G/4.38G [01:09<01:16, 37.4MiB/s]
35%|███▍ | 1.52G/4.38G [01:09<01:49, 26.2MiB/s]
35%|███▍ | 1.52G/4.38G [01:09<01:42, 27.9MiB/s]
35%|███▍ | 1.53G/4.38G [01:09<01:51, 25.7MiB/s]
35%|███▌ | 1.53G/4.38G [01:09<01:42, 27.9MiB/s]
35%|███▌ | 1.54G/4.38G [01:10<02:00, 23.5MiB/s]
35%|███▌ | 1.54G/4.38G [01:10<01:46, 26.7MiB/s]
35%|███▌ | 1.55G/4.38G [01:10<01:27, 32.4MiB/s]
35%|███▌ | 1.55G/4.38G [01:10<01:58, 23.9MiB/s]
36%|███▌ | 1.56G/4.38G [01:10<01:34, 30.0MiB/s]
36%|███▌ | 1.56G/4.38G [01:10<01:39, 28.2MiB/s]
36%|███▌ | 1.57G/4.38G [01:11<01:51, 25.2MiB/s]
36%|███▌ | 1.57G/4.38G [01:11<02:13, 21.1MiB/s]
36%|███▌ | 1.58G/4.38G [01:11<02:07, 22.0MiB/s]
36%|███▌ | 1.58G/4.38G [01:11<02:31, 18.5MiB/s]
36%|███▌ | 1.58G/4.38G [01:12<02:21, 19.8MiB/s]
36%|███▌ | 1.59G/4.38G [01:12<02:25, 19.2MiB/s]
36%|███▋ | 1.59G/4.38G [01:12<01:52, 24.7MiB/s]
36%|███▋ | 1.59G/4.38G [01:12<01:56, 23.9MiB/s]
37%|███▋ | 1.60G/4.38G [01:12<01:23, 33.4MiB/s]
37%|███▋ | 1.60G/4.38G [01:12<01:34, 29.4MiB/s]
37%|███▋ | 1.61G/4.38G [01:12<01:41, 27.4MiB/s]
37%|███▋ | 1.61G/4.38G [01:13<02:15, 20.4MiB/s]
37%|███▋ | 1.62G/4.38G [01:13<02:08, 21.4MiB/s]
37%|███▋ | 1.62G/4.38G [01:13<02:10, 21.2MiB/s]
37%|███▋ | 1.62G/4.38G [01:13<02:13, 20.6MiB/s]
37%|███▋ | 1.63G/4.38G [01:13<02:05, 21.9MiB/s]
37%|███▋ | 1.63G/4.38G [01:14<01:44, 26.3MiB/s]
37%|███▋ | 1.64G/4.38G [01:14<01:40, 27.4MiB/s]
37%|███▋ | 1.64G/4.38G [01:14<01:19, 34.6MiB/s]
38%|███▊ | 1.65G/4.38G [01:14<01:23, 32.6MiB/s]
38%|███▊ | 1.65G/4.38G [01:14<01:28, 30.7MiB/s]
38%|███▊ | 1.66G/4.38G [01:14<01:06, 40.6MiB/s]
38%|███▊ | 1.67G/4.38G [01:14<01:19, 34.1MiB/s]
38%|███▊ | 1.67G/4.38G [01:15<01:29, 30.3MiB/s]
38%|███▊ | 1.67G/4.38G [01:15<01:28, 30.5MiB/s]
38%|███▊ | 1.68G/4.38G [01:15<01:26, 31.2MiB/s]
38%|███▊ | 1.68G/4.38G [01:15<01:46, 25.2MiB/s]
39%|███▊ | 1.69G/4.38G [01:15<01:56, 23.2MiB/s]
39%|███▊ | 1.69G/4.38G [01:16<01:39, 27.2MiB/s]
39%|███▊ | 1.70G/4.38G [01:16<01:54, 23.5MiB/s]
39%|███▉ | 1.70G/4.38G [01:16<01:31, 29.4MiB/s]
39%|███▉ | 1.71G/4.38G [01:16<01:13, 36.2MiB/s]
39%|███▉ | 1.71G/4.38G [01:16<01:48, 24.5MiB/s]
39%|███▉ | 1.72G/4.38G [01:17<01:42, 26.1MiB/s]
39%|███▉ | 1.72G/4.38G [01:17<01:46, 25.1MiB/s]
39%|███▉ | 1.73G/4.38G [01:17<01:34, 28.2MiB/s]
39%|███▉ | 1.73G/4.38G [01:17<01:42, 25.8MiB/s]
40%|███▉ | 1.73G/4.38G [01:17<01:41, 26.0MiB/s]
40%|███▉ | 1.74G/4.38G [01:17<01:56, 22.7MiB/s]
40%|███▉ | 1.74G/4.38G [01:18<02:24, 18.2MiB/s]
40%|███▉ | 1.75G/4.38G [01:18<02:33, 17.2MiB/s]
40%|███▉ | 1.75G/4.38G [01:18<02:23, 18.3MiB/s]
40%|████ | 1.75G/4.38G [01:18<02:44, 15.9MiB/s]
40%|████ | 1.76G/4.38G [01:19<02:00, 21.7MiB/s]
40%|████ | 1.77G/4.38G [01:19<01:36, 27.0MiB/s]
40%|████ | 1.77G/4.38G [01:19<01:50, 23.7MiB/s]
41%|████ | 1.78G/4.38G [01:19<01:39, 26.2MiB/s]
41%|████ | 1.78G/4.38G [01:19<01:50, 23.5MiB/s]
41%|████ | 1.79G/4.38G [01:19<01:29, 28.9MiB/s]
41%|████ | 1.79G/4.38G [01:20<02:10, 19.8MiB/s]
41%|████ | 1.80G/4.38G [01:20<01:51, 23.1MiB/s]
41%|████ | 1.80G/4.38G [01:20<01:45, 24.5MiB/s]
41%|████ | 1.80G/4.38G [01:20<02:03, 20.9MiB/s]
41%|████▏ | 1.81G/4.38G [01:21<01:35, 27.0MiB/s]
42%|████▏ | 1.82G/4.38G [01:21<01:18, 32.8MiB/s]
42%|████▏ | 1.82G/4.38G [01:21<01:25, 29.9MiB/s]
42%|████▏ | 1.83G/4.38G [01:21<01:24, 30.1MiB/s]
42%|████▏ | 1.83G/4.38G [01:21<01:37, 26.1MiB/s]
42%|████▏ | 1.84G/4.38G [01:21<01:23, 30.5MiB/s]
42%|████▏ | 1.84G/4.38G [01:22<01:52, 22.6MiB/s]
42%|████▏ | 1.84G/4.38G [01:22<01:31, 27.7MiB/s]
42%|████▏ | 1.85G/4.38G [01:22<01:34, 26.7MiB/s]
42%|████▏ | 1.85G/4.38G [01:22<01:25, 29.6MiB/s]
42%|████▏ | 1.86G/4.38G [01:22<01:14, 33.7MiB/s]
43%|████▎ | 1.86G/4.38G [01:22<01:26, 29.2MiB/s]
43%|████▎ | 1.87G/4.38G [01:23<01:50, 22.7MiB/s]
43%|████▎ | 1.87G/4.38G [01:23<01:51, 22.6MiB/s]
43%|████▎ | 1.88G/4.38G [01:23<01:18, 31.7MiB/s]
43%|████▎ | 1.88G/4.38G [01:24<02:40, 15.6MiB/s]
43%|████▎ | 1.89G/4.38G [01:24<02:36, 15.9MiB/s]
43%|████▎ | 1.89G/4.38G [01:24<02:31, 16.4MiB/s]
43%|████▎ | 1.89G/4.38G [01:24<02:15, 18.4MiB/s]
43%|████▎ | 1.90G/4.38G [01:24<02:14, 18.5MiB/s]
43%|████▎ | 1.90G/4.38G [01:24<01:46, 23.2MiB/s]
44%|████▎ | 1.91G/4.38G [01:25<01:43, 23.8MiB/s]
44%|████▎ | 1.91G/4.38G [01:25<01:57, 21.0MiB/s]
44%|████▍ | 1.92G/4.38G [01:25<01:38, 25.0MiB/s]
44%|████▍ | 1.92G/4.38G [01:25<02:03, 20.0MiB/s]
44%|████▍ | 1.93G/4.38G [01:26<01:50, 22.2MiB/s]
44%|████▍ | 1.94G/4.38G [01:26<01:34, 25.8MiB/s]
44%|████▍ | 1.94G/4.38G [01:26<01:38, 24.8MiB/s]
44%|████▍ | 1.94G/4.38G [01:26<01:26, 28.0MiB/s]
44%|████▍ | 1.95G/4.38G [01:26<01:31, 26.6MiB/s]
45%|████▍ | 1.95G/4.38G [01:26<01:08, 35.5MiB/s]
45%|████▍ | 1.96G/4.38G [01:27<01:21, 29.8MiB/s]
45%|████▍ | 1.96G/4.38G [01:27<01:12, 33.2MiB/s]
45%|████▍ | 1.97G/4.38G [01:27<00:58, 40.9MiB/s]
45%|████▌ | 1.97G/4.38G [01:27<01:08, 35.2MiB/s]
45%|████▌ | 1.98G/4.38G [01:27<01:17, 30.8MiB/s]
45%|████▌ | 1.99G/4.38G [01:27<01:31, 26.1MiB/s]
45%|████▌ | 1.99G/4.38G [01:28<01:36, 24.7MiB/s]
46%|████▌ | 2.00G/4.38G [01:28<01:26, 27.7MiB/s]
46%|████▌ | 2.00G/4.38G [01:28<01:37, 24.4MiB/s]
46%|████▌ | 2.00G/4.38G [01:28<01:33, 25.5MiB/s]
46%|████▌ | 2.01G/4.38G [01:28<01:36, 24.6MiB/s]
46%|████▌ | 2.01G/4.38G [01:28<01:25, 27.7MiB/s]
46%|████▌ | 2.01G/4.38G [01:29<01:30, 26.2MiB/s]
46%|████▌ | 2.02G/4.38G [01:29<01:14, 31.7MiB/s]
46%|████▌ | 2.02G/4.38G [01:29<01:28, 26.5MiB/s]
46%|████▋ | 2.03G/4.38G [01:29<01:33, 25.3MiB/s]
46%|████▋ | 2.03G/4.38G [01:29<01:44, 22.4MiB/s]
46%|████▋ | 2.04G/4.38G [01:29<01:31, 25.6MiB/s]
47%|████▋ | 2.04G/4.38G [01:29<01:32, 25.2MiB/s]
47%|████▋ | 2.04G/4.38G [01:30<02:13, 17.5MiB/s]
47%|████▋ | 2.05G/4.38G [01:30<02:02, 19.1MiB/s]
47%|████▋ | 2.05G/4.38G [01:30<02:16, 17.1MiB/s]
47%|████▋ | 2.05G/4.38G [01:30<01:51, 20.9MiB/s]
47%|████▋ | 2.06G/4.38G [01:31<02:13, 17.4MiB/s]
47%|████▋ | 2.06G/4.38G [01:31<02:13, 17.3MiB/s]
47%|████▋ | 2.06G/4.38G [01:31<02:24, 16.1MiB/s]
47%|████▋ | 2.07G/4.38G [01:31<01:47, 21.5MiB/s]
47%|████▋ | 2.07G/4.38G [01:31<02:00, 19.2MiB/s]
47%|████▋ | 2.08G/4.38G [01:32<01:32, 25.0MiB/s]
48%|████▊ | 2.08G/4.38G [01:32<01:45, 21.9MiB/s]
48%|████▊ | 2.09G/4.38G [01:32<02:11, 17.4MiB/s]
48%|████▊ | 2.09G/4.38G [01:32<02:24, 15.8MiB/s]
48%|████▊ | 2.09G/4.38G [01:33<02:23, 15.9MiB/s]
48%|████▊ | 2.10G/4.38G [01:33<02:03, 18.4MiB/s]
48%|████▊ | 2.10G/4.38G [01:33<02:24, 15.8MiB/s]
48%|████▊ | 2.10G/4.38G [01:33<01:43, 22.1MiB/s]
48%|████▊ | 2.11G/4.38G [01:33<01:57, 19.3MiB/s]
48%|████▊ | 2.11G/4.38G [01:33<01:31, 24.8MiB/s]
48%|████▊ | 2.12G/4.38G [01:34<01:38, 23.1MiB/s]
48%|████▊ | 2.12G/4.38G [01:34<02:25, 15.6MiB/s]
48%|████▊ | 2.12G/4.38G [01:34<02:34, 14.6MiB/s]
49%|████▊ | 2.13G/4.38G [01:35<02:18, 16.3MiB/s]
49%|████▊ | 2.13G/4.38G [01:35<02:36, 14.4MiB/s]
49%|████▊ | 2.13G/4.38G [01:35<02:14, 16.7MiB/s]
49%|████▉ | 2.14G/4.38G [01:35<01:53, 19.8MiB/s]
49%|████▉ | 2.14G/4.38G [01:35<02:08, 17.5MiB/s]
49%|████▉ | 2.15G/4.38G [01:35<01:38, 22.8MiB/s]
49%|████▉ | 2.15G/4.38G [01:36<02:01, 18.4MiB/s]
49%|████▉ | 2.15G/4.38G [01:36<02:00, 18.5MiB/s]
49%|████▉ | 2.16G/4.38G [01:36<01:38, 22.6MiB/s]
49%|████▉ | 2.16G/4.38G [01:36<01:27, 25.4MiB/s]
49%|████▉ | 2.17G/4.38G [01:36<01:41, 21.9MiB/s]
50%|████▉ | 2.17G/4.38G [01:36<01:14, 29.8MiB/s]
50%|████▉ | 2.18G/4.38G [01:37<01:21, 27.0MiB/s]
50%|████▉ | 2.18G/4.38G [01:37<01:36, 22.7MiB/s]
50%|████▉ | 2.19G/4.38G [01:37<01:29, 24.6MiB/s]
50%|████▉ | 2.19G/4.38G [01:37<01:44, 21.0MiB/s]
50%|█████ | 2.20G/4.38G [01:37<01:22, 26.6MiB/s]
50%|█████ | 2.20G/4.38G [01:38<01:33, 23.4MiB/s]
50%|█████ | 2.20G/4.38G [01:38<01:31, 23.8MiB/s]
50%|█████ | 2.21G/4.38G [01:38<01:29, 24.2MiB/s]
50%|█████ | 2.21G/4.38G [01:38<01:10, 30.9MiB/s]
51%|█████ | 2.22G/4.38G [01:38<01:42, 21.2MiB/s]
51%|█████ | 2.22G/4.38G [01:38<01:13, 29.2MiB/s]
51%|█████ | 2.23G/4.38G [01:39<01:27, 24.5MiB/s]
51%|█████ | 2.23G/4.38G [01:39<01:47, 20.1MiB/s]
51%|█████ | 2.23G/4.38G [01:39<01:57, 18.3MiB/s]
51%|█████ | 2.24G/4.38G [01:39<01:25, 25.2MiB/s]
51%|█████ | 2.24G/4.38G [01:39<01:35, 22.5MiB/s]
51%|█████▏ | 2.25G/4.38G [01:40<01:16, 28.0MiB/s]
51%|█████▏ | 2.25G/4.38G [01:40<01:31, 23.2MiB/s]
51%|█████▏ | 2.25G/4.38G [01:40<01:31, 23.4MiB/s]
52%|█████▏ | 2.26G/4.38G [01:40<01:46, 20.0MiB/s]
52%|█████▏ | 2.26G/4.38G [01:40<01:32, 23.0MiB/s]
52%|█████▏ | 2.27G/4.38G [01:41<01:42, 20.7MiB/s]
52%|█████▏ | 2.27G/4.38G [01:41<01:23, 25.1MiB/s]
52%|█████▏ | 2.27G/4.38G [01:41<02:16, 15.4MiB/s]
52%|█████▏ | 2.28G/4.38G [01:41<01:31, 22.9MiB/s]
52%|█████▏ | 2.29G/4.38G [01:41<01:15, 27.9MiB/s]
52%|█████▏ | 2.29G/4.38G [01:42<01:32, 22.6MiB/s]
52%|█████▏ | 2.30G/4.38G [01:42<01:18, 26.5MiB/s]
52%|█████▏ | 2.30G/4.38G [01:42<01:32, 22.6MiB/s]
53%|█████▎ | 2.31G/4.38G [01:42<01:31, 22.8MiB/s]
53%|█████▎ | 2.31G/4.38G [01:42<01:37, 21.2MiB/s]
53%|█████▎ | 2.31G/4.38G [01:43<01:55, 17.9MiB/s]
53%|█████▎ | 2.32G/4.38G [01:43<02:28, 13.9MiB/s]
53%|█████▎ | 2.32G/4.38G [01:43<01:30, 22.7MiB/s]
53%|█████▎ | 2.33G/4.38G [01:43<01:26, 23.7MiB/s]
53%|█████▎ | 2.33G/4.38G [01:44<01:36, 21.2MiB/s]
53%|█████▎ | 2.33G/4.38G [01:44<01:46, 19.3MiB/s]
53%|█████▎ | 2.34G/4.38G [01:44<01:19, 25.8MiB/s]
53%|█████▎ | 2.34G/4.38G [01:44<01:41, 20.1MiB/s]
54%|█████▎ | 2.35G/4.38G [01:44<01:25, 23.7MiB/s]
54%|█████▎ | 2.35G/4.38G [01:44<01:45, 19.2MiB/s]
54%|█████▍ | 2.36G/4.38G [01:45<01:12, 27.9MiB/s]
54%|█████▍ | 2.36G/4.38G [01:45<00:56, 35.6MiB/s]
54%|█████▍ | 2.37G/4.38G [01:45<01:10, 28.4MiB/s]
54%|█████▍ | 2.37G/4.38G [01:45<01:24, 23.7MiB/s]
54%|█████▍ | 2.38G/4.38G [01:45<01:36, 20.8MiB/s]
54%|█████▍ | 2.38G/4.38G [01:46<01:23, 24.1MiB/s]
54%|█████▍ | 2.38G/4.38G [01:46<01:36, 20.7MiB/s]
55%|█████▍ | 2.39G/4.38G [01:46<01:34, 21.1MiB/s]
55%|█████▍ | 2.39G/4.38G [01:46<01:39, 20.0MiB/s]
55%|█████▍ | 2.40G/4.38G [01:47<01:56, 17.0MiB/s]
55%|█████▍ | 2.40G/4.38G [01:47<02:16, 14.5MiB/s]
55%|█████▍ | 2.41G/4.38G [01:47<01:46, 18.6MiB/s]
55%|█████▍ | 2.41G/4.38G [01:47<02:06, 15.6MiB/s]
55%|█████▌ | 2.41G/4.38G [01:48<01:49, 18.0MiB/s]
55%|█████▌ | 2.42G/4.38G [01:48<01:53, 17.3MiB/s]
55%|█████▌ | 2.42G/4.38G [01:48<01:50, 17.7MiB/s]
55%|█████▌ | 2.42G/4.38G [01:48<02:10, 15.0MiB/s]
55%|█████▌ | 2.43G/4.38G [01:48<02:22, 13.7MiB/s]
55%|█████▌ | 2.43G/4.38G [01:49<02:27, 13.2MiB/s]
56%|█████▌ | 2.43G/4.38G [01:49<02:38, 12.3MiB/s]
56%|█████▌ | 2.43G/4.38G [01:49<02:27, 13.2MiB/s]
56%|█████▌ | 2.44G/4.38G [01:49<01:52, 17.2MiB/s]
56%|█████▌ | 2.44G/4.38G [01:50<02:27, 13.2MiB/s]
56%|█████▌ | 2.45G/4.38G [01:50<01:21, 23.6MiB/s]
56%|█████▌ | 2.46G/4.38G [01:50<01:03, 30.4MiB/s]
56%|█████▌ | 2.46G/4.38G [01:50<01:09, 27.5MiB/s]
56%|█████▋ | 2.46G/4.38G [01:50<01:13, 26.2MiB/s]
56%|█████▋ | 2.47G/4.38G [01:50<01:16, 24.9MiB/s]
56%|█████▋ | 2.47G/4.38G [01:50<01:01, 30.9MiB/s]
57%|█████▋ | 2.48G/4.38G [01:51<01:15, 25.2MiB/s]
57%|█████▋ | 2.48G/4.38G [01:51<01:08, 27.8MiB/s]
57%|█████▋ | 2.48G/4.38G [01:51<01:08, 27.8MiB/s]
57%|█████▋ | 2.49G/4.38G [01:51<01:12, 26.0MiB/s]
57%|█████▋ | 2.50G/4.38G [01:51<01:01, 30.4MiB/s]
57%|█████▋ | 2.50G/4.38G [01:52<01:09, 27.0MiB/s]
57%|█████▋ | 2.51G/4.38G [01:52<00:58, 32.2MiB/s]
57%|█████▋ | 2.51G/4.38G [01:52<01:07, 27.7MiB/s]
57%|█████▋ | 2.52G/4.38G [01:52<01:08, 27.4MiB/s]
57%|█████▋ | 2.52G/4.38G [01:52<01:14, 25.0MiB/s]
58%|█████▊ | 2.52G/4.38G [01:52<01:00, 30.6MiB/s]
58%|█████▊ | 2.53G/4.38G [01:53<01:29, 20.7MiB/s]
58%|█████▊ | 2.53G/4.38G [01:53<01:48, 17.0MiB/s]
58%|█████▊ | 2.53G/4.38G [01:53<02:07, 14.5MiB/s]
58%|█████▊ | 2.54G/4.38G [01:53<01:35, 19.3MiB/s]
58%|█████▊ | 2.54G/4.38G [01:54<01:32, 20.0MiB/s]
58%|█████▊ | 2.55G/4.38G [01:54<01:23, 22.1MiB/s]
58%|█████▊ | 2.55G/4.38G [01:54<01:23, 21.9MiB/s]
58%|█████▊ | 2.55G/4.38G [01:54<01:45, 17.3MiB/s]
58%|█████▊ | 2.56G/4.38G [01:54<01:45, 17.2MiB/s]
58%|█████▊ | 2.56G/4.38G [01:55<01:58, 15.4MiB/s]
59%|█████▊ | 2.57G/4.38G [01:55<01:23, 21.7MiB/s]
59%|█████▊ | 2.57G/4.38G [01:55<01:33, 19.4MiB/s]
59%|█████▊ | 2.57G/4.38G [01:55<01:19, 22.9MiB/s]
59%|█████▉ | 2.58G/4.38G [01:55<01:22, 21.9MiB/s]
59%|█████▉ | 2.58G/4.38G [01:55<01:17, 23.2MiB/s]
59%|█████▉ | 2.58G/4.38G [01:56<01:26, 20.8MiB/s]
59%|█████▉ | 2.59G/4.38G [01:56<02:03, 14.6MiB/s]
59%|█████▉ | 2.59G/4.38G [01:56<02:21, 12.7MiB/s]
59%|█████▉ | 2.59G/4.38G [01:56<02:23, 12.5MiB/s]
59%|█████▉ | 2.60G/4.38G [01:57<01:33, 19.1MiB/s]
59%|█████▉ | 2.60G/4.38G [01:57<01:33, 19.0MiB/s]
60%|█████▉ | 2.61G/4.38G [01:57<01:07, 26.1MiB/s]
60%|█████▉ | 2.61G/4.38G [01:57<01:06, 26.5MiB/s]
60%|█████▉ | 2.62G/4.38G [01:57<00:53, 33.2MiB/s]
60%|█████▉ | 2.62G/4.38G [01:57<00:56, 31.3MiB/s]
60%|█████▉ | 2.63G/4.38G [01:58<01:04, 27.2MiB/s]
60%|██████ | 2.63G/4.38G [01:58<01:40, 17.5MiB/s]
60%|██████ | 2.63G/4.38G [01:58<01:47, 16.2MiB/s]
60%|██████ | 2.64G/4.38G [01:59<01:35, 18.2MiB/s]
60%|██████ | 2.64G/4.38G [01:59<01:36, 18.0MiB/s]
60%|██████ | 2.65G/4.38G [01:59<01:12, 23.9MiB/s]
61%|██████ | 2.65G/4.38G [01:59<01:28, 19.6MiB/s]
61%|██████ | 2.66G/4.38G [01:59<01:18, 21.9MiB/s]
61%|██████ | 2.66G/4.38G [01:59<01:27, 19.6MiB/s]
61%|██████ | 2.67G/4.38G [02:00<01:06, 25.8MiB/s]
61%|██████ | 2.67G/4.38G [02:00<01:18, 21.8MiB/s]
61%|██████ | 2.67G/4.38G [02:00<01:01, 27.6MiB/s]
61%|██████ | 2.68G/4.38G [02:00<01:17, 21.9MiB/s]
61%|██████ | 2.68G/4.38G [02:00<01:18, 21.7MiB/s]
61%|██████▏ | 2.69G/4.38G [02:00<01:19, 21.3MiB/s]
61%|██████▏ | 2.69G/4.38G [02:01<01:05, 25.7MiB/s]
61%|██████▏ | 2.69G/4.38G [02:01<01:16, 22.1MiB/s]
62%|██████▏ | 2.70G/4.38G [02:01<01:33, 18.0MiB/s]
62%|██████▏ | 2.70G/4.38G [02:01<01:24, 19.8MiB/s]
62%|██████▏ | 2.70G/4.38G [02:02<01:49, 15.3MiB/s]
62%|██████▏ | 2.71G/4.38G [02:02<01:30, 18.4MiB/s]
62%|██████▏ | 2.71G/4.38G [02:02<01:38, 16.9MiB/s]
62%|██████▏ | 2.72G/4.38G [02:02<01:14, 22.5MiB/s]
62%|██████▏ | 2.72G/4.38G [02:02<00:55, 29.8MiB/s]
62%|██████▏ | 2.73G/4.38G [02:02<01:07, 24.6MiB/s]
62%|██████▏ | 2.73G/4.38G [02:03<01:00, 27.4MiB/s]
62%|██████▏ | 2.74G/4.38G [02:03<01:05, 25.0MiB/s]
63%|██████▎ | 2.74G/4.38G [02:03<01:02, 26.2MiB/s]
63%|██████▎ | 2.75G/4.38G [02:03<00:55, 29.4MiB/s]
63%|██████▎ | 2.75G/4.38G [02:03<01:01, 26.4MiB/s]
63%|██████▎ | 2.76G/4.38G [02:03<01:10, 23.2MiB/s]
63%|██████▎ | 2.76G/4.38G [02:04<01:19, 20.4MiB/s]
63%|██████▎ | 2.76G/4.38G [02:04<01:19, 20.5MiB/s]
63%|██████▎ | 2.76G/4.38G [02:04<02:00, 13.4MiB/s]
63%|██████▎ | 2.77G/4.38G [02:04<01:33, 17.2MiB/s]
63%|██████▎ | 2.78G/4.38G [02:05<01:21, 19.6MiB/s]
63%|██████▎ | 2.78G/4.38G [02:05<01:44, 15.3MiB/s]
64%|██████▎ | 2.78G/4.38G [02:05<01:56, 13.7MiB/s]
64%|██████▎ | 2.79G/4.38G [02:06<01:58, 13.5MiB/s]
64%|██████▎ | 2.79G/4.38G [02:06<01:41, 15.6MiB/s]
64%|██████▍ | 2.79G/4.38G [02:06<01:49, 14.5MiB/s]
64%|██████▍ | 2.80G/4.38G [02:06<01:33, 16.9MiB/s]
64%|██████▍ | 2.80G/4.38G [02:07<01:43, 15.2MiB/s]
64%|██████▍ | 2.81G/4.38G [02:07<01:14, 21.2MiB/s]
64%|██████▍ | 2.81G/4.38G [02:07<01:21, 19.2MiB/s]
64%|██████▍ | 2.82G/4.38G [02:07<01:12, 21.6MiB/s]
64%|██████▍ | 2.82G/4.38G [02:07<01:14, 21.1MiB/s]
64%|██████▍ | 2.83G/4.38G [02:07<00:52, 29.4MiB/s]
65%|██████▍ | 2.83G/4.38G [02:08<00:57, 26.9MiB/s]
65%|██████▍ | 2.83G/4.38G [02:08<00:54, 28.3MiB/s]
65%|██████▍ | 2.84G/4.38G [02:08<01:10, 22.0MiB/s]
65%|██████▍ | 2.84G/4.38G [02:08<01:16, 20.2MiB/s]
65%|██████▍ | 2.84G/4.38G [02:08<01:19, 19.2MiB/s]
65%|██████▌ | 2.85G/4.38G [02:09<01:08, 22.3MiB/s]
65%|██████▌ | 2.85G/4.38G [02:09<01:12, 21.0MiB/s]
65%|██████▌ | 2.86G/4.38G [02:09<00:58, 25.9MiB/s]
65%|██████▌ | 2.86G/4.38G [02:09<01:12, 20.9MiB/s]
65%|██████▌ | 2.86G/4.38G [02:09<01:11, 21.1MiB/s]
65%|██████▌ | 2.87G/4.38G [02:09<01:04, 23.4MiB/s]
66%|██████▌ | 2.88G/4.38G [02:09<00:47, 32.0MiB/s]
66%|██████▌ | 2.88G/4.38G [02:10<00:48, 30.8MiB/s]
66%|██████▌ | 2.88G/4.38G [02:10<00:52, 28.7MiB/s]
66%|██████▌ | 2.89G/4.38G [02:10<00:56, 26.5MiB/s]
66%|██████▌ | 2.89G/4.38G [02:10<00:48, 30.8MiB/s]
66%|██████▌ | 2.90G/4.38G [02:10<00:51, 28.9MiB/s]
66%|██████▌ | 2.90G/4.38G [02:10<00:54, 27.1MiB/s]
66%|██████▋ | 2.91G/4.38G [02:11<00:47, 31.1MiB/s]
66%|██████▋ | 2.91G/4.38G [02:11<00:47, 30.9MiB/s]
67%|██████▋ | 2.91G/4.38G [02:11<01:02, 23.6MiB/s]
67%|██████▋ | 2.92G/4.38G [02:11<01:02, 23.5MiB/s]
67%|██████▋ | 2.92G/4.38G [02:11<01:16, 19.1MiB/s]
67%|██████▋ | 2.93G/4.38G [02:12<01:06, 22.0MiB/s]
67%|██████▋ | 2.93G/4.38G [02:12<01:28, 16.5MiB/s]
67%|██████▋ | 2.93G/4.38G [02:12<01:34, 15.4MiB/s]
67%|██████▋ | 2.94G/4.38G [02:12<01:20, 17.9MiB/s]
67%|██████▋ | 2.94G/4.38G [02:12<00:51, 27.7MiB/s]
67%|██████▋ | 2.95G/4.38G [02:12<00:53, 26.7MiB/s]
67%|██████▋ | 2.95G/4.38G [02:13<00:53, 26.5MiB/s]
68%|██████▊ | 2.96G/4.38G [02:13<00:38, 36.7MiB/s]
68%|██████▊ | 2.97G/4.38G [02:13<00:37, 37.4MiB/s]
68%|██████▊ | 2.97G/4.38G [02:13<00:53, 26.6MiB/s]
68%|██████▊ | 2.98G/4.38G [02:14<01:00, 23.1MiB/s]
68%|██████▊ | 2.98G/4.38G [02:14<01:03, 22.2MiB/s]
68%|██████▊ | 2.98G/4.38G [02:14<00:54, 25.6MiB/s]
68%|██████▊ | 2.99G/4.38G [02:14<01:02, 22.2MiB/s]
68%|██████▊ | 2.99G/4.38G [02:15<01:26, 16.0MiB/s]
68%|██████▊ | 3.00G/4.38G [02:15<01:30, 15.3MiB/s]
69%|██████▊ | 3.00G/4.38G [02:15<01:07, 20.4MiB/s]
69%|██████▊ | 3.00G/4.38G [02:15<01:18, 17.4MiB/s]
69%|██████▊ | 3.01G/4.38G [02:15<00:57, 23.7MiB/s]
69%|██████▉ | 3.01G/4.38G [02:15<01:08, 20.1MiB/s]
69%|██████▉ | 3.02G/4.38G [02:16<01:09, 19.7MiB/s]
69%|██████▉ | 3.02G/4.38G [02:16<01:10, 19.4MiB/s]
69%|██████▉ | 3.03G/4.38G [02:16<00:56, 23.9MiB/s]
69%|██████▉ | 3.03G/4.38G [02:16<00:59, 22.8MiB/s]
69%|██████▉ | 3.03G/4.38G [02:16<00:52, 25.6MiB/s]
69%|██████▉ | 3.04G/4.38G [02:17<01:06, 20.3MiB/s]
69%|██████▉ | 3.04G/4.38G [02:17<01:10, 18.9MiB/s]
70%|██████▉ | 3.05G/4.38G [02:17<01:17, 17.3MiB/s]
70%|██████▉ | 3.05G/4.38G [02:17<00:50, 26.1MiB/s]
70%|██████▉ | 3.06G/4.38G [02:17<00:57, 23.1MiB/s]
70%|██████▉ | 3.06G/4.38G [02:18<00:51, 25.9MiB/s]
70%|███████ | 3.07G/4.38G [02:18<00:41, 31.6MiB/s]
70%|███████ | 3.07G/4.38G [02:18<00:45, 28.5MiB/s]
70%|███████ | 3.08G/4.38G [02:18<00:58, 22.2MiB/s]
70%|███████ | 3.08G/4.38G [02:18<01:02, 20.8MiB/s]
70%|███████ | 3.08G/4.38G [02:18<00:53, 24.4MiB/s]
70%|███████ | 3.09G/4.38G [02:19<01:07, 19.3MiB/s]
71%|███████ | 3.09G/4.38G [02:19<01:12, 17.8MiB/s]
71%|███████ | 3.09G/4.38G [02:19<01:00, 21.3MiB/s]
71%|███████ | 3.10G/4.38G [02:19<01:02, 20.4MiB/s]
71%|███████ | 3.10G/4.38G [02:19<00:39, 32.1MiB/s]
71%|███████ | 3.11G/4.38G [02:19<00:45, 28.0MiB/s]
71%|███████ | 3.11G/4.38G [02:20<00:43, 29.1MiB/s]
71%|███████ | 3.12G/4.38G [02:20<00:42, 29.7MiB/s]
71%|███████▏ | 3.12G/4.38G [02:20<00:49, 25.4MiB/s]
71%|███████▏ | 3.13G/4.38G [02:20<00:53, 23.3MiB/s]
71%|███████▏ | 3.13G/4.38G [02:21<01:05, 19.0MiB/s]
72%|███████▏ | 3.14G/4.38G [02:21<00:51, 24.1MiB/s]
72%|███████▏ | 3.15G/4.38G [02:21<00:37, 33.3MiB/s]
72%|███████▏ | 3.15G/4.38G [02:21<00:38, 31.9MiB/s]
72%|███████▏ | 3.15G/4.38G [02:21<00:38, 32.0MiB/s]
72%|███████▏ | 3.16G/4.38G [02:21<00:29, 41.7MiB/s]
72%|███████▏ | 3.17G/4.38G [02:21<00:28, 42.2MiB/s]
72%|███████▏ | 3.17G/4.38G [02:22<00:47, 25.7MiB/s]
73%|███████▎ | 3.18G/4.38G [02:22<00:36, 33.0MiB/s]
73%|███████▎ | 3.18G/4.38G [02:22<00:35, 33.4MiB/s]
73%|███████▎ | 3.19G/4.38G [02:22<00:44, 26.5MiB/s]
73%|███████▎ | 3.19G/4.38G [02:22<00:34, 34.2MiB/s]
73%|███████▎ | 3.20G/4.38G [02:22<00:35, 33.0MiB/s]
73%|███████▎ | 3.20G/4.38G [02:23<00:39, 29.9MiB/s]
73%|███████▎ | 3.21G/4.38G [02:23<00:29, 39.7MiB/s]
73%|███████▎ | 3.22G/4.38G [02:23<00:31, 37.3MiB/s]
74%|███████▎ | 3.22G/4.38G [02:23<00:43, 26.7MiB/s]
74%|███████▎ | 3.23G/4.38G [02:24<00:49, 23.1MiB/s]
74%|███████▎ | 3.23G/4.38G [02:24<00:49, 23.5MiB/s]
74%|███████▍ | 3.23G/4.38G [02:24<00:57, 20.0MiB/s]
74%|███████▍ | 3.24G/4.38G [02:24<01:00, 19.0MiB/s]
74%|███████▍ | 3.24G/4.38G [02:24<00:45, 24.9MiB/s]
74%|███████▍ | 3.25G/4.38G [02:25<00:50, 22.3MiB/s]
74%|███████▍ | 3.25G/4.38G [02:25<00:43, 26.1MiB/s]
74%|███████▍ | 3.26G/4.38G [02:25<00:47, 23.8MiB/s]
74%|███████▍ | 3.26G/4.38G [02:25<00:37, 29.6MiB/s]
75%|███████▍ | 3.27G/4.38G [02:25<00:32, 34.0MiB/s]
75%|███████▍ | 3.27G/4.38G [02:25<00:35, 31.1MiB/s]
75%|███████▍ | 3.28G/4.38G [02:25<00:32, 33.8MiB/s]
75%|███████▍ | 3.28G/4.38G [02:26<00:37, 29.5MiB/s]
75%|███████▌ | 3.29G/4.38G [02:26<00:40, 26.9MiB/s]
75%|███████▌ | 3.29G/4.38G [02:26<00:54, 20.2MiB/s]
75%|███████▌ | 3.30G/4.38G [02:26<00:41, 26.4MiB/s]
75%|███████▌ | 3.30G/4.38G [02:26<00:53, 20.2MiB/s]
75%|███████▌ | 3.31G/4.38G [02:27<00:50, 21.3MiB/s]
76%|███████▌ | 3.31G/4.38G [02:27<00:36, 29.7MiB/s]
76%|███████▌ | 3.32G/4.38G [02:27<00:38, 27.7MiB/s]
76%|███████▌ | 3.32G/4.38G [02:27<00:50, 21.1MiB/s]
76%|███████▌ | 3.32G/4.38G [02:27<00:48, 21.7MiB/s]
76%|███████▌ | 3.33G/4.38G [02:28<00:41, 25.3MiB/s]
76%|███████▌ | 3.33G/4.38G [02:28<00:40, 25.8MiB/s]
76%|███████▌ | 3.34G/4.38G [02:28<00:35, 29.0MiB/s]
76%|███████▌ | 3.34G/4.38G [02:28<00:38, 26.7MiB/s]
76%|███████▋ | 3.35G/4.38G [02:28<00:34, 30.4MiB/s]
76%|███████▋ | 3.35G/4.38G [02:28<00:42, 24.6MiB/s]
77%|███████▋ | 3.36G/4.38G [02:28<00:31, 33.1MiB/s]
77%|███████▋ | 3.36G/4.38G [02:29<00:35, 29.2MiB/s]
77%|███████▋ | 3.36G/4.38G [02:29<00:36, 27.9MiB/s]
77%|███████▋ | 3.37G/4.38G [02:29<00:31, 31.9MiB/s]
77%|███████▋ | 3.37G/4.38G [02:29<00:34, 29.5MiB/s]
77%|███████▋ | 3.38G/4.38G [02:29<00:28, 35.1MiB/s]
77%|███████▋ | 3.39G/4.38G [02:29<00:24, 40.6MiB/s]
77%|███████▋ | 3.39G/4.38G [02:30<00:38, 25.6MiB/s]
78%|███████▊ | 3.40G/4.38G [02:30<00:39, 24.9MiB/s]
78%|███████▊ | 3.41G/4.38G [02:30<00:29, 33.6MiB/s]
78%|███████▊ | 3.41G/4.38G [02:30<00:27, 35.0MiB/s]
78%|███████▊ | 3.41G/4.38G [02:30<00:29, 33.2MiB/s]
78%|███████▊ | 3.42G/4.38G [02:31<00:36, 26.0MiB/s]
78%|███████▊ | 3.42G/4.38G [02:31<00:40, 23.4MiB/s]
78%|███████▊ | 3.43G/4.38G [02:31<00:38, 24.9MiB/s]
78%|███████▊ | 3.43G/4.38G [02:31<00:41, 22.9MiB/s]
78%|███████▊ | 3.44G/4.38G [02:31<00:33, 28.4MiB/s]
79%|███████▊ | 3.44G/4.38G [02:32<00:35, 26.8MiB/s]
79%|███████▊ | 3.45G/4.38G [02:32<00:36, 25.4MiB/s]
79%|███████▊ | 3.45G/4.38G [02:32<00:38, 24.1MiB/s]
79%|███████▉ | 3.46G/4.38G [02:32<00:28, 33.0MiB/s]
79%|███████▉ | 3.46G/4.38G [02:32<00:28, 32.6MiB/s]
79%|███████▉ | 3.46G/4.38G [02:32<00:25, 35.8MiB/s]
79%|███████▉ | 3.47G/4.38G [02:32<00:35, 25.8MiB/s]
79%|███████▉ | 3.47G/4.38G [02:33<00:33, 27.0MiB/s]
79%|███████▉ | 3.47G/4.38G [02:33<00:40, 22.5MiB/s]
79%|███████▉ | 3.48G/4.38G [02:33<00:41, 21.7MiB/s]
79%|███████▉ | 3.48G/4.38G [02:33<00:40, 22.1MiB/s]
80%|███████▉ | 3.49G/4.38G [02:33<00:32, 27.0MiB/s]
80%|███████▉ | 3.50G/4.38G [02:33<00:28, 31.0MiB/s]
80%|███████▉ | 3.50G/4.38G [02:34<00:34, 25.8MiB/s]
80%|███████▉ | 3.50G/4.38G [02:34<00:30, 28.5MiB/s]
80%|████████ | 3.51G/4.38G [02:34<00:33, 26.1MiB/s]
80%|████████ | 3.51G/4.38G [02:34<00:30, 28.1MiB/s]
80%|████████ | 3.52G/4.38G [02:34<00:25, 33.5MiB/s]
80%|████████ | 3.52G/4.38G [02:35<00:30, 28.5MiB/s]
81%|████████ | 3.53G/4.38G [02:35<00:28, 29.7MiB/s]
81%|████████ | 3.53G/4.38G [02:35<00:32, 26.0MiB/s]
81%|████████ | 3.54G/4.38G [02:35<00:24, 33.7MiB/s]
81%|████████ | 3.54G/4.38G [02:35<00:27, 30.8MiB/s]
81%|████████ | 3.55G/4.38G [02:35<00:30, 27.4MiB/s]
81%|████████ | 3.55G/4.38G [02:36<00:40, 20.6MiB/s]
81%|████████ | 3.56G/4.38G [02:36<00:33, 24.8MiB/s]
81%|████████ | 3.56G/4.38G [02:36<00:39, 20.9MiB/s]
81%|████████▏ | 3.56G/4.38G [02:36<00:47, 17.4MiB/s]
81%|████████▏ | 3.56G/4.38G [02:36<00:44, 18.4MiB/s]
81%|████████▏ | 3.56G/4.38G [02:36<00:47, 17.3MiB/s]
81%|████████▏ | 3.57G/4.38G [02:37<00:51, 15.7MiB/s]
82%|████████▏ | 3.57G/4.38G [02:37<00:35, 22.5MiB/s]
82%|████████▏ | 3.58G/4.38G [02:37<00:30, 26.3MiB/s]
82%|████████▏ | 3.58G/4.38G [02:37<00:36, 22.2MiB/s]
82%|████████▏ | 3.59G/4.38G [02:37<00:27, 29.0MiB/s]
82%|████████▏ | 3.60G/4.38G [02:38<00:37, 20.7MiB/s]
82%|████████▏ | 3.60G/4.38G [02:38<00:37, 20.6MiB/s]
82%|████████▏ | 3.61G/4.38G [02:38<00:36, 21.3MiB/s]
82%|████████▏ | 3.61G/4.38G [02:38<00:37, 20.5MiB/s]
82%|████████▏ | 3.61G/4.38G [02:38<00:31, 24.4MiB/s]
83%|████████▎ | 3.62G/4.38G [02:39<00:37, 20.2MiB/s]
83%|████████▎ | 3.62G/4.38G [02:39<00:28, 26.3MiB/s]
83%|████████▎ | 3.63G/4.38G [02:39<00:28, 26.7MiB/s]
83%|████████▎ | 3.63G/4.38G [02:39<00:24, 30.1MiB/s]
83%|████████▎ | 3.63G/4.38G [02:39<00:29, 25.6MiB/s]
83%|████████▎ | 3.64G/4.38G [02:39<00:24, 30.0MiB/s]
83%|████████▎ | 3.64G/4.38G [02:40<00:32, 22.8MiB/s]
83%|████████▎ | 3.65G/4.38G [02:40<00:26, 28.1MiB/s]
83%|████████▎ | 3.65G/4.38G [02:40<00:31, 22.9MiB/s]
83%|████████▎ | 3.66G/4.38G [02:40<00:27, 26.3MiB/s]
84%|████████▎ | 3.66G/4.38G [02:40<00:20, 35.4MiB/s]
84%|████████▎ | 3.67G/4.38G [02:40<00:22, 31.6MiB/s]
84%|████████▍ | 3.67G/4.38G [02:41<00:23, 29.6MiB/s]
84%|████████▍ | 3.68G/4.38G [02:41<00:27, 25.9MiB/s]
84%|████████▍ | 3.68G/4.38G [02:41<00:21, 32.6MiB/s]
84%|████████▍ | 3.69G/4.38G [02:41<00:23, 29.2MiB/s]
84%|████████▍ | 3.69G/4.38G [02:41<00:25, 27.5MiB/s]
84%|████████▍ | 3.69G/4.38G [02:41<00:31, 21.8MiB/s]
84%|████████▍ | 3.70G/4.38G [02:42<00:27, 25.0MiB/s]
84%|████████▍ | 3.70G/4.38G [02:42<00:28, 23.6MiB/s]
85%|████████▍ | 3.71G/4.38G [02:42<00:29, 22.6MiB/s]
85%|████████▍ | 3.71G/4.38G [02:42<00:35, 18.8MiB/s]
85%|████████▍ | 3.71G/4.38G [02:42<00:25, 26.0MiB/s]
85%|████████▍ | 3.72G/4.38G [02:42<00:25, 25.9MiB/s]
85%|████████▍ | 3.72G/4.38G [02:43<00:21, 30.2MiB/s]
85%|████████▌ | 3.73G/4.38G [02:43<00:25, 25.3MiB/s]
85%|████████▌ | 3.73G/4.38G [02:43<00:27, 23.9MiB/s]
85%|████████▌ | 3.73G/4.38G [02:43<00:28, 22.8MiB/s]
85%|████████▌ | 3.73G/4.38G [02:43<00:30, 21.4MiB/s]
85%|████████▌ | 3.74G/4.38G [02:43<00:22, 28.4MiB/s]
85%|████████▌ | 3.74G/4.38G [02:43<00:26, 24.1MiB/s]
86%|████████▌ | 3.75G/4.38G [02:44<00:19, 33.2MiB/s]
86%|████████▌ | 3.75G/4.38G [02:44<00:23, 27.0MiB/s]
86%|████████▌ | 3.76G/4.38G [02:44<00:23, 26.4MiB/s]
86%|████████▌ | 3.76G/4.38G [02:44<00:28, 21.4MiB/s]
86%|████████▌ | 3.76G/4.38G [02:44<00:28, 22.0MiB/s]
86%|████████▌ | 3.77G/4.38G [02:44<00:29, 21.1MiB/s]
86%|████████▌ | 3.77G/4.38G [02:45<00:23, 25.6MiB/s]
86%|████████▌ | 3.78G/4.38G [02:45<00:30, 19.8MiB/s]
86%|████████▋ | 3.78G/4.38G [02:45<00:28, 20.8MiB/s]
86%|████████▋ | 3.78G/4.38G [02:45<00:30, 19.3MiB/s]
87%|████████▋ | 3.79G/4.38G [02:45<00:24, 24.0MiB/s]
87%|████████▋ | 3.79G/4.38G [02:46<00:29, 20.2MiB/s]
87%|████████▋ | 3.80G/4.38G [02:46<00:20, 28.2MiB/s]
87%|████████▋ | 3.80G/4.38G [02:46<00:20, 27.7MiB/s]
87%|████████▋ | 3.81G/4.38G [02:46<00:23, 24.1MiB/s]
87%|████████▋ | 3.81G/4.38G [02:46<00:23, 23.9MiB/s]
87%|████████▋ | 3.82G/4.38G [02:46<00:16, 33.4MiB/s]
87%|████████▋ | 3.82G/4.38G [02:46<00:17, 32.5MiB/s]
87%|████████▋ | 3.83G/4.38G [02:47<00:19, 28.2MiB/s]
87%|████████▋ | 3.83G/4.38G [02:47<00:15, 35.0MiB/s]
88%|████████▊ | 3.84G/4.38G [02:47<00:21, 25.9MiB/s]
88%|████████▊ | 3.84G/4.38G [02:47<00:21, 24.8MiB/s]
88%|████████▊ | 3.84G/4.38G [02:48<00:27, 19.5MiB/s]
88%|████████▊ | 3.85G/4.38G [02:48<00:29, 17.8MiB/s]
88%|████████▊ | 3.85G/4.38G [02:48<00:32, 16.3MiB/s]
88%|████████▊ | 3.86G/4.38G [02:48<00:28, 18.5MiB/s]
88%|████████▊ | 3.86G/4.38G [02:49<00:29, 18.0MiB/s]
88%|████████▊ | 3.86G/4.38G [02:49<00:31, 16.4MiB/s]
88%|████████▊ | 3.86G/4.38G [02:49<00:27, 19.0MiB/s]
88%|████████▊ | 3.87G/4.38G [02:49<00:26, 19.7MiB/s]
88%|████████▊ | 3.87G/4.38G [02:49<00:26, 19.3MiB/s]
88%|████████▊ | 3.87G/4.38G [02:49<00:25, 20.2MiB/s]
88%|████████▊ | 3.88G/4.38G [02:49<00:27, 18.5MiB/s]
89%|████████▊ | 3.88G/4.38G [02:50<00:19, 26.0MiB/s]
89%|████████▊ | 3.88G/4.38G [02:50<00:21, 23.4MiB/s]
89%|████████▉ | 3.89G/4.38G [02:50<00:15, 30.7MiB/s]
89%|████████▉ | 3.89G/4.38G [02:50<00:19, 25.5MiB/s]
89%|████████▉ | 3.90G/4.38G [02:50<00:19, 24.4MiB/s]
89%|████████▉ | 3.90G/4.38G [02:50<00:22, 21.6MiB/s]
89%|████████▉ | 3.90G/4.38G [02:51<00:36, 13.1MiB/s]
89%|████████▉ | 3.91G/4.38G [02:51<00:38, 12.2MiB/s]
89%|████████▉ | 3.91G/4.38G [02:51<00:37, 12.6MiB/s]
89%|████████▉ | 3.92G/4.38G [02:51<00:22, 20.4MiB/s]
89%|████████▉ | 3.92G/4.38G [02:51<00:22, 20.2MiB/s]
89%|████████▉ | 3.92G/4.38G [02:52<00:23, 19.6MiB/s]
90%|████████▉ | 3.92G/4.38G [02:52<00:22, 20.5MiB/s]
90%|████████▉ | 3.93G/4.38G [02:52<00:26, 16.9MiB/s]
90%|████████▉ | 3.93G/4.38G [02:52<00:22, 19.8MiB/s]
90%|████████▉ | 3.93G/4.38G [02:52<00:22, 19.5MiB/s]
90%|████████▉ | 3.94G/4.38G [02:52<00:13, 31.5MiB/s]
90%|█████████ | 3.95G/4.38G [02:53<00:14, 29.1MiB/s]
90%|█████████ | 3.95G/4.38G [02:53<00:14, 28.7MiB/s]
90%|█████████ | 3.96G/4.38G [02:53<00:10, 38.9MiB/s]
90%|█████████ | 3.96G/4.38G [02:53<00:12, 34.2MiB/s]
91%|█████████ | 3.97G/4.38G [02:53<00:13, 30.4MiB/s]
91%|█████████ | 3.97G/4.38G [02:53<00:11, 36.8MiB/s]
91%|█████████ | 3.98G/4.38G [02:54<00:13, 30.2MiB/s]
91%|█████████ | 3.98G/4.38G [02:54<00:13, 29.5MiB/s]
91%|█████████ | 3.99G/4.38G [02:54<00:15, 25.5MiB/s]
91%|█████████ | 3.99G/4.38G [02:54<00:13, 28.1MiB/s]
91%|█████████ | 3.99G/4.38G [02:54<00:15, 25.3MiB/s]
91%|█████████▏| 4.00G/4.38G [02:54<00:13, 28.5MiB/s]
91%|█████████▏| 4.00G/4.38G [02:55<00:17, 21.7MiB/s]
91%|█████████▏| 4.01G/4.38G [02:55<00:14, 26.2MiB/s]
92%|█████████▏| 4.01G/4.38G [02:55<00:14, 26.0MiB/s]
92%|█████████▏| 4.02G/4.38G [02:55<00:11, 33.0MiB/s]
92%|█████████▏| 4.02G/4.38G [02:55<00:12, 29.0MiB/s]
92%|█████████▏| 4.03G/4.38G [02:55<00:12, 29.5MiB/s]
92%|█████████▏| 4.03G/4.38G [02:55<00:11, 30.4MiB/s]
92%|█████████▏| 4.03G/4.38G [02:56<00:15, 22.4MiB/s]
92%|█████████▏| 4.04G/4.38G [02:56<00:17, 19.8MiB/s]
92%|█████████▏| 4.04G/4.38G [02:56<00:13, 25.0MiB/s]
92%|█████████▏| 4.05G/4.38G [02:56<00:09, 34.4MiB/s]
93%|█████████▎| 4.06G/4.38G [02:56<00:09, 33.3MiB/s]
93%|█████████▎| 4.06G/4.38G [02:56<00:09, 33.3MiB/s]
93%|█████████▎| 4.06G/4.38G [02:57<00:12, 26.5MiB/s]
93%|█████████▎| 4.07G/4.38G [02:57<00:12, 24.3MiB/s]
93%|█████████▎| 4.08G/4.38G [02:57<00:12, 25.2MiB/s]
93%|█████████▎| 4.08G/4.38G [02:57<00:13, 23.2MiB/s]
93%|█████████▎| 4.08G/4.38G [02:58<00:11, 25.4MiB/s]
93%|█████████▎| 4.09G/4.38G [02:58<00:12, 22.7MiB/s]
93%|█████████▎| 4.09G/4.38G [02:58<00:10, 27.4MiB/s]
93%|█████████▎| 4.09G/4.38G [02:58<00:12, 22.1MiB/s]
94%|█████████▎| 4.10G/4.38G [02:58<00:10, 27.0MiB/s]
94%|█████████▍| 4.11G/4.38G [02:58<00:08, 32.5MiB/s]
94%|█████████▍| 4.11G/4.38G [02:59<00:10, 26.8MiB/s]
94%|█████████▍| 4.12G/4.38G [02:59<00:09, 28.1MiB/s]
94%|█████████▍| 4.12G/4.38G [02:59<00:10, 25.4MiB/s]
94%|█████████▍| 4.13G/4.38G [02:59<00:09, 27.5MiB/s]
94%|█████████▍| 4.13G/4.38G [02:59<00:12, 20.5MiB/s]
94%|█████████▍| 4.13G/4.38G [02:59<00:09, 26.9MiB/s]
94%|█████████▍| 4.14G/4.38G [03:00<00:09, 25.0MiB/s]
95%|█████████▍| 4.14G/4.38G [03:00<00:08, 29.5MiB/s]
95%|█████████▍| 4.15G/4.38G [03:00<00:09, 23.8MiB/s]
95%|█████████▍| 4.15G/4.38G [03:00<00:11, 19.3MiB/s]
95%|█████████▍| 4.15G/4.38G [03:01<00:13, 16.7MiB/s]
95%|█████████▍| 4.16G/4.38G [03:01<00:12, 17.5MiB/s]
95%|█████████▍| 4.16G/4.38G [03:01<00:13, 16.0MiB/s]
95%|█████████▌| 4.17G/4.38G [03:01<00:11, 18.9MiB/s]
95%|█████████▌| 4.17G/4.38G [03:02<00:12, 16.4MiB/s]
95%|█████████▌| 4.18G/4.38G [03:02<00:09, 21.3MiB/s]
95%|█████████▌| 4.18G/4.38G [03:02<00:09, 20.4MiB/s]
96%|█████████▌| 4.18G/4.38G [03:02<00:08, 23.8MiB/s]
96%|█████████▌| 4.19G/4.38G [03:02<00:09, 20.4MiB/s]
96%|█████████▌| 4.19G/4.38G [03:02<00:09, 21.3MiB/s]
96%|█████████▌| 4.19G/4.38G [03:02<00:07, 24.0MiB/s]
96%|█████████▌| 4.20G/4.38G [03:03<00:08, 21.0MiB/s]
96%|█████████▌| 4.20G/4.38G [03:03<00:12, 14.8MiB/s]
96%|█████████▌| 4.20G/4.38G [03:03<00:12, 13.9MiB/s]
96%|█████████▌| 4.21G/4.38G [03:04<00:09, 17.7MiB/s]
96%|█████████▌| 4.21G/4.38G [03:04<00:09, 17.4MiB/s]
96%|█████████▋| 4.22G/4.38G [03:04<00:06, 23.7MiB/s]
96%|█████████▋| 4.22G/4.38G [03:04<00:08, 18.5MiB/s]
96%|█████████▋| 4.23G/4.38G [03:04<00:06, 24.3MiB/s]
97%|█████████▋| 4.23G/4.38G [03:04<00:07, 21.6MiB/s]
97%|█████████▋| 4.24G/4.38G [03:05<00:05, 27.0MiB/s]
97%|█████████▋| 4.24G/4.38G [03:05<00:04, 33.5MiB/s]
97%|█████████▋| 4.25G/4.38G [03:05<00:04, 33.0MiB/s]
97%|█████████▋| 4.25G/4.38G [03:05<00:03, 35.9MiB/s]
97%|█████████▋| 4.26G/4.38G [03:05<00:04, 26.1MiB/s]
97%|█████████▋| 4.26G/4.38G [03:06<00:05, 20.7MiB/s]
97%|█████████▋| 4.26G/4.38G [03:06<00:05, 20.2MiB/s]
97%|█████████▋| 4.27G/4.38G [03:06<00:05, 20.3MiB/s]
97%|█████████▋| 4.27G/4.38G [03:06<00:06, 18.3MiB/s]
97%|█████████▋| 4.27G/4.38G [03:06<00:06, 17.4MiB/s]
98%|█████████▊| 4.27G/4.38G [03:07<00:09, 11.1MiB/s]
98%|█████████▊| 4.28G/4.38G [03:07<00:07, 14.8MiB/s]
98%|█████████▊| 4.28G/4.38G [03:07<00:07, 14.0MiB/s]
98%|█████████▊| 4.28G/4.38G [03:07<00:08, 12.0MiB/s]
98%|█████████▊| 4.28G/4.38G [03:07<00:06, 15.0MiB/s]
98%|█████████▊| 4.29G/4.38G [03:07<00:05, 16.2MiB/s]
98%|█████████▊| 4.29G/4.38G [03:07<00:05, 17.1MiB/s]
98%|█████████▊| 4.29G/4.38G [03:08<00:03, 22.2MiB/s]
98%|█████████▊| 4.30G/4.38G [03:08<00:03, 21.8MiB/s]
98%|█████████▊| 4.30G/4.38G [03:08<00:03, 22.0MiB/s]
98%|█████████▊| 4.30G/4.38G [03:08<00:03, 19.6MiB/s]
98%|█████████▊| 4.31G/4.38G [03:08<00:02, 25.3MiB/s]
99%|█████████▊| 4.32G/4.38G [03:08<00:01, 32.0MiB/s]
99%|█████████▊| 4.32G/4.38G [03:09<00:02, 25.0MiB/s]
99%|█████████▉| 4.33G/4.38G [03:09<00:01, 31.1MiB/s]
99%|█████████▉| 4.33G/4.38G [03:09<00:01, 25.7MiB/s]
99%|█████████▉| 4.34G/4.38G [03:09<00:01, 23.4MiB/s]
99%|█████████▉| 4.34G/4.38G [03:09<00:01, 22.7MiB/s]
99%|█████████▉| 4.34G/4.38G [03:09<00:01, 28.9MiB/s]
99%|█████████▉| 4.35G/4.38G [03:10<00:01, 24.4MiB/s]
99%|█████████▉| 4.35G/4.38G [03:10<00:01, 24.8MiB/s]
99%|█████████▉| 4.35G/4.38G [03:10<00:01, 24.6MiB/s]
100%|█████████▉| 4.36G/4.38G [03:10<00:00, 27.1MiB/s]
100%|█████████▉| 4.36G/4.38G [03:10<00:00, 24.5MiB/s]
100%|█████████▉| 4.37G/4.38G [03:11<00:00, 24.5MiB/s]
100%|█████████▉| 4.37G/4.38G [03:11<00:00, 23.7MiB/s]
100%|█████████▉| 4.37G/4.38G [03:11<00:00, 16.0MiB/s]
100%|█████████▉| 4.38G/4.38G [03:11<00:00, 14.4MiB/s]
100%|█████████▉| 4.38G/4.38G [03:11<00:00, 13.8MiB/s]
100%|█████████▉| 4.38G/4.38G [03:12<00:00, 12.2MiB/s]
100%|██████████| 4.38G/4.38G [03:12<00:00, 22.8MiB/s]
Then we will load the csv files.
dataset_path = os.path.join(download_dir, 'flickr30k_processed')
train_data = pd.read_csv(f'{dataset_path}/train.csv', index_col=0)
val_data = pd.read_csv(f'{dataset_path}/val.csv', index_col=0)
test_data = pd.read_csv(f'{dataset_path}/test.csv', index_col=0)
image_col = "image"
text_col = "caption"
We also need to expand the relative image paths to use their absolute local paths.
def path_expander(path, base_folder):
path_l = path.split(';')
return ';'.join([os.path.abspath(os.path.join(base_folder, path)) for path in path_l])
train_data[image_col] = train_data[image_col].apply(lambda ele: path_expander(ele, base_folder=dataset_path))
val_data[image_col] = val_data[image_col].apply(lambda ele: path_expander(ele, base_folder=dataset_path))
test_data[image_col] = test_data[image_col].apply(lambda ele: path_expander(ele, base_folder=dataset_path))
Take train_data
for example, let’s see how the data look like in the dataframe.
train_data.head()
caption | image | |
---|---|---|
0 | Two young guys with shaggy hair look at their ... | /home/ci/autogluon/docs/tutorials/multimodal/s... |
1 | Two young White males are outside near many bu... | /home/ci/autogluon/docs/tutorials/multimodal/s... |
2 | Two men in green shirts are standing in a yard | /home/ci/autogluon/docs/tutorials/multimodal/s... |
3 | A man in a blue shirt standing in a garden | /home/ci/autogluon/docs/tutorials/multimodal/s... |
4 | Two friends enjoy time spent together | /home/ci/autogluon/docs/tutorials/multimodal/s... |
Each row is one image and text pair, implying that they match each other. Since one image corresponds to five captions in the dataset, we copy each image path five times to build the correspondences. We can visualize one image-text pair.
train_data[text_col][0]
'Two young guys with shaggy hair look at their hands while hanging out in the yard'
pil_img = Image(filename=train_data[image_col][0])
display(pil_img)

To perform evaluation or semantic search, we need to extract the unique image and text items from text_data
and add one label column in the test_data
.
test_image_data = pd.DataFrame({image_col: test_data[image_col].unique().tolist()})
test_text_data = pd.DataFrame({text_col: test_data[text_col].unique().tolist()})
test_data_with_label = test_data.copy()
test_label_col = "relevance"
test_data_with_label[test_label_col] = [1] * len(test_data)
Initialize Predictor¶
To initialize a predictor for image-text matching, we need to set problem_type
as image_text_similarity
. query
and response
refer to the two dataframe columns in which two items in one row should match each other. You can set query=text_col
and response=image_col
, or query=image_col
and response=text_col
. In image-text matching, query
and response
are equivalent.
from autogluon.multimodal import MultiModalPredictor
predictor = MultiModalPredictor(
query=text_col,
response=image_col,
problem_type="image_text_similarity",
eval_metric="recall",
)
By initializing the predictor for image_text_similarity
, you have loaded the pretrained CLIP backbone openai/clip-vit-base-patch32
.
Directly Evaluate on Test Dataset (Zero-shot)¶
You may be interested in getting the pretrained model’s performance on your data. Let’s compute the text-to-image and image-to-text retrieval scores.
txt_to_img_scores = predictor.evaluate(
data=test_data_with_label,
query_data=test_text_data,
response_data=test_image_data,
label=test_label_col,
cutoffs=[1, 5, 10],
)
img_to_txt_scores = predictor.evaluate(
data=test_data_with_label,
query_data=test_image_data,
response_data=test_text_data,
label=test_label_col,
cutoffs=[1, 5, 10],
)
print(f"txt_to_img_scores: {txt_to_img_scores}")
print(f"img_to_txt_scores: {img_to_txt_scores}")
txt_to_img_scores: {'recall@1': 0.58964, 'recall@5': 0.83533, 'recall@10': 0.90156}
img_to_txt_scores: {'recall@1': 0.15525, 'recall@5': 0.571, 'recall@10': 0.7176}
Here we report the recall
, which is the eval_metric
in initializing the predictor above. One cutoff
value means using the top k retrieved items to calculate the score. You may find that the text-to-image recalls are much higher than the image-to-text recalls. This is because each image is paired with five texts. In image-to-text retrieval, the upper bound of recall@1
is 20%, which means that the top-1 text is correct, but there are totally five texts to retrieve.
Finetune Predictor¶
After measuring the pretrained performance, we can finetune the model on our dataset to see whether we can get improvements. For a quick demo, here we set the time limit to 180 seconds.
predictor.fit(
train_data=train_data,
tuning_data=val_data,
time_limit=180,
)
No path specified. Models will be saved in: "AutogluonModels/ag-20241030_013927"
=================== System Info ===================
AutoGluon Version: 1.1.1b20241030
Python Version: 3.10.13
Operating System: Linux
Platform Machine: x86_64
Platform Version: #1 SMP Tue Sep 24 10:00:37 UTC 2024
CPU Count: 8
Pytorch Version: 2.3.1+cu121
CUDA Version: 12.1
Memory Avail: 27.03 GB / 30.95 GB (87.3%)
Disk Space Avail: 174.61 GB / 255.99 GB (68.2%)
===================================================
AutoMM starts to create your model. ✨✨✨
To track the learning progress, you can open a terminal and launch Tensorboard:
```shell
# Assume you have installed tensorboard
tensorboard --logdir /home/ci/autogluon/docs/tutorials/multimodal/semantic_matching/AutogluonModels/ag-20241030_013927
```
INFO: Seed set to 0
GPU Count: 1
GPU Count to be Used: 1
GPU 0 Name: Tesla T4
GPU 0 Memory: 0.57GB/15.0GB (Used/Total)
INFO: Using 16bit Automatic Mixed Precision (AMP)
INFO: GPU available: True (cuda), used: True
INFO: TPU available: False, using: 0 TPU cores
INFO: HPU available: False, using: 0 HPUs
INFO: LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
INFO:
| Name | Type | Params | Mode
------------------------------------------------------------------------
0 | query_model | CLIPForImageText | 151 M | train
1 | response_model | CLIPForImageText | 151 M | train
2 | validation_metric | CustomHitRate | 0 | train
3 | loss_func | MultiNegativesSoftmaxLoss | 0 | train
------------------------------------------------------------------------
151 M Trainable params
0 Non-trainable params
151 M Total params
605.109 Total estimated model params size (MB)
INFO: Time limit reached. Elapsed time is 0:03:00. Signaling Trainer to stop.
INFO: Epoch 0, global step 345: 'val_recall' reached 0.56111 (best 0.56111), saving model to '/home/ci/autogluon/docs/tutorials/multimodal/semantic_matching/AutogluonModels/ag-20241030_013927/epoch=0-step=345.ckpt' as top 3
Start to fuse 1 checkpoints via the greedy soup algorithm.
AutoMM has created your model. 🎉🎉🎉
To load the model, use the code below:
```python
from autogluon.multimodal import MultiModalPredictor
predictor = MultiModalPredictor.load("/home/ci/autogluon/docs/tutorials/multimodal/semantic_matching/AutogluonModels/ag-20241030_013927")
```
If you are not satisfied with the model, try to increase the training time,
adjust the hyperparameters (https://auto.gluon.ai/stable/tutorials/multimodal/advanced_topics/customization.html),
or post issues on GitHub (https://github.com/autogluon/autogluon/issues).
<autogluon.multimodal.predictor.MultiModalPredictor at 0x7fa678a53a00>
Evaluate the Finetuned Model on the Test Dataset¶
Now Let’s evaluate the finetuned model. Similarly, we also compute the recalls of text-to-image and image-to-text retrievals.
txt_to_img_scores = predictor.evaluate(
data=test_data_with_label,
query_data=test_text_data,
response_data=test_image_data,
label=test_label_col,
cutoffs=[1, 5, 10],
)
img_to_txt_scores = predictor.evaluate(
data=test_data_with_label,
query_data=test_image_data,
response_data=test_text_data,
label=test_label_col,
cutoffs=[1, 5, 10],
)
print(f"txt_to_img_scores: {txt_to_img_scores}")
print(f"img_to_txt_scores: {img_to_txt_scores}")
txt_to_img_scores: {'recall@1': 0.69768, 'recall@5': 0.90576, 'recall@10': 0.95128}
img_to_txt_scores: {'recall@1': 0.16865, 'recall@5': 0.6694, 'recall@10': 0.8152}
We can observe large improvements over the zero-shot predictor. This means that finetuning CLIP on our customized data may help achieve better performance.
Predict Whether Image and Text Match¶
Whether finetuned or not, the predictor can predict whether image and text pairs match.
pred = predictor.predict(test_data.head(5))
print(pred)
0 1
1 1
2 1
3 1
4 1
dtype: int64
Predict Matching Probabilities¶
The predictor can also return to you the matching probabilities.
proba = predictor.predict_proba(test_data.head(5))
print(proba)
0 1
0 0.343417 0.656583
1 0.325477 0.674523
2 0.347010 0.652990
3 0.344534 0.655466
4 0.330271 0.669729
The second column is the probability of being a match.
Extract Embeddings¶
Another common user case is to extract image and text embeddings.
image_embeddings = predictor.extract_embedding({image_col: test_image_data[image_col][:5].tolist()})
print(image_embeddings.shape)
(5, 512)
text_embeddings = predictor.extract_embedding({text_col: test_text_data[text_col][:5].tolist()})
print(text_embeddings.shape)
(5, 512)
Semantic Search¶
We also provide an advanced util function to conduct semantic search. First, given one or more texts, we can search semantically similar images from an image database.
from autogluon.multimodal.utils import semantic_search
text_to_image_hits = semantic_search(
matcher=predictor,
query_data=test_text_data.iloc[[3]],
response_data=test_image_data,
top_k=5,
)
Let’s visualize the text query and top-1 image response.
test_text_data.iloc[[3]]
caption | |
---|---|
3 | A man in an orange hat starring at something |
pil_img = Image(filename=test_image_data[image_col][text_to_image_hits[0][0]['response_id']])
display(pil_img)

Similarly, given one or more images, we can retrieve texts with similar semantic meanings.
image_to_text_hits = semantic_search(
matcher=predictor,
query_data=test_image_data.iloc[[6]],
response_data=test_text_data,
top_k=5,
)
Let’s visualize the image query and top-1 text response.
pil_img = Image(filename=test_image_data[image_col][6])
display(pil_img)

test_text_data[text_col][image_to_text_hits[0][1]['response_id']]
'Several students waiting outside an igloo'
Other Examples¶
You may go to AutoMM Examples to explore other examples about AutoMM.
Customization¶
To learn how to customize AutoMM, please refer to Customize AutoMM.