[toc]

『深度学习 7 日打卡营·大作业』

零基础解锁深度学习神器飞桨框架高层 API,七天时间助你掌握 CV、NLP 领域最火模型及应用。

  1. 课程地址

传送门:https://aistudio.baidu.com/aistudio/course/introduce/6771

  1. 目标
  • 掌握深度学习常用模型基础知识
  • 熟练掌握一种国产开源深度学习框架
  • 具备独立完成相关深度学习任务的能力
  • 能用所学为 AI 加一份年味

数据集:

指定数据集:cifar100,通过高层 API 调用。

可以自己写数据增强和数据预处理功能。

模型:

随便选,模型参数初始化(如:uniform 和 normal)可以随意调整。

模型训练

各种超参数(如:epochs、batch_size)可以随意调整。

评判标准

最终以 model.evaluate 的精度输出值(格式如下),计算方式是将 eval_dataset 送入 evaluate 接口即可,需要在 model.prepare 中配置评估指标 Accuracy,所用数据集不能被用于训练过。

1
{'loss': [6.4980035], 'acc': 0.8485721442885772}

导入相关库

1
2
3
4
import paddle
import numpy as np

paddle.__version__
1
'2.0.0'
1
2
# 配置GPU
paddle.set_device('gpu')
1
CUDAPlace(0)

② 数据准备

数据增强

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

#mean and std of cifar100 dataset
CIFAR100_TRAIN_MEAN = (0.5070751592371323, 0.48654887331495095, 0.4409178433670343)
CIFAR100_TRAIN_STD = (0.2673342858792401, 0.2564384629170883, 0.27615047132568404)

train_transfrom = T.Compose([
T.Resize((256, 256)),
T.CenterCrop((224, 224)),
T.RandomHorizontalFlip(0.5), # 随机水平翻转
T.RandomRotation(degrees=15), # (-degrees,+degrees)
T.ToTensor(), # 数据的格式转换和标准化 HWC => CHW
T.Normalize(mean=CIFAR100_MEAN, std=CIFAR100_STD) # 图像归一化
])

eval_transfrom = T.Compose([
T.Resize(224),
T.ToTensor(), # 数据的格式转换和标准化 HWC => CHW
T.Normalize(mean=CIFAR100_MEAN, std=CIFAR100_STD) # 图像归一化
])

class paddle.vision.transforms.ToTensor

将形状为 (H x W x C)的输入数据 PIL.Image 或 numpy.ndarray 转换为 (C x H x W)。 如果想保持形状不变,可以将参数 data_format 设置为 ‘HWC’。

同时,如果输入的 PIL.Image 的 mode 是 (L, LA, P, I, F, RGB, YCbCr, RGBA, CMYK, 1) 其中一种,或者输入的 numpy.ndarray 数据类型是 'uint8',那个会将输入数据从(0-255)的范围缩放到 (0-1)的范围。其他的情况,则保持输入不变。

1
2
3
4
5
6
7
8
# 训练数据集
train_dataset = paddle.vision.datasets.Cifar100(mode='train', transform=paddle.vision.transforms.ToTensor())

# 验证数据集
eval_dataset = paddle.vision.datasets.Cifar100(mode='test', transform=paddle.vision.transforms.ToTensor())

# 合并数据集
dataset = paddle.concat([d[0] for d in paddle.io.DataLoader(train_dataset)] + [d[0] for d in paddle.io.DataLoader(eval_dataset)])
1
2
# 计算数据均值和方差
print(f'mean:{dataset.mean(axis=[0, 2, 3]).numpy()} \n std:{dataset.std(axis=[0, 2, 3]).numpy()}')
1
2
mean:[0.5073715 0.4867007 0.441096 ]
std:[0.26750046 0.25658613 0.27630225]

由于要调用resnet101的预训练模型,这里把 CIFAR 的$32\times 32$的图像resize为$224\times 224$的图像,保持特征尺寸和感受野的一致。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
import paddle.vision.transforms as T


# CIFAR100_TRAIN_MEAN = (0.5070751592371323, 0.48654887331495095, 0.4409178433670343)
# CIFAR100_TRAIN_STD = (0.2673342858792401, 0.2564384629170883, 0.27615047132568404)

CIFAR100_MEAN = [0.5073715, 0.4867007, 0.441096]
CIFAR100_STD = [0.26750046, 0.25658613, 0.27630225]

# mean=[0.485, 0.456, 0.406]
# std=[0.229, 0.224, 0.225]

train_transfrom = T.Compose([
T.Resize((256, 256)),
T.CenterCrop((224, 224)),
T.RandomHorizontalFlip(0.5), # 随机水平翻转
T.RandomRotation(degrees=15), # (-degrees,+degrees)
T.ToTensor(), # 数据的格式转换和标准化 HWC => CHW
T.Normalize(mean=CIFAR100_MEAN, std=CIFAR100_STD) # 图像归一化
])

eval_transfrom = T.Compose([
T.Resize(224),
T.ToTensor(), # 数据的格式转换和标准化 HWC => CHW
T.Normalize(mean=CIFAR100_MEAN, std=CIFAR100_STD) # 图像归一化
])

# 训练数据集
train_dataset = paddle.vision.datasets.Cifar100(mode='train', transform=train_transfrom)

# 验证数据集
eval_dataset = paddle.vision.datasets.Cifar100(mode='test', transform=eval_transfrom)
1
2
3
4
5
6
7
print(f'训练集大小: {len(train_dataset)}, 测试集大小: {len(eval_dataset)}')

print("train data shape:", train_dataset[0][0].shape)

print("eval data shape:", eval_dataset[0][0].shape)

# train_dataset[3][0]
1
2
3
训练集大小: 50000, 测试集大小: 10000
train data shape: [3, 224, 224]
eval data shape: [3, 224, 224]
1
2
3
4
5
6
7
8
9
import matplotlib.pyplot as plt

idx = np.random.randint(0, 50000, size=10)

for i in idx:
img = train_dataset[i][0].numpy().transpose([1, 2, 0])
plt.imshow(img)
plt.title(train_dataset[i][1])
plt.show()
1
[WARNING 2021-02-11 13:53:09,315 image.py:664] Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).

在这里插入图片描述

1
[WARNING 2021-02-11 13:53:09,435 image.py:664] Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).

在这里插入图片描述

1
[WARNING 2021-02-11 13:53:09,532 image.py:664] Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).

在这里插入图片描述

1
2
3
[WARNING 2021-02-11 13:53:09,644 image.py:664] Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
[WARNING 2021-02-11 13:53:09,756 image.py:664] Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
[WARNING 2021-02-11 13:53:09,879 image.py:664] Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).

在这里插入图片描述

在这里插入图片描述

在这里插入图片描述

1
[WARNING 2021-02-11 13:53:09,994 image.py:664] Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).

在这里插入图片描述

1
[WARNING 2021-02-11 13:53:10,117 image.py:664] Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).

在这里插入图片描述

1
[WARNING 2021-02-11 13:53:10,240 image.py:664] Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).

在这里插入图片描述

1
[WARNING 2021-02-11 13:53:10,339 image.py:664] Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).

在这里插入图片描述

3.1 模型开发

1
network = paddle.vision.models.resnet101(num_classes=100, pretrained=True)
1
2
3
4
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py:1263: UserWarning: Skip loading for fc.weight. fc.weight receives a shape [2048, 1000], but the expected shape is [2048, 100].
warnings.warn(("Skip loading for {}. ".format(key) + str(err)))
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py:1263: UserWarning: Skip loading for fc.bias. fc.bias receives a shape [1000], but the expected shape is [100].
warnings.warn(("Skip loading for {}. ".format(key) + str(err)))

3.2 模型可视化

1
2
3
model = paddle.Model(network)

model.summary((-1, 3, 224, 224))
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
-------------------------------------------------------------------------------
Layer (type) Input Shape Output Shape Param #
===============================================================================
Conv2D-105 [[1, 3, 224, 224]] [1, 64, 112, 112] 9,408
BatchNorm2D-105 [[1, 64, 112, 112]] [1, 64, 112, 112] 256
ReLU-35 [[1, 64, 112, 112]] [1, 64, 112, 112] 0
MaxPool2D-2 [[1, 64, 112, 112]] [1, 64, 56, 56] 0
Conv2D-107 [[1, 64, 56, 56]] [1, 64, 56, 56] 4,096
BatchNorm2D-107 [[1, 64, 56, 56]] [1, 64, 56, 56] 256
ReLU-36 [[1, 256, 56, 56]] [1, 256, 56, 56] 0
Conv2D-108 [[1, 64, 56, 56]] [1, 64, 56, 56] 36,864
BatchNorm2D-108 [[1, 64, 56, 56]] [1, 64, 56, 56] 256
Conv2D-109 [[1, 64, 56, 56]] [1, 256, 56, 56] 16,384
BatchNorm2D-109 [[1, 256, 56, 56]] [1, 256, 56, 56] 1,024
Conv2D-106 [[1, 64, 56, 56]] [1, 256, 56, 56] 16,384
BatchNorm2D-106 [[1, 256, 56, 56]] [1, 256, 56, 56] 1,024
BottleneckBlock-34 [[1, 64, 56, 56]] [1, 256, 56, 56] 0
Conv2D-110 [[1, 256, 56, 56]] [1, 64, 56, 56] 16,384
BatchNorm2D-110 [[1, 64, 56, 56]] [1, 64, 56, 56] 256
ReLU-37 [[1, 256, 56, 56]] [1, 256, 56, 56] 0
Conv2D-111 [[1, 64, 56, 56]] [1, 64, 56, 56] 36,864
BatchNorm2D-111 [[1, 64, 56, 56]] [1, 64, 56, 56] 256
Conv2D-112 [[1, 64, 56, 56]] [1, 256, 56, 56] 16,384
BatchNorm2D-112 [[1, 256, 56, 56]] [1, 256, 56, 56] 1,024
BottleneckBlock-35 [[1, 256, 56, 56]] [1, 256, 56, 56] 0
Conv2D-113 [[1, 256, 56, 56]] [1, 64, 56, 56] 16,384
BatchNorm2D-113 [[1, 64, 56, 56]] [1, 64, 56, 56] 256
ReLU-38 [[1, 256, 56, 56]] [1, 256, 56, 56] 0
Conv2D-114 [[1, 64, 56, 56]] [1, 64, 56, 56] 36,864
BatchNorm2D-114 [[1, 64, 56, 56]] [1, 64, 56, 56] 256
Conv2D-115 [[1, 64, 56, 56]] [1, 256, 56, 56] 16,384
BatchNorm2D-115 [[1, 256, 56, 56]] [1, 256, 56, 56] 1,024
BottleneckBlock-36 [[1, 256, 56, 56]] [1, 256, 56, 56] 0
Conv2D-117 [[1, 256, 56, 56]] [1, 128, 56, 56] 32,768
BatchNorm2D-117 [[1, 128, 56, 56]] [1, 128, 56, 56] 512
ReLU-39 [[1, 512, 28, 28]] [1, 512, 28, 28] 0
Conv2D-118 [[1, 128, 56, 56]] [1, 128, 28, 28] 147,456
BatchNorm2D-118 [[1, 128, 28, 28]] [1, 128, 28, 28] 512
Conv2D-119 [[1, 128, 28, 28]] [1, 512, 28, 28] 65,536
BatchNorm2D-119 [[1, 512, 28, 28]] [1, 512, 28, 28] 2,048
Conv2D-116 [[1, 256, 56, 56]] [1, 512, 28, 28] 131,072
BatchNorm2D-116 [[1, 512, 28, 28]] [1, 512, 28, 28] 2,048
BottleneckBlock-37 [[1, 256, 56, 56]] [1, 512, 28, 28] 0
Conv2D-120 [[1, 512, 28, 28]] [1, 128, 28, 28] 65,536
BatchNorm2D-120 [[1, 128, 28, 28]] [1, 128, 28, 28] 512
ReLU-40 [[1, 512, 28, 28]] [1, 512, 28, 28] 0
Conv2D-121 [[1, 128, 28, 28]] [1, 128, 28, 28] 147,456
BatchNorm2D-121 [[1, 128, 28, 28]] [1, 128, 28, 28] 512
Conv2D-122 [[1, 128, 28, 28]] [1, 512, 28, 28] 65,536
BatchNorm2D-122 [[1, 512, 28, 28]] [1, 512, 28, 28] 2,048
BottleneckBlock-38 [[1, 512, 28, 28]] [1, 512, 28, 28] 0
Conv2D-123 [[1, 512, 28, 28]] [1, 128, 28, 28] 65,536
BatchNorm2D-123 [[1, 128, 28, 28]] [1, 128, 28, 28] 512
ReLU-41 [[1, 512, 28, 28]] [1, 512, 28, 28] 0
Conv2D-124 [[1, 128, 28, 28]] [1, 128, 28, 28] 147,456
BatchNorm2D-124 [[1, 128, 28, 28]] [1, 128, 28, 28] 512
Conv2D-125 [[1, 128, 28, 28]] [1, 512, 28, 28] 65,536
BatchNorm2D-125 [[1, 512, 28, 28]] [1, 512, 28, 28] 2,048
BottleneckBlock-39 [[1, 512, 28, 28]] [1, 512, 28, 28] 0
Conv2D-126 [[1, 512, 28, 28]] [1, 128, 28, 28] 65,536
BatchNorm2D-126 [[1, 128, 28, 28]] [1, 128, 28, 28] 512
ReLU-42 [[1, 512, 28, 28]] [1, 512, 28, 28] 0
Conv2D-127 [[1, 128, 28, 28]] [1, 128, 28, 28] 147,456
BatchNorm2D-127 [[1, 128, 28, 28]] [1, 128, 28, 28] 512
Conv2D-128 [[1, 128, 28, 28]] [1, 512, 28, 28] 65,536
BatchNorm2D-128 [[1, 512, 28, 28]] [1, 512, 28, 28] 2,048
BottleneckBlock-40 [[1, 512, 28, 28]] [1, 512, 28, 28] 0
Conv2D-130 [[1, 512, 28, 28]] [1, 256, 28, 28] 131,072
BatchNorm2D-130 [[1, 256, 28, 28]] [1, 256, 28, 28] 1,024
ReLU-43 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-131 [[1, 256, 28, 28]] [1, 256, 14, 14] 589,824
BatchNorm2D-131 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
Conv2D-132 [[1, 256, 14, 14]] [1, 1024, 14, 14] 262,144
BatchNorm2D-132 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 4,096
Conv2D-129 [[1, 512, 28, 28]] [1, 1024, 14, 14] 524,288
BatchNorm2D-129 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 4,096
BottleneckBlock-41 [[1, 512, 28, 28]] [1, 1024, 14, 14] 0
Conv2D-133 [[1, 1024, 14, 14]] [1, 256, 14, 14] 262,144
BatchNorm2D-133 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
ReLU-44 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-134 [[1, 256, 14, 14]] [1, 256, 14, 14] 589,824
BatchNorm2D-134 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
Conv2D-135 [[1, 256, 14, 14]] [1, 1024, 14, 14] 262,144
BatchNorm2D-135 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 4,096
BottleneckBlock-42 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-136 [[1, 1024, 14, 14]] [1, 256, 14, 14] 262,144
BatchNorm2D-136 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
ReLU-45 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-137 [[1, 256, 14, 14]] [1, 256, 14, 14] 589,824
BatchNorm2D-137 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
Conv2D-138 [[1, 256, 14, 14]] [1, 1024, 14, 14] 262,144
BatchNorm2D-138 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 4,096
BottleneckBlock-43 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-139 [[1, 1024, 14, 14]] [1, 256, 14, 14] 262,144
BatchNorm2D-139 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
ReLU-46 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-140 [[1, 256, 14, 14]] [1, 256, 14, 14] 589,824
BatchNorm2D-140 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
Conv2D-141 [[1, 256, 14, 14]] [1, 1024, 14, 14] 262,144
BatchNorm2D-141 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 4,096
BottleneckBlock-44 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-142 [[1, 1024, 14, 14]] [1, 256, 14, 14] 262,144
BatchNorm2D-142 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
ReLU-47 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-143 [[1, 256, 14, 14]] [1, 256, 14, 14] 589,824
BatchNorm2D-143 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
Conv2D-144 [[1, 256, 14, 14]] [1, 1024, 14, 14] 262,144
BatchNorm2D-144 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 4,096
BottleneckBlock-45 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-145 [[1, 1024, 14, 14]] [1, 256, 14, 14] 262,144
BatchNorm2D-145 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
ReLU-48 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-146 [[1, 256, 14, 14]] [1, 256, 14, 14] 589,824
BatchNorm2D-146 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
Conv2D-147 [[1, 256, 14, 14]] [1, 1024, 14, 14] 262,144
BatchNorm2D-147 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 4,096
BottleneckBlock-46 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-148 [[1, 1024, 14, 14]] [1, 256, 14, 14] 262,144
BatchNorm2D-148 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
ReLU-49 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-149 [[1, 256, 14, 14]] [1, 256, 14, 14] 589,824
BatchNorm2D-149 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
Conv2D-150 [[1, 256, 14, 14]] [1, 1024, 14, 14] 262,144
BatchNorm2D-150 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 4,096
BottleneckBlock-47 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-151 [[1, 1024, 14, 14]] [1, 256, 14, 14] 262,144
BatchNorm2D-151 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
ReLU-50 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-152 [[1, 256, 14, 14]] [1, 256, 14, 14] 589,824
BatchNorm2D-152 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
Conv2D-153 [[1, 256, 14, 14]] [1, 1024, 14, 14] 262,144
BatchNorm2D-153 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 4,096
BottleneckBlock-48 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-154 [[1, 1024, 14, 14]] [1, 256, 14, 14] 262,144
BatchNorm2D-154 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
ReLU-51 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-155 [[1, 256, 14, 14]] [1, 256, 14, 14] 589,824
BatchNorm2D-155 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
Conv2D-156 [[1, 256, 14, 14]] [1, 1024, 14, 14] 262,144
BatchNorm2D-156 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 4,096
BottleneckBlock-49 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-157 [[1, 1024, 14, 14]] [1, 256, 14, 14] 262,144
BatchNorm2D-157 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
ReLU-52 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-158 [[1, 256, 14, 14]] [1, 256, 14, 14] 589,824
BatchNorm2D-158 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
Conv2D-159 [[1, 256, 14, 14]] [1, 1024, 14, 14] 262,144
BatchNorm2D-159 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 4,096
BottleneckBlock-50 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-160 [[1, 1024, 14, 14]] [1, 256, 14, 14] 262,144
BatchNorm2D-160 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
ReLU-53 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-161 [[1, 256, 14, 14]] [1, 256, 14, 14] 589,824
BatchNorm2D-161 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
Conv2D-162 [[1, 256, 14, 14]] [1, 1024, 14, 14] 262,144
BatchNorm2D-162 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 4,096
BottleneckBlock-51 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-163 [[1, 1024, 14, 14]] [1, 256, 14, 14] 262,144
BatchNorm2D-163 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
ReLU-54 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-164 [[1, 256, 14, 14]] [1, 256, 14, 14] 589,824
BatchNorm2D-164 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
Conv2D-165 [[1, 256, 14, 14]] [1, 1024, 14, 14] 262,144
BatchNorm2D-165 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 4,096
BottleneckBlock-52 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-166 [[1, 1024, 14, 14]] [1, 256, 14, 14] 262,144
BatchNorm2D-166 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
ReLU-55 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-167 [[1, 256, 14, 14]] [1, 256, 14, 14] 589,824
BatchNorm2D-167 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
Conv2D-168 [[1, 256, 14, 14]] [1, 1024, 14, 14] 262,144
BatchNorm2D-168 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 4,096
BottleneckBlock-53 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-169 [[1, 1024, 14, 14]] [1, 256, 14, 14] 262,144
BatchNorm2D-169 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
ReLU-56 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-170 [[1, 256, 14, 14]] [1, 256, 14, 14] 589,824
BatchNorm2D-170 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
Conv2D-171 [[1, 256, 14, 14]] [1, 1024, 14, 14] 262,144
BatchNorm2D-171 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 4,096
BottleneckBlock-54 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-172 [[1, 1024, 14, 14]] [1, 256, 14, 14] 262,144
BatchNorm2D-172 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
ReLU-57 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-173 [[1, 256, 14, 14]] [1, 256, 14, 14] 589,824
BatchNorm2D-173 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
Conv2D-174 [[1, 256, 14, 14]] [1, 1024, 14, 14] 262,144
BatchNorm2D-174 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 4,096
BottleneckBlock-55 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-175 [[1, 1024, 14, 14]] [1, 256, 14, 14] 262,144
BatchNorm2D-175 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
ReLU-58 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-176 [[1, 256, 14, 14]] [1, 256, 14, 14] 589,824
BatchNorm2D-176 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
Conv2D-177 [[1, 256, 14, 14]] [1, 1024, 14, 14] 262,144
BatchNorm2D-177 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 4,096
BottleneckBlock-56 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-178 [[1, 1024, 14, 14]] [1, 256, 14, 14] 262,144
BatchNorm2D-178 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
ReLU-59 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-179 [[1, 256, 14, 14]] [1, 256, 14, 14] 589,824
BatchNorm2D-179 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
Conv2D-180 [[1, 256, 14, 14]] [1, 1024, 14, 14] 262,144
BatchNorm2D-180 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 4,096
BottleneckBlock-57 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-181 [[1, 1024, 14, 14]] [1, 256, 14, 14] 262,144
BatchNorm2D-181 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
ReLU-60 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-182 [[1, 256, 14, 14]] [1, 256, 14, 14] 589,824
BatchNorm2D-182 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
Conv2D-183 [[1, 256, 14, 14]] [1, 1024, 14, 14] 262,144
BatchNorm2D-183 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 4,096
BottleneckBlock-58 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-184 [[1, 1024, 14, 14]] [1, 256, 14, 14] 262,144
BatchNorm2D-184 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
ReLU-61 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-185 [[1, 256, 14, 14]] [1, 256, 14, 14] 589,824
BatchNorm2D-185 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
Conv2D-186 [[1, 256, 14, 14]] [1, 1024, 14, 14] 262,144
BatchNorm2D-186 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 4,096
BottleneckBlock-59 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-187 [[1, 1024, 14, 14]] [1, 256, 14, 14] 262,144
BatchNorm2D-187 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
ReLU-62 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-188 [[1, 256, 14, 14]] [1, 256, 14, 14] 589,824
BatchNorm2D-188 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
Conv2D-189 [[1, 256, 14, 14]] [1, 1024, 14, 14] 262,144
BatchNorm2D-189 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 4,096
BottleneckBlock-60 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-190 [[1, 1024, 14, 14]] [1, 256, 14, 14] 262,144
BatchNorm2D-190 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
ReLU-63 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-191 [[1, 256, 14, 14]] [1, 256, 14, 14] 589,824
BatchNorm2D-191 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
Conv2D-192 [[1, 256, 14, 14]] [1, 1024, 14, 14] 262,144
BatchNorm2D-192 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 4,096
BottleneckBlock-61 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-193 [[1, 1024, 14, 14]] [1, 256, 14, 14] 262,144
BatchNorm2D-193 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
ReLU-64 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-194 [[1, 256, 14, 14]] [1, 256, 14, 14] 589,824
BatchNorm2D-194 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
Conv2D-195 [[1, 256, 14, 14]] [1, 1024, 14, 14] 262,144
BatchNorm2D-195 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 4,096
BottleneckBlock-62 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-196 [[1, 1024, 14, 14]] [1, 256, 14, 14] 262,144
BatchNorm2D-196 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
ReLU-65 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-197 [[1, 256, 14, 14]] [1, 256, 14, 14] 589,824
BatchNorm2D-197 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024
Conv2D-198 [[1, 256, 14, 14]] [1, 1024, 14, 14] 262,144
BatchNorm2D-198 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 4,096
BottleneckBlock-63 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0
Conv2D-200 [[1, 1024, 14, 14]] [1, 512, 14, 14] 524,288
BatchNorm2D-200 [[1, 512, 14, 14]] [1, 512, 14, 14] 2,048
ReLU-66 [[1, 2048, 7, 7]] [1, 2048, 7, 7] 0
Conv2D-201 [[1, 512, 14, 14]] [1, 512, 7, 7] 2,359,296
BatchNorm2D-201 [[1, 512, 7, 7]] [1, 512, 7, 7] 2,048
Conv2D-202 [[1, 512, 7, 7]] [1, 2048, 7, 7] 1,048,576
BatchNorm2D-202 [[1, 2048, 7, 7]] [1, 2048, 7, 7] 8,192
Conv2D-199 [[1, 1024, 14, 14]] [1, 2048, 7, 7] 2,097,152
BatchNorm2D-199 [[1, 2048, 7, 7]] [1, 2048, 7, 7] 8,192
BottleneckBlock-64 [[1, 1024, 14, 14]] [1, 2048, 7, 7] 0
Conv2D-203 [[1, 2048, 7, 7]] [1, 512, 7, 7] 1,048,576
BatchNorm2D-203 [[1, 512, 7, 7]] [1, 512, 7, 7] 2,048
ReLU-67 [[1, 2048, 7, 7]] [1, 2048, 7, 7] 0
Conv2D-204 [[1, 512, 7, 7]] [1, 512, 7, 7] 2,359,296
BatchNorm2D-204 [[1, 512, 7, 7]] [1, 512, 7, 7] 2,048
Conv2D-205 [[1, 512, 7, 7]] [1, 2048, 7, 7] 1,048,576
BatchNorm2D-205 [[1, 2048, 7, 7]] [1, 2048, 7, 7] 8,192
BottleneckBlock-65 [[1, 2048, 7, 7]] [1, 2048, 7, 7] 0
Conv2D-206 [[1, 2048, 7, 7]] [1, 512, 7, 7] 1,048,576
BatchNorm2D-206 [[1, 512, 7, 7]] [1, 512, 7, 7] 2,048
ReLU-68 [[1, 2048, 7, 7]] [1, 2048, 7, 7] 0
Conv2D-207 [[1, 512, 7, 7]] [1, 512, 7, 7] 2,359,296
BatchNorm2D-207 [[1, 512, 7, 7]] [1, 512, 7, 7] 2,048
Conv2D-208 [[1, 512, 7, 7]] [1, 2048, 7, 7] 1,048,576
BatchNorm2D-208 [[1, 2048, 7, 7]] [1, 2048, 7, 7] 8,192
BottleneckBlock-66 [[1, 2048, 7, 7]] [1, 2048, 7, 7] 0
AdaptiveAvgPool2D-2 [[1, 2048, 7, 7]] [1, 2048, 1, 1] 0
Linear-2 [[1, 2048]] [1, 100] 204,900
===============================================================================
Total params: 42,810,404
Trainable params: 42,599,716
Non-trainable params: 210,688
-------------------------------------------------------------------------------
Input size (MB): 0.57
Forward/backward pass size (MB): 391.63
Params size (MB): 163.31
Estimated Total Size (MB): 555.52
-------------------------------------------------------------------------------






{'total_params': 42810404, 'trainable_params': 42599716}

④ 模型训练和调优

class paddle.optimizer.lr.PiecewiseDecay(boundaries, values, last_epoch=- 1, verbose=False)

该接口提供分段设置学习率的策略。

class paddle.optimizer.lr.LinearWarmup(learing_rate, warmup_steps, start_lr, end_lr, last_epoch=- 1, verbose=False)

该接口提供一种学习率优化策略-线性学习率热身(warm up)对学习率进行初步调整。在正常调整学习率之前,先逐步增大学习率。

class paddle.callbacks.EarlyStopping(monitor=‘loss’, mode=‘auto’, patience=0, verbose=1, min_delta=0, baseline=None, save_best_model=True)

在模型评估阶段,模型效果如果没有提升,EarlyStopping 会让模型提前停止训练。

  • monitor (str,可选) - 监控量。该量作为模型是否停止学习的监控指标。默认值:‘loss’。

  • mode (str,可选) - 可以是’auto’、‘min’或者’max’。在 min 模式下,模型会在监控量的值不再减少时停止训练;max 模式下,模型会在监控量的值不再增加时停止训练;auto 模式下,实际的模式会从 monitor 推断出来。如果monitor中有’acc’,将会认为是 max 模式,其它情况下,都会被推断为 min 模式。默认值:‘auto’。

  • patience (int,可选) - 多少个 epoch 模型效果未提升会使模型提前停止训练。默认值:0。

  • verbose (int,可选) - 可以是 0 或者 1。1 代表不打印模型提前停止训练的日志,1 代表打印日志。默认值:1。

  • min_delta (int|float,可选) - 监控量最小改变值。当 evaluation 的监控变量改变值小于 min_delta ,就认为模型没有变化。默认值:0。

  • baseline (int|float,可选) - 监控量的基线。如果模型在训练 patience 个 epoch 后效果对比基线没有提升,将会停止训练。如果是 None,代表没有基线。默认值:None。

  • save_best_model (bool,可选) - 是否保存效果最好的模型(监控量的值最优)。文件会保存在 fit 中传入的参数 save_dir 下,前缀名为 best_model,默认值: True。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
def make_optimizer(parameters=None, momentum=0.9, weight_decay=5e-4, boundaries=None, values=None):

learning_rate = paddle.optimizer.lr.PiecewiseDecay(
boundaries=boundaries,
values=values,
verbose=False)

# learning_rate = paddle.optimizer.lr.LinearWarmup(
# learning_rate=learning_rate,
# warmup_steps=wamup_steps,
# start_lr=base_lr / 5.,
# end_lr=base_lr,
# verbose=False)

optimizer = paddle.optimizer.Momentum(
learning_rate=learning_rate,
weight_decay=weight_decay,
momentum=momentum,
parameters=parameters)

# optimizer = paddle.optimizer.AdamW(
# learning_rate=learning_rate,
# weight_decay=weight_decay,
# parameters=parameters)

return optimizer


base_lr = 5e-4
boundaries = [5, 10]

optimizer = make_optimizer(boundaries=boundaries, values=[base_lr, base_lr*0.2, base_lr*0.1], parameters=model.parameters())

model.prepare(
# optimizer=paddle.optimizer.Adam(learning_rate=5e-4, weight_decay=paddle.regularizer.L2Decay(5e-4), parameters=model.parameters()),
optimizer=optimizer,
loss=paddle.nn.CrossEntropyLoss(),
metrics=paddle.metric.Accuracy(topk=(1, 5))
)

# callbacks
visualdl = paddle.callbacks.VisualDL('./visualdl/resnet101')
earlystop = paddle.callbacks.EarlyStopping( # acc不在上升时停止
'acc',
mode='max',
patience=5,
verbose=1,
min_delta=0,
baseline=None,
save_best_model=True)

model.fit(
train_dataset,
eval_dataset,
epochs=20,
batch_size=128,
save_freq=2,
save_dir='checkpoint/resnet101',
callbacks=[visualdl, earlystop],
verbose=1
)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
The loss value printed in the log is the current step, and the metric is the average value of previous step.
Epoch 1/20
step 391/391 [==============================] - loss: 1.6645 - acc_top1: 0.5995 - acc_top5: 0.8853 - 889ms/step
save checkpoint at /home/aistudio/checkpoint/resnet101/0
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 79/79 [==============================] - loss: 1.5161 - acc_top1: 0.6217 - acc_top5: 0.8958 - 421ms/step
Eval samples: 10000
Epoch 2/20


/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/hapi/callbacks.py:808: UserWarning: Monitor of EarlyStopping should be loss or metric name.
'Monitor of EarlyStopping should be loss or metric name.')


step 391/391 [==============================] - loss: 1.6678 - acc_top1: 0.6264 - acc_top5: 0.8994 - 891ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 79/79 [==============================] - loss: 1.3930 - acc_top1: 0.6398 - acc_top5: 0.9048 - 421ms/step
Eval samples: 10000
Epoch 3/20
step 391/391 [==============================] - loss: 1.4620 - acc_top1: 0.6483 - acc_top5: 0.9111 - 890ms/step
save checkpoint at /home/aistudio/checkpoint/resnet101/2
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 79/79 [==============================] - loss: 1.2873 - acc_top1: 0.6574 - acc_top5: 0.9170 - 419ms/step
Eval samples: 10000
Epoch 4/20
step 391/391 [==============================] - loss: 1.2918 - acc_top1: 0.6655 - acc_top5: 0.9203 - 888ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 79/79 [==============================] - loss: 1.2550 - acc_top1: 0.6713 - acc_top5: 0.9235 - 422ms/step
Eval samples: 10000
Epoch 5/20
step 391/391 [==============================] - loss: 1.2747 - acc_top1: 0.6815 - acc_top5: 0.9260 - 890ms/step
save checkpoint at /home/aistudio/checkpoint/resnet101/4
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 79/79 [==============================] - loss: 1.1627 - acc_top1: 0.6817 - acc_top5: 0.9289 - 432ms/step
Eval samples: 10000
Epoch 6/20
step 391/391 [==============================] - loss: 1.2274 - acc_top1: 0.6934 - acc_top5: 0.9329 - 896ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 79/79 [==============================] - loss: 1.1079 - acc_top1: 0.6979 - acc_top5: 0.9341 - 429ms/step
Eval samples: 10000
Epoch 7/20
step 391/391 [==============================] - loss: 1.0835 - acc_top1: 0.7030 - acc_top5: 0.9362 - 895ms/step
save checkpoint at /home/aistudio/checkpoint/resnet101/6
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 79/79 [==============================] - loss: 1.0728 - acc_top1: 0.7092 - acc_top5: 0.9389 - 427ms/step
Eval samples: 10000
Epoch 8/20
step 391/391 [==============================] - loss: 1.0596 - acc_top1: 0.7117 - acc_top5: 0.9420 - 895ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 79/79 [==============================] - loss: 1.0304 - acc_top1: 0.7185 - acc_top5: 0.9434 - 428ms/step
Eval samples: 10000
Epoch 9/20
step 391/391 [==============================] - loss: 1.0755 - acc_top1: 0.7246 - acc_top5: 0.9442 - 895ms/step
save checkpoint at /home/aistudio/checkpoint/resnet101/8
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 79/79 [==============================] - loss: 1.0199 - acc_top1: 0.7282 - acc_top5: 0.9452 - 424ms/step
Eval samples: 10000
Epoch 10/20
step 391/391 [==============================] - loss: 1.1637 - acc_top1: 0.7312 - acc_top5: 0.9478 - 893ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 79/79 [==============================] - loss: 1.0062 - acc_top1: 0.7315 - acc_top5: 0.9464 - 428ms/step
Eval samples: 10000
Epoch 11/20
step 391/391 [==============================] - loss: 1.0794 - acc_top1: 0.7399 - acc_top5: 0.9518 - 894ms/step
save checkpoint at /home/aistudio/checkpoint/resnet101/10
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 79/79 [==============================] - loss: 0.9966 - acc_top1: 0.7390 - acc_top5: 0.9493 - 426ms/step
Eval samples: 10000
Epoch 12/20
step 391/391 [==============================] - loss: 0.8478 - acc_top1: 0.7469 - acc_top5: 0.9539 - 894ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 79/79 [==============================] - loss: 0.9767 - acc_top1: 0.7424 - acc_top5: 0.9512 - 425ms/step
Eval samples: 10000
Epoch 13/20
step 391/391 [==============================] - loss: 0.9958 - acc_top1: 0.7526 - acc_top5: 0.9555 - 890ms/step
save checkpoint at /home/aistudio/checkpoint/resnet101/12
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 79/79 [==============================] - loss: 0.9697 - acc_top1: 0.7490 - acc_top5: 0.9539 - 426ms/step
Eval samples: 10000
Epoch 14/20
step 391/391 [==============================] - loss: 0.7780 - acc_top1: 0.7584 - acc_top5: 0.9579 - 894ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 79/79 [==============================] - loss: 0.9391 - acc_top1: 0.7539 - acc_top5: 0.9564 - 423ms/step
Eval samples: 10000
Epoch 15/20
step 391/391 [==============================] - loss: 1.0491 - acc_top1: 0.7658 - acc_top5: 0.9614 - 891ms/step
save checkpoint at /home/aistudio/checkpoint/resnet101/14
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 79/79 [==============================] - loss: 0.9187 - acc_top1: 0.7583 - acc_top5: 0.9571 - 420ms/step
Eval samples: 10000
Epoch 16/20
step 391/391 [==============================] - loss: 0.8458 - acc_top1: 0.7696 - acc_top5: 0.9617 - 890ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 79/79 [==============================] - loss: 0.9076 - acc_top1: 0.7623 - acc_top5: 0.9589 - 422ms/step
Eval samples: 10000
Epoch 17/20
step 391/391 [==============================] - loss: 1.0365 - acc_top1: 0.7758 - acc_top5: 0.9628 - 892ms/step
save checkpoint at /home/aistudio/checkpoint/resnet101/16
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 79/79 [==============================] - loss: 0.8588 - acc_top1: 0.7638 - acc_top5: 0.9598 - 424ms/step
Eval samples: 10000
Epoch 18/20
step 391/391 [==============================] - loss: 0.9865 - acc_top1: 0.7814 - acc_top5: 0.9650 - 890ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 79/79 [==============================] - loss: 0.8834 - acc_top1: 0.7701 - acc_top5: 0.9618 - 429ms/step
Eval samples: 10000
Epoch 19/20
step 391/391 [==============================] - loss: 0.6649 - acc_top1: 0.7834 - acc_top5: 0.9670 - 893ms/step
save checkpoint at /home/aistudio/checkpoint/resnet101/18
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 79/79 [==============================] - loss: 0.9119 - acc_top1: 0.7721 - acc_top5: 0.9620 - 427ms/step
Eval samples: 10000
Epoch 20/20
step 391/391 [==============================] - loss: 0.6904 - acc_top1: 0.7891 - acc_top5: 0.9680 - 898ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 79/79 [==============================] - loss: 0.8655 - acc_top1: 0.7740 - acc_top5: 0.9628 - 430ms/step
Eval samples: 10000
save checkpoint at /home/aistudio/checkpoint/resnet101/final

VisualDL

resize_epoch20_lr5e-4

loss 还在下降,继续训练

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
# 加载finetuning模型训练
model.load('./checkpoint/resnet101/14')


def make_optimizer(parameters=None, momentum=0.9, weight_decay=5e-4, boundaries=None, values=None):

learning_rate = paddle.optimizer.lr.PiecewiseDecay(
boundaries=boundaries,
values=values,
verbose=False)

# learning_rate = paddle.optimizer.lr.LinearWarmup(
# learning_rate=learning_rate,
# warmup_steps=wamup_steps,
# start_lr=base_lr / 5.,
# end_lr=base_lr,
# verbose=False)

# optimizer = paddle.optimizer.Momentum(
# learning_rate=learning_rate,
# weight_decay=weight_decay,
# momentum=momentum,
# parameters=parameters)

optimizer = paddle.optimizer.Adam(
learning_rate=learning_rate,
weight_decay=weight_decay,
parameters=parameters)

return optimizer

base_lr = 5e-5
boundaries = [5]

optimizer = make_optimizer(boundaries=boundaries, values=[base_lr, base_lr*0.2], parameters=model.parameters())

model.prepare(
optimizer=optimizer,
loss=paddle.nn.CrossEntropyLoss(),
metrics=paddle.metric.Accuracy(topk=(1, 5))
)

# callbacks
visualdl = paddle.callbacks.VisualDL('./visualdl/resnet101/14')
earlystop = paddle.callbacks.EarlyStopping(
# acc不在上升时停止
'acc',
mode='max',
patience=4,
verbose=1,
min_delta=0,
baseline=None,
save_best_model=True)

model.fit(
train_dataset,
eval_dataset,
epochs=10,
batch_size=128,
save_freq=2,
save_dir='checkpoint/resnet101/14',
callbacks=[visualdl],
verbose=1
_dataset,
eval_dataset,
epochs=10,
batch_size=128,
save_freq=2,
save_dir='checkpoint/resnet101/14',
callbacks=[visualdl],
verbose=1
)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
The loss value printed in the log is the current step, and the metric is the average value of previous step.
Epoch 1/10
step 391/391 [==============================] - loss: 0.7212 - acc_top1: 0.7760 - acc_top5: 0.9648 - 892ms/step
save checkpoint at /home/aistudio/checkpoint/resnet101/14/0
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 79/79 [==============================] - loss: 0.9302 - acc_top1: 0.7996 - acc_top5: 0.9680 - 421ms/step
Eval samples: 10000
Epoch 2/10
step 391/391 [==============================] - loss: 0.5295 - acc_top1: 0.8301 - acc_top5: 0.9776 - 893ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 79/79 [==============================] - loss: 0.8164 - acc_top1: 0.8103 - acc_top5: 0.9732 - 416ms/step
Eval samples: 10000
Epoch 3/10
step 391/391 [==============================] - loss: 0.4092 - acc_top1: 0.8622 - acc_top5: 0.9853 - 895ms/step
save checkpoint at /home/aistudio/checkpoint/resnet101/14/2
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 79/79 [==============================] - loss: 0.8366 - acc_top1: 0.8297 - acc_top5: 0.9747 - 416ms/step
Eval samples: 10000
Epoch 4/10
step 391/391 [==============================] - loss: 0.3479 - acc_top1: 0.8860 - acc_top5: 0.9896 - 899ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 79/79 [==============================] - loss: 0.7397 - acc_top1: 0.8325 - acc_top5: 0.9757 - 428ms/step
Eval samples: 10000
Epoch 5/10
step 391/391 [==============================] - loss: 0.3405 - acc_top1: 0.9086 - acc_top5: 0.9925 - 906ms/step
save checkpoint at /home/aistudio/checkpoint/resnet101/14/4
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 79/79 [==============================] - loss: 0.8492 - acc_top1: 0.8373 - acc_top5: 0.9780 - 430ms/step
Eval samples: 10000
Epoch 6/10
step 391/391 [==============================] - loss: 0.1852 - acc_top1: 0.9242 - acc_top5: 0.9948 - 902ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 79/79 [==============================] - loss: 0.7244 - acc_top1: 0.8436 - acc_top5: 0.9761 - 422ms/step
Eval samples: 10000
Epoch 7/10
step 391/391 [==============================] - loss: 0.3084 - acc_top1: 0.9387 - acc_top5: 0.9969 - 895ms/step
save checkpoint at /home/aistudio/checkpoint/resnet101/14/6
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 79/79 [==============================] - loss: 0.8724 - acc_top1: 0.8458 - acc_top5: 0.9767 - 419ms/step
Eval samples: 10000
Epoch 8/10
step 391/391 [==============================] - loss: 0.1378 - acc_top1: 0.9529 - acc_top5: 0.9979 - 893ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 79/79 [==============================] - loss: 0.8844 - acc_top1: 0.8443 - acc_top5: 0.9765 - 419ms/step
Eval samples: 10000
Epoch 9/10
step 391/391 [==============================] - loss: 0.1667 - acc_top1: 0.9623 - acc_top5: 0.9981 - 894ms/step
save checkpoint at /home/aistudio/checkpoint/resnet101/14/8
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 79/79 [==============================] - loss: 0.7474 - acc_top1: 0.8490 - acc_top5: 0.9775 - 416ms/step
Eval samples: 10000
Epoch 10/10
step 391/391 [==============================] - loss: 0.1841 - acc_top1: 0.9697 - acc_top5: 0.9989 - 895ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 79/79 [==============================] - loss: 0.7419 - acc_top1: 0.8471 - acc_top5: 0.9758 - 428ms/step
Eval samples: 10000
save checkpoint at /home/aistudio/checkpoint/resnet101/14/final

VisualDL

1
model.save('./finetuning/resnet101/model', training=True)

评分输出

1
2
Eval samples: 10000
{'loss': [1.4640276], 'acc_top1': 0.6361581096849475, 'acc_top5': 0.8786464410735122}
1
2
3
result = model.evaluate(eval_dataset, batch_size=128, verbose=1)

print(result)
1
2
3
4
5
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 79/79 [==============================] - loss: 0.7419 - acc_top1: 0.8468 - acc_top5: 0.9757 - 417ms/step
Eval samples: 10000
{'loss': [0.74186254], 'acc_top1': 0.8467935528120714, 'acc_top5': 0.9757373113854595}

CIFAR-100 网络模型效果参考

https://paperswithcode.com/sota/image-classification-on-cifar-100

自定义模型训练结果参考:

dataset network params top1 err top5 err epoch(lr = 0.1) epoch(lr = 0.02) epoch(lr = 0.004) epoch(lr = 0.0008) total epoch
cifar100 mobilenet 3.3M 34.02 10.56 60 60 40 40 200
cifar100 mobilenetv2 2.36M 31.92 09.02 60 60 40 40 200
cifar100 squeezenet 0.78M 30.59 8.36 60 60 40 40 200
cifar100 shufflenet 1.0M 29.94 8.35 60 60 40 40 200
cifar100 shufflenetv2 1.3M 30.49 8.49 60 60 40 40 200
cifar100 vgg11_bn 28.5M 31.36 11.85 60 60 40 40 200
cifar100 vgg13_bn 28.7M 28.00 9.71 60 60 40 40 200
cifar100 vgg16_bn 34.0M 27.07 8.84 60 60 40 40 200
cifar100 vgg19_bn 39.0M 27.77 8.84 60 60 40 40 200
cifar100 resnet18 11.2M 24.39 6.95 60 60 40 40 200
cifar100 resnet34 21.3M 23.24 6.63 60 60 40 40 200
cifar100 resnet50 23.7M 22.61 6.04 60 60 40 40 200
cifar100 resnet101 42.7M 22.22 5.61 60 60 40 40 200
cifar100 resnet152 58.3M 22.31 5.81 60 60 40 40 200
cifar100 preactresnet18 11.3M 27.08 8.53 60 60 40 40 200
cifar100 preactresnet34 21.5M 24.79 7.68 60 60 40 40 200
cifar100 preactresnet50 23.9M 25.73 8.15 60 60 40 40 200
cifar100 preactresnet101 42.9M 24.84 7.83 60 60 40 40 200
cifar100 preactresnet152 58.6M 22.71 6.62 60 60 40 40 200
cifar100 resnext50 14.8M 22.23 6.00 60 60 40 40 200
cifar100 resnext101 25.3M 22.22 5.99 60 60 40 40 200
cifar100 resnext152 33.3M 22.40 5.58 60 60 40 40 200
cifar100 attention59 55.7M 33.75 12.90 60 60 40 40 200
cifar100 attention92 102.5M 36.52 11.47 60 60 40 40 200
cifar100 densenet121 7.0M 22.99 6.45 60 60 40 40 200
cifar100 densenet161 26M 21.56 6.04 60 60 60 40 200
cifar100 densenet201 18M 21.46 5.9 60 60 40 40 200
cifar100 googlenet 6.2M 21.97 5.94 60 60 40 40 200
cifar100 inceptionv3 22.3M 22.81 6.39 60 60 40 40 200
cifar100 inceptionv4 41.3M 24.14 6.90 60 60 40 40 200
cifar100 inceptionresnetv2 65.4M 27.51 9.11 60 60 40 40 200
cifar100 xception 21.0M 25.07 7.32 60 60 40 40 200
cifar100 seresnet18 11.4M 23.56 6.68 60 60 40 40 200
cifar100 seresnet34 21.6M 22.07 6.12 60 60 40 40 200
cifar100 seresnet50 26.5M 21.42 5.58 60 60 40 40 200
cifar100 seresnet101 47.7M 20.98 5.41 60 60 40 40 200
cifar100 seresnet152 66.2M 20.66 5.19 60 60 40 40 200
cifar100 nasnet 5.2M 22.71 5.91 60 60 40 40 200
cifar100 wideresnet-40-10 55.9M 21.25 5.77 60 60 40 40 200
cifar100 stochasticdepth18 11.22M 31.40 8.84 60 60 40 40 200
cifar100 stochasticdepth34 21.36M 27.72 7.32 60 60 40 40 200
cifar100 stochasticdepth50 23.71M 23.35 5.76 60 60 40 40 200
cifar100 stochasticdepth101 42.69M 21.28 5.39 60 60 40 40 200