Tensoflow學(xué)習(xí)記錄12--resnet網(wǎng)絡(luò)

2019-11-10 18:38:43

字體：大中小

供稿：網(wǎng)友

綜述

前面?zhèn)z小節(jié)已經(jīng)講了經(jīng)典的alex-net和vgg網(wǎng)絡(luò)，vgg-net在alex網(wǎng)絡(luò)的基礎(chǔ)上，測(cè)試了很多種加深網(wǎng)絡(luò)的方式，得到了vgg16和vgg19最后的結(jié)果還不錯(cuò)，但是后來(lái)人們發(fā)現(xiàn)，在網(wǎng)絡(luò)深度到達(dá)一定程度后，繼續(xù)加深網(wǎng)絡(luò)，會(huì)有倆個(gè)問(wèn)題，一個(gè)是太遠(yuǎn)了，梯度消失，即數(shù)據(jù)分散在不再被激活的那個(gè)區(qū)域?qū)е绿荻葹?消失了，這個(gè)可以通過(guò)norimalized核intermediate narmalization layers解決。二個(gè)是模型的準(zhǔn)確率會(huì)迅速下滑，單并不是overfit造成的。作者提出了一個(gè)網(wǎng)絡(luò)結(jié)構(gòu)，通過(guò)重復(fù)利用輸入來(lái)優(yōu)化訓(xùn)練，結(jié)果出奇的好。

resnet網(wǎng)絡(luò)結(jié)構(gòu)

一般的，總體分為6層，第一層為一個(gè)卷積層加一個(gè)pool層，最后為一個(gè)全鏈接fc層，中間四層的每一層都由多個(gè)residual block組成，然后每個(gè)residual block又由2個(gè)或3個(gè)卷積層加shortcut connections(捷徑，即加上了初始的輸入x)組成，這樣構(gòu)成的深層次的卷積神經(jīng)網(wǎng)絡(luò)。

residual block

這里寫(xiě)圖片描述中間weight layer一般由2層或者3層卷積組成，層與層之間加上batch norimalization以及relu（何凱名再后續(xù)的文章中有提到怎么處理比較好，可以看這篇博客Binbin Xu），最后一層加上x(chóng)后再relu，最為輸出。這里申明下，中間是4層，每層由多個(gè)residual block組成，每個(gè)block又由多個(gè)卷積層加上x(chóng)（identity效果比較好，Binbin Xu identity好處很多，首先效果好，再者不會(huì)增加參數(shù)）

一個(gè)34層網(wǎng)絡(luò)結(jié)構(gòu)的例子

最右邊即為一個(gè)32層的resnet網(wǎng)絡(luò)，最上面一個(gè)卷積加池化，中間分別有3，4，6，3，這四層block，每個(gè)block由倆個(gè)卷積，即（3+4+6+3=16）×2=32，再加上最后一個(gè)fc層即34層的結(jié)構(gòu)。這里寫(xiě)圖片描述

應(yīng)用

CIFAR-10

發(fā)現(xiàn)復(fù)雜了可能效果還不好，所以就做了一個(gè)簡(jiǎn)單的模型，原始為32*32的圖，padding了4位，然后再隨機(jī)crop出32*32的圖，接著便三個(gè)卷積層，分別為32*32×16，16*16×32，8*8×64，每層n個(gè)block，每個(gè)block倆個(gè)卷積層，再加上最后fc共6n+2層。說(shuō)是110層效果最好，1000+層反而還不好了，可能是過(guò)擬合。

下面是部分實(shí)現(xiàn)代碼，來(lái)自于ry/tensorflow-resnet

# This is what they use for CIFAR-10 and 100.# See Section 4.2 in http://arxiv.org/abs/1512.03385def inference_small(x, is_training, num_blocks=3, # 6n+2 total weight layers will be used. use_bias=False, # defaults to using batch norm num_classes=10): c = Config() c['is_training'] = tf.convert_to_tensor(is_training, dtype='bool', name='is_training') c['use_bias'] = use_bias c['fc_units_out'] = num_classes c['num_blocks'] = num_blocks c['num_classes'] = num_classes inference_small_config(x, c)def inference_small_config(x, c): c['bottleneck'] = False c['ksize'] = 3 c['stride'] = 1 with tf.variable_scope('scale1'): c['conv_filters_out'] = 16 c['block_filters_internal'] = 16 c['stack_stride'] = 1 x = conv(x, c) x = bn(x, c) x = activation(x) x = stack(x, c) with tf.variable_scope('scale2'): c['block_filters_internal'] = 32 c['stack_stride'] = 2 x = stack(x, c) with tf.variable_scope('scale3'): c['block_filters_internal'] = 64 c['stack_stride'] = 2 x = stack(x, c) # post-net x = tf.reduce_mean(x, reduction_indices=[1, 2], name="avg_pool") if c['num_classes'] != None: with tf.variable_scope('fc'): x = fc(x, c) return x

另一種部分實(shí)現(xiàn)代碼，參考自wenxinxu/resnet-in-tensorflow

def inference(input_tensor_batch, n, reuse): ''' The main function that defines the ResNet. total layers = 1 + 2n + 2n + 2n +1 = 6n + 2 :param input_tensor_batch: 4D tensor :param n: num_residual_blocks :param reuse: To build train graph, reuse=False. To build validation graph and share weights with train graph, resue=True :return: last layer in the network. Not softmax-ed ''' layers = [] with tf.variable_scope('conv0', reuse=reuse): conv0 = conv_bn_relu_layer(input_tensor_batch, [3, 3, 3, 16], 1) activation_summary(conv0) layers.append(conv0) for i in range(n): with tf.variable_scope('conv1_%d' %i, reuse=reuse): if i == 0: conv1 = residual_block(layers[-1], 16, first_block=True) else: conv1 = residual_block(layers[-1], 16) activation_summary(conv1) layers.append(conv1) for i in range(n): with tf.variable_scope('conv2_%d' %i, reuse=reuse): conv2 = residual_block(layers[-1], 32) activation_summary(conv2) layers.append(conv2) for i in range(n): with tf.variable_scope('conv3_%d' %i, reuse=reuse): conv3 = residual_block(layers[-1], 64) layers.append(conv3) assert conv3.get_shape().as_list()[1:] == [8, 8, 64] with tf.variable_scope('fc', reuse=reuse): in_channel = layers[-1].get_shape().as_list()[-1] bn_layer = batch_normalization_layer(layers[-1], in_channel) relu_layer = tf.nn.relu(bn_layer) global_pool = tf.reduce_mean(relu_layer, [1, 2]) assert global_pool.get_shape().as_list()[-1:] == [64] output = output_layer(global_pool, 10) layers.append(output) return layers[-1]

Imagenet

層差原文tabble1 了例如50層的話，中間四層分別（3+4+6+3）=16個(gè)block，每個(gè)block3個(gè)卷積，1×1，3×3，1×1，共16×3=48層，加上下倆層就五十層了。下面是部分實(shí)現(xiàn)代碼，來(lái)自于ry/tensorflow-resnet

def inference(x, is_training, num_classes=1000, num_blocks=[3, 4, 6, 3], # defaults to 50-layer network use_bias=False, # defaults to using batch norm bottleneck=True): c = Config() c['bottleneck'] = bottleneck c['is_training'] = tf.convert_to_tensor(is_training, dtype='bool', name='is_training') c['ksize'] = 3 c['stride'] = 1 c['use_bias'] = use_bias c['fc_units_out'] = num_classes c['num_blocks'] = num_blocks c['stack_stride'] = 2 with tf.variable_scope('scale1'): c['conv_filters_out'] = 64 c['ksize'] = 7 c['stride'] = 2 x = conv(x, c) x = bn(x, c) x = activation(x) with tf.variable_scope('scale2'): x = _max_pool(x, ksize=3, stride=2) c['num_blocks'] = num_blocks[0] c['stack_stride'] = 1 c['block_filters_internal'] = 64 x = stack(x, c) with tf.variable_scope('scale3'): c['num_blocks'] = num_blocks[1] c['block_filters_internal'] = 128 assert c['stack_stride'] == 2 x = stack(x, c) with tf.variable_scope('scale4'): c['num_blocks'] = num_blocks[2] c['block_filters_internal'] = 256 x = stack(x, c) with tf.variable_scope('scale5'): c['num_blocks'] = num_blocks[3] c['block_filters_internal'] = 512 x = stack(x, c) # post-net x = tf.reduce_mean(x, reduction_indices=[1, 2], name="avg_pool") if num_classes != None: with tf.variable_scope('fc'): x = fc(x, c) return x

上一篇：Lucene使用DeleteDocuments刪除索引無(wú)效的原因

下一篇：數(shù)據(jù)結(jié)構(gòu)實(shí)驗(yàn)之?dāng)?shù)組二：稀疏矩陣