分類:目錄(?)[+]
在卷積神經網絡中,我們經常會碰到池化操作,而池化層往往在卷積層后面,通過池化來降低卷積層輸出的特征向量,同時改善結果(不易出現過擬合)。
為什么可以通過降低維度呢?
因為圖像具有一種“靜態性”的屬性,這也就意味著在一個圖像區域有用的特征極有可能在另一個區域同樣適用。因此,為了描述大的圖像,一個很自然的想法就是對不同位置的特征進行聚合統計,例如,人們可以計算圖像一個區域上的某個特定特征的平均值 (或最大值)來代表這個區域的特征。[1]
池化作用于圖像中不重合的區域(這與卷積操作不同),過程如下圖。

我們定義池化窗口的大小為sizeX,即下圖中紅色正方形的邊長,定義兩個相鄰池化窗口的水平位移/豎直位移為stride。一般池化由于每一池化窗口都是不重復的,所以sizeX=stride。

最常見的池化操作為平均池化mean pooling和最大池化max pooling:
平均池化:計算圖像區域的平均值作為該區域池化后的值。
最大池化:選圖像區域的最大值作為該區域池化后的值。
論文中[2]中,作者使用了重疊池化,其他的設置都不變的情況下, top-1和top-5 的錯誤率分別減少了0.4% 和0.3%。
空間金字塔池化可以把任何尺度的圖像的卷積特征轉化成相同維度,這不僅可以讓CNN處理任意尺度的圖像,還能避免cropping和warping操作,導致一些信息的丟失,具有非常重要的意義。
一般的CNN都需要輸入圖像的大小是固定的,這是因為全連接層的輸入需要固定輸入維度,但在卷積操作是沒有對圖像尺度有限制,所有作者提出了空間金字塔池化,先讓圖像進行卷積操作,然后轉化成維度相同的特征輸入到全連接層,這個可以把CNN擴展到任意大小的圖像。

空間金字塔池化的思想來自于Spatial Pyramid Model,它一個pooling變成了多個scale的pooling。用不同大小池化窗口作用于卷積特征,我們可以得到1X1,2X2,4X4的池化結果,由于conv5中共有256個過濾器,所以得到1個256維的特征,4個256個特征,以及16個256維的特征,然后把這21個256維特征鏈接起來輸入全連接層,通過這種方式把不同大小的圖像轉化成相同維度的特征。

對于不同的圖像要得到相同大小的pooling結果,就需要根據圖像的大小動態的計算池化窗口的大小和步長。假設conv5輸出的大小為a*a,需要得到n*n大小的池化結果,可以讓窗口大小sizeX為
,步長為
。下圖以conv5輸出的大小為13*13為例。

疑問:如果conv5輸出的大小為14*14,[pool1*1]的sizeX=stride=14,[pool2*2]的sizeX=stride=7,這些都沒有問題,但是,[pool4*4]的sizeX=5,stride=4,最后一列和最后一行特征沒有被池化操作計算在內。
SPP其實就是一種多個scale的pooling,可以獲取圖像中的多尺度信息;在CNN中加入SPP后,可以讓CNN處理任意大小的輸入,這讓模型變得更加的flexible。
4. Reference
[1] UFLDL_Tutorial
[2] Krizhevsky, I. Sutskever, andG. Hinton, “Imagenet classification with deep convolutional neural networks,”in NipS,2012.
[3] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Su,Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition,LSVRC-2014 contest
來源:http://blog.csdn.NET/danieljianfeng/article/details/42433475======================================================================================================
分類:版權聲明:本文為博主原創文章,未經博主允許不得轉載。
圖像大小與參數個數:
前面幾章都是針對小圖像塊處理的,這一章則是針對大圖像進行處理的。兩者在這的區別還是很明顯的,小圖像(如8*8,MINIST的28*28)可以采用全連接的方式(即輸入層和隱含層直接相連)。但是大圖像,這個將會變得很耗時:比如96*96的圖像,若采用全連接方式,需要96*96個輸入單元,然后如果要訓練100個特征,只這一層就需要96*96*100個參數(W,b),訓練時間將是前面的幾百或者上萬倍。所以這里用到了部分聯通網絡。對于圖像來說,每個隱含單元僅僅連接輸入圖像的一小片相鄰區域。
這樣就引出了一個卷積的方法:
convolution:
自然圖像有其固有特性,也就是說,圖像的一部分的統計特性與其他部分是一樣的。這也意味著我們在這一部分學習的特征也能用在另一部分上,所以對于這個圖像上的所有位置,我們都能使用同樣的學習特征。
對于圖像,當從一個大尺寸圖像中隨機選取一小塊,比如說8x8作為樣本,并且從這個小塊樣本中學習到了一些特征,這時我們可以把從這個8x8樣本中學習到的特征作為探測器,應用到這個圖像的任意地方中去。特別是,我們可以用從8x8樣本中所學習到的特征跟原本的大尺寸圖像作卷積,從而對這個大尺寸圖像上的任一位置獲得一個不同特征的激活值。
講義中舉得具體例子,還是看例子容易理解:
假設你已經從一個96x96的圖像中學習到了它的一個8x8的樣本所具有的特征,假設這是由有100個隱含單元的自編碼完成的。為了得到卷積特征,需要對96x96的圖像的每個8x8的小塊圖像區域都進行卷積運算。也就是說,抽取8x8的小塊區域,并且從起始坐標開始依次標記為(1,1),(1,2),...,一直到(89,89),然后對抽取的區域逐個運行訓練過的稀疏自編碼來得到特征的激活值。在這個例子里,顯然可以得到100個集合,每個集合含有89x89個卷積特征。講義中那個gif圖更形象,這里不知道怎么添加進來...
最后,總結下convolution的處理過程:
假設給定了r * c的大尺寸圖像,將其定義為xlarge。首先通過從大尺寸圖像中抽取的a * b的小尺寸圖像樣本xsmall訓練稀疏自編碼,得到了k個特征(k為隱含層神經元數量),然后對于xlarge中的每個a*b大小的塊,求激活值fs,然后對這些fs進行卷積。這樣得到(r-a+1)*(c-b+1)*k個卷積后的特征矩陣。
pooling:
在通過卷積獲得了特征(features)之后,下一步我們希望利用這些特征去做分類。理論上講,人們可以把所有解析出來的特征關聯到一個分類器,例如softmax分類器,但計算量非常大。例如:對于一個96X96像素的圖像,假設我們已經通過8X8個輸入學習得到了400個特征。而每一個卷積都會得到一個(96 ? 8 + 1) * (96 ? 8 + 1) = 7921的結果集,由于已經得到了400個特征,所以對于每個樣例(example)結果集的大小就將達到892 * 400 = 3,168,400 個特征。這樣學習一個擁有超過3百萬特征的輸入的分類器是相當不明智的,并且極易出現過度擬合(over-fitting).
所以就有了pooling這個方法,翻譯作“池化”?感覺pooling這個英語單詞還是挺形象的,翻譯“作池”化就沒那么形象了。其實也就是把特征圖像區域的一部分求個均值或者最大值,用來代表這部分區域。如果是求均值就是mean pooling,求最大值就是max pooling。講義中那個gif圖也很形象,只是不知道這里怎么放gif圖....
至于pooling為什么可以這樣做,是因為:我們之所以決定使用卷積后的特征是因為圖像具有一種“靜態性”的屬性,這也就意味著在一個圖像區域有用的特征極有可能在另一個區域同樣適用。因此,為了描述大的圖像,一個很自然的想法就是對不同位置的特征進行聚合統計。這個均值或者最大值就是一種聚合統計的方法。
另外,如果人們選擇圖像中的連續范圍作為池化區域,并且只是池化相同(重復)的隱藏單元產生的特征,那么,這些池化單元就具有平移不變性(translation invariant)。這就意味著即使圖像經歷了一個小的平移之后,依然會產生相同的(池化的)特征(這里有個小小的疑問,既然這樣,是不是只能保證在池化大小的這塊區域內具有平移不變性?)。在很多任務中(例如物體檢測、聲音識別),我們都更希望得到具有平移不變性的特征,因為即使圖像經過了平移,樣例(圖像)的標記仍然保持不變。例如,如果你處理一個MNIST數據集的數字,把它向左側或右側平移,那么不論最終的位置在哪里,你都會期望你的分類器仍然能夠精確地將其分類為相同的數字。
練習:
下面是講義中的練習。用到了上一章的練習的結構(即在convolution過程中的第一步,用稀疏自編碼對xsmall求k個特征)。
以下是主要程序:
主程序cnnExercise.m

%% CS294A/CS294W Convolutional Neural Networks Exercise% Instructions% ------------% % This file contains code that helps you get started on the% convolutional neural networks exercise. In this exercise, you will only% need to modify cnnConvolve.m and cnnPool.m. You will not need to modify% this file.%%======================================================================%% STEP 0: Initialization% Here we initialize some parameters used for the exercise.imageDim = 64; % image dimensionimageChannels = 3; % number of channels (rgb, so 3)patchDim = 8; % patch dimensionnumPatches = 50000; % number of patchesvisibleSize = patchDim * patchDim * imageChannels; % number of input units outputSize = visibleSize; % number of output unitshiddenSize = 400; % number of hidden units epsilon = 0.1; % epsilon for ZCA whiteningpoolDim = 19; % dimension of pooling region%%======================================================================%% STEP 1: Train a sparse autoencoder (with a linear decoder) to learn % features from color patches. If you have completed the linear decoder% execise, use the features that you have obtained from that exercise, % loading them into optTheta. Recall that we have to keep around the % parameters used in whitening (i.e., the ZCA whitening matrix and the% meanPatch)% --------------------------- YOUR CODE HERE --------------------------% Train the sparse autoencoder and fill the following variables with % the optimal parameters:%optTheta = zeros(2*hiddenSize*visibleSize+hiddenSize+visibleSize, 1);%ZCAWhite = zeros(visibleSize, visibleSize);%meanPatch = zeros(visibleSize, 1);load STL10Features.mat;% --------------------------------------------------------------------% Display and check to see that the features look goodW = reshape(optTheta(1:visibleSize * hiddenSize), hiddenSize, visibleSize);b = optTheta(2*hiddenSize*visibleSize+1:2*hiddenSize*visibleSize+hiddenSize);displayColorNetwork( (W*ZCAWhite)');%%======================================================================%% STEP 2: Implement and test convolution and pooling% In this step, you will implement convolution and pooling, and test them% on a small part of the data set to ensure that you have implemented% these two functions correctly. In the next step, you will actually% convolve and pool the features with the STL10 images.%% STEP 2a: Implement convolution% Implement convolution in the function cnnConvolve in cnnConvolve.m% Note that we have to PReprocess the images in the exact same way % we preprocessed the patches before we can obtain the feature activations.load stlTrainSubset.mat % loads numTrainImages, trainImages, trainLabels%% Use only the first 8 images for testingconvImages = trainImages(:, :, :, 1:8); % NOTE: Implement cnnConvolve in cnnConvolve.m first!convolvedFeatures = cnnConvolve(patchDim, hiddenSize, convImages, W, b, ZCAWhite, meanPatch);%% STEP 2b: Checking your convolution% To ensure that you have convolved the features correctly, we have% provided some code to compare the results of your convolution with% activations from the sparse autoencoder% For 1000 random pointsfor i = 1:1000 featureNum = randi([1, hiddenSize]); imageNum = randi([1, 8]); imageRow = randi([1, imageDim - patchDim + 1]); imageCol = randi([1, imageDim - patchDim + 1]); patch = convImages(imageRow:imageRow + patchDim - 1, imageCol:imageCol + patchDim - 1, :, imageNum); patch = patch(:); patch = patch - meanPatch; patch = ZCAWhite * patch; features = feedForwardAutoencoder(optTheta, hiddenSize, visibleSize, patch); if abs(features(featureNum, 1) - convolvedFeatures(featureNum, imageNum, imageRow, imageCol)) > 1e-9 fprintf('Convolved feature does not match activation from autoencoder/n'); fprintf('Feature Number : %d/n', featureNum); fprintf('Image Number : %d/n', imageNum); fprintf('Image Row : %d/n', imageRow); fprintf('Image Column : %d/n', imageCol); fprintf('Convolved feature : %0.5f/n', convolvedFeatures(featureNum, imageNum, imageRow, imageCol)); fprintf('Sparse AE feature : %0.5f/n', features(featureNum, 1)); error('Convolved feature does not match activation from autoencoder'); end enddisp('Congratulations! Your convolution code passed the test.');%% STEP 2c: Implement pooling% Implement pooling in the function cnnPool in cnnPool.m% NOTE: Implement cnnPool in cnnPool.m first!pooledFeatures = cnnPool(poolDim, convolvedFeatures);%% STEP 2d: Checking your pooling% To ensure that you have implemented pooling, we will use your pooling% function to pool over a test matrix and check the results.testMatrix = reshape(1:64, 8, 8);expectedMatrix = [mean(mean(testMatrix(1:4, 1:4))) mean(mean(testMatrix(1:4, 5:8))); ... mean(mean(testMatrix(5:8, 1:4))) mean(mean(testMatrix(5:8, 5:8))); ]; testMatrix = reshape(testMatrix, 1, 1, 8, 8); pooledFeatures = squeeze(cnnPool(4, testMatrix));if ~isequal(pooledFeatures, expectedMatrix) disp('Pooling incorrect'); disp('Expected'); disp(expectedMatrix); disp('Got'); disp(pooledFeatures);else disp('Congratulations! Your pooling code passed the test.');end%%======================================================================%% STEP 3: Convolve and pool with the dataset% In this step, you will convolve each of the features you learned with% the full large images to obtain the convolved features. You will then% pool the convolved features to obtain the pooled features for% classification.%% Because the convolved features matrix is very large, we will do the% convolution and pooling 50 features at a time to avoid running out of% memory. Reduce this number if necessarystepSize = 50;assert(mod(hiddenSize, stepSize) == 0, 'stepSize should divide hiddenSize');load stlTrainSubset.mat % loads numTrainImages, trainImages, trainLabelsload stlTestSubset.mat % loads numTestImages, testImages, testLabelspooledFeaturesTrain = zeros(hiddenSize, numTrainImages, ... floor((imageDim - patchDim + 1) / poolDim), ... floor((imageDim - patchDim + 1) / poolDim) );pooledFeaturesTest = zeros(hiddenSize, numTestImages, ... floor((imageDim - patchDim + 1) / poolDim), ... floor((imageDim - patchDim + 1) / poolDim) );tic();for convPart = 1:(hiddenSize / stepSize) featureStart = (convPart - 1) * stepSize + 1; featureEnd = convPart * stepSize; fprintf('Step %d: features %d to %d/n', convPart, featureStart, featureEnd); Wt = W(featureStart:featureEnd, :); bt = b(featureStart:featureEnd); fprintf('Convolving and pooling train images/n'); convolvedFeaturesThis = cnnConvolve(patchDim, stepSize, ... trainImages, Wt, bt, ZCAWhite, meanPatch); pooledFeaturesThis = cnnPool(poolDim, convolvedFeaturesThis); pooledFeaturesTrain(featureStart:featureEnd, :, :, :) = pooledFeaturesThis; toc(); clear convolvedFeaturesThis pooledFeaturesThis; fprintf('Convolving and pooling test images/n'); convolvedFeaturesThis = cnnConvolve(patchDim, stepSize, ... testImages, Wt, bt, ZCAWhite, meanPatch); pooledFeaturesThis = cnnPool(poolDim, convolvedFeaturesThis); pooledFeaturesTest(featureStart:featureEnd, :, :, :) = pooledFeaturesThis; toc(); clear convolvedFeaturesThis pooledFeaturesThis;end% You might want to save the pooled features since convolution and pooling takes a long timesave('cnnPooledFeatures.mat', 'pooledFeaturesTrain', 'pooledFeaturesTest');toc();%%======================================================================%% STEP 4: Use pooled features for classification% Now, you will use your pooled features to train a softmax classifier,% using softmaxTrain from the softmax exercise.% Training the softmax classifer for 1000 iterations should take less than% 10 minutes.% Add the path to your softmax solution, if necessary% addpath /path/to/solution/% Setup parameters for softmaxsoftmaxLambda = 1e-4;numClasses = 4;% Reshape the pooledFeatures to form an input vector for softmaxsoftmaxX = permute(pooledFeaturesTrain, [1 3 4 2]);softmaxX = reshape(softmaxX, numel(pooledFeaturesTrain) / numTrainImages,... numTrainImages);softmaxY = trainLabels;options = struct;options.maxIter = 200;softmaxModel = softmaxTrain(numel(pooledFeaturesTrain) / numTrainImages,... numClasses, softmaxLambda, softmaxX, softmaxY, options);%%======================================================================%% STEP 5: Test classifer% Now you will test your trained classifer against the test imagessoftmaxX = permute(pooledFeaturesTest, [1 3 4 2]);softmaxX = reshape(softmaxX, numel(pooledFeaturesTest) / numTestImages, numTestImages);softmaxY = testLabels;[pred] = softmaxPredict(softmaxModel, softmaxX);acc = (pred(:) == softmaxY(:));acc = sum(acc) / size(acc, 1);fprintf('Accuracy: %2.3f%%/n', acc * 100);% You should expect to get an accuracy of around 80% on the test images.
cnnConvolve.m
function convolvedFeatures = cnnConvolve(patchDim, numFeatures, images, W, b, ZCAWhite, meanPatch)%cnnConvolve Returns the convolution of the features given by W and b with%the given images%% Parameters:% patchDim - patch (feature) dimension% numFeatures - number of features% images - large images to convolve with, matrix in the form% images(r, c, channel, image number)% W, b - W, b for features from the sparse autoencoder% ZCAWhite, meanPatch - ZCAWhitening and meanPatch matrices used for% preprocessing%% Returns:% convolvedFeatures - matrix of convolved features in the form% convolvedFeatures(featureNum, imageNum, imageRow, imageCol)patchSize = patchDim*patchDim;numImages = size(images, 4);imageDim = size(images, 1);imageChannels = size(images, 3);convolvedFeatures = zeros(numFeatures, numImages, imageDim - patchDim + 1, imageDim - patchDim + 1);% Instructions:% Convolve every feature with every large image here to produce the % numFeatures x numImages x (imageDim - patchDim + 1) x (imageDim - patchDim + 1) % matrix convolvedFeatures, such that % convolvedFeatures(featureNum, imageNum, imageRow, imageCol) is the% value of the convolved featureNum feature for the imageNum image over% the region (imageRow, imageCol) to (imageRow + patchDim - 1, imageCol + patchDim - 1)%% Expected running times: % Convolving with 100 images should take less than 3 minutes % Convolving with 5000 images should take around an hour% (So to save time when testing, you should convolve with less images, as% described earlier)% -------------------- YOUR CODE HERE --------------------% Precompute the matrices that will be used during the convolution. Recall% that you need to take into account the whitening and mean subtraction% stepsWT = W*ZCAWhite;bT = b-WT*meanPatch;% --------------------------------------------------------convolvedFeatures = zeros(numFeatures, numImages, imageDim - patchDim + 1, imageDim - patchDim + 1);for imageNum = 1:numImages for featureNum = 1:numFeatures % convolution of image with feature matrix for each channel convolvedImage = zeros(imageDim - patchDim + 1, imageDim - patchDim + 1); for channel = 1:3 % Obtain the feature (patchDim x patchDim) needed during the convolution % ---- YOUR CODE HERE ---- %feature = zeros(8,8); % You should replace this offset = (channel-1)*patchSize; feature = reshape(WT(featureNum,(offset+1):(offset+patchSize)),patchDim,patchDim); % ------------------------ % Flip the feature matrix because of the definition of convolution, as explained later feature = flipud(fliplr(squeeze(feature))); % Obtain the image im = squeeze(images(:, :, channel, imageNum)); % Convolve "feature" with "im", adding the result to convolvedImage % be sure to do a 'valid' convolution % ---- YOUR CODE HERE ---- convolveThisChannel = conv2(im,feature,'valid'); convolvedImage = convolvedImage + convolveThisChannel; %三個通道加起來,應該是指三個通道同時用來做判斷標準。 % ------------------------ end % Subtract the bias unit (correcting for the mean subtraction as well) % Then, apply the sigmoid function to get the hidden activation % ---- YOUR CODE HERE ---- convolvedImage = sigmoid(convolvedImage + bT(featureNum)); % ------------------------ % The convolved feature is the sum of the convolved values for all channels convolvedFeatures(featureNum, imageNum, :, :) = convolvedImage; endendfunction sigm = sigmoid(x) sigm = 1 ./ (1 + exp(-x));endend
cnnPool.m
function pooledFeatures = cnnPool(poolDim, convolvedFeatures)%cnnPool Pools the given convolved features%% Parameters:% poolDim - dimension of pooling region% convolvedFeatures - convolved features to pool (as given by cnnConvolve)% convolvedFeatures(featureNum, imageNum, imageRow, imageCol)%% Returns:% pooledFeatures - matrix of pooled features in the form% pooledFeatures(featureNum, imageNum, poolRow, poolCol)% numImages = size(convolvedFeatures, 2);numFeatures = size(convolvedFeatures, 1);convolvedDim = size(convolvedFeatures, 3);pooledFeatures = zeros(numFeatures, numImages, floor(convolvedDim / poolDim), floor(convolvedDim / poolDim));% -------------------- YOUR CODE HERE --------------------% Instructions:% Now pool the convolved features in regions of poolDim x poolDim,% to obtain the % numFeatures x numImages x (convolvedDim/poolDim) x (convolvedDim/poolDim) % matrix pooledFeatures, such that% pooledFeatures(featureNum, imageNum, poolRow, poolCol) is the % value of the featureNum feature for the imageNum image pooled over the% corresponding (poolRow, poolCol) pooling region % (see http://ufldl/wiki/index.php/Pooling )% % Use mean pooling here.% -------------------- YOUR CODE HERE --------------------numBlocks = floor(convolvedDim/poolDim); %每個維度總共分成多少塊(57/19),這里對于不同維數的數據,poolDim要選擇能剛好除盡的?for featureNum = 1:numFeatures for imageNum=1:numImages for poolRow = 1:numBlocks for poolCol = 1:numBlocks features = convolvedFeatures(featureNum,imageNum,(poolRow-1)*poolDim+1:poolRow*poolDim,(poolCol-1)*poolDim+1:poolCol*poolDim); pooledFeatures(featureNum,imageNum,poolRow,poolCol) = mean(features(:)); end end endendend
結果:
Accuracy: 78.938%
與講義提到的80%左右差不多。
ps:講義地址:
http://deeplearning.stanford.edu/wiki/index.PHP/Feature_extraction_using_convolution
http://deeplearning.stanford.edu/wiki/index.php/Pooling
http://deeplearning.stanford.edu/wiki/index.php/Exercise:Convolution_and_Pooling
=====================================================================================================
深度學習之CNN一 卷積與池化
1 卷積
連續: 一維卷積:
s(t)=(x?w)(t)=∫x(a)w(t?a)dt 二維卷積:S(t)=(K?I)(i,j)=∫∫I(i,j)K(i?m,j?n)dmdn 離散: 一維卷積:s(t)=(x?w)(t)=∑ax(a)w(t?a) 二維卷積:S(i,j)=(K?I)(i,j)=∑m∑nI(i,j)K(i?m,j?n) 卷積具有交換性,即
(K?I)(i,j)=(I?K)(i,j) ∑m∑nI(i,j)K(i?m,j?n)=∑m∑nI(i?m,j?n)K(i,j) 編程實現中: 二維卷積:
S(t)=(K?I)(i,j)=∑m∑nI(i+m,j+n)K(i,j) 這個定義就不具有交換性上面的
w,K 稱為核,s(t),S(i,j) 有時候稱為特征映射。2 卷積神經網絡
卷積神經網絡主要利用3個思想:稀疏連接、參數共享、平移不變性。
稀疏連接一個神經元的感知視野是指能夠影響該神經元的其他神經元。如上圖中
x3 的感知視野是s2,s3,s4 。深度卷經網絡中,深層單元的感知視野比淺層單元的大。參數共享 稀疏連接和參數共享都能顯著減少參數。平移不變性 參數共享會導致平移不變性。稱
f(x) 對g(x) 是不變的,如果f(g(x))=g(f(x)) 。例如I(x,y) 是一張圖像,g(I)=I(x?1,y) ,則(g(I)?K)=g((I?K)) 。池化(pooling) 池化輸出的是鄰近區域的概括統計量,一般是矩形區域。池化有最大池化、平均池化、滑動平均池化、
L2 范數池化等。 池化能使特征獲得平移不變性。如果我們只關心某些特征是否存在而不是在哪里時,平移不變性就很有用了。卷積也會產生平移不變性,注意區分,卷積對輸入平移是不變的,池化對特征平移是不變的。池化能顯著地減少參數,若滑動距離stride大小為
k ,則參數量減少k 倍池化能解決不同規格的輸入的問題。如下圖池化時,將圖片區域四等分,不管圖片的大小。
完整的CNN結構:
由上面的兩幅圖可以看到,為了提取不同的特征,每個卷積層都有多種卷積(通道, channel)。 一般來說,輸入并不只是一張灰度圖,更多的是彩圖,這樣輸入就是一個三維張量(tensor)
Vi,j,k ,表示第i 個通道的j 行k 列的值。則通過卷積得到的特征S 可表示為:Si,j,k=∑l,m,nVl,j+m?1,k+n?1Ki,l,m,n 其中K 是一個四維張量,Ki,l,m,n 表示卷積層的第i 個通道與輸入的第l 個通道在行偏m 、列偏n 時的權重系數。式子中-1是因為C和Python中 下標是從0開始的。 滑動長度為s 的卷積Si,j,k=∑l,m,nVl,(j?1)s+m,(k?1)+nKi,l,m,n 神經網絡在實現中還要注意輸入的補零方式。如果不補零的話,由于卷積核的關系,特征的大小(size)總是小于輸入的大小,致使整個網絡的深度是有限的。
如上圖所示,補零的方式有兩種: valid:也就是不補零。 same:在圖像邊緣補零,使得輸入和輸出大小相同。
版權聲明:本文為博主原創文章,未經博主允許不得轉載。
新聞熱點
疑難解答