池化方法總結(jié)（Pooling）和卷積。第三部分講的很好

2019-11-10 22:20:19

字體：大中小

來源：轉(zhuǎn)載

供稿：網(wǎng)友

池化方法總結(jié)（Pooling）

2016-01-12 22:55 5033人閱讀評論(2) 收藏舉報(bào)

分類：deep learning（18）

目錄(?)[+]

在卷積神經(jīng)網(wǎng)絡(luò)中，我們經(jīng)常會碰到池化操作，而池化層往往在卷積層后面，通過池化來降低卷積層輸出的特征向量，同時(shí)改善結(jié)果（不易出現(xiàn)過擬合）。

為什么可以通過降低維度呢？

因?yàn)閳D像具有一種“靜態(tài)性”的屬性，這也就意味著在一個(gè)圖像區(qū)域有用的特征極有可能在另一個(gè)區(qū)域同樣適用。因此，為了描述大的圖像，一個(gè)很自然的想法就是對不同位置的特征進(jìn)行聚合統(tǒng)計(jì)，例如，人們可以計(jì)算圖像一個(gè)區(qū)域上的某個(gè)特定特征的平均值 (或最大值)來代表這個(gè)區(qū)域的特征。[1]

1. 一般池化（General Pooling）

池化作用于圖像中不重合的區(qū)域（這與卷積操作不同），過程如下圖。

我們定義池化窗口的大小為sizeX，即下圖中紅色正方形的邊長，定義兩個(gè)相鄰池化窗口的水平位移/豎直位移為stride。一般池化由于每一池化窗口都是不重復(fù)的，所以sizeX=stride。

最常見的池化操作為平均池化mean pooling和最大池化max pooling：

平均池化：計(jì)算圖像區(qū)域的平均值作為該區(qū)域池化后的值。

最大池化：選圖像區(qū)域的最大值作為該區(qū)域池化后的值。

2. 重疊池化（OverlappingPooling）[2]

重疊池化正如其名字所說的，相鄰池化窗口之間會有重疊區(qū)域，此時(shí)sizeX>stride。

論文中[2]中，作者使用了重疊池化，其他的設(shè)置都不變的情況下， top-1和top-5 的錯(cuò)誤率分別減少了0.4% 和0.3%。

3. 空金字塔池化（Spatial Pyramid Pooling）[3]

空間金字塔池化可以把任何尺度的圖像的卷積特征轉(zhuǎn)化成相同維度，這不僅可以讓CNN處理任意尺度的圖像，還能避免cropping和warping操作，導(dǎo)致一些信息的丟失，具有非常重要的意義。

一般的CNN都需要輸入圖像的大小是固定的，這是因?yàn)槿B接層的輸入需要固定輸入維度，但在卷積操作是沒有對圖像尺度有限制，所有作者提出了空間金字塔池化，先讓圖像進(jìn)行卷積操作，然后轉(zhuǎn)化成維度相同的特征輸入到全連接層，這個(gè)可以把CNN擴(kuò)展到任意大小的圖像。

空間金字塔池化的思想來自于Spatial Pyramid Model，它一個(gè)pooling變成了多個(gè)scale的pooling。用不同大小池化窗口作用于卷積特征，我們可以得到1X1,2X2,4X4的池化結(jié)果，由于conv5中共有256個(gè)過濾器，所以得到1個(gè)256維的特征，4個(gè)256個(gè)特征，以及16個(gè)256維的特征，然后把這21個(gè)256維特征鏈接起來輸入全連接層，通過這種方式把不同大小的圖像轉(zhuǎn)化成相同維度的特征。

對于不同的圖像要得到相同大小的pooling結(jié)果，就需要根據(jù)圖像的大小動(dòng)態(tài)的計(jì)算池化窗口的大小和步長。假設(shè)conv5輸出的大小為a*a，需要得到n*n大小的池化結(jié)果，可以讓窗口大小sizeX為，步長為。下圖以conv5輸出的大小為13*13為例。

疑問：如果conv5輸出的大小為14*14，[pool1*1]的sizeX=stride=14，[pool2*2]的sizeX=stride=7，這些都沒有問題，但是，[pool4*4]的sizeX=5，stride=4，最后一列和最后一行特征沒有被池化操作計(jì)算在內(nèi)。

SPP其實(shí)就是一種多個(gè)scale的pooling，可以獲取圖像中的多尺度信息；在CNN中加入SPP后，可以讓CNN處理任意大小的輸入，這讓模型變得更加的flexible。

4. Reference

[1] UFLDL_Tutorial

[2] Krizhevsky, I. Sutskever, andG. Hinton, “Imagenet classification with deep convolutional neural networks,”in NipS,2012.

[3] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Su,Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition,LSVRC-2014 contest

來源：http://blog.csdn.NET/danieljianfeng/article/details/42433475======================================================================================================

Deep Learning 學(xué)習(xí)隨記（七）Convolution and Pooling --卷積和池化

2014-04-30 16:39 12300人閱讀評論(1) 收藏舉報(bào)

分類：算法研究(數(shù)據(jù)挖掘、機(jī)器學(xué)習(xí)、自然語言、深度學(xué)習(xí)、搜索引擎)（453）

C/C++（win32和linux）（411）

版權(quán)聲明：本文為博主原創(chuàng)文章，未經(jīng)博主允許不得轉(zhuǎn)載。

圖像大小與參數(shù)個(gè)數(shù)：

前面幾章都是針對小圖像塊處理的，這一章則是針對大圖像進(jìn)行處理的。兩者在這的區(qū)別還是很明顯的，小圖像（如8*8，MINIST的28*28）可以采用全連接的方式（即輸入層和隱含層直接相連）。但是大圖像，這個(gè)將會變得很耗時(shí)：比如96*96的圖像，若采用全連接方式，需要96*96個(gè)輸入單元，然后如果要訓(xùn)練100個(gè)特征，只這一層就需要96*96*100個(gè)參數(shù)（W,b），訓(xùn)練時(shí)間將是前面的幾百或者上萬倍。所以這里用到了部分聯(lián)通網(wǎng)絡(luò)。對于圖像來說，每個(gè)隱含單元僅僅連接輸入圖像的一小片相鄰區(qū)域。

這樣就引出了一個(gè)卷積的方法：

convolution：

自然圖像有其固有特性，也就是說，圖像的一部分的統(tǒng)計(jì)特性與其他部分是一樣的。這也意味著我們在這一部分學(xué)習(xí)的特征也能用在另一部分上，所以對于這個(gè)圖像上的所有位置，我們都能使用同樣的學(xué)習(xí)特征。

對于圖像，當(dāng)從一個(gè)大尺寸圖像中隨機(jī)選取一小塊，比如說8x8作為樣本，并且從這個(gè)小塊樣本中學(xué)習(xí)到了一些特征，這時(shí)我們可以把從這個(gè)8x8樣本中學(xué)習(xí)到的特征作為探測器，應(yīng)用到這個(gè)圖像的任意地方中去。特別是，我們可以用從8x8樣本中所學(xué)習(xí)到的特征跟原本的大尺寸圖像作卷積，從而對這個(gè)大尺寸圖像上的任一位置獲得一個(gè)不同特征的激活值。

講義中舉得具體例子，還是看例子容易理解：

假設(shè)你已經(jīng)從一個(gè)96x96的圖像中學(xué)習(xí)到了它的一個(gè)8x8的樣本所具有的特征，假設(shè)這是由有100個(gè)隱含單元的自編碼完成的。為了得到卷積特征，需要對96x96的圖像的每個(gè)8x8的小塊圖像區(qū)域都進(jìn)行卷積運(yùn)算。也就是說，抽取8x8的小塊區(qū)域，并且從起始坐標(biāo)開始依次標(biāo)記為（1，1），（1，2），...，一直到（89，89），然后對抽取的區(qū)域逐個(gè)運(yùn)行訓(xùn)練過的稀疏自編碼來得到特征的激活值。在這個(gè)例子里，顯然可以得到100個(gè)集合，每個(gè)集合含有89x89個(gè)卷積特征。講義中那個(gè)gif圖更形象，這里不知道怎么添加進(jìn)來...

最后，總結(jié)下convolution的處理過程：

假設(shè)給定了r * c的大尺寸圖像，將其定義為x_large。首先通過從大尺寸圖像中抽取的a * b的小尺寸圖像樣本x_small訓(xùn)練稀疏自編碼，得到了k個(gè)特征（k為隱含層神經(jīng)元數(shù)量），然后對于x_large中的每個(gè)a*b大小的塊，求激活值fs，然后對這些fs進(jìn)行卷積。這樣得到（r-a+1）*（c-b+1）*k個(gè)卷積后的特征矩陣。

pooling：

在通過卷積獲得了特征（features）之后，下一步我們希望利用這些特征去做分類。理論上講，人們可以把所有解析出來的特征關(guān)聯(lián)到一個(gè)分類器，例如softmax分類器，但計(jì)算量非常大。例如：對于一個(gè)96X96像素的圖像，假設(shè)我們已經(jīng)通過8X8個(gè)輸入學(xué)習(xí)得到了400個(gè)特征。而每一個(gè)卷積都會得到一個(gè)(96 ? 8 + 1) * (96 ? 8 + 1) = 7921的結(jié)果集，由于已經(jīng)得到了400個(gè)特征，所以對于每個(gè)樣例（example）結(jié)果集的大小就將達(dá)到89² * 400 = 3,168,400 個(gè)特征。這樣學(xué)習(xí)一個(gè)擁有超過3百萬特征的輸入的分類器是相當(dāng)不明智的，并且極易出現(xiàn)過度擬合（over-fitting）.

所以就有了pooling這個(gè)方法，翻譯作“池化”？感覺pooling這個(gè)英語單詞還是挺形象的，翻譯“作池”化就沒那么形象了。其實(shí)也就是把特征圖像區(qū)域的一部分求個(gè)均值或者最大值，用來代表這部分區(qū)域。如果是求均值就是mean pooling，求最大值就是max pooling。講義中那個(gè)gif圖也很形象，只是不知道這里怎么放gif圖....

至于pooling為什么可以這樣做，是因?yàn)椋何覀冎詻Q定使用卷積后的特征是因?yàn)閳D像具有一種“靜態(tài)性”的屬性，這也就意味著在一個(gè)圖像區(qū)域有用的特征極有可能在另一個(gè)區(qū)域同樣適用。因此，為了描述大的圖像，一個(gè)很自然的想法就是對不同位置的特征進(jìn)行聚合統(tǒng)計(jì)。這個(gè)均值或者最大值就是一種聚合統(tǒng)計(jì)的方法。

另外，如果人們選擇圖像中的連續(xù)范圍作為池化區(qū)域，并且只是池化相同(重復(fù))的隱藏單元產(chǎn)生的特征，那么，這些池化單元就具有平移不變性(translation invariant)。這就意味著即使圖像經(jīng)歷了一個(gè)小的平移之后，依然會產(chǎn)生相同的（池化的）特征（這里有個(gè)小小的疑問，既然這樣，是不是只能保證在池化大小的這塊區(qū)域內(nèi)具有平移不變性？）。在很多任務(wù)中（例如物體檢測、聲音識別），我們都更希望得到具有平移不變性的特征，因?yàn)榧词箞D像經(jīng)過了平移，樣例（圖像）的標(biāo)記仍然保持不變。例如，如果你處理一個(gè)MNIST數(shù)據(jù)集的數(shù)字，把它向左側(cè)或右側(cè)平移，那么不論最終的位置在哪里，你都會期望你的分類器仍然能夠精確地將其分類為相同的數(shù)字。

練習(xí)：

下面是講義中的練習(xí)。用到了上一章的練習(xí)的結(jié)構(gòu)（即在convolution過程中的第一步，用稀疏自編碼對x_small求k個(gè)特征）。

以下是主要程序：

主程序cnnExercise.m

%% CS294A/CS294W Convolutional Neural Networks Exercise%  Instructions%  ------------% %  This file contains code that helps you get started on the%  convolutional neural networks exercise. In this exercise, you will only%  need to modify cnnConvolve.m and cnnPool.m. You will not need to modify%  this file.%%======================================================================%% STEP 0: Initialization%  Here we initialize some parameters used for the exercise.imageDim = 64;         % image dimensionimageChannels = 3;     % number of channels (rgb, so 3)patchDim = 8;          % patch dimensionnumPatches = 50000;    % number of patchesvisibleSize = patchDim * patchDim * imageChannels;  % number of input units outputSize = visibleSize;   % number of output unitshiddenSize = 400;           % number of hidden units epsilon = 0.1;           % epsilon for ZCA whiteningpoolDim = 19;          % dimension of pooling region%%======================================================================%% STEP 1: Train a sparse autoencoder (with a linear decoder) to learn %  features from color patches. If you have completed the linear decoder%  execise, use the features that you have obtained from that exercise, %  loading them into optTheta. Recall that we have to keep around the %  parameters used in whitening (i.e., the ZCA whitening matrix and the%  meanPatch)% --------------------------- YOUR CODE HERE --------------------------% Train the sparse autoencoder and fill the following variables with % the optimal parameters:%optTheta =  zeros(2*hiddenSize*visibleSize+hiddenSize+visibleSize, 1);%ZCAWhite =  zeros(visibleSize, visibleSize);%meanPatch = zeros(visibleSize, 1);load STL10Features.mat;% --------------------------------------------------------------------% Display and check to see that the features look goodW = reshape(optTheta(1:visibleSize * hiddenSize), hiddenSize, visibleSize);b = optTheta(2*hiddenSize*visibleSize+1:2*hiddenSize*visibleSize+hiddenSize);displayColorNetwork( (W*ZCAWhite)');%%======================================================================%% STEP 2: Implement and test convolution and pooling%  In this step, you will implement convolution and pooling, and test them%  on a small part of the data set to ensure that you have implemented%  these two functions correctly. In the next step, you will actually%  convolve and pool the features with the STL10 images.%% STEP 2a: Implement convolution%  Implement convolution in the function cnnConvolve in cnnConvolve.m% Note that we have to PReprocess the images in the exact same way % we preprocessed the patches before we can obtain the feature activations.load stlTrainSubset.mat % loads numTrainImages, trainImages, trainLabels%% Use only the first 8 images for testingconvImages = trainImages(:, :, :, 1:8); % NOTE: Implement cnnConvolve in cnnConvolve.m first!convolvedFeatures = cnnConvolve(patchDim, hiddenSize, convImages, W, b, ZCAWhite, meanPatch);%% STEP 2b: Checking your convolution%  To ensure that you have convolved the features correctly, we have%  provided some code to compare the results of your convolution with%  activations from the sparse autoencoder% For 1000 random pointsfor i = 1:1000        featureNum = randi([1, hiddenSize]);    imageNum = randi([1, 8]);    imageRow = randi([1, imageDim - patchDim + 1]);    imageCol = randi([1, imageDim - patchDim + 1]);           patch = convImages(imageRow:imageRow + patchDim - 1, imageCol:imageCol + patchDim - 1, :, imageNum);    patch = patch(:);                patch = patch - meanPatch;    patch = ZCAWhite * patch;        features = feedForwardAutoencoder(optTheta, hiddenSize, visibleSize, patch);     if abs(features(featureNum, 1) - convolvedFeatures(featureNum, imageNum, imageRow, imageCol)) > 1e-9        fprintf('Convolved feature does not match activation from autoencoder/n');        fprintf('Feature Number    : %d/n', featureNum);        fprintf('Image Number      : %d/n', imageNum);        fprintf('Image Row         : %d/n', imageRow);        fprintf('Image Column      : %d/n', imageCol);        fprintf('Convolved feature : %0.5f/n', convolvedFeatures(featureNum, imageNum, imageRow, imageCol));        fprintf('Sparse AE feature : %0.5f/n', features(featureNum, 1));               error('Convolved feature does not match activation from autoencoder');    end enddisp('Congratulations! Your convolution code passed the test.');%% STEP 2c: Implement pooling%  Implement pooling in the function cnnPool in cnnPool.m% NOTE: Implement cnnPool in cnnPool.m first!pooledFeatures = cnnPool(poolDim, convolvedFeatures);%% STEP 2d: Checking your pooling%  To ensure that you have implemented pooling, we will use your pooling%  function to pool over a test matrix and check the results.testMatrix = reshape(1:64, 8, 8);expectedMatrix = [mean(mean(testMatrix(1:4, 1:4))) mean(mean(testMatrix(1:4, 5:8))); ...                  mean(mean(testMatrix(5:8, 1:4))) mean(mean(testMatrix(5:8, 5:8))); ];            testMatrix = reshape(testMatrix, 1, 1, 8, 8);        pooledFeatures = squeeze(cnnPool(4, testMatrix));if ~isequal(pooledFeatures, expectedMatrix)    disp('Pooling incorrect');    disp('Expected');    disp(expectedMatrix);    disp('Got');    disp(pooledFeatures);else    disp('Congratulations! Your pooling code passed the test.');end%%======================================================================%% STEP 3: Convolve and pool with the dataset%  In this step, you will convolve each of the features you learned with%  the full large images to obtain the convolved features. You will then%  pool the convolved features to obtain the pooled features for%  classification.%%  Because the convolved features matrix is very large, we will do the%  convolution and pooling 50 features at a time to avoid running out of%  memory. Reduce this number if necessarystepSize = 50;assert(mod(hiddenSize, stepSize) == 0, 'stepSize should divide hiddenSize');load stlTrainSubset.mat % loads numTrainImages, trainImages, trainLabelsload stlTestSubset.mat  % loads numTestImages,  testImages,  testLabelspooledFeaturesTrain = zeros(hiddenSize, numTrainImages, ...    floor((imageDim - patchDim + 1) / poolDim), ...    floor((imageDim - patchDim + 1) / poolDim) );pooledFeaturesTest = zeros(hiddenSize, numTestImages, ...    floor((imageDim - patchDim + 1) / poolDim), ...    floor((imageDim - patchDim + 1) / poolDim) );tic();for convPart = 1:(hiddenSize / stepSize)        featureStart = (convPart - 1) * stepSize + 1;    featureEnd = convPart * stepSize;        fprintf('Step %d: features %d to %d/n', convPart, featureStart, featureEnd);      Wt = W(featureStart:featureEnd, :);    bt = b(featureStart:featureEnd);            fprintf('Convolving and pooling train images/n');    convolvedFeaturesThis = cnnConvolve(patchDim, stepSize, ...        trainImages, Wt, bt, ZCAWhite, meanPatch);    pooledFeaturesThis = cnnPool(poolDim, convolvedFeaturesThis);    pooledFeaturesTrain(featureStart:featureEnd, :, :, :) = pooledFeaturesThis;       toc();    clear convolvedFeaturesThis pooledFeaturesThis;        fprintf('Convolving and pooling test images/n');    convolvedFeaturesThis = cnnConvolve(patchDim, stepSize, ...        testImages, Wt, bt, ZCAWhite, meanPatch);    pooledFeaturesThis = cnnPool(poolDim, convolvedFeaturesThis);    pooledFeaturesTest(featureStart:featureEnd, :, :, :) = pooledFeaturesThis;       toc();    clear convolvedFeaturesThis pooledFeaturesThis;end% You might want to save the pooled features since convolution and pooling takes a long timesave('cnnPooledFeatures.mat', 'pooledFeaturesTrain', 'pooledFeaturesTest');toc();%%======================================================================%% STEP 4: Use pooled features for classification%  Now, you will use your pooled features to train a softmax classifier,%  using softmaxTrain from the softmax exercise.%  Training the softmax classifer for 1000 iterations should take less than%  10 minutes.% Add the path to your softmax solution, if necessary% addpath /path/to/solution/% Setup parameters for softmaxsoftmaxLambda = 1e-4;numClasses = 4;% Reshape the pooledFeatures to form an input vector for softmaxsoftmaxX = permute(pooledFeaturesTrain, [1 3 4 2]);softmaxX = reshape(softmaxX, numel(pooledFeaturesTrain) / numTrainImages,...    numTrainImages);softmaxY = trainLabels;options = struct;options.maxIter = 200;softmaxModel = softmaxTrain(numel(pooledFeaturesTrain) / numTrainImages,...    numClasses, softmaxLambda, softmaxX, softmaxY, options);%%======================================================================%% STEP 5: Test classifer%  Now you will test your trained classifer against the test imagessoftmaxX = permute(pooledFeaturesTest, [1 3 4 2]);softmaxX = reshape(softmaxX, numel(pooledFeaturesTest) / numTestImages, numTestImages);softmaxY = testLabels;[pred] = softmaxPredict(softmaxModel, softmaxX);acc = (pred(:) == softmaxY(:));acc = sum(acc) / size(acc, 1);fprintf('Accuracy: %2.3f%%/n', acc * 100);% You should expect to get an accuracy of around 80% on the test images. 
 
cnnConvolve.m
function convolvedFeatures = cnnConvolve(patchDim, numFeatures, images, W, b, ZCAWhite, meanPatch)%cnnConvolve Returns the convolution of the features given by W and b with%the given images%% Parameters:%  patchDim - patch (feature) dimension%  numFeatures - number of features%  images - large images to convolve with, matrix in the form%           images(r, c, channel, image number)%  W, b - W, b for features from the sparse autoencoder%  ZCAWhite, meanPatch - ZCAWhitening and meanPatch matrices used for%                        preprocessing%% Returns:%  convolvedFeatures - matrix of convolved features in the form%                      convolvedFeatures(featureNum, imageNum, imageRow, imageCol)patchSize = patchDim*patchDim;numImages = size(images, 4);imageDim = size(images, 1);imageChannels = size(images, 3);convolvedFeatures = zeros(numFeatures, numImages, imageDim - patchDim + 1, imageDim - patchDim + 1);% Instructions:%   Convolve every feature with every large image here to produce the %   numFeatures x numImages x (imageDim - patchDim + 1) x (imageDim - patchDim + 1) %   matrix convolvedFeatures, such that %   convolvedFeatures(featureNum, imageNum, imageRow, imageCol) is the%   value of the convolved featureNum feature for the imageNum image over%   the region (imageRow, imageCol) to (imageRow + patchDim - 1, imageCol + patchDim - 1)%% Expected running times: %   Convolving with 100 images should take less than 3 minutes %   Convolving with 5000 images should take around an hour%   (So to save time when testing, you should convolve with less images, as%   described earlier)% -------------------- YOUR CODE HERE --------------------% Precompute the matrices that will be used during the convolution. Recall% that you need to take into account the whitening and mean subtraction% stepsWT = W*ZCAWhite;bT = b-WT*meanPatch;% --------------------------------------------------------convolvedFeatures = zeros(numFeatures, numImages, imageDim - patchDim + 1, imageDim - patchDim + 1);for imageNum = 1:numImages  for featureNum = 1:numFeatures    % convolution of image with feature matrix for each channel    convolvedImage = zeros(imageDim - patchDim + 1, imageDim - patchDim + 1);    for channel = 1:3      % Obtain the feature (patchDim x patchDim) needed during the convolution      % ---- YOUR CODE HERE ----      %feature = zeros(8,8); % You should replace this      offset = (channel-1)*patchSize;      feature = reshape(WT(featureNum,(offset+1):(offset+patchSize)),patchDim,patchDim);      % ------------------------      % Flip the feature matrix because of the definition of convolution, as explained later      feature = flipud(fliplr(squeeze(feature)));            % Obtain the image      im = squeeze(images(:, :, channel, imageNum));      % Convolve "feature" with "im", adding the result to convolvedImage      % be sure to do a 'valid' convolution      % ---- YOUR CODE HERE ----       convolveThisChannel = conv2(im,feature,'valid');       convolvedImage = convolvedImage + convolveThisChannel;            %三個(gè)通道加起來，應(yīng)該是指三個(gè)通道同時(shí)用來做判斷標(biāo)準(zhǔn)。          % ------------------------    end        % Subtract the bias unit (correcting for the mean subtraction as well)    % Then, apply the sigmoid function to get the hidden activation    % ---- YOUR CODE HERE ----    convolvedImage = sigmoid(convolvedImage + bT(featureNum));    % ------------------------        % The convolved feature is the sum of the convolved values for all channels    convolvedFeatures(featureNum, imageNum, :, :) = convolvedImage;  endendfunction sigm = sigmoid(x)    sigm = 1 ./ (1 + exp(-x));endend 
 
cnnPool.m
function pooledFeatures = cnnPool(poolDim, convolvedFeatures)%cnnPool Pools the given convolved features%% Parameters:%  poolDim - dimension of pooling region%  convolvedFeatures - convolved features to pool (as given by cnnConvolve)%                      convolvedFeatures(featureNum, imageNum, imageRow, imageCol)%% Returns:%  pooledFeatures - matrix of pooled features in the form%                   pooledFeatures(featureNum, imageNum, poolRow, poolCol)%     numImages = size(convolvedFeatures, 2);numFeatures = size(convolvedFeatures, 1);convolvedDim = size(convolvedFeatures, 3);pooledFeatures = zeros(numFeatures, numImages, floor(convolvedDim / poolDim), floor(convolvedDim / poolDim));% -------------------- YOUR CODE HERE --------------------% Instructions:%   Now pool the convolved features in regions of poolDim x poolDim,%   to obtain the %   numFeatures x numImages x (convolvedDim/poolDim) x (convolvedDim/poolDim) %   matrix pooledFeatures, such that%   pooledFeatures(featureNum, imageNum, poolRow, poolCol) is the %   value of the featureNum feature for the imageNum image pooled over the%   corresponding (poolRow, poolCol) pooling region %   (see http://ufldl/wiki/index.php/Pooling )%   %   Use mean pooling here.% -------------------- YOUR CODE HERE --------------------numBlocks = floor(convolvedDim/poolDim);             %每個(gè)維度總共分成多少塊（57/19）,這里對于不同維數(shù)的數(shù)據(jù)，poolDim要選擇能剛好除盡的？for featureNum = 1:numFeatures    for imageNum=1:numImages        for poolRow = 1:numBlocks            for poolCol = 1:numBlocks                features = convolvedFeatures(featureNum,imageNum,(poolRow-1)*poolDim+1:poolRow*poolDim,(poolCol-1)*poolDim+1:poolCol*poolDim);                pooledFeatures(featureNum,imageNum,poolRow,poolCol) = mean(features(:));            end        end    endendend 
 
結(jié)果：
Accuracy: 78.938%
與講義提到的80%左右差不多。
 
ps：講義地址：
http://deeplearning.stanford.edu/wiki/index.PHP/Feature_extraction_using_convolution
http://deeplearning.stanford.edu/wiki/index.php/Pooling
http://deeplearning.stanford.edu/wiki/index.php/Exercise:Convolution_and_Pooling
=====================================================================================================
深度學(xué)習(xí)之CNN一 卷積與池化
分類：深度學(xué)習(xí)CNN
 （836）  （0）  舉報(bào)  收藏
1 卷積
連續(xù)： 一維卷積：s(t)=(x?w)(t)=∫x(a)w(t?a)dt 二維卷積：S(t)=(K?I)(i,j)=∫∫I(i,j)K(i?m,j?n)dmdn 離散： 一維卷積：s(t)=(x?w)(t)=∑ax(a)w(t?a) 二維卷積：S(i,j)=(K?I)(i,j)=∑m∑nI(i,j)K(i?m,j?n)
卷積具有交換性，即 (K?I)(i,j)=(I?K)(i,j) ∑m∑nI(i,j)K(i?m,j?n)=∑m∑nI(i?m,j?n)K(i,j)
編程實(shí)現(xiàn)中: 二維卷積：S(t)=(K?I)(i,j)=∑m∑nI(i+m,j+n)K(i,j) 這個(gè)定義就不具有交換性
上面的w,K稱為核，s(t),S(i,j)有時(shí)候稱為特征映射。 
2 卷積神經(jīng)網(wǎng)絡(luò)
卷積神經(jīng)網(wǎng)絡(luò)主要利用3個(gè)思想：稀疏連接、參數(shù)共享、平移不變性。
稀疏連接  一個(gè)神經(jīng)元的感知視野是指能夠影響該神經(jīng)元的其他神經(jīng)元。如上圖中x3的感知視野是s2,s3,s4。深度卷經(jīng)網(wǎng)絡(luò)中，深層單元的感知視野比淺層單元的大。 參數(shù)共享 稀疏連接和參數(shù)共享都能顯著減少參數(shù)。平移不變性 參數(shù)共享會導(dǎo)致平移不變性。稱f(x)對g(x)是不變的，如果f(g(x))=g(f(x))。例如I(x,y)是一張圖像，g(I)=I(x?1,y)，則(g(I)?K)=g((I?K))。
池化（pooling） 池化輸出的是鄰近區(qū)域的概括統(tǒng)計(jì)量，一般是矩形區(qū)域。池化有最大池化、平均池化、滑動(dòng)平均池化、L2范數(shù)池化等。 池化能使特征獲得平移不變性。如果我們只關(guān)心某些特征是否存在而不是在哪里時(shí)，平移不變性就很有用了。卷積也會產(chǎn)生平移不變性，注意區(qū)分，卷積對輸入平移是不變的，池化對特征平移是不變的。  池化能顯著地減少參數(shù)，若滑動(dòng)距離stride大小為k，則參數(shù)量減少k倍  池化能解決不同規(guī)格的輸入的問題。如下圖池化時(shí)，將圖片區(qū)域四等分，不管圖片的大小。 
完整的CNN結(jié)構(gòu)： 
由上面的兩幅圖可以看到，為了提取不同的特征，每個(gè)卷積層都有多種卷積（通道, channel）。 一般來說，輸入并不只是一張灰度圖，更多的是彩圖，這樣輸入就是一個(gè)三維張量（tensor）Vi,j,k，表示第i個(gè)通道的j行k列的值。則通過卷積得到的特征S可表示為：Si,j,k=∑l,m,nVl,j+m?1,k+n?1Ki,l,m,n 其中K是一個(gè)四維張量，Ki,l,m,n表示卷積層的第i個(gè)通道與輸入的第l個(gè)通道在行偏m、列偏n時(shí)的權(quán)重系數(shù)。式子中-1是因?yàn)镃和Python中 下標(biāo)是從0開始的。 滑動(dòng)長度為s的卷積Si,j,k=∑l,m,nVl,(j?1)s+m,(k?1)+nKi,l,m,n
神經(jīng)網(wǎng)絡(luò)在實(shí)現(xiàn)中還要注意輸入的補(bǔ)零方式。如果不補(bǔ)零的話，由于卷積核的關(guān)系，特征的大?。╯ize）總是小于輸入的大小，致使整個(gè)網(wǎng)絡(luò)的深度是有限的。  如上圖所示，補(bǔ)零的方式有兩種： valid：也就是不補(bǔ)零。 same：在圖像邊緣補(bǔ)零，使得輸入和輸出大小相同。
版權(quán)聲明：本文為博主原創(chuàng)文章，未經(jīng)博主允許不得轉(zhuǎn)載。
頂1 踩1