国产探花免费观看_亚洲丰满少妇自慰呻吟_97日韩有码在线_资源在线日韩欧美_一区二区精品毛片,辰东完美世界有声小说,欢乐颂第一季,yy玄幻小说排行榜完本

首頁 > 學(xué)院 > 開發(fā)設(shè)計 > 正文

Logistic回歸python實現(xiàn)

2019-11-14 17:29:31
字體:
供稿:網(wǎng)友

Logistic回歸

算法優(yōu)缺點:


1.計算代價不高,易于理解和實現(xiàn)
2.容易欠擬合,分類精度可能不高
3.適用數(shù)據(jù)類型:數(shù)值型和標(biāo)稱型

算法思想:

  • 其實就我的理解來說,logistic回歸實際上就是加了個sigmoid函數(shù)的線性回歸,這個sigmoid函數(shù)的好處就在于,將結(jié)果歸到了0到1這個區(qū)間里面了,并且sigmoid(0)=0.5,也就是說里面的線性部分的結(jié)果大于零小于零就可以直接計算到了。這里的求解方式是梯度上升法,具體我就不扯了,最推薦的資料還是Ng的視頻,那里面的梯度下降就是啦,只不過一個是梯度上升的方向一個是下降的方向,做法什么的都一樣。
  • 而梯度上升(準(zhǔn)確的說叫做“批梯度上升”)的一個缺點就是計算量太大了,每一次迭代都需要把所有的數(shù)據(jù)算一遍,這樣一旦訓(xùn)練集大了之后,那么計算量將非常大,所以這里后面還提出了隨機梯度下降,思想就是每次只是根據(jù)一個data進(jìn)行修正。這樣得到的最終的結(jié)果可能會有所偏差但是速度卻提高了很多,而且優(yōu)化之后的偏差還是很小的。隨機梯度上升的另一個好處是這是一個在線算法,可以根據(jù)新數(shù)據(jù)的到來不斷處理

函數(shù):

loadDataSet()
創(chuàng)建數(shù)據(jù)集,這里的數(shù)據(jù)集就是在一個文件中,這里面有三行,分別是兩個特征和一個標(biāo)簽,但是我們在讀出的時候還加了X0這個屬性
sigmoid(inX)
sigmoid函數(shù)的計算,這個函數(shù)長這樣的,基本坐標(biāo)大點就和階躍函數(shù)很像了


gradAscend(dataMatIn, classLabels)
梯度上升算法的實現(xiàn),里面用到了numpy的數(shù)組,并且設(shè)定了迭代次數(shù)500次,然后為了計算速度都采取了矩陣計算,計算的過程中的公式大概是:w= w+alpha*(y-h)x[i](一直懶得寫公式,見諒。。。)
gradAscendWithDraw(dataMatIn, classLabels)
上面的函數(shù)加強版,增加了一個weight跟著迭代次數(shù)的變化曲線
stocGradAscent0(dataMatrix, classLabels)
這里為了加快速度用來隨機梯度上升,即每次根據(jù)一組數(shù)據(jù)調(diào)整(額,好吧,這個際沒有隨機因為那是線面那個函數(shù))
stocGradAscentWithDraw0(dataMatrix, classLabels)
上面的函數(shù)加強版,增加了一個weight跟著迭代次數(shù)的變化曲線
stocGradAscent1(dataMatrix, classLabels, numIter=150)
這就真的開始隨機了,隨機的主要好處是減少了周期性的波動了。另外這里還加入了alpha的值隨迭代變化,這樣可以讓alpha的值不斷的變化,但是都不會減小到0。
stocGradAscentWithDraw1(dataMatrix, classLabels, numIter=150)
上面的函數(shù)加強版,增加了一個weight跟著迭代次數(shù)的變化曲線
plotBestFit(wei)
根據(jù)計算的weight值畫出擬合的線,直觀觀察效果

運行效果分析:
1、梯度上升:
迭代變化趨勢
分類結(jié)果:
2、隨機梯度上升版本1
迭代變化趨勢
分類結(jié)果:
這個速度雖然快了很多但是效果不太理想啊。不過這個計算量那么少,我們?nèi)绻堰@個迭代200次肯定不一樣了,效果如下
果然好多了
3、隨機梯度上升版本2
迭代變化趨勢
分類結(jié)果:
恩,就是這樣啦,效果還是不錯的啦。代碼的畫圖部分寫的有點爛,見諒啦
  1.   1 #coding=utf-8  2 from numpy import *  3   4 def loadDataSet():  5     dataMat = []  6     labelMat = []  7     fr = open('testSet.txt')  8     for line in fr.readlines():  9         lineArr = line.strip().split() 10         dataMat.append([1.0, float(lineArr[0]), float(lineArr[1])]) 11         labelMat.append(int(lineArr[2])) 12     return dataMat, labelMat 13      14 def sigmoid(inX): 15     return 1.0/(1+exp(-inX)) 16      17 def gradAscend(dataMatIn, classLabels): 18     dataMatrix = mat(dataMatIn) 19     labelMat = mat(classLabels).transpose() 20     m,n = shape(dataMatrix) 21     alpha = 0.001 22     maxCycle = 500 23     weight = ones((n,1)) 24     for k in range(maxCycle): 25         h = sigmoid(dataMatrix*weight) 26         error = labelMat - h 27         weight += alpha * dataMatrix.transpose() * error 28         #plotBestFit(weight) 29     return weight 30  31 def gradAscendWithDraw(dataMatIn, classLabels): 32     import matplotlib.pyplot as plt 33     fig = plt.figure() 34     ax = fig.add_subplot(311,ylabel='x0') 35     bx = fig.add_subplot(312,ylabel='x1') 36     cx = fig.add_subplot(313,ylabel='x2') 37     dataMatrix = mat(dataMatIn) 38     labelMat = mat(classLabels).transpose() 39     m,n = shape(dataMatrix) 40     alpha = 0.001 41     maxCycle = 500 42     weight = ones((n,1)) 43     wei1 = [] 44     wei2 = [] 45     wei3 = [] 46     for k in range(maxCycle): 47         h = sigmoid(dataMatrix*weight) 48         error = labelMat - h 49         weight += alpha * dataMatrix.transpose() * error 50         wei1.extend(weight[0]) 51         wei2.extend(weight[1]) 52         wei3.extend(weight[2]) 53     ax.plot(range(maxCycle), wei1) 54     bx.plot(range(maxCycle), wei2) 55     cx.plot(range(maxCycle), wei3) 56     plt.xlabel('iter_num') 57     plt.show() 58     return weight 59      60 def stocGradAscent0(dataMatrix, classLabels): 61     m,n = shape(dataMatrix) 62      63     alpha = 0.001 64     weight = ones(n) 65     for i in range(m): 66         h = sigmoid(sum(dataMatrix[i]*weight)) 67         error = classLabels[i] - h 68         weight = weight + alpha * error * dataMatrix[i] 69     return weight 70      71 def stocGradAscentWithDraw0(dataMatrix, classLabels): 72     import matplotlib.pyplot as plt 73     fig = plt.figure() 74     ax = fig.add_subplot(311,ylabel='x0') 75     bx = fig.add_subplot(312,ylabel='x1') 76     cx = fig.add_subplot(313,ylabel='x2') 77     m,n = shape(dataMatrix) 78      79     alpha = 0.001 80     weight = ones(n) 81     wei1 = array([]) 82     wei2 = array([]) 83     wei3 = array([]) 84     numIter = 200 85     for j in range(numIter): 86         for i in range(m): 87             h = sigmoid(sum(dataMatrix[i]*weight)) 88             error = classLabels[i] - h 89             weight = weight + alpha * error * dataMatrix[i] 90             wei1 =append(wei1, weight[0]) 91             wei2 =append(wei2, weight[1]) 92             wei3 =append(wei3, weight[2]) 93     ax.plot(array(range(m*numIter)), wei1) 94     bx.plot(array(range(m*numIter)), wei2) 95     cx.plot(array(range(m*numIter)), wei3) 96     plt.xlabel('iter_num') 97     plt.show() 98     return weight 99     100 def stocGradAscent1(dataMatrix, classLabels, numIter=150):101     m,n = shape(dataMatrix)102     103     #alpha = 0.001104     weight = ones(n)105     for j in range(numIter):106         dataIndex = range(m)107         for i in range(m):108             alpha = 4/ (1.0+j+i) +0.01109             randIndex = int(random.uniform(0,len(dataIndex)))110             h = sigmoid(sum(dataMatrix[randIndex]*weight))111             error = classLabels[randIndex] - h112             weight = weight + alpha * error * dataMatrix[randIndex]113             del(dataIndex[randIndex])114     return weight115     116 def stocGradAscentWithDraw1(dataMatrix, classLabels, numIter=150):117     import matplotlib.pyplot as plt118     fig = plt.figure()119     ax = fig.add_subplot(311,ylabel='x0')120     bx = fig.add_subplot(312,ylabel='x1')121     cx = fig.add_subplot(313,ylabel='x2')122     m,n = shape(dataMatrix)123     124     #alpha = 0.001125     weight = ones(n)126     wei1 = array([])127     wei2 = array([])128     wei3 = array([])129     for j in range(numIter):130         dataIndex = range(m)131         for i in range(m):132             alpha = 4/ (1.0+j+i) +0.01133             randIndex = int(random.uniform(0,len(dataIndex)))134             h = sigmoid(sum(dataMatrix[randIndex]*weight))135             error = classLabels[randIndex] - h136             weight = weight + alpha * error * dataMatrix[randIndex]137             del(dataIndex[randIndex])138             wei1 =append(wei1, weight[0])139             wei2 =append(wei2, weight[1])140             wei3 =append(wei3, weight[2])141     ax.plot(array(range(len(wei1))), wei1)142     bx.plot(array(range(len(wei2))), wei2)143     cx.plot(array(range(len(wei2))), wei3)144     plt.xlabel('iter_num')145     plt.show()146     return weight147     148 def plotBestFit(wei):149     import matplotlib.pyplot as plt150     weight = wei151     dataMat,labelMat = loadDataSet()152     dataArr = array(dataMat)153     n = shape(dataArr)[0]154     xcord1 = []155     ycord1 = []156     xcord2 = []157     ycord2 = []158     for i in range(n):159         if int(labelMat[i]) == 1:160             xcord1.append(dataArr[i,1])161             ycord1.append(dataArr[i,2])162         else:163             xcord2.append(dataArr[i,1])164             ycord2.append(dataArr[i,2])165     fig = plt.figure()166     ax = fig.add_subplot(111)167     ax.scatter(xcord1, ycord1, s=30, c='red', marker='s')168     ax.scatter(xcord2, ycord2, s=30, c='green')169     x = arange(-3.0, 3.0, 0.1)170     y = (-weight[0] - weight[1]*x)/weight[2]171     ax.plot(x,y)172     plt.xlabel('X1')173     plt.ylabel('X2')174     plt.show()175     176 def main():177     dataArr,labelMat = loadDataSet()178     #w = gradAscendWithDraw(dataArr,labelMat)179     w = stocGradAscentWithDraw0(array(dataArr),labelMat)180     plotBestFit(w)181     182 if __name__ == '__main__':183     main()

     

    機器學(xué)習(xí)筆記索引




發(fā)表評論 共有條評論
用戶名: 密碼:
驗證碼: 匿名發(fā)表
主站蜘蛛池模板: 得荣县| 涿鹿县| 铜陵市| 肃北| 梨树县| 来宾市| 孙吴县| 威宁| 安乡县| 莎车县| 通河县| 铅山县| 邳州市| 彰武县| 长海县| 巴彦淖尔市| 沙坪坝区| 邛崃市| 讷河市| 景洪市| 家居| 喀什市| 南川市| 昌平区| 来安县| 达孜县| 西乌珠穆沁旗| 迭部县| 青浦区| 新民市| 诸暨市| 湄潭县| 威宁| 金阳县| 遂川县| 宁阳县| 涿鹿县| 泰顺县| 团风县| 治多县| 吉水县|