因為之前在項目中一直使用Tensorflow,最近需要處理NLP問題,對Pytorch框架還比較陌生,所以特地再學(xué)習(xí)一下pytorch在自然語言處理問題中的簡單使用,這里做一個記錄。
一、Pytorch基礎(chǔ)
首先,第一步是導(dǎo)入pytorch的一系列包
import torchimport torch.autograd as autograd #Autograd為Tensor所有操作提供自動求導(dǎo)方法import torch.nn as nnimport torch.nn.functional as Fimport torch.optim as optim
1)Tensor張量
a) 創(chuàng)建Tensors
#tensorx = torch.Tensor([[1,2,3],[4,5,6]])#size為2x3x4的隨機數(shù)隨機數(shù)x = torch.randn((2,3,4))
b) Tensors計算
x = torch.Tensor([[1,2],[3,4]])y = torch.Tensor([[5,6],[7,8]])z = x+y
c) Reshape Tensors
x = torch.randn(2,3,4)#拉直x = x.view(-1)#4*6維度x = x.view(4,6)
2)計算圖和自動微分
a) Variable變量
#將Tensor變?yōu)閂ariablex = autograd.Variable(torch.Tensor([1,2,3]),requires_grad = True)#將Variable變?yōu)門ensory = x.data
b) 反向梯度算法
x = autograd.Variable(torch.Tensor([1,2]),requires_grad=True)y = autograd.Variable(torch.Tensor([3,4]),requires_grad=True)z = x+y#求和s = z.sum()#反向梯度傳播s.backward()print(x.grad)
c) 線性映射
linear = nn.Linear(3,5) #三維線性映射到五維x = autograd.Variable(torch.randn(4,3))#輸出為(4,5)維y = linear(x)
d) 非線性映射(激活函數(shù)的使用)
x = autograd.Variable(torch.randn(5))#relu激活函數(shù)x_relu = F.relu(x)print(x_relu)x_soft = F.softmax(x)#softmax激活函數(shù)print(x_soft)print(x_soft.sum())
output:
Variable containing:-0.9347-0.9882 1.3801-0.1173 0.9317[torch.FloatTensor of size 5] Variable containing: 0.0481 0.0456 0.4867 0.1089 0.3108[torch.FloatTensor of size 5] Variable containing: 1[torch.FloatTensor of size 1] Variable containing:-3.0350-3.0885-0.7201-2.2176-1.1686[torch.FloatTensor of size 5]
二、Pytorch創(chuàng)建網(wǎng)絡(luò)
1) word embedding詞嵌入
通過nn.Embedding(m,n)實現(xiàn),m表示所有的單詞數(shù)目,n表示詞嵌入的維度。
word_to_idx = {'hello':0,'world':1}embeds = nn.Embedding(2,5) #即兩個單詞,單詞的詞嵌入維度為5hello_idx = torch.LongTensor([word_to_idx['hello']])hello_idx = autograd.Variable(hello_idx)hello_embed = embeds(hello_idx)print(hello_embed)
output:
Variable containing:-0.6982 0.3909 -1.0760 -1.6215 0.4429[torch.FloatTensor of size 1x5]
2) N-Gram 語言模型
先介紹一下N-Gram語言模型,給定一個單詞序列 ,計算 ,其中 是序列的第 個單詞。
import torchimport torch.nn as nnimport torch.nn.functional as Fimport torch.autograd as autogradimport torch.optim as optim from six.moves import xrange
對句子進行分詞:
context_size = 2embed_dim = 10text_sequence = """When forty winters shall besiege thy brow,And dig deep trenches in thy beauty's field,Thy youth's proud livery so gazed on now,Will be a totter'd weed of small worth held:Then being asked, where all thy beauty lies,Where all the treasure of thy lusty days;To say, within thine own deep sunken eyes,Were an all-eating shame, and thriftless praise.How much more praise deserv'd thy beauty's use,If thou couldst answer 'This fair child of mineShall sum my count, and make my old excuse,'Proving his beauty by succession thine!This were to be new made when thou art old,And see thy blood warm when thou feel'st it cold.""".split()#分詞trigrams = [ ([text_sequence[i], text_sequence[i+1]], text_sequence[i+2]) for i in xrange(len(text_sequence) - 2) ]trigrams[:10]
新聞熱點
疑難解答
圖片精選