nn包

我们重新设计了nn包,它由autograd完全集成。我们来看看这些变化。

用autograd替换容器

您不需要再使用像ConcatTable这样的容器,或者类似CAddTable的模块,或使用和调试nngraph。我们将无缝地使用自动量化来定义我们的神经网络。例如,

  • output = nn.CAddTable():forward({input1, input2}) 简单地变成了 output = input1 + input2
  • output = nn.MulConstant(0.5):forward(input) 简单地变成了 output = input * 0.5

状态不再存在于模块中,而是在网络图中:

由于这个原因,使用循环网络应该更简单。如果要创建一个经常性网络,只需多次使用相同的线性图层,而无需考虑分享权重。

torch-nn-vs-pytorch-nn

简化调试:

调试使用Pythonpdb调试器是直观的,并且调试器和堆栈跟踪在错误发生的地方停止。你所看到的就是你得到的。

示例1:ConvNet

看看如何创建一个小型ConvNet

所有的网络都是从基类nn.Module

  • 在构造函数中,您声明要使用的所有图层。
  • 在正向函数中,您可以定义如何运行模型,从输入到输出
import torch
from torch.autograd import Variable
import torch.nn as nn
import torch.nn.functional as F

class MNISTConvNet(nn.Module):

    def __init__(self):
        # this is the place where you instantiate all your modules
        # you can later access them using the same names you've given them in
        # here
        super(MNISTConvNet, self).__init__()
        self.conv1 = nn.Conv2d(1, 10, 5)
        self.pool1 = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(10, 20, 5)
        self.pool2 = nn.MaxPool2d(2, 2)
        self.fc1 = nn.Linear(320, 50)
        self.fc2 = nn.Linear(50, 10)

    # it's the forward function that defines the network structure
    # we're accepting only a single input in here, but if you want,
    # feel free to use more
    def forward(self, input):
        x = self.pool1(F.relu(self.conv1(input)))
        x = self.pool2(F.relu(self.conv2(x)))

        # in your model definition you can go full crazy and use arbitrary
        # python code to define your model structure
        # all these are perfectly legal, and will be handled correctly
        # by autograd:
        # if x.gt(0) > x.numel() / 2:
        #      ...
        #
        # you can even do a loop and reuse the same module inside it
        # modules no longer hold ephemeral state, so you can use them
        # multiple times during your forward pass
        # while x.norm(2) < 10:
        #    x = self.conv1(x)

        x = x.view(x.size(0), -1)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        return x

让我们现在使用定义的ConvNet。你先创建一个类的实例。

net = MNISTConvNet()
print(net)

输出:

MNISTConvNet (
  (conv1): Conv2d(1, 10, kernel_size=(5, 5), stride=(1, 1))
  (pool1): MaxPool2d (size=(2, 2), stride=(2, 2), dilation=(1, 1))
  (conv2): Conv2d(10, 20, kernel_size=(5, 5), stride=(1, 1))
  (pool2): MaxPool2d (size=(2, 2), stride=(2, 2), dilation=(1, 1))
  (fc1): Linear (320 -> 50)
  (fc2): Linear (50 -> 10)
)

注意:

torch.nn仅支持小批量整个torch.nn 包装仅支持作为小批量样品的输入,而不是单个样品。

例如,nn.Conv2d将采用4D TensornSamples x nChannels x Height x Width

如果您有一个样本,只需使用input.unsqueeze(0)来伪造一个尺寸。

创建一个包含单个随机数据样本的小批量,并通过ConvNet发送样本。

input = Variable(torch.randn(1, 1, 28, 28))
out = net(input)
print(out.size())

输出:

torch.Size([1, 10])

定义虚拟目标标签并使用损失函数计算错误。

target = Variable(torch.LongTensor([3]))
loss_fn = nn.CrossEntropyLoss()  # LogSoftmax + ClassNLL Loss
err = loss_fn(out, target)
err.backward()

print(err)

输出:

Variable containing:
 2.3186
[torch.FloatTensor of size 1]

ConvNet输出是一个变量。我们用这个来计算损失,结果是错误,也是一个变量。在err上调用.backward,因此将通过ConvNet传播梯度的所有的方式

让我们访问各个层的权重和梯度:

print(net.conv1.weight.grad.size())

输出:

torch.Size([10, 1, 5, 5])
print(net.conv1.weight.data.norm())  # norm of the weight
print(net.conv1.weight.grad.data.norm())  # norm of the gradients

输出:

1.8083854303685114 
0.1320870710384528

前向和后向功能钩

我们检查了重量和梯度。但是如何检查/修改图层的输出和grad_output?

为此目的引入了钩子。

您可以在a Module或a 上注册一个功能Variable。钩可以是前钩或后钩。当执行转移呼叫时,前进挂钩将被执行。反向挂钩将在后退阶段执行。我们来看一个例子。

我们在conv2上注册一个前进钩,并打印一些信息

def printnorm(self, input, output):
    # input is a tuple of packed inputs
    # output is a Variable. output.data is the Tensor we are interested
    print('Inside ' + self.__class__.__name__ + ' forward')
    print('')
    print('input: ', type(input))
    print('input[0]: ', type(input[0]))
    print('output: ', type(output))
    print('')
    print('input size:', input[0].size())
    print('output size:', output.data.size())
    print('output norm:', output.data.norm())

net.conv2.register_forward_hook(printnorm)

out = net(input)

输出:

Inside Conv2d forward

input:  <class 'tuple'>
input[0]:  <class 'torch.autograd.variable.Variable'>
output:  <class 'torch.autograd.variable.Variable'>

input size: torch.Size([1, 10, 12, 12])
output size: torch.Size([1, 20, 8, 8])
output norm: 13.36310708097026

我们在conv2上注册一个向后的钩子并打印一些信息

def printgradnorm(self, grad_input, grad_output):
    print('Inside ' + self.__class__.__name__ + ' backward')
    print('Inside class:' + self.__class__.__name__)
    print('')
    print('grad_input: ', type(grad_input))
    print('grad_input[0]: ', type(grad_input[0]))
    print('grad_output: ', type(grad_output))
    print('grad_output[0]: ', type(grad_output[0]))
    print('')
    print('grad_input size:', grad_input[0].size())
    print('grad_output size:', grad_output[0].size())
    print('grad_input norm:', grad_input[0].data.norm())

net.conv2.register_backward_hook(printgradnorm)

out = net(input)
err = loss_fn(out, target)
err.backward()

输出: 里面Conv2d向前

Inside Conv2d forward

input:  <class 'tuple'>
input[0]:  <class 'torch.autograd.variable.Variable'>
output:  <class 'torch.autograd.variable.Variable'>

input size: torch.Size([1, 10, 12, 12])
output size: torch.Size([1, 20, 8, 8])
output norm: 13.36310708097026
Inside Conv2d backward
Inside class:Conv2d

grad_input:  <class 'tuple'>
grad_input[0]:  <class 'torch.autograd.variable.Variable'>
grad_output:  <class 'tuple'>
grad_output[0]:  <class 'torch.autograd.variable.Variable'>

grad_input size: torch.Size([1, 10, 12, 12])
grad_output size: torch.Size([1, 20, 8, 8])
grad_input norm: 0.024339322124524825

一个完整和有效的MNIST示例位于这里 https://github.com/pytorch/examples/tree/master/mnist

示例2:经常网

接下来,我们来看看用PyTorch建立复发网。

由于网络的状态保存在图形中,而不是在图层中,因此您可以简单地创建一个nn.Linear并重复使用它来进行重现。

class RNN(nn.Module):

    # you can also accept arguments in your model constructor
    def __init__(self, data_size, hidden_size, output_size):
        super(RNN, self).__init__()

        self.hidden_size = hidden_size
        input_size = data_size + hidden_size

        self.i2h = nn.Linear(input_size, hidden_size)
        self.h2o = nn.Linear(hidden_size, output_size)

    def forward(self, data, last_hidden):
        input = torch.cat((data, last_hidden), 1)
        hidden = self.i2h(input)
        output = self.h2o(hidden)
        return hidden, output

rnn = RNN(50, 20, 10)

使用LSTM和Penn Tree-bank的更完整的语言建模示例位于 此处

默认情况下,PyTorch对ConvNets和Recurrent Nets进行了无缝的CuDNN集成

loss_fn = nn.MSELoss()

batch_size = 10
TIMESTEPS = 5

# Create some fake data
batch = Variable(torch.randn(batch_size, 50))
hidden = Variable(torch.zeros(batch_size, 20))
target = Variable(torch.zeros(batch_size, 10))

loss = 0
for t in range(TIMESTEPS):
    # yes! you can reuse the same network several times,
    # sum up the losses, and call backward!
    hidden, output = rnn(batch, hidden)
    loss += loss_fn(output, target)
loss.backward()

脚本的总运行时间:(0分0.035秒)

下载Python源代码:nn_tutorial.py
下载jupyter笔记本:nn_tutorial.ipynb