{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Transfer Learning\n", "\n", "In this notebook, you'll learn how to use pre-trained networks to solved challenging problems in computer vision. Specifically, you'll use networks trained on [ImageNet](http://www.image-net.org/) [available from torchvision](http://pytorch.org/docs/0.3.0/torchvision/models.html). \n", "\n", "ImageNet is a massive dataset with over 1 million labeled images in 1000 categories. It's used to train deep neural networks using an architecture called convolutional layers. I'm not going to get into the details of convolutional networks here, but if you want to learn more about them, please [watch this](https://www.youtube.com/watch?v=2-Ol7ZB0MmU).\n", "\n", "Once trained, these models work astonishingly well as feature detectors for images they weren't trained on. Using a pre-trained network on images not in the training set is called transfer learning. Here we'll use transfer learning to train a network that can classify our cat and dog photos with near perfect accuracy.\n", "\n", "With `torchvision.models` you can download these pre-trained networks and use them in your applications. We'll include `models` in our imports now." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "%config InlineBackend.figure_format = 'retina'\n", "\n", "import matplotlib.pyplot as plt\n", "\n", "import torch\n", "from torch import nn\n", "from torch import optim\n", "import torch.nn.functional as F\n", "from torchvision import datasets, transforms, models\n", "import os" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Most of the pretrained models require the input to be 224x224 images. Also, we'll need to match the normalization used when the models were trained. Each color channel was normalized separately, the means are `[0.485, 0.456, 0.406]` and the standard deviations are `[0.229, 0.224, 0.225]`." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "C:\\Users\\Daniel\\Projects\\git-repos\\udacity\\python\\Deep Learning\\Deep Learning with PyTorch\n" ] } ], "source": [ "data_dir = '\\\\Cat_Dog_data'\n", "\n", "# TODO: Define transforms for the training data and testing data\n", "train_transforms = transforms.Compose([transforms.RandomRotation(30),\n", " transforms.RandomResizedCrop(224),\n", " transforms.RandomHorizontalFlip(),\n", " transforms.ToTensor(),\n", " transforms.Normalize([0.485, 0.456, 0.406],\n", " [0.229, 0.224, 0.225])])\n", "\n", "test_transforms = transforms.Compose([transforms.Resize(255),\n", " transforms.CenterCrop(224),\n", " transforms.ToTensor(),\n", " transforms.Normalize([0.485, 0.456, 0.406],\n", " [0.229, 0.224, 0.225])])\n", "train_path = os.path.join('Cat_Dog_data', 'train')\n", "test_path = os.path.join('Cat_Dog_data', 'test')\n", "print(os.getcwd())\n", "# Pass transforms in here, then run the next cell to see how the transforms look\n", "train_data = datasets.ImageFolder(train_path, transform=train_transforms)\n", "test_data = datasets.ImageFolder(test_path, transform=test_transforms)\n", "\n", "trainloader = torch.utils.data.DataLoader(train_data, batch_size=64, shuffle=True)\n", "testloader = torch.utils.data.DataLoader(test_data, batch_size=64)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can load in a model such as [DenseNet](http://pytorch.org/docs/0.3.0/torchvision/models.html#id5). Let's print out the model architecture so we can see what's going on." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "DenseNet(\n", " (features): Sequential(\n", " (conv0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)\n", " (norm0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu0): ReLU(inplace)\n", " (pool0): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)\n", " (denseblock1): _DenseBlock(\n", " (denselayer1): _DenseLayer(\n", " (norm1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(64, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer2): _DenseLayer(\n", " (norm1): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(96, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer3): _DenseLayer(\n", " (norm1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer4): _DenseLayer(\n", " (norm1): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(160, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer5): _DenseLayer(\n", " (norm1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(192, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer6): _DenseLayer(\n", " (norm1): BatchNorm2d(224, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(224, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " )\n", " (transition1): _Transition(\n", " (norm): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu): ReLU(inplace)\n", " (conv): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (pool): AvgPool2d(kernel_size=2, stride=2, padding=0)\n", " )\n", " (denseblock2): _DenseBlock(\n", " (denselayer1): _DenseLayer(\n", " (norm1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer2): _DenseLayer(\n", " (norm1): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(160, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer3): _DenseLayer(\n", " (norm1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(192, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer4): _DenseLayer(\n", " (norm1): BatchNorm2d(224, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(224, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer5): _DenseLayer(\n", " (norm1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer6): _DenseLayer(\n", " (norm1): BatchNorm2d(288, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(288, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer7): _DenseLayer(\n", " (norm1): BatchNorm2d(320, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(320, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer8): _DenseLayer(\n", " (norm1): BatchNorm2d(352, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(352, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer9): _DenseLayer(\n", " (norm1): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(384, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer10): _DenseLayer(\n", " (norm1): BatchNorm2d(416, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(416, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer11): _DenseLayer(\n", " (norm1): BatchNorm2d(448, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(448, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer12): _DenseLayer(\n", " (norm1): BatchNorm2d(480, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(480, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " )\n", " (transition2): _Transition(\n", " (norm): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu): ReLU(inplace)\n", " (conv): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (pool): AvgPool2d(kernel_size=2, stride=2, padding=0)\n", " )\n", " (denseblock3): _DenseBlock(\n", " (denselayer1): _DenseLayer(\n", " (norm1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer2): _DenseLayer(\n", " (norm1): BatchNorm2d(288, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(288, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer3): _DenseLayer(\n", " (norm1): BatchNorm2d(320, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(320, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer4): _DenseLayer(\n", " (norm1): BatchNorm2d(352, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(352, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer5): _DenseLayer(\n", " (norm1): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(384, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer6): _DenseLayer(\n", " (norm1): BatchNorm2d(416, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(416, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer7): _DenseLayer(\n", " (norm1): BatchNorm2d(448, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(448, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer8): _DenseLayer(\n", " (norm1): BatchNorm2d(480, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(480, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer9): _DenseLayer(\n", " (norm1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer10): _DenseLayer(\n", " (norm1): BatchNorm2d(544, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(544, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer11): _DenseLayer(\n", " (norm1): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(576, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer12): _DenseLayer(\n", " (norm1): BatchNorm2d(608, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(608, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer13): _DenseLayer(\n", " (norm1): BatchNorm2d(640, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(640, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer14): _DenseLayer(\n", " (norm1): BatchNorm2d(672, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(672, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer15): _DenseLayer(\n", " (norm1): BatchNorm2d(704, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(704, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer16): _DenseLayer(\n", " (norm1): BatchNorm2d(736, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(736, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer17): _DenseLayer(\n", " (norm1): BatchNorm2d(768, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(768, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer18): _DenseLayer(\n", " (norm1): BatchNorm2d(800, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(800, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer19): _DenseLayer(\n", " (norm1): BatchNorm2d(832, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(832, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer20): _DenseLayer(\n", " (norm1): BatchNorm2d(864, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(864, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer21): _DenseLayer(\n", " (norm1): BatchNorm2d(896, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(896, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer22): _DenseLayer(\n", " (norm1): BatchNorm2d(928, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(928, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer23): _DenseLayer(\n", " (norm1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(960, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer24): _DenseLayer(\n", " (norm1): BatchNorm2d(992, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(992, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " )\n", " (transition3): _Transition(\n", " (norm): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu): ReLU(inplace)\n", " (conv): Conv2d(1024, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (pool): AvgPool2d(kernel_size=2, stride=2, padding=0)\n", " )\n", " (denseblock4): _DenseBlock(\n", " (denselayer1): _DenseLayer(\n", " (norm1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer2): _DenseLayer(\n", " (norm1): BatchNorm2d(544, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(544, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer3): _DenseLayer(\n", " (norm1): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(576, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer4): _DenseLayer(\n", " (norm1): BatchNorm2d(608, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(608, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer5): _DenseLayer(\n", " (norm1): BatchNorm2d(640, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(640, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer6): _DenseLayer(\n", " (norm1): BatchNorm2d(672, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(672, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer7): _DenseLayer(\n", " (norm1): BatchNorm2d(704, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(704, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer8): _DenseLayer(\n", " (norm1): BatchNorm2d(736, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(736, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer9): _DenseLayer(\n", " (norm1): BatchNorm2d(768, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(768, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer10): _DenseLayer(\n", " (norm1): BatchNorm2d(800, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(800, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer11): _DenseLayer(\n", " (norm1): BatchNorm2d(832, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(832, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer12): _DenseLayer(\n", " (norm1): BatchNorm2d(864, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(864, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer13): _DenseLayer(\n", " (norm1): BatchNorm2d(896, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(896, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer14): _DenseLayer(\n", " (norm1): BatchNorm2d(928, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(928, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer15): _DenseLayer(\n", " (norm1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(960, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer16): _DenseLayer(\n", " (norm1): BatchNorm2d(992, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(992, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " )\n", " (norm5): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " )\n", " (classifier): Linear(in_features=1024, out_features=1000, bias=True)\n", ")" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model = models.densenet121(pretrained=True)\n", "model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This model is built out of two main parts, the features and the classifier. The features part is a stack of convolutional layers and overall works as a feature detector that can be fed into a classifier. The classifier part is a single fully-connected layer `(classifier): Linear(in_features=1024, out_features=1000)`. This layer was trained on the ImageNet dataset, so it won't work for our specific problem. That means we need to replace the classifier, but the features will work perfectly on their own. In general, I think about pre-trained networks as amazingly good feature detectors that can be used as the input for simple feed-forward classifiers." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "# Freeze parameters so we don't backprop through them\n", "for param in model.parameters():\n", " param.requires_grad = False\n", "\n", "from collections import OrderedDict\n", "classifier = nn.Sequential(OrderedDict([\n", " ('fc1', nn.Linear(1024, 500)),\n", " ('relu', nn.ReLU()),\n", " ('fc2', nn.Linear(500, 2)),\n", " ('output', nn.LogSoftmax(dim=1))\n", " ]))\n", " \n", "model.classifier = classifier" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "With our model built, we need to train the classifier. However, now we're using a **really deep** neural network. If you try to train this on a CPU like normal, it will take a long, long time. Instead, we're going to use the GPU to do the calculations. The linear algebra computations are done in parallel on the GPU leading to 100x increased training speeds. It's also possible to train on multiple GPUs, further decreasing training time.\n", "\n", "PyTorch, along with pretty much every other deep learning framework, uses [CUDA](https://developer.nvidia.com/cuda-zone) to efficiently compute the forward and backwards passes on the GPU. In PyTorch, you move your model parameters and other tensors to the GPU memory using `model.to('cuda')`. You can move them back from the GPU with `model.to('cpu')` which you'll commonly do when you need to operate on the network output outside of PyTorch. As a demonstration of the increased speed, I'll compare how long it takes to perform a forward and backward pass with and without a GPU." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "import time" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "torch.cuda.is_available()" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "torch.backends.cudnn.enabled" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Device = cpu; Time per batch: 2.370 seconds\n", "Device = cuda; Time per batch: 0.019 seconds\n" ] } ], "source": [ "for device in ['cpu', 'cuda']:\n", "\n", " criterion = nn.NLLLoss()\n", " # Only train the classifier parameters, feature parameters are frozen\n", " optimizer = optim.Adam(model.classifier.parameters(), lr=0.001)\n", "\n", " model.to(device)\n", "\n", " for ii, (inputs, labels) in enumerate(trainloader):\n", "\n", " # Move input and label tensors to the GPU\n", " inputs, labels = inputs.to(device), labels.to(device)\n", "\n", " start = time.time()\n", "\n", " outputs = model.forward(inputs)\n", " loss = criterion(outputs, labels)\n", " loss.backward()\n", " optimizer.step()\n", "\n", " if ii==3:\n", " break\n", " \n", " print(f\"Device = {device}; Time per batch: {(time.time() - start)/3:.3f} seconds\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can write device agnostic code which will automatically use CUDA if it's enabled like so:\n", "```python\n", "# at beginning of the script\n", "device = torch.device(\"cuda:0\" if torch.cuda.is_available() else \"cpu\")\n", "\n", "...\n", "\n", "# then whenever you get a new Tensor or Module\n", "# this won't copy if they are already on the desired device\n", "input = data.to(device)\n", "model = MyModule(...).to(device)\n", "```\n", "\n", "From here, I'll let you finish training the model. The process is the same as before except now your model is much more powerful. You should get better than 95% accuracy easily.\n", "\n", ">**Exercise:** Train a pretrained models to classify the cat and dog images. Continue with the DenseNet model, or try ResNet, it's also a good model to try out first. Make sure you are only training the classifier and the parameters for the features part are frozen." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "DenseNet(\n", " (features): Sequential(\n", " (conv0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)\n", " (norm0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu0): ReLU(inplace)\n", " (pool0): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)\n", " (denseblock1): _DenseBlock(\n", " (denselayer1): _DenseLayer(\n", " (norm1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(64, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer2): _DenseLayer(\n", " (norm1): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(96, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer3): _DenseLayer(\n", " (norm1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer4): _DenseLayer(\n", " (norm1): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(160, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer5): _DenseLayer(\n", " (norm1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(192, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer6): _DenseLayer(\n", " (norm1): BatchNorm2d(224, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(224, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " )\n", " (transition1): _Transition(\n", " (norm): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu): ReLU(inplace)\n", " (conv): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (pool): AvgPool2d(kernel_size=2, stride=2, padding=0)\n", " )\n", " (denseblock2): _DenseBlock(\n", " (denselayer1): _DenseLayer(\n", " (norm1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer2): _DenseLayer(\n", " (norm1): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(160, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer3): _DenseLayer(\n", " (norm1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(192, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer4): _DenseLayer(\n", " (norm1): BatchNorm2d(224, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(224, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer5): _DenseLayer(\n", " (norm1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer6): _DenseLayer(\n", " (norm1): BatchNorm2d(288, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(288, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer7): _DenseLayer(\n", " (norm1): BatchNorm2d(320, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(320, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer8): _DenseLayer(\n", " (norm1): BatchNorm2d(352, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(352, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer9): _DenseLayer(\n", " (norm1): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(384, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer10): _DenseLayer(\n", " (norm1): BatchNorm2d(416, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(416, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer11): _DenseLayer(\n", " (norm1): BatchNorm2d(448, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(448, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer12): _DenseLayer(\n", " (norm1): BatchNorm2d(480, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(480, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " )\n", " (transition2): _Transition(\n", " (norm): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu): ReLU(inplace)\n", " (conv): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (pool): AvgPool2d(kernel_size=2, stride=2, padding=0)\n", " )\n", " (denseblock3): _DenseBlock(\n", " (denselayer1): _DenseLayer(\n", " (norm1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer2): _DenseLayer(\n", " (norm1): BatchNorm2d(288, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(288, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer3): _DenseLayer(\n", " (norm1): BatchNorm2d(320, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(320, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer4): _DenseLayer(\n", " (norm1): BatchNorm2d(352, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(352, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer5): _DenseLayer(\n", " (norm1): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(384, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer6): _DenseLayer(\n", " (norm1): BatchNorm2d(416, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(416, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer7): _DenseLayer(\n", " (norm1): BatchNorm2d(448, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(448, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer8): _DenseLayer(\n", " (norm1): BatchNorm2d(480, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(480, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer9): _DenseLayer(\n", " (norm1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer10): _DenseLayer(\n", " (norm1): BatchNorm2d(544, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(544, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer11): _DenseLayer(\n", " (norm1): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(576, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer12): _DenseLayer(\n", " (norm1): BatchNorm2d(608, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(608, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer13): _DenseLayer(\n", " (norm1): BatchNorm2d(640, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(640, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer14): _DenseLayer(\n", " (norm1): BatchNorm2d(672, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(672, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer15): _DenseLayer(\n", " (norm1): BatchNorm2d(704, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(704, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer16): _DenseLayer(\n", " (norm1): BatchNorm2d(736, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(736, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer17): _DenseLayer(\n", " (norm1): BatchNorm2d(768, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(768, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer18): _DenseLayer(\n", " (norm1): BatchNorm2d(800, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(800, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer19): _DenseLayer(\n", " (norm1): BatchNorm2d(832, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(832, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer20): _DenseLayer(\n", " (norm1): BatchNorm2d(864, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(864, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer21): _DenseLayer(\n", " (norm1): BatchNorm2d(896, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(896, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer22): _DenseLayer(\n", " (norm1): BatchNorm2d(928, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(928, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer23): _DenseLayer(\n", " (norm1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(960, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer24): _DenseLayer(\n", " (norm1): BatchNorm2d(992, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(992, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " )\n", " (transition3): _Transition(\n", " (norm): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu): ReLU(inplace)\n", " (conv): Conv2d(1024, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (pool): AvgPool2d(kernel_size=2, stride=2, padding=0)\n", " )\n", " (denseblock4): _DenseBlock(\n", " (denselayer1): _DenseLayer(\n", " (norm1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer2): _DenseLayer(\n", " (norm1): BatchNorm2d(544, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(544, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer3): _DenseLayer(\n", " (norm1): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(576, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer4): _DenseLayer(\n", " (norm1): BatchNorm2d(608, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(608, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer5): _DenseLayer(\n", " (norm1): BatchNorm2d(640, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(640, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer6): _DenseLayer(\n", " (norm1): BatchNorm2d(672, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(672, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer7): _DenseLayer(\n", " (norm1): BatchNorm2d(704, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(704, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer8): _DenseLayer(\n", " (norm1): BatchNorm2d(736, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(736, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer9): _DenseLayer(\n", " (norm1): BatchNorm2d(768, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(768, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer10): _DenseLayer(\n", " (norm1): BatchNorm2d(800, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(800, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer11): _DenseLayer(\n", " (norm1): BatchNorm2d(832, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(832, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer12): _DenseLayer(\n", " (norm1): BatchNorm2d(864, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(864, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer13): _DenseLayer(\n", " (norm1): BatchNorm2d(896, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(896, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer14): _DenseLayer(\n", " (norm1): BatchNorm2d(928, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(928, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer15): _DenseLayer(\n", " (norm1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(960, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " (denselayer16): _DenseLayer(\n", " (norm1): BatchNorm2d(992, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu1): ReLU(inplace)\n", " (conv1): Conv2d(992, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n", " (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " (relu2): ReLU(inplace)\n", " (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n", " )\n", " )\n", " (norm5): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n", " )\n", " (classifier): Sequential(\n", " (0): Linear(in_features=1024, out_features=256, bias=True)\n", " (1): ReLU()\n", " (2): Dropout(p=0.2)\n", " (3): Linear(in_features=256, out_features=2, bias=True)\n", " (4): LogSoftmax()\n", " )\n", ")" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "## TODO: Use a pretrained model to classify the cat and dog images\n", "\n", "# Use GPU if it's available\n", "device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')\n", "\n", "model = models.densenet121(pretrained=True)\n", "\n", "# Freeze Parameters so we don't back propagate through them\n", "for param in model.parameters():\n", " param.requires_grad = False\n", " \n", "model.classifier = nn.Sequential(nn.Linear(1024, 256),\n", " nn.ReLU(),\n", " nn.Dropout(0.2),\n", " nn.Linear(256, 2),\n", " nn.LogSoftmax(dim=1))\n", "\n", "criterion = nn.NLLLoss()\n", "\n", "# Only train the classifier parameters, feature parameters are frozen\n", "optimizer = optim.Adam(model.classifier.parameters(), lr=0.003)\n", "\n", "model.to(device)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch: 1 out of 1\n", "Training Loss: 0.745\n", "Test Loss: 0.237\n", "Test Accuracy: 0.968\n", "\n", "Epoch: 1 out of 1\n", "Training Loss: 0.335\n", "Test Loss: 0.124\n", "Test Accuracy: 0.969\n", "\n", "Epoch: 1 out of 1\n", "Training Loss: 0.197\n", "Test Loss: 0.073\n", "Test Accuracy: 0.978\n", "\n", "Epoch: 1 out of 1\n", "Training Loss: 0.248\n", "Test Loss: 0.066\n", "Test Accuracy: 0.978\n", "\n", "Epoch: 1 out of 1\n", "Training Loss: 0.203\n", "Test Loss: 0.067\n", "Test Accuracy: 0.976\n", "\n", "Epoch: 1 out of 1\n", "Training Loss: 0.223\n", "Test Loss: 0.053\n", "Test Accuracy: 0.980\n", "\n", "Epoch: 1 out of 1\n", "Training Loss: 0.174\n", "Test Loss: 0.051\n", "Test Accuracy: 0.982\n", "\n", "Epoch: 1 out of 1\n", "Training Loss: 0.111\n", "Test Loss: 0.051\n", "Test Accuracy: 0.980\n", "\n", "Epoch: 1 out of 1\n", "Training Loss: 0.158\n", "Test Loss: 0.046\n", "Test Accuracy: 0.982\n", "\n", "Epoch: 1 out of 1\n", "Training Loss: 0.136\n", "Test Loss: 0.046\n", "Test Accuracy: 0.985\n", "\n", "Epoch: 1 out of 1\n", "Training Loss: 0.174\n", "Test Loss: 0.057\n", "Test Accuracy: 0.977\n", "\n", "Epoch: 1 out of 1\n", "Training Loss: 0.170\n", "Test Loss: 0.050\n", "Test Accuracy: 0.984\n", "\n", "Epoch: 1 out of 1\n", "Training Loss: 0.179\n", "Test Loss: 0.046\n", "Test Accuracy: 0.984\n", "\n", "Epoch: 1 out of 1\n", "Training Loss: 0.207\n", "Test Loss: 0.051\n", "Test Accuracy: 0.982\n", "\n", "Epoch: 1 out of 1\n", "Training Loss: 0.204\n", "Test Loss: 0.111\n", "Test Accuracy: 0.961\n", "\n", "Epoch: 1 out of 1\n", "Training Loss: 0.233\n", "Test Loss: 0.164\n", "Test Accuracy: 0.935\n", "\n", "Epoch: 1 out of 1\n", "Training Loss: 0.367\n", "Test Loss: 0.191\n", "Test Accuracy: 0.919\n", "\n" ] }, { "ename": "KeyboardInterrupt", "evalue": "", "output_type": "error", "traceback": [ "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[1;31mKeyboardInterrupt\u001b[0m Traceback (most recent call last)", "\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m\u001b[0m\n\u001b[0;32m 35\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 36\u001b[0m \u001b[1;32mwith\u001b[0m \u001b[0mtorch\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mno_grad\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m---> 37\u001b[1;33m \u001b[1;32mfor\u001b[0m \u001b[0minputs\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mlabels\u001b[0m \u001b[1;32min\u001b[0m \u001b[0mtestloader\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 38\u001b[0m \u001b[1;31m# Move input and label tensors to the device\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 39\u001b[0m \u001b[0minputs\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mlabels\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0minputs\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mto\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mdevice\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mlabels\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mto\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mdevice\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n", "\u001b[1;32m~\\Anaconda3\\lib\\site-packages\\torch\\utils\\data\\dataloader.py\u001b[0m in \u001b[0;36m__next__\u001b[1;34m(self)\u001b[0m\n\u001b[0;32m 558\u001b[0m \u001b[1;32mif\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mnum_workers\u001b[0m \u001b[1;33m==\u001b[0m \u001b[1;36m0\u001b[0m\u001b[1;33m:\u001b[0m \u001b[1;31m# same-process loading\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 559\u001b[0m \u001b[0mindices\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mnext\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0msample_iter\u001b[0m\u001b[1;33m)\u001b[0m \u001b[1;31m# may raise StopIteration\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 560\u001b[1;33m \u001b[0mbatch\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mcollate_fn\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m[\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mdataset\u001b[0m\u001b[1;33m[\u001b[0m\u001b[0mi\u001b[0m\u001b[1;33m]\u001b[0m \u001b[1;32mfor\u001b[0m \u001b[0mi\u001b[0m \u001b[1;32min\u001b[0m \u001b[0mindices\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 561\u001b[0m \u001b[1;32mif\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mpin_memory\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 562\u001b[0m \u001b[0mbatch\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0m_utils\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mpin_memory\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mpin_memory_batch\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mbatch\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n", "\u001b[1;32m~\\Anaconda3\\lib\\site-packages\\torch\\utils\\data\\dataloader.py\u001b[0m in \u001b[0;36m\u001b[1;34m(.0)\u001b[0m\n\u001b[0;32m 558\u001b[0m \u001b[1;32mif\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mnum_workers\u001b[0m \u001b[1;33m==\u001b[0m \u001b[1;36m0\u001b[0m\u001b[1;33m:\u001b[0m \u001b[1;31m# same-process loading\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 559\u001b[0m \u001b[0mindices\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mnext\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0msample_iter\u001b[0m\u001b[1;33m)\u001b[0m \u001b[1;31m# may raise StopIteration\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 560\u001b[1;33m \u001b[0mbatch\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mcollate_fn\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m[\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mdataset\u001b[0m\u001b[1;33m[\u001b[0m\u001b[0mi\u001b[0m\u001b[1;33m]\u001b[0m \u001b[1;32mfor\u001b[0m \u001b[0mi\u001b[0m \u001b[1;32min\u001b[0m \u001b[0mindices\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 561\u001b[0m \u001b[1;32mif\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mpin_memory\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 562\u001b[0m \u001b[0mbatch\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0m_utils\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mpin_memory\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mpin_memory_batch\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mbatch\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n", "\u001b[1;32m~\\Anaconda3\\lib\\site-packages\\torchvision\\datasets\\folder.py\u001b[0m in \u001b[0;36m__getitem__\u001b[1;34m(self, index)\u001b[0m\n\u001b[0;32m 136\u001b[0m \"\"\"\n\u001b[0;32m 137\u001b[0m \u001b[0mpath\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mtarget\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0msamples\u001b[0m\u001b[1;33m[\u001b[0m\u001b[0mindex\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 138\u001b[1;33m \u001b[0msample\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mloader\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mpath\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 139\u001b[0m \u001b[1;32mif\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mtransform\u001b[0m \u001b[1;32mis\u001b[0m \u001b[1;32mnot\u001b[0m \u001b[1;32mNone\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 140\u001b[0m \u001b[0msample\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mtransform\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0msample\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n", "\u001b[1;32m~\\Anaconda3\\lib\\site-packages\\torchvision\\datasets\\folder.py\u001b[0m in \u001b[0;36mdefault_loader\u001b[1;34m(path)\u001b[0m\n\u001b[0;32m 172\u001b[0m \u001b[1;32mreturn\u001b[0m \u001b[0maccimage_loader\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mpath\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 173\u001b[0m \u001b[1;32melse\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 174\u001b[1;33m \u001b[1;32mreturn\u001b[0m \u001b[0mpil_loader\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mpath\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 175\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 176\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n", "\u001b[1;32m~\\Anaconda3\\lib\\site-packages\\torchvision\\datasets\\folder.py\u001b[0m in \u001b[0;36mpil_loader\u001b[1;34m(path)\u001b[0m\n\u001b[0;32m 155\u001b[0m \u001b[1;32mwith\u001b[0m \u001b[0mopen\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mpath\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;34m'rb'\u001b[0m\u001b[1;33m)\u001b[0m \u001b[1;32mas\u001b[0m \u001b[0mf\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 156\u001b[0m \u001b[0mimg\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mImage\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mopen\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mf\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 157\u001b[1;33m \u001b[1;32mreturn\u001b[0m \u001b[0mimg\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mconvert\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m'RGB'\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 158\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 159\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n", "\u001b[1;32m~\\Anaconda3\\lib\\site-packages\\PIL\\Image.py\u001b[0m in \u001b[0;36mconvert\u001b[1;34m(self, mode, matrix, dither, palette, colors)\u001b[0m\n\u001b[0;32m 913\u001b[0m \"\"\"\n\u001b[0;32m 914\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 915\u001b[1;33m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mload\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 916\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 917\u001b[0m \u001b[1;32mif\u001b[0m \u001b[1;32mnot\u001b[0m \u001b[0mmode\u001b[0m \u001b[1;32mand\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mmode\u001b[0m \u001b[1;33m==\u001b[0m \u001b[1;34m\"P\"\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n", "\u001b[1;32m~\\Anaconda3\\lib\\site-packages\\PIL\\ImageFile.py\u001b[0m in \u001b[0;36mload\u001b[1;34m(self)\u001b[0m\n\u001b[0;32m 239\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 240\u001b[0m \u001b[0mb\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mb\u001b[0m \u001b[1;33m+\u001b[0m \u001b[0ms\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 241\u001b[1;33m \u001b[0mn\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0merr_code\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mdecoder\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mdecode\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mb\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 242\u001b[0m \u001b[1;32mif\u001b[0m \u001b[0mn\u001b[0m \u001b[1;33m<\u001b[0m \u001b[1;36m0\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 243\u001b[0m \u001b[1;32mbreak\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n", "\u001b[1;31mKeyboardInterrupt\u001b[0m: " ] } ], "source": [ "epochs = 1\n", "steps = 0\n", "runningLoss = 0\n", "printEvery = 5\n", "\n", "for epoch in range(epochs):\n", " for inputs, labels in trainloader:\n", " steps += 1\n", " \n", " # Move input and label tensors to the device\n", " inputs, labels = inputs.to(device), labels.to(device)\n", " \n", " # Zero the gradients\n", " optimizer.zero_grad()\n", " \n", " # Make forward pass\n", " logps = model.forward(inputs)\n", " \n", " # Calculate loss\n", " loss = criterion(logps, labels)\n", " \n", " # Backpropagate\n", " loss.backward()\n", " \n", " # Update the weights\n", " optimizer.step()\n", " \n", " runningLoss += loss.item()\n", " \n", " # Do the validation pass\n", " if steps % printEvery == 0:\n", " testLoss = 0\n", " accuracy = 0\n", " model.eval()\n", " \n", " with torch.no_grad():\n", " for inputs, labels in testloader:\n", " # Move input and label tensors to the device\n", " inputs, labels = inputs.to(device), labels.to(device)\n", " \n", " # Get the output\n", " logps = model.forward(inputs)\n", " \n", " # Get the loss\n", " batchLoss = criterion(logps, labels)\n", " testLoss += batchLoss.item()\n", " \n", " # Find the accuracy\n", " # Get the probabilities\n", " ps = torch.exp(logps)\n", " \n", " # Get the most likely class for each prediction\n", " top_p, top_class = ps.topk(1, dim=1)\n", " \n", " # Check if the predictions match the actual label\n", " equals = top_class == labels.view(*top_class.shape)\n", " \n", " # Update accuracy\n", " accuracy += torch.mean(equals.type(torch.FloatTensor)).item()\n", " \n", " # Print output\n", " print(f'Epoch: {epoch+1} out of {epochs}')\n", " print(f'Training Loss: {runningLoss/printEvery:.3f}')\n", " print(f'Test Loss: {testLoss/len(testloader):.3f}')\n", " print(f'Test Accuracy: {accuracy/len(testloader):.3f}')\n", " print()\n", "\n", " runningLoss = 0\n", " model.train()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "device" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 2 }