updating cuda for windows - first step

This commit is contained in:
2019-07-24 00:45:59 +01:00
parent 46660262e0
commit a6177a8f2f
5 changed files with 1279 additions and 279 deletions

View File

@@ -0,0 +1,945 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Transfer Learning\n",
"\n",
"In this notebook, you'll learn how to use pre-trained networks to solved challenging problems in computer vision. Specifically, you'll use networks trained on [ImageNet](http://www.image-net.org/) [available from torchvision](http://pytorch.org/docs/0.3.0/torchvision/models.html). \n",
"\n",
"ImageNet is a massive dataset with over 1 million labeled images in 1000 categories. It's used to train deep neural networks using an architecture called convolutional layers. I'm not going to get into the details of convolutional networks here, but if you want to learn more about them, please [watch this](https://www.youtube.com/watch?v=2-Ol7ZB0MmU).\n",
"\n",
"Once trained, these models work astonishingly well as feature detectors for images they weren't trained on. Using a pre-trained network on images not in the training set is called transfer learning. Here we'll use transfer learning to train a network that can classify our cat and dog photos with near perfect accuracy.\n",
"\n",
"With `torchvision.models` you can download these pre-trained networks and use them in your applications. We'll include `models` in our imports now."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [],
"source": [
"%matplotlib inline\n",
"%config InlineBackend.figure_format = 'retina'\n",
"\n",
"import matplotlib.pyplot as plt\n",
"\n",
"import torch\n",
"from torch import nn\n",
"from torch import optim\n",
"import torch.nn.functional as F\n",
"from torchvision import datasets, transforms, models\n",
"import os"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Most of the pretrained models require the input to be 224x224 images. Also, we'll need to match the normalization used when the models were trained. Each color channel was normalized separately, the means are `[0.485, 0.456, 0.406]` and the standard deviations are `[0.229, 0.224, 0.225]`."
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"C:\\Users\\Daniel\\Projects\\git-repos\\udacity\\python\\Deep Learning\\Deep Learning with PyTorch\n"
]
}
],
"source": [
"data_dir = '\\\\Cat_Dog_data'\n",
"\n",
"# TODO: Define transforms for the training data and testing data\n",
"train_transforms = transforms.Compose([transforms.RandomRotation(30),\n",
" transforms.RandomResizedCrop(224),\n",
" transforms.RandomHorizontalFlip(),\n",
" transforms.ToTensor(),\n",
" transforms.Normalize([0.485, 0.456, 0.406],\n",
" [0.229, 0.224, 0.225])])\n",
"\n",
"test_transforms = transforms.Compose([transforms.Resize(255),\n",
" transforms.CenterCrop(224),\n",
" transforms.ToTensor(),\n",
" transforms.Normalize([0.485, 0.456, 0.406],\n",
" [0.229, 0.224, 0.225])])\n",
"train_path = os.path.join('Cat_Dog_data', 'train')\n",
"test_path = os.path.join('Cat_Dog_data', 'test')\n",
"print(os.getcwd())\n",
"# Pass transforms in here, then run the next cell to see how the transforms look\n",
"train_data = datasets.ImageFolder(train_path, transform=train_transforms)\n",
"test_data = datasets.ImageFolder(test_path, transform=test_transforms)\n",
"\n",
"trainloader = torch.utils.data.DataLoader(train_data, batch_size=64, shuffle=True)\n",
"testloader = torch.utils.data.DataLoader(test_data, batch_size=64)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can load in a model such as [DenseNet](http://pytorch.org/docs/0.3.0/torchvision/models.html#id5). Let's print out the model architecture so we can see what's going on."
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Downloading: \"https://download.pytorch.org/models/densenet121-a639ec97.pth\" to C:\\Users\\Daniel/.torch\\models\\densenet121-a639ec97.pth\n",
"32342954it [00:01, 24558733.49it/s]\n"
]
},
{
"data": {
"text/plain": [
"DenseNet(\n",
" (features): Sequential(\n",
" (conv0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)\n",
" (norm0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu0): ReLU(inplace)\n",
" (pool0): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)\n",
" (denseblock1): _DenseBlock(\n",
" (denselayer1): _DenseLayer(\n",
" (norm1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(64, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer2): _DenseLayer(\n",
" (norm1): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(96, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer3): _DenseLayer(\n",
" (norm1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer4): _DenseLayer(\n",
" (norm1): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(160, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer5): _DenseLayer(\n",
" (norm1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(192, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer6): _DenseLayer(\n",
" (norm1): BatchNorm2d(224, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(224, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" )\n",
" (transition1): _Transition(\n",
" (norm): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu): ReLU(inplace)\n",
" (conv): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (pool): AvgPool2d(kernel_size=2, stride=2, padding=0)\n",
" )\n",
" (denseblock2): _DenseBlock(\n",
" (denselayer1): _DenseLayer(\n",
" (norm1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(128, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer2): _DenseLayer(\n",
" (norm1): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(160, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer3): _DenseLayer(\n",
" (norm1): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(192, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer4): _DenseLayer(\n",
" (norm1): BatchNorm2d(224, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(224, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer5): _DenseLayer(\n",
" (norm1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer6): _DenseLayer(\n",
" (norm1): BatchNorm2d(288, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(288, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer7): _DenseLayer(\n",
" (norm1): BatchNorm2d(320, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(320, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer8): _DenseLayer(\n",
" (norm1): BatchNorm2d(352, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(352, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer9): _DenseLayer(\n",
" (norm1): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(384, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer10): _DenseLayer(\n",
" (norm1): BatchNorm2d(416, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(416, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer11): _DenseLayer(\n",
" (norm1): BatchNorm2d(448, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(448, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer12): _DenseLayer(\n",
" (norm1): BatchNorm2d(480, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(480, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" )\n",
" (transition2): _Transition(\n",
" (norm): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu): ReLU(inplace)\n",
" (conv): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (pool): AvgPool2d(kernel_size=2, stride=2, padding=0)\n",
" )\n",
" (denseblock3): _DenseBlock(\n",
" (denselayer1): _DenseLayer(\n",
" (norm1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer2): _DenseLayer(\n",
" (norm1): BatchNorm2d(288, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(288, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer3): _DenseLayer(\n",
" (norm1): BatchNorm2d(320, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(320, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer4): _DenseLayer(\n",
" (norm1): BatchNorm2d(352, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(352, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer5): _DenseLayer(\n",
" (norm1): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(384, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer6): _DenseLayer(\n",
" (norm1): BatchNorm2d(416, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(416, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer7): _DenseLayer(\n",
" (norm1): BatchNorm2d(448, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(448, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer8): _DenseLayer(\n",
" (norm1): BatchNorm2d(480, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(480, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer9): _DenseLayer(\n",
" (norm1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer10): _DenseLayer(\n",
" (norm1): BatchNorm2d(544, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(544, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer11): _DenseLayer(\n",
" (norm1): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(576, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer12): _DenseLayer(\n",
" (norm1): BatchNorm2d(608, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(608, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer13): _DenseLayer(\n",
" (norm1): BatchNorm2d(640, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(640, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer14): _DenseLayer(\n",
" (norm1): BatchNorm2d(672, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(672, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer15): _DenseLayer(\n",
" (norm1): BatchNorm2d(704, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(704, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer16): _DenseLayer(\n",
" (norm1): BatchNorm2d(736, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(736, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer17): _DenseLayer(\n",
" (norm1): BatchNorm2d(768, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(768, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer18): _DenseLayer(\n",
" (norm1): BatchNorm2d(800, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(800, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer19): _DenseLayer(\n",
" (norm1): BatchNorm2d(832, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(832, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer20): _DenseLayer(\n",
" (norm1): BatchNorm2d(864, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(864, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer21): _DenseLayer(\n",
" (norm1): BatchNorm2d(896, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(896, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer22): _DenseLayer(\n",
" (norm1): BatchNorm2d(928, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(928, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer23): _DenseLayer(\n",
" (norm1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(960, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer24): _DenseLayer(\n",
" (norm1): BatchNorm2d(992, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(992, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" )\n",
" (transition3): _Transition(\n",
" (norm): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu): ReLU(inplace)\n",
" (conv): Conv2d(1024, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (pool): AvgPool2d(kernel_size=2, stride=2, padding=0)\n",
" )\n",
" (denseblock4): _DenseBlock(\n",
" (denselayer1): _DenseLayer(\n",
" (norm1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer2): _DenseLayer(\n",
" (norm1): BatchNorm2d(544, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(544, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer3): _DenseLayer(\n",
" (norm1): BatchNorm2d(576, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(576, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer4): _DenseLayer(\n",
" (norm1): BatchNorm2d(608, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(608, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer5): _DenseLayer(\n",
" (norm1): BatchNorm2d(640, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(640, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer6): _DenseLayer(\n",
" (norm1): BatchNorm2d(672, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(672, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer7): _DenseLayer(\n",
" (norm1): BatchNorm2d(704, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(704, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer8): _DenseLayer(\n",
" (norm1): BatchNorm2d(736, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(736, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer9): _DenseLayer(\n",
" (norm1): BatchNorm2d(768, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(768, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer10): _DenseLayer(\n",
" (norm1): BatchNorm2d(800, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(800, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer11): _DenseLayer(\n",
" (norm1): BatchNorm2d(832, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(832, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer12): _DenseLayer(\n",
" (norm1): BatchNorm2d(864, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(864, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer13): _DenseLayer(\n",
" (norm1): BatchNorm2d(896, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(896, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer14): _DenseLayer(\n",
" (norm1): BatchNorm2d(928, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(928, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer15): _DenseLayer(\n",
" (norm1): BatchNorm2d(960, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(960, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" (denselayer16): _DenseLayer(\n",
" (norm1): BatchNorm2d(992, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu1): ReLU(inplace)\n",
" (conv1): Conv2d(992, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)\n",
" (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" (relu2): ReLU(inplace)\n",
" (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)\n",
" )\n",
" )\n",
" (norm5): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
" )\n",
" (classifier): Linear(in_features=1024, out_features=1000, bias=True)\n",
")"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"model = models.densenet121(pretrained=True)\n",
"model"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This model is built out of two main parts, the features and the classifier. The features part is a stack of convolutional layers and overall works as a feature detector that can be fed into a classifier. The classifier part is a single fully-connected layer `(classifier): Linear(in_features=1024, out_features=1000)`. This layer was trained on the ImageNet dataset, so it won't work for our specific problem. That means we need to replace the classifier, but the features will work perfectly on their own. In general, I think about pre-trained networks as amazingly good feature detectors that can be used as the input for simple feed-forward classifiers."
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [],
"source": [
"# Freeze parameters so we don't backprop through them\n",
"for param in model.parameters():\n",
" param.requires_grad = False\n",
"\n",
"from collections import OrderedDict\n",
"classifier = nn.Sequential(OrderedDict([\n",
" ('fc1', nn.Linear(1024, 500)),\n",
" ('relu', nn.ReLU()),\n",
" ('fc2', nn.Linear(500, 2)),\n",
" ('output', nn.LogSoftmax(dim=1))\n",
" ]))\n",
" \n",
"model.classifier = classifier"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"With our model built, we need to train the classifier. However, now we're using a **really deep** neural network. If you try to train this on a CPU like normal, it will take a long, long time. Instead, we're going to use the GPU to do the calculations. The linear algebra computations are done in parallel on the GPU leading to 100x increased training speeds. It's also possible to train on multiple GPUs, further decreasing training time.\n",
"\n",
"PyTorch, along with pretty much every other deep learning framework, uses [CUDA](https://developer.nvidia.com/cuda-zone) to efficiently compute the forward and backwards passes on the GPU. In PyTorch, you move your model parameters and other tensors to the GPU memory using `model.to('cuda')`. You can move them back from the GPU with `model.to('cpu')` which you'll commonly do when you need to operate on the network output outside of PyTorch. As a demonstration of the increased speed, I'll compare how long it takes to perform a forward and backward pass with and without a GPU."
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [],
"source": [
"import time"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"False"
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"torch.cuda.is_available()"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Device = cpu; Time per batch: 2.646 seconds\n"
]
},
{
"ename": "AssertionError",
"evalue": "Torch not compiled with CUDA enabled",
"output_type": "error",
"traceback": [
"\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[1;31mAssertionError\u001b[0m Traceback (most recent call last)",
"\u001b[1;32m<ipython-input-23-3723cf3e57d3>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m\u001b[0m\n\u001b[0;32m 5\u001b[0m \u001b[0moptimizer\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0moptim\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mAdam\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mmodel\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mclassifier\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mparameters\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mlr\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;36m0.001\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 6\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 7\u001b[1;33m \u001b[0mmodel\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mto\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mdevice\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 8\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 9\u001b[0m \u001b[1;32mfor\u001b[0m \u001b[0mii\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;33m(\u001b[0m\u001b[0minputs\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mlabels\u001b[0m\u001b[1;33m)\u001b[0m \u001b[1;32min\u001b[0m \u001b[0menumerate\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mtrainloader\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
"\u001b[1;32m~\\Anaconda3\\lib\\site-packages\\torch\\nn\\modules\\module.py\u001b[0m in \u001b[0;36mto\u001b[1;34m(self, *args, **kwargs)\u001b[0m\n\u001b[0;32m 379\u001b[0m \u001b[1;32mreturn\u001b[0m \u001b[0mt\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mto\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mdevice\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mdtype\u001b[0m \u001b[1;32mif\u001b[0m \u001b[0mt\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mis_floating_point\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m \u001b[1;32melse\u001b[0m \u001b[1;32mNone\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mnon_blocking\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 380\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 381\u001b[1;33m \u001b[1;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_apply\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mconvert\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 382\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 383\u001b[0m \u001b[1;32mdef\u001b[0m \u001b[0mregister_backward_hook\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mhook\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
"\u001b[1;32m~\\Anaconda3\\lib\\site-packages\\torch\\nn\\modules\\module.py\u001b[0m in \u001b[0;36m_apply\u001b[1;34m(self, fn)\u001b[0m\n\u001b[0;32m 185\u001b[0m \u001b[1;32mdef\u001b[0m \u001b[0m_apply\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mfn\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 186\u001b[0m \u001b[1;32mfor\u001b[0m \u001b[0mmodule\u001b[0m \u001b[1;32min\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mchildren\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 187\u001b[1;33m \u001b[0mmodule\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_apply\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mfn\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 188\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 189\u001b[0m \u001b[1;32mfor\u001b[0m \u001b[0mparam\u001b[0m \u001b[1;32min\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_parameters\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mvalues\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
"\u001b[1;32m~\\Anaconda3\\lib\\site-packages\\torch\\nn\\modules\\module.py\u001b[0m in \u001b[0;36m_apply\u001b[1;34m(self, fn)\u001b[0m\n\u001b[0;32m 185\u001b[0m \u001b[1;32mdef\u001b[0m \u001b[0m_apply\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mfn\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 186\u001b[0m \u001b[1;32mfor\u001b[0m \u001b[0mmodule\u001b[0m \u001b[1;32min\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mchildren\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 187\u001b[1;33m \u001b[0mmodule\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_apply\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mfn\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 188\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 189\u001b[0m \u001b[1;32mfor\u001b[0m \u001b[0mparam\u001b[0m \u001b[1;32min\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_parameters\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mvalues\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
"\u001b[1;32m~\\Anaconda3\\lib\\site-packages\\torch\\nn\\modules\\module.py\u001b[0m in \u001b[0;36m_apply\u001b[1;34m(self, fn)\u001b[0m\n\u001b[0;32m 191\u001b[0m \u001b[1;31m# Tensors stored in modules are graph leaves, and we don't\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 192\u001b[0m \u001b[1;31m# want to create copy nodes, so we have to unpack the data.\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 193\u001b[1;33m \u001b[0mparam\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mdata\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mfn\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mparam\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mdata\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 194\u001b[0m \u001b[1;32mif\u001b[0m \u001b[0mparam\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_grad\u001b[0m \u001b[1;32mis\u001b[0m \u001b[1;32mnot\u001b[0m \u001b[1;32mNone\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 195\u001b[0m \u001b[0mparam\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_grad\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mdata\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mfn\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mparam\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_grad\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mdata\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
"\u001b[1;32m~\\Anaconda3\\lib\\site-packages\\torch\\nn\\modules\\module.py\u001b[0m in \u001b[0;36mconvert\u001b[1;34m(t)\u001b[0m\n\u001b[0;32m 377\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 378\u001b[0m \u001b[1;32mdef\u001b[0m \u001b[0mconvert\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mt\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 379\u001b[1;33m \u001b[1;32mreturn\u001b[0m \u001b[0mt\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mto\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mdevice\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mdtype\u001b[0m \u001b[1;32mif\u001b[0m \u001b[0mt\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mis_floating_point\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m \u001b[1;32melse\u001b[0m \u001b[1;32mNone\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mnon_blocking\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 380\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 381\u001b[0m \u001b[1;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_apply\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mconvert\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
"\u001b[1;32m~\\Anaconda3\\lib\\site-packages\\torch\\cuda\\__init__.py\u001b[0m in \u001b[0;36m_lazy_init\u001b[1;34m()\u001b[0m\n\u001b[0;32m 159\u001b[0m raise RuntimeError(\n\u001b[0;32m 160\u001b[0m \"Cannot re-initialize CUDA in forked subprocess. \" + msg)\n\u001b[1;32m--> 161\u001b[1;33m \u001b[0m_check_driver\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 162\u001b[0m \u001b[0mtorch\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_C\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_cuda_init\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 163\u001b[0m \u001b[0m_cudart\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0m_load_cudart\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
"\u001b[1;32m~\\Anaconda3\\lib\\site-packages\\torch\\cuda\\__init__.py\u001b[0m in \u001b[0;36m_check_driver\u001b[1;34m()\u001b[0m\n\u001b[0;32m 73\u001b[0m \u001b[1;32mdef\u001b[0m \u001b[0m_check_driver\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 74\u001b[0m \u001b[1;32mif\u001b[0m \u001b[1;32mnot\u001b[0m \u001b[0mhasattr\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mtorch\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_C\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;34m'_cuda_isDriverSufficient'\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m---> 75\u001b[1;33m \u001b[1;32mraise\u001b[0m \u001b[0mAssertionError\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m\"Torch not compiled with CUDA enabled\"\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 76\u001b[0m \u001b[1;32mif\u001b[0m \u001b[1;32mnot\u001b[0m \u001b[0mtorch\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_C\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_cuda_isDriverSufficient\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 77\u001b[0m \u001b[1;32mif\u001b[0m \u001b[0mtorch\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_C\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_cuda_getDriverVersion\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m \u001b[1;33m==\u001b[0m \u001b[1;36m0\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
"\u001b[1;31mAssertionError\u001b[0m: Torch not compiled with CUDA enabled"
]
}
],
"source": [
"for device in ['cpu', 'cuda']:\n",
"\n",
" criterion = nn.NLLLoss()\n",
" # Only train the classifier parameters, feature parameters are frozen\n",
" optimizer = optim.Adam(model.classifier.parameters(), lr=0.001)\n",
"\n",
" model.to(device)\n",
"\n",
" for ii, (inputs, labels) in enumerate(trainloader):\n",
"\n",
" # Move input and label tensors to the GPU\n",
" inputs, labels = inputs.to(device), labels.to(device)\n",
"\n",
" start = time.time()\n",
"\n",
" outputs = model.forward(inputs)\n",
" loss = criterion(outputs, labels)\n",
" loss.backward()\n",
" optimizer.step()\n",
"\n",
" if ii==3:\n",
" break\n",
" \n",
" print(f\"Device = {device}; Time per batch: {(time.time() - start)/3:.3f} seconds\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can write device agnostic code which will automatically use CUDA if it's enabled like so:\n",
"```python\n",
"# at beginning of the script\n",
"device = torch.device(\"cuda:0\" if torch.cuda.is_available() else \"cpu\")\n",
"\n",
"...\n",
"\n",
"# then whenever you get a new Tensor or Module\n",
"# this won't copy if they are already on the desired device\n",
"input = data.to(device)\n",
"model = MyModule(...).to(device)\n",
"```\n",
"\n",
"From here, I'll let you finish training the model. The process is the same as before except now your model is much more powerful. You should get better than 95% accuracy easily.\n",
"\n",
">**Exercise:** Train a pretrained models to classify the cat and dog images. Continue with the DenseNet model, or try ResNet, it's also a good model to try out first. Make sure you are only training the classifier and the parameters for the features part are frozen."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/opt/conda/lib/python3.6/site-packages/torchvision-0.2.1-py3.6.egg/torchvision/models/densenet.py:212: UserWarning: nn.init.kaiming_normal is now deprecated in favor of nn.init.kaiming_normal_.\n"
]
},
{
"data": {
"text/plain": [
"device(type='cuda')"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"## TODO: Use a pretrained model to classify the cat and dog images\n",
"\n",
"# Use GPU if it's available\n",
"device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')\n",
"\n",
"model = models.densenet121(pretrained=True)\n",
"\n",
"# Freeze Parameters so we don't back propagate through them\n",
"for param in model.parameters():\n",
" param.requires_grad = False\n",
" \n",
"model.classifier = nn.Sequential(nn.Linear(1024, 256),\n",
" nn.ReLU(),\n",
" nn.Dropout(0.2),\n",
" nn.Linear(256, 2),\n",
" nn.LogSoftmax(dim=1))\n",
"\n",
"criterion = nn.NLLLoss()\n",
"\n",
"# Only train the classifier parameters, feature parameters are frozen\n",
"optimizer = optim.Adam(model.classifier.parameters(), lr=0.003)\n",
"\n",
"model.to(device)\n",
"device"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Epoch: 1 out of 1\n",
"Training Loss: 0.208\n",
"Test Loss: 0.076\n",
"Test Accuracy: 0.971\n",
"\n",
"Epoch: 1 out of 1\n",
"Training Loss: 0.214\n",
"Test Loss: 0.068\n",
"Test Accuracy: 0.978\n",
"\n",
"Epoch: 1 out of 1\n",
"Training Loss: 0.216\n",
"Test Loss: 0.075\n",
"Test Accuracy: 0.971\n",
"\n"
]
}
],
"source": [
"epochs = 1\n",
"steps = 0\n",
"runningLoss = 0\n",
"printEvery = 5\n",
"\n",
"for epoch in range(epochs):\n",
" for inputs, labels in trainloader:\n",
" steps += 1\n",
" \n",
" # Move input and label tensors to the device\n",
" inputs, labels = inputs.to(device), labels.to(device)\n",
" \n",
" # Zero the gradients\n",
" optimizer.zero_grad()\n",
" \n",
" # Make forward pass\n",
" logps = model.forward(inputs)\n",
" \n",
" # Calculate loss\n",
" loss = criterion(logps, labels)\n",
" \n",
" # Backpropagate\n",
" loss.backward()\n",
" \n",
" # Update the weights\n",
" optimizer.step()\n",
" \n",
" runningLoss += loss.item()\n",
" \n",
" # Do the validation pass\n",
" if steps % printEvery == 0:\n",
" testLoss = 0\n",
" accuracy = 0\n",
" model.eval()\n",
" \n",
" with torch.no_grad():\n",
" for inputs, labels in testloader:\n",
" # Move input and label tensors to the device\n",
" inputs, labels = inputs.to(device), labels.to(device)\n",
" \n",
" # Get the output\n",
" logps = model.forward(inputs)\n",
" \n",
" # Get the loss\n",
" batchLoss = criterion(logps, labels)\n",
" testLoss += batchLoss.item()\n",
" \n",
" # Find the accuracy\n",
" # Get the probabilities\n",
" ps = torch.exp(logps)\n",
" \n",
" # Get the most likely class for each prediction\n",
" top_p, top_class = ps.topk(1, dim=1)\n",
" \n",
" # Check if the predictions match the actual label\n",
" equals = top_class == labels.view(*top_class.shape)\n",
" \n",
" # Update accuracy\n",
" accuracy += torch.mean(equals.type(torch.FloatTensor)).item()\n",
" \n",
" # Print output\n",
" print(f'Epoch: {epoch+1} out of {epochs}')\n",
" print(f'Training Loss: {runningLoss/printEvery:.3f}')\n",
" print(f'Test Loss: {testLoss/len(testloader):.3f}')\n",
" print(f'Test Accuracy: {accuracy/len(testloader):.3f}')\n",
" print()\n",
"\n",
" runningLoss = 0\n",
" model.train()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -30,7 +30,8 @@
"from torch import nn\n",
"from torch import optim\n",
"import torch.nn.functional as F\n",
"from torchvision import datasets, transforms, models"
"from torchvision import datasets, transforms, models\n",
"import os"
]
},
{
@@ -44,9 +45,17 @@
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"C:\\Users\\Daniel\\Projects\\git-repos\\udacity\\python\\Deep Learning\\Deep Learning with PyTorch\n"
]
}
],
"source": [
"data_dir = 'Cat_Dog_data'\n",
"data_dir = '\\\\Cat_Dog_data'\n",
"\n",
"# TODO: Define transforms for the training data and testing data\n",
"train_transforms = transforms.Compose([transforms.RandomRotation(30),\n",
@@ -61,10 +70,12 @@
" transforms.ToTensor(),\n",
" transforms.Normalize([0.485, 0.456, 0.406],\n",
" [0.229, 0.224, 0.225])])\n",
"\n",
"train_path = os.path.join('Cat_Dog_data', 'train')\n",
"test_path = os.path.join('Cat_Dog_data', 'test')\n",
"print(os.getcwd())\n",
"# Pass transforms in here, then run the next cell to see how the transforms look\n",
"train_data = datasets.ImageFolder(data_dir + '/train', transform=train_transforms)\n",
"test_data = datasets.ImageFolder(data_dir + '/test', transform=test_transforms)\n",
"train_data = datasets.ImageFolder(train_path, transform=train_transforms)\n",
"test_data = datasets.ImageFolder(test_path, transform=test_transforms)\n",
"\n",
"trainloader = torch.utils.data.DataLoader(train_data, batch_size=64, shuffle=True)\n",
"testloader = torch.utils.data.DataLoader(test_data, batch_size=64)"
@@ -84,13 +95,6 @@
"scrolled": true
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/opt/conda/lib/python3.6/site-packages/torchvision-0.2.1-py3.6.egg/torchvision/models/densenet.py:212: UserWarning: nn.init.kaiming_normal is now deprecated in favor of nn.init.kaiming_normal_.\n"
]
},
{
"data": {
"text/plain": [
@@ -645,7 +649,7 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
@@ -654,22 +658,72 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 11,
"metadata": {},
"outputs": [],
"source": []
"outputs": [
{
"data": {
"text/plain": [
"False"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"torch.cuda.is_available()"
]
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"torch.backends.cudnn.enabled"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Device = cpu; Time per batch: 5.701 seconds\n",
"Device = cuda; Time per batch: 0.010 seconds\n"
"Device = cpu; Time per batch: 2.646 seconds\n"
]
},
{
"ename": "AssertionError",
"evalue": "Torch not compiled with CUDA enabled",
"output_type": "error",
"traceback": [
"\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[1;31mAssertionError\u001b[0m Traceback (most recent call last)",
"\u001b[1;32m<ipython-input-23-3723cf3e57d3>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m\u001b[0m\n\u001b[0;32m 5\u001b[0m \u001b[0moptimizer\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0moptim\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mAdam\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mmodel\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mclassifier\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mparameters\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mlr\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;36m0.001\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 6\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 7\u001b[1;33m \u001b[0mmodel\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mto\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mdevice\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 8\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 9\u001b[0m \u001b[1;32mfor\u001b[0m \u001b[0mii\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;33m(\u001b[0m\u001b[0minputs\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mlabels\u001b[0m\u001b[1;33m)\u001b[0m \u001b[1;32min\u001b[0m \u001b[0menumerate\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mtrainloader\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
"\u001b[1;32m~\\Anaconda3\\lib\\site-packages\\torch\\nn\\modules\\module.py\u001b[0m in \u001b[0;36mto\u001b[1;34m(self, *args, **kwargs)\u001b[0m\n\u001b[0;32m 379\u001b[0m \u001b[1;32mreturn\u001b[0m \u001b[0mt\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mto\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mdevice\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mdtype\u001b[0m \u001b[1;32mif\u001b[0m \u001b[0mt\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mis_floating_point\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m \u001b[1;32melse\u001b[0m \u001b[1;32mNone\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mnon_blocking\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 380\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 381\u001b[1;33m \u001b[1;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_apply\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mconvert\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 382\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 383\u001b[0m \u001b[1;32mdef\u001b[0m \u001b[0mregister_backward_hook\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mhook\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
"\u001b[1;32m~\\Anaconda3\\lib\\site-packages\\torch\\nn\\modules\\module.py\u001b[0m in \u001b[0;36m_apply\u001b[1;34m(self, fn)\u001b[0m\n\u001b[0;32m 185\u001b[0m \u001b[1;32mdef\u001b[0m \u001b[0m_apply\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mfn\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 186\u001b[0m \u001b[1;32mfor\u001b[0m \u001b[0mmodule\u001b[0m \u001b[1;32min\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mchildren\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 187\u001b[1;33m \u001b[0mmodule\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_apply\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mfn\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 188\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 189\u001b[0m \u001b[1;32mfor\u001b[0m \u001b[0mparam\u001b[0m \u001b[1;32min\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_parameters\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mvalues\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
"\u001b[1;32m~\\Anaconda3\\lib\\site-packages\\torch\\nn\\modules\\module.py\u001b[0m in \u001b[0;36m_apply\u001b[1;34m(self, fn)\u001b[0m\n\u001b[0;32m 185\u001b[0m \u001b[1;32mdef\u001b[0m \u001b[0m_apply\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mfn\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 186\u001b[0m \u001b[1;32mfor\u001b[0m \u001b[0mmodule\u001b[0m \u001b[1;32min\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mchildren\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 187\u001b[1;33m \u001b[0mmodule\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_apply\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mfn\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 188\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 189\u001b[0m \u001b[1;32mfor\u001b[0m \u001b[0mparam\u001b[0m \u001b[1;32min\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_parameters\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mvalues\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
"\u001b[1;32m~\\Anaconda3\\lib\\site-packages\\torch\\nn\\modules\\module.py\u001b[0m in \u001b[0;36m_apply\u001b[1;34m(self, fn)\u001b[0m\n\u001b[0;32m 191\u001b[0m \u001b[1;31m# Tensors stored in modules are graph leaves, and we don't\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 192\u001b[0m \u001b[1;31m# want to create copy nodes, so we have to unpack the data.\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 193\u001b[1;33m \u001b[0mparam\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mdata\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mfn\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mparam\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mdata\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 194\u001b[0m \u001b[1;32mif\u001b[0m \u001b[0mparam\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_grad\u001b[0m \u001b[1;32mis\u001b[0m \u001b[1;32mnot\u001b[0m \u001b[1;32mNone\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 195\u001b[0m \u001b[0mparam\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_grad\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mdata\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mfn\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mparam\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_grad\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mdata\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
"\u001b[1;32m~\\Anaconda3\\lib\\site-packages\\torch\\nn\\modules\\module.py\u001b[0m in \u001b[0;36mconvert\u001b[1;34m(t)\u001b[0m\n\u001b[0;32m 377\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 378\u001b[0m \u001b[1;32mdef\u001b[0m \u001b[0mconvert\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mt\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 379\u001b[1;33m \u001b[1;32mreturn\u001b[0m \u001b[0mt\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mto\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mdevice\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mdtype\u001b[0m \u001b[1;32mif\u001b[0m \u001b[0mt\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mis_floating_point\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m \u001b[1;32melse\u001b[0m \u001b[1;32mNone\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mnon_blocking\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 380\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 381\u001b[0m \u001b[1;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_apply\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mconvert\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
"\u001b[1;32m~\\Anaconda3\\lib\\site-packages\\torch\\cuda\\__init__.py\u001b[0m in \u001b[0;36m_lazy_init\u001b[1;34m()\u001b[0m\n\u001b[0;32m 159\u001b[0m raise RuntimeError(\n\u001b[0;32m 160\u001b[0m \"Cannot re-initialize CUDA in forked subprocess. \" + msg)\n\u001b[1;32m--> 161\u001b[1;33m \u001b[0m_check_driver\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 162\u001b[0m \u001b[0mtorch\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_C\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_cuda_init\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 163\u001b[0m \u001b[0m_cudart\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0m_load_cudart\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
"\u001b[1;32m~\\Anaconda3\\lib\\site-packages\\torch\\cuda\\__init__.py\u001b[0m in \u001b[0;36m_check_driver\u001b[1;34m()\u001b[0m\n\u001b[0;32m 73\u001b[0m \u001b[1;32mdef\u001b[0m \u001b[0m_check_driver\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 74\u001b[0m \u001b[1;32mif\u001b[0m \u001b[1;32mnot\u001b[0m \u001b[0mhasattr\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mtorch\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_C\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;34m'_cuda_isDriverSufficient'\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m---> 75\u001b[1;33m \u001b[1;32mraise\u001b[0m \u001b[0mAssertionError\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m\"Torch not compiled with CUDA enabled\"\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 76\u001b[0m \u001b[1;32mif\u001b[0m \u001b[1;32mnot\u001b[0m \u001b[0mtorch\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_C\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_cuda_isDriverSufficient\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 77\u001b[0m \u001b[1;32mif\u001b[0m \u001b[0mtorch\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_C\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_cuda_getDriverVersion\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m \u001b[1;33m==\u001b[0m \u001b[1;36m0\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
"\u001b[1;31mAssertionError\u001b[0m: Torch not compiled with CUDA enabled"
]
}
],
@@ -895,7 +949,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.3"
"version": "3.7.3"
}
},
"nbformat": 4,