diff --git a/ME-390-Exercises/ME-390-07-convnets/convnets.ipynb b/ME-390-Exercises/ME-390-07-convnets/convnets.ipynb new file mode 100644 index 0000000..3d46768 --- /dev/null +++ b/ME-390-Exercises/ME-390-07-convnets/convnets.ipynb @@ -0,0 +1,936 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "level-steps", + "metadata": {}, + "source": [ + "# Convolutional Neural Networks" + ] + }, + { + "cell_type": "markdown", + "id": "inner-dispatch", + "metadata": {}, + "source": [ + "
\n", + "\n", + "This notebook is part of a series of exercises for the CIVIL-226 Introduction to Machine Learning for Engineers course at EPFL and adapted for the ME-390. Copyright (c) 2021 [VITA](https://www.epfl.ch/labs/vita/) lab at EPFL \n", + "Use of this source code is governed by an MIT-style license that can be found in the LICENSE file or at https://www.opensource.org/licenses/MIT\n", + "\n", + "**Author(s):** David Mizrahi\n", + "
\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "id": "imported-agency", + "metadata": {}, + "source": [ + "In this exercise, we'll build on what was done in the previous exercise and implement Convolutional Neural Nets with PyTorch." + ] + }, + { + "cell_type": "markdown", + "id": "induced-manual", + "metadata": {}, + "source": [ + "*Run next cell to show tweet*" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "clean-imagination", + "metadata": {}, + "outputs": [], + "source": [ + "%%html\n", + "

A demo from 1993 of 32-year-old Yann LeCun showing off the world's first convolutional network for text recognition. #tbt #ML #neuralnetworks #CNNs #MachineLearning pic.twitter.com/9eeibjJ4MK

— MIT CSAIL #AAAI2021 (@MIT_CSAIL) January 7, 2021
" + ] + }, + { + "cell_type": "markdown", + "id": "minus-router", + "metadata": {}, + "source": [ + "#### For Google Colab\n", + "You can run this notebook in Google Colab using the following link:\n", + "https://colab.research.google.com/github/SYCAMORE-Lab/ME-390-2022/blob/master/ME-390-Exercises/ME-390-07-convnets/convnets.ipynb" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "twenty-trauma", + "metadata": {}, + "outputs": [], + "source": [ + "try:\n", + " import google.colab\n", + " IN_COLAB = True\n", + "except:\n", + " IN_COLAB = False\n", + " \n", + "if IN_COLAB:\n", + " # Install torchsummary\n", + " !pip install torchsummary\n", + " # Clone the entire repo to access the files\n", + " !git clone -l -s https://github.com/SYCAMORE-Lab/ME-390-2022.git cloned-repo\n", + " %cd cloned-repo/ME-390-Exercises/ME-390-07-convnets" + ] + }, + { + "cell_type": "markdown", + "id": "adult-sarah", + "metadata": {}, + "source": [ + "## 1. Imports & set-up\n", + "\n", + "This part is nearly identical to last exercise on fully-connected neural networks.\n", + "\n", + "More specifically, we define:\n", + "\n", + "- the MNIST dataset & dataloader\n", + "- the training & test loop\n", + "- a 3-layer fully connected neural net (now called `three_layer_net` instead of `model`)\n", + "\n", + "Then this neural net is trained for 10 epochs. This time, we use **Adam instead of SGD** as our optimizer." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "special-agent", + "metadata": {}, + "outputs": [], + "source": [ + "# PyTorch & torchvision\n", + "import torch\n", + "import torch.nn as nn\n", + "import torch.nn.functional as F\n", + "import torch.optim as optim\n", + "\n", + "import torchvision\n", + "import torchvision.transforms as transforms\n", + "from torchvision.datasets import MNIST, FashionMNIST\n", + "\n", + "# torchsummary\n", + "#import torchsummary\n", + "\n", + "# Progress bar\n", + "from tqdm.auto import tqdm\n", + "\n", + "# Helper files\n", + "import helpers\n", + "import metrics" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "rental-drove", + "metadata": {}, + "outputs": [], + "source": [ + "torch.__version__" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "conservative-forum", + "metadata": {}, + "outputs": [], + "source": [ + "torchvision.__version__" + ] + }, + { + "cell_type": "markdown", + "id": "endangered-vegetarian", + "metadata": {}, + "source": [ + "As was done in last exercise, here is a brief description of these imported packages:\n", + "\n", + "**PyTorch:**\n", + "- `torch.nn` Contains the basic building blocks to implement neural nets (incl. different types of layers and loss functions) | [Documentation](https://pytorch.org/docs/stable/nn.html)\n", + "- `torch.nn.functional` A functional (stateless) approach to torch.nn, often used for stateless objects (e.g. ReLU) | [Documentation](https://pytorch.org/docs/stable/nn.functional.html) | [More info](https://discuss.pytorch.org/t/what-is-the-difference-between-torch-nn-and-torch-nn-functional/33597/2)\n", + "- `torch.optim` A package implementing various optimization algorithms, such as SGD and Adam | [Documentation](https://pytorch.org/docs/stable/optim.html)\n", + "\n", + "**torchvision:**\n", + "- `torchvision.transforms` Common image transformations\n", + "- `torchvision.datasets` Popular image datasets\n", + "\n", + "**`torchsummary`:** Provides additional information on network architecture\n", + "\n", + "**`tqdm`:** Popular package used to show progress bars | [Documentation](https://tqdm.github.io/)\n", + "\n", + "**`helpers`**: Contains functions to help visualize data and predictions\n", + "\n", + "**`metrics`:** Contains two simple classes that help keep track and compute the loss and accuracy over a training epoch" + ] + }, + { + "cell_type": "markdown", + "id": "organizational-helen", + "metadata": {}, + "source": [ + "### Dataset & dataloader" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "metallic-slovak", + "metadata": {}, + "outputs": [], + "source": [ + "# Save dataset in a folder called \"/data\"\n", + "root = \"data\"\n", + "\n", + "# transforms.ToTensor() is used to convert the downloaded PIL Image to a torch Tensor\n", + "train_data = MNIST(root, train=True, transform=transforms.ToTensor(), download=True)\n", + "test_data = MNIST(root, train=False, transform=transforms.ToTensor(), download=True)\n", + "\n", + "batch_size = 32\n", + "# Reshuffle training data at every epoch, but not the test data \n", + "train_loader = torch.utils.data.DataLoader(train_data, batch_size=batch_size, shuffle=True)\n", + "test_loader = torch.utils.data.DataLoader(test_data, batch_size=batch_size, shuffle=False)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "prime-haven", + "metadata": {}, + "outputs": [], + "source": [ + "print(f\"Images in training data: {len(train_data)}\")\n", + "print(f\"Images in test data: {len(test_data)}\")\n", + "# Show the mapping from target value to class name (if you're using MNIST, you won't be too surprised)\n", + "print(\"Mapping from targer value to class name:\")\n", + "{i: class_name for i, class_name in enumerate(train_data.classes)}" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "endangered-sarah", + "metadata": {}, + "outputs": [], + "source": [ + "images, targets = iter(train_loader).next()\n", + "helpers.imshow(torchvision.utils.make_grid(images, nrow=8))\n", + "print(targets.reshape(-1, 8))" + ] + }, + { + "cell_type": "markdown", + "id": "certified-benefit", + "metadata": {}, + "source": [ + "### Training loop & test accuracy" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "silver-transaction", + "metadata": {}, + "outputs": [], + "source": [ + "def train(model: torch.nn.Module, train_loader: torch.utils.data.DataLoader, loss_fn: torch.nn.Module, optimizer: torch.optim.Optimizer, epochs: int):\n", + " \n", + " # Initialize metrics for loss and accuracy\n", + " loss_metric = metrics.LossMetric()\n", + " acc_metric = metrics.AccuracyMetric(k=1)\n", + " \n", + " model.train()\n", + " \n", + " for epoch in range(1, epochs + 1):\n", + " \n", + " # Progress bar set-up\n", + " pbar = tqdm(total=len(train_loader), leave=True)\n", + " pbar.set_description(f\"Epoch {epoch}\")\n", + " \n", + " # Iterate through data\n", + " for data, target in train_loader:\n", + " \n", + " # Zero-out the gradients\n", + " optimizer.zero_grad()\n", + " \n", + " # Forward pass\n", + " out = model(data)\n", + " \n", + " # Compute loss\n", + " loss = loss_fn(out, target)\n", + " \n", + " # Backward pass\n", + " loss.backward()\n", + " \n", + " # Optimizer step\n", + " optimizer.step()\n", + "\n", + " # Update metrics & progress bar\n", + " loss_metric.update(loss.item(), data.shape[0])\n", + " acc_metric.update(out, target)\n", + " pbar.update()\n", + " \n", + " # End of epoch, show loss and acc\n", + " pbar.set_postfix_str(f\"Train loss: {loss_metric.compute():.3f} | Train acc: {acc_metric.compute() * 100:.2f}%\")\n", + " loss_metric.reset()\n", + " acc_metric.reset()\n", + " \n", + "def test(model: torch.nn.Module, dataloader: torch.utils.data.DataLoader):\n", + " \n", + " # Initialize accuracy metric\n", + " acc_metric = metrics.AccuracyMetric(k=1)\n", + " \n", + " # Progress bar set-up\n", + " pbar = tqdm(total=len(test_loader), leave=True)\n", + " \n", + " model.eval()\n", + " \n", + " with torch.no_grad(): \n", + " # Iterate through data\n", + " for data, target in dataloader:\n", + " \n", + " # Forward pass\n", + " out = model(data)\n", + " \n", + " # Update accuracy metric\n", + " acc_metric.update(out, target)\n", + "\n", + " # Update progress bar\n", + " pbar.update()\n", + " \n", + " # End of epoch, show loss and acc\n", + " test_acc = acc_metric.compute() * 100\n", + " pbar.set_postfix_str(f\"Acc: {test_acc:.2f}%\")\n", + " print(f\"Accuracy is {test_acc:.2f}%\")" + ] + }, + { + "cell_type": "markdown", + "id": "eligible-territory", + "metadata": {}, + "source": [ + "### Three layer fully-connected NN" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "heard-distance", + "metadata": {}, + "outputs": [], + "source": [ + "class ThreeLayerNet(nn.Module):\n", + " \"\"\"3-Layer neural net\"\"\"\n", + " \n", + " def __init__(self) -> None:\n", + " super().__init__()\n", + " self.fc1 = nn.Linear(784, 100)\n", + " self.fc2 = nn.Linear(100, 100)\n", + " self.fc3 = nn.Linear(100, 10)\n", + "\n", + " def forward(self, x: torch.Tensor) -> torch.Tensor:\n", + " # Flatten to get tensor of shape (batch_size, 784)\n", + " x = x.flatten(start_dim=1)\n", + "\n", + " x = F.relu(self.fc1(x))\n", + " x = F.relu(self.fc2(x))\n", + " out = self.fc3(x)\n", + " return out\n", + "\n", + " def predict(self, x: torch.Tensor) -> torch.Tensor:\n", + " \"\"\"Predicts classes by calculating the softmax\"\"\"\n", + " logits = self.forward(x)\n", + " return F.softmax(logits, dim=1)\n", + "\n", + "# Note: Instance is called three_layer_net instead of model this time around\n", + "three_layer_net = ThreeLayerNet()" + ] + }, + { + "cell_type": "markdown", + "id": "recreational-google", + "metadata": {}, + "source": [ + "#### Loss & optimizer\n", + "\n", + "As before, we'll use the [Cross Entropy](https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html) loss.\n", + "\n", + "However, this time, we'll switch up optimizers and use **[Adam](https://pytorch.org/docs/master/generated/torch.optim.Adam.html)** with the default settings for the learning rate and momentum. This should help us get faster convergence than with SGD.\n", + "\n", + "Implement both the loss and the optimizer in the next cell." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "compact-afghanistan", + "metadata": {}, + "outputs": [], + "source": [ + "### START CODE HERE ###\n", + "# Cross-Entropy loss\n", + "loss_fn = ...\n", + "### END CODE HERE ###\n", + "# Use Adam with default parameters\n", + "optimizer = optim.Adam(three_layer_net.parameters())" + ] + }, + { + "cell_type": "markdown", + "id": "hazardous-healing", + "metadata": {}, + "source": [ + "#### Training" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "collectible-passion", + "metadata": {}, + "outputs": [], + "source": [ + "train(three_layer_net, train_loader, loss_fn, optimizer, epochs=10)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "involved-painting", + "metadata": {}, + "outputs": [], + "source": [ + "test(three_layer_net, test_loader)" + ] + }, + { + "cell_type": "markdown", + "id": "romantic-hacker", + "metadata": {}, + "source": [ + "**Expected result:** >96% test accuracy on MNIST" + ] + }, + { + "cell_type": "markdown", + "id": "wired-wound", + "metadata": {}, + "source": [ + "## 2. LeNet" + ] + }, + { + "cell_type": "markdown", + "id": "personalized-evaluation", + "metadata": {}, + "source": [ + "In this part, you'll see the implementation of a slightly modified version of LeNet5, a convolutional neural network proposed by [Yann Le Cun et al. in 1998](http://yann.lecun.com/exdb/publis/pdf/lecun-01a.pdf). LeNet was one of the earliest convolutional neural networks, and helped promote the development of deep learning. Your goal is to reproduce this network architecture from just the paper's figure (see below) and a few extra tips." + ] + }, + { + "cell_type": "markdown", + "id": "visible-briefs", + "metadata": {}, + "source": [ + "#### LeNet5\n", + "\n", + "" + ] + }, + { + "cell_type": "markdown", + "id": "mathematical-presentation", + "metadata": {}, + "source": [ + "Here are some tips to understand implementation:\n", + "\n", + "- Our images are 28x28, but the figure shows 32x32 input images. Can you find a way to make our images fit? **Hint:** `nn.Conv2d` has a padding parameter.\n", + "- Both convolutional layers use 5x5 filters with stride 1\n", + "- We use ReLU as the activation function\n", + "- We use Max-Pooling whenever subsampling is needed\n", + "- We'll need to flatten the tensor at some point\n", + "- As before, no need to add softmax after the final layer, `nn.CrossEntropyLoss()` adds it automatically\n", + "\n", + "Furthermore, here is some helpful documentation:\n", + "- [`torch.nn` documentation](https://pytorch.org/docs/stable/nn.html)\n", + "- [`torch.nn.functional` documentation](https://pytorch.org/docs/stable/nn.functional.html)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "willing-strip", + "metadata": {}, + "outputs": [], + "source": [ + "class LeNet(nn.Module):\n", + " \"\"\"LeNet5 from `\"Gradient-Based Learning Applied To Document Recognition\"\n", + " `_\n", + " \"\"\"\n", + "\n", + " def __init__(self) -> None:\n", + " super().__init__()\n", + "\n", + " self.conv1 = nn.Conv2d(1, 6, kernel_size=5, padding=2)\n", + " self.conv2 = nn.Conv2d(6, 16, kernel_size=5)\n", + " self.fc1 = nn.Linear(16 * 5 * 5, 120)\n", + " self.fc2 = nn.Linear(120, 84)\n", + " self.fc3 = nn.Linear(84, 10)\n", + " \n", + "\n", + " def forward(self, x: torch.Tensor) -> torch.Tensor:\n", + " x = F.relu(self.conv1(x))\n", + " x = F.max_pool2d(x, 2)\n", + " x = F.relu(self.conv2(x))\n", + " x = F.max_pool2d(x, 2)\n", + " x = x.flatten(start_dim=1)\n", + " x = F.relu(self.fc1(x))\n", + " x = F.relu(self.fc2(x))\n", + " out = self.fc3(x)\n", + " return out\n", + " \n", + " def predict(self, x: torch.Tensor) -> torch.Tensor:\n", + " \"\"\"Predicts classes by calculating the softmax\"\"\"\n", + " logits = self.forward(x)\n", + " return F.softmax(logits, dim=1)\n", + "\n", + "\n", + "lenet = LeNet()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "straight-virtue", + "metadata": {}, + "outputs": [], + "source": [ + "# Check that a forward pass gives the correct output size\n", + "print(lenet(images).shape)" + ] + }, + { + "cell_type": "markdown", + "id": "representative-judgment", + "metadata": {}, + "source": [ + "**Expected output:** `torch.Size([32, 10])`" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "homeless-german", + "metadata": {}, + "outputs": [], + "source": [ + "### START CODE HERE ###\n", + "# Cross-Entropy loss\n", + "loss_fn = ...\n", + "# Adam\n", + "optimizer = ...\n", + "### END CODE HERE ###" + ] + }, + { + "cell_type": "markdown", + "id": "widespread-gross", + "metadata": {}, + "source": [ + "#### Training" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ceramic-clear", + "metadata": {}, + "outputs": [], + "source": [ + "train(lenet, train_loader, loss_fn, optimizer, epochs=10)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "built-arcade", + "metadata": {}, + "outputs": [], + "source": [ + "test(lenet, test_loader)" + ] + }, + { + "cell_type": "markdown", + "id": "macro-mailman", + "metadata": {}, + "source": [ + "**Expected result:** >98% test accuracy on MNIST" + ] + }, + { + "cell_type": "markdown", + "id": "deadly-subscription", + "metadata": {}, + "source": [ + "#### Visualizing predictions" + ] + }, + { + "cell_type": "markdown", + "id": "completed-establishment", + "metadata": {}, + "source": [ + "Let's visualize some of these predictions with the help of `view_prediction()` from `helpers`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "neural-computer", + "metadata": {}, + "outputs": [], + "source": [ + "images, _ = iter(test_loader).next()\n", + "preds = lenet.predict(images)\n", + "\n", + "# Shows the image next to the classifier's softmax score\n", + "# Show for the first 5 images (change value to see more images)\n", + "for i in range(5):\n", + " helpers.view_prediction(images[i], preds[i], test_data.classes)" + ] + }, + { + "cell_type": "markdown", + "id": "modified-flush", + "metadata": {}, + "source": [ + "## 3. Comparing networks" + ] + }, + { + "cell_type": "markdown", + "id": "fallen-arrival", + "metadata": {}, + "source": [ + "We've successfully trained two models on the MNIST dataset. But how do they differ? To find out, we'll compare their test accuracy and their architecture." + ] + }, + { + "cell_type": "markdown", + "id": "biblical-alberta", + "metadata": {}, + "source": [ + "#### Test accuracy" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "municipal-mercy", + "metadata": {}, + "outputs": [], + "source": [ + "print(\"3-layer fully-connected net test accuracy:\")\n", + "test(three_layer_net, test_loader)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "motivated-artist", + "metadata": {}, + "outputs": [], + "source": [ + "print(\"LeNet-5 test accuracy\")\n", + "test(lenet, test_loader)" + ] + }, + { + "cell_type": "markdown", + "id": "scenic-packet", + "metadata": {}, + "source": [ + "#### Model size" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "undefined-serial", + "metadata": {}, + "outputs": [], + "source": [ + "torchsummary.summary(three_layer_net, (1, 28, 28), device=\"cpu\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "threaded-spine", + "metadata": {}, + "outputs": [], + "source": [ + "torchsummary.summary(lenet, (1, 28, 28), device=\"cpu\")" + ] + }, + { + "cell_type": "markdown", + "id": "expensive-render", + "metadata": {}, + "source": [ + "**Questions:** \n", + "- Which model has the highest accuracy?\n", + "- Compare the number of trainable parameters (weights) in both networks? Where do most of LeNet's trainable parameters come from?\n", + "- Which model takes longer to train? Look at the `it/s` metric displayed next to the progress bar." + ] + }, + { + "cell_type": "markdown", + "id": "touched-bobby", + "metadata": {}, + "source": [ + "**Answers:** \n", + "YOUR ANSWERS HERE" + ] + }, + { + "cell_type": "markdown", + "id": "numerical-moldova", + "metadata": {}, + "source": [ + "## 4. Mixing it up" + ] + }, + { + "cell_type": "markdown", + "id": "adolescent-migration", + "metadata": {}, + "source": [ + "LeNet performs quite well on MNIST. But what would happen if we apply a fixed random permutation to the pixels of the images?\n", + "\n", + "To find out, we'll create a dataset we'll call permuted MNIST. It simply takes the original dataset, and permutes pixels before feeding images to the network." + ] + }, + { + "cell_type": "markdown", + "id": "published-lindsay", + "metadata": {}, + "source": [ + "### Permuted MNIST" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "optimum-shoot", + "metadata": {}, + "outputs": [], + "source": [ + "# Fix random seed so permutation is identical across runs\n", + "torch.manual_seed(42)\n", + "perm_indices = torch.randperm(784)\n", + "# Set back to random seed\n", + "torch.random.seed()\n", + "\n", + "# The same permutation gets applied to each image \n", + "permute_transform = transforms.Compose([transforms.ToTensor(), transforms.Lambda(lambda x: x.flatten()[perm_indices].reshape(1, 28, 28))])" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "pretty-phrase", + "metadata": {}, + "outputs": [], + "source": [ + "permuted_train_data = MNIST(root, train=True, transform=permute_transform, download=True)\n", + "permuted_test_data = MNIST(root, train=False, transform=permute_transform, download=True)\n", + "\n", + "batch_size = 32\n", + "permuted_train_loader = torch.utils.data.DataLoader(permuted_train_data, batch_size=batch_size, shuffle=True)\n", + "permuted_test_loader = torch.utils.data.DataLoader(permuted_test_data, batch_size=batch_size, shuffle=False)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "precise-zambia", + "metadata": {}, + "outputs": [], + "source": [ + "# Visualize permuted images\n", + "permuted_images, targets = iter(permuted_test_loader).next()\n", + "helpers.imshow(torchvision.utils.make_grid(permuted_images, nrow=8))\n", + "print(targets.reshape(-1, 8))" + ] + }, + { + "cell_type": "markdown", + "id": "engaged-sherman", + "metadata": {}, + "source": [ + "Pretty hard for us humans to tell which digit is which, right?" + ] + }, + { + "cell_type": "markdown", + "id": "exposed-pendant", + "metadata": {}, + "source": [ + "**Question:** Before starting the training process, how do you think this random permutation will affect the performance of the two networks (3-layer net and LeNet)?" + ] + }, + { + "cell_type": "markdown", + "id": "synthetic-torture", + "metadata": {}, + "source": [ + "**Answer:** \n", + "YOUR ANSWER HERE" + ] + }, + { + "cell_type": "markdown", + "id": "liquid-colonial", + "metadata": {}, + "source": [ + "### Training on permuted images" + ] + }, + { + "cell_type": "markdown", + "id": "cardiac-thousand", + "metadata": {}, + "source": [ + "Let's now train our two network architectures on this permuted dataset. As only the dataset changes, the training procedure will be almost exactly the same as previously." + ] + }, + { + "cell_type": "markdown", + "id": "ordered-modem", + "metadata": {}, + "source": [ + "#### Fully-connected NN (3-layer net)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "noble-strategy", + "metadata": {}, + "outputs": [], + "source": [ + "permuted_three_layer_net = ThreeLayerNet()\n", + "loss_fn = nn.CrossEntropyLoss()\n", + "optimizer = optim.Adam(permuted_three_layer_net.parameters())\n", + "\n", + "train(permuted_three_layer_net, permuted_train_loader, loss_fn, optimizer, epochs=10)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "exposed-statement", + "metadata": {}, + "outputs": [], + "source": [ + "test(permuted_three_layer_net, permuted_test_loader)" + ] + }, + { + "cell_type": "markdown", + "id": "urban-richards", + "metadata": {}, + "source": [ + "#### LeNet" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "bizarre-plenty", + "metadata": {}, + "outputs": [], + "source": [ + "permuted_lenet = LeNet()\n", + "loss_fn = nn.CrossEntropyLoss()\n", + "optimizer = optim.Adam(permuted_lenet.parameters())\n", + "\n", + "train(permuted_lenet, permuted_train_loader, loss_fn, optimizer, epochs=10)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "banned-antique", + "metadata": {}, + "outputs": [], + "source": [ + "test(permuted_lenet, permuted_test_loader)" + ] + }, + { + "cell_type": "markdown", + "id": "chubby-shield", + "metadata": {}, + "source": [ + "Our 3 layer net is completely unaffected by the permutation, while the accuracy of LeNet decreases.\n", + "\n", + "This is to be expected. A ConvNet makes the explicit assumption that the input are images, which allows it to encode certain properties into the architecture, while a fully-connected neural net makes no assumption of the sort. When these assumptions hold, a ConvNet performs quite well but suffers otherwise. Note that LeNet still performs quite well, in part thanks to the final few fully-connected layers, and because MNIST is a particularly easy dataset.\n", + "\n", + "As real-world images don't have all their pixels permuted by a malicious exercise maker, you can safely use ConvNets for most tasks involving images." + ] + }, + { + "cell_type": "markdown", + "id": "classical-request", + "metadata": {}, + "source": [ + "Congratulations on finishing this exercise!" + ] + }, + { + "cell_type": "markdown", + "id": "local-cornwall", + "metadata": {}, + "source": [ + "## (Optional) Additional PyTorch resources \n", + "- PyTorch basics: https://pytorch.org/tutorials/beginner/basics/intro.html\n", + "- PyTorch cheat sheet: https://pytorch.org/tutorials/beginner/ptcheat.html\n", + "- Other PyTorch tutorials: https://pytorch.org/tutorials/index.html\n", + "- PyTorch recipes: https://pytorch.org/tutorials/recipes/recipes_index.html (bite-sized code examples on specific PyTorch features)\n", + "- PyTorch examples: https://github.com/pytorch/examples" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.10" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/ME-390-Exercises/ME-390-07-convnets/data/MNIST/processed/test.pt b/ME-390-Exercises/ME-390-07-convnets/data/MNIST/processed/test.pt new file mode 100644 index 0000000..b99ff69 Binary files /dev/null and b/ME-390-Exercises/ME-390-07-convnets/data/MNIST/processed/test.pt differ diff --git a/ME-390-Exercises/ME-390-07-convnets/data/MNIST/processed/training.pt b/ME-390-Exercises/ME-390-07-convnets/data/MNIST/processed/training.pt new file mode 100644 index 0000000..eab5a2f Binary files /dev/null and b/ME-390-Exercises/ME-390-07-convnets/data/MNIST/processed/training.pt differ diff --git a/ME-390-Exercises/ME-390-07-convnets/data/MNIST/raw/t10k-images-idx3-ubyte b/ME-390-Exercises/ME-390-07-convnets/data/MNIST/raw/t10k-images-idx3-ubyte new file mode 100644 index 0000000..1170b2c Binary files /dev/null and b/ME-390-Exercises/ME-390-07-convnets/data/MNIST/raw/t10k-images-idx3-ubyte differ diff --git a/ME-390-Exercises/ME-390-07-convnets/data/MNIST/raw/t10k-images-idx3-ubyte.gz b/ME-390-Exercises/ME-390-07-convnets/data/MNIST/raw/t10k-images-idx3-ubyte.gz new file mode 100644 index 0000000..5ace8ea Binary files /dev/null and b/ME-390-Exercises/ME-390-07-convnets/data/MNIST/raw/t10k-images-idx3-ubyte.gz differ diff --git a/ME-390-Exercises/ME-390-07-convnets/data/MNIST/raw/t10k-labels-idx1-ubyte b/ME-390-Exercises/ME-390-07-convnets/data/MNIST/raw/t10k-labels-idx1-ubyte new file mode 100644 index 0000000..d1c3a97 Binary files /dev/null and b/ME-390-Exercises/ME-390-07-convnets/data/MNIST/raw/t10k-labels-idx1-ubyte differ diff --git a/ME-390-Exercises/ME-390-07-convnets/data/MNIST/raw/t10k-labels-idx1-ubyte.gz b/ME-390-Exercises/ME-390-07-convnets/data/MNIST/raw/t10k-labels-idx1-ubyte.gz new file mode 100644 index 0000000..a7e1415 Binary files /dev/null and b/ME-390-Exercises/ME-390-07-convnets/data/MNIST/raw/t10k-labels-idx1-ubyte.gz differ diff --git a/ME-390-Exercises/ME-390-07-convnets/data/MNIST/raw/train-images-idx3-ubyte b/ME-390-Exercises/ME-390-07-convnets/data/MNIST/raw/train-images-idx3-ubyte new file mode 100644 index 0000000..bbce276 Binary files /dev/null and b/ME-390-Exercises/ME-390-07-convnets/data/MNIST/raw/train-images-idx3-ubyte differ diff --git a/ME-390-Exercises/ME-390-07-convnets/data/MNIST/raw/train-images-idx3-ubyte.gz b/ME-390-Exercises/ME-390-07-convnets/data/MNIST/raw/train-images-idx3-ubyte.gz new file mode 100644 index 0000000..b50e4b6 Binary files /dev/null and b/ME-390-Exercises/ME-390-07-convnets/data/MNIST/raw/train-images-idx3-ubyte.gz differ diff --git a/ME-390-Exercises/ME-390-07-convnets/data/MNIST/raw/train-labels-idx1-ubyte b/ME-390-Exercises/ME-390-07-convnets/data/MNIST/raw/train-labels-idx1-ubyte new file mode 100644 index 0000000..d6b4c5d Binary files /dev/null and b/ME-390-Exercises/ME-390-07-convnets/data/MNIST/raw/train-labels-idx1-ubyte differ diff --git a/ME-390-Exercises/ME-390-07-convnets/data/MNIST/raw/train-labels-idx1-ubyte.gz b/ME-390-Exercises/ME-390-07-convnets/data/MNIST/raw/train-labels-idx1-ubyte.gz new file mode 100644 index 0000000..707a576 Binary files /dev/null and b/ME-390-Exercises/ME-390-07-convnets/data/MNIST/raw/train-labels-idx1-ubyte.gz differ diff --git a/ME-390-Exercises/ME-390-07-convnets/helpers.py b/ME-390-Exercises/ME-390-07-convnets/helpers.py new file mode 100644 index 0000000..f8c601f --- /dev/null +++ b/ME-390-Exercises/ME-390-07-convnets/helpers.py @@ -0,0 +1,43 @@ +from typing import Iterable + +import numpy as np +import torch +import matplotlib.pyplot as plt + + +def imshow(img: torch.Tensor) -> None: + fig, ax = plt.subplots() + ax.imshow(to_np_img(img), cmap="gray") + ax.axis("off") + plt.show() + + +def to_np_img(img: torch.Tensor) -> np.ndarray: + return np.transpose(img.numpy(), (1, 2, 0)).squeeze() + + +def view_prediction( + img: torch.Tensor, + pred: torch.Tensor, + classes: Iterable = range(10), +) -> None: + """Shows prediction for MNIST style datasets (with 10 classes) + + Args: + img: image to display (as tensor) + pred: model prediction + classes: class names (of size 10) + """ + pred = pred.data.numpy().squeeze() + + fig, (ax1, ax2) = plt.subplots(figsize=(6, 7), ncols=2) + plt.subplots_adjust(wspace=0.4) + ax1.imshow(to_np_img(img), cmap="gray") + ax1.axis("off") + ax2.barh(np.arange(10), pred) + ax2.set_aspect(0.1) + ax2.set_yticks(np.arange(10)) + ax2.set_yticklabels(classes) + ax2.set_xlim(0, 1.1) + ax2.set_title("Prediction") + plt.show() \ No newline at end of file diff --git a/ME-390-Exercises/ME-390-07-convnets/images/lenet.png b/ME-390-Exercises/ME-390-07-convnets/images/lenet.png new file mode 100644 index 0000000..5daeef6 Binary files /dev/null and b/ME-390-Exercises/ME-390-07-convnets/images/lenet.png differ diff --git a/ME-390-Exercises/ME-390-07-convnets/metrics.py b/ME-390-Exercises/ME-390-07-convnets/metrics.py new file mode 100644 index 0000000..c69296c --- /dev/null +++ b/ME-390-Exercises/ME-390-07-convnets/metrics.py @@ -0,0 +1,50 @@ +# Source: https://github.com/dmizr/phuber/blob/master/phuber/metrics.py +import torch + + +class LossMetric: + """Keeps track of the loss over an epoch""" + + def __init__(self) -> None: + self.running_loss = 0 + self.count = 0 + + def update(self, loss: float, batch_size: int) -> None: + self.running_loss += loss * batch_size + self.count += batch_size + + def compute(self) -> float: + return self.running_loss / self.count + + def reset(self) -> None: + self.running_loss = 0 + self.count = 0 + + +class AccuracyMetric: + """Keeps track of the top-k accuracy over an epoch + Args: + k (int): Value of k for top-k accuracy + """ + + def __init__(self, k: int = 1) -> None: + self.correct = 0 + self.total = 0 + self.k = k + + def update(self, out: torch.Tensor, target: torch.Tensor) -> None: + # Computes top-k accuracy + _, indices = torch.topk(out, self.k, dim=-1) + target_in_top_k = torch.eq(indices, target[:, None]).bool().any(-1) + total_correct = torch.sum(target_in_top_k, dtype=torch.int).item() + total_samples = target.shape[0] + + self.correct += total_correct + self.total += total_samples + + def compute(self) -> float: + return self.correct / self.total + + def reset(self) -> None: + self.correct = 0 + self.total = 0