diff --git a/Exercises/00-setup/.ipynb_checkpoints/jupyter-checkpoint.md b/Exercises/00-setup/.ipynb_checkpoints/jupyter-checkpoint.md deleted file mode 100644 index 4f14a79..0000000 --- a/Exercises/00-setup/.ipynb_checkpoints/jupyter-checkpoint.md +++ /dev/null @@ -1,92 +0,0 @@ -# Using Jupyter Notebooks and Jupyter Lab - - -## What is a Jupyter Notebook? - -From the [Project Jupyter website](https://jupyter.org/): - -> The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, and much more. - -## What is Jupyter Lab? - -Jupyter Lab is a web-based development environment for Jupyter Notebooks, code and data. You can also use it to view Markdown, PDF and text files. - - -## Interacting with Jupyter Lab and launching a Notebook - -When you open EPFL noto, you should see a screen as follows: - - - -You will notice there will be a directory structure on the left-hand side of your window. On the right-hand side, we have a "launcher" interface where we can create a Jupyter Notebook (boxed in red), or several other types of files. To start a new notebook, click on the "Python 3" logo beneath "Notebook". You can also open an existing notebook from the directory. - -Once you open a notebook, you will be brought to the following screen. - - - - -Some of the most important components of the notebook are highlighted in colored boxes. -- To rename your notebook (shown here as Untitled), you can right-click your document and select Rename. -- In purple is the cell formatting assignment. By default, it is set to "Code", but it can also be set to "Markdown". -- In red is a code cell, in which you can write Python code. -- In blue is a markdown cell, which is used to display nicely formatted text, images and mathematical equations. - - - -#### Markdown cells -Here are some useful resources for Markdown cells: -- https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet -- https://www.datacamp.com/community/tutorials/markdown-in-jupyter-notebook - -#### Code cells - -Code cells support Python code (and other languages with different kernels, although we will only use Python in this course). They can be executed in any order. - -- To run a single cell, use **Shift + Enter** or press on the ▶ button. -- To run all the cells starting from the beginning, go to `Run` -> `Run All Cells...`. - - - - -**Warning:** Code cells can be executed in any order. This means that you can overwrite your current variables by running cells out of order. This also means that variables declared in cells that were executed then deleted will still be present (known as a hidden state) unless the kernel is restarted. Therefore, **when coding in notebooks, be cautious of the order in which you run cells.** One solution to avoid hidden states is to frequently restart your kernel (and run up to your currently selected cell). - -#### Autocompletion and documentation -Autocompletion is possible with JupyterLab too! Use **Tab**. - - - -To view the documentation for a function or class, use **Shift + Tab**. - -#### Keyboard shortcuts - -You can gain a lot of time using [keyboard shortcuts](https://cheatography.com/weidadeyue/cheat-sheets/jupyter-notebook/pdf/) to navigate through notebooks. - -## The kernel - -The kernel maintains the state of a notebook's computations (such as current variables, declared functions and loaded data). For notebooks that do not take too long to run, it is desirable to frequently restart the kernel to ensure that there are no hidden states. - -Here is a list of what the different kernel related actions do: - -![](images/kernel_1.png) - -- **Interrupt ( or ■ button):** Causes the kernel to stop performing the current task without actually shutting the kernel down. You can use this option when you want to stop a very long task (eg. stop processing a large dataset). - -- **Restart (or ↻ button):** Stops the kernel and starts it again. This action causes you to lose all the state data. In some cases, this is precisely what you need to do when the environment has become "dirty" with hidden state data. - -- **Restart & Clear Output:** Stops the kernel, starts it again, and clears all the existing cell outputs. - -- **Restart Kernel & Run Up to Selected Cell:** Stops the kernel, starts it again, and then runs every cell starting from the top cell and ending with the currently selected cell. Use this when you are working on a Notebook and want to make sure there are no hidden states. - -- **Restart Kernel & Run All Cells:** Stops the kernel, starts it again, and then runs every cell starting from the top cell and ending with the last cell. Use this when you're done working on a Notebook and want to make sure everything runs properly and doesn't depend on hidden states. - -- **Shutdown:** Shuts the kernel down. You may perform this step in preparation for using a different kernel. - -- **Change Kernel:** Selects a different kernel from the list of kernels you have installed. For example, you may want to test an application using various Python versions to ensure that it runs on all of them. - -## Additional resources - -**Jupyter Notebook tutorial: https://www.dataquest.io/blog/jupyter-notebook-tutorial/** - -**Jupyter Notebook documentation: https://jupyterlab.readthedocs.io/en/stable/user/notebook.html** - -**Jupyter Lab interface documentation: https://jupyterlab.readthedocs.io/en/stable/user/interface.html** diff --git a/Exercises/00-setup/.ipynb_checkpoints/noto-checkpoint.md b/Exercises/00-setup/.ipynb_checkpoints/noto-checkpoint.md deleted file mode 100644 index 262adad..0000000 --- a/Exercises/00-setup/.ipynb_checkpoints/noto-checkpoint.md +++ /dev/null @@ -1,17 +0,0 @@ -# EPFL Noto - -EPFL provides a centralized JupyterLab platform for students called [Noto](https://www.epfl.ch/education/educational-initiatives/cede/digitaltools/noto/), allowing you to run all these notebooks without having to install anything on your computer. This is because those notebooks are running in the cloud (i.e. on a remote server, with which you directly interact through the JupyterLab interface). - -To clone this repository on EPFL Noto, click on this link: https://go.epfl.ch/ME-390-2022. - -As in a local Python installation, you will need to frequently `git pull` the most recent changes in order to ensure that your files are up-to-date. - -**Warning**: You'll need a stable internet connection to run these labs through Noto, as it runs in the cloud. This is not the case when running the notebooks locally, even though we're still using a web browser to access Jupyter Lab. - -## Other cloud-based notebook platforms -There exist other online platforms offering cloud-based Jupyter Notebooks. These platforms could enable faster computation time. Notably: - -- [Google Colab](https://colab.research.google.com/), which offers free GPU runtimes, which is very useful for some machine learning tasks (such as training neural networks). -- [Deepnote](https://deepnote.com/), which offers real-time collaboration (useful for group projects). - -**Note**: We do not provide any support for using these platforms. diff --git a/Exercises/01-python/.ipynb_checkpoints/intro_to_python-checkpoint.ipynb b/Exercises/01-python/.ipynb_checkpoints/intro_to_python-checkpoint.ipynb deleted file mode 100644 index 15bb71a..0000000 --- a/Exercises/01-python/.ipynb_checkpoints/intro_to_python-checkpoint.ipynb +++ /dev/null @@ -1,1528 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Introduction to Python" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "
\n", - "\n", - "This notebook was developed for the CS-233 Introduction to Machine Learning course at EPFL, adapted for the CIVIL-226 Introduction to Machine Learning for Engineers course, and re-adapted for the ME-390. We thank contributers in CS-233 ([CVLab](https://www.epfl.ch/labs/cvlab)) and CIVIL-226 ([VITA](https://www.epfl.ch/labs/vita/)).\n", - " \n", - "**Author(s):** Sena Kiciroglu, minor changes by Tom Winandy and David Mizrahi\n", - "
\n", - "Welcome to the first exercise of Introduction to Machine Learning. Today we will get familiar with Python, the language we will use for all the exercises of this course. \n", - "\n", - "This week we will introduce some important concepts in the basics of Python. Next week, you will learn how to work with NumPy, a popular Python library used for scientific computing. \n", - "\n", - "Python is a popular language to use for machine learning tasks. This is especially true because of the selection of **libraries and frameworks**, developed specifically for machine learning and scientific computing. To name a few, you have Keras, TensorFlow and PyTorch for developing neural networks, SciPy and NumPy used for scientific computing, Pandas for data analysis, etc. (You might also get to dabble in PyTorch in the upcoming weeks.)\n", - "\n", - "Python also allows you to write quick, readable, high-level code. It's great for fast prototyping. \n", - "\n", - "You can find a useful Python cheatsheet at: https://www.pythoncheatsheet.org/\n", - "\n", - "Let's get into it!\n" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "tags": [] - }, - "source": [ - "## 1. Jupyter Notebook\n", - "\n", - "In these exercises we will use Jupyter Notebooks, which contain Python code, text explanations and visuals. \n", - "\n", - "The Jupyter Notebook document (such as the one you are looking at right now) consists of cells containing Python code, text or other content. You can run each cell by clicking on the button `Run` in the top toolbar, or you can use a keyboard shortcut `Ctrl` + `Enter` (run current cell) or `Shift` + `Enter` (run current cell and move to the cell below)." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 2. Indentation and Control Flow" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Finally we get to start doing some coding!\n", - "\n", - "First thing to know: Python does not separate different lines of code with a semicolon `;`. So just RUN the following cell with no worries." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# This is a Python comment. Start the line with `#` for a comment\n", - "print(\"First line of code. I will declare some variables\")\n", - "a = 1 # second line!!\n", - "b = 2\n", - "c = \"Fish\"\n", - "print(f\"My variables are: a = {a}, b = {b}, c = {c}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Easy! However, in Python you have to be careful and have perfect indentation (a reason why Python code is so readable). The reason is, Python uses indentation to keep track of what is part of the if statement, the loops and the functions. This is different from Java (this is assuming you know Java) where you would have curly brackets `{ }` for this purpose. \n", - "\n", - "Let's start with the if statement.\n", - "\n", - "### 2.1. If Statement\n", - "\n", - "The rule is, all indented parts after the `if condition :` belong to that branch of the if statement. \n", - "\n", - "```python\n", - "if condition :\n", - " inside the statement\n", - " still inside the statement\n", - "elif condition:\n", - " inside the else-if part of the statement\n", - "else:\n", - " inside the else part of the statement\n", - "outside the statement\n", - " ```\n", - " \n", - "Let's see it in action:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "if a + b == 3:\n", - " print(\"It's me again! We are inside the first if statement\")\n", - " print(\"It's optional to use parentheses for the condition a + b ==3\")\n", - " print(\"Don't forget to put a `:` at the end of the condition!!\")\n", - " if (c == \"Fish\"):\n", - " print(\"This is a second if statement inside the first one\")\n", - " print(\"I'm out of the second if statement, but still inside the first one\")\n", - "else:\n", - " print(\"This is the else part of the first if statement.\")\n", - " print(\"These lines will never be printed!\")\n", - "print(\"I'm not inside any of the if statements\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### Exercise \n", - "\n", - "Let's see another if statement example. Try to figure out what the output will be **BEFORE** running the cell below.\n", - "\n", - "Reminder, we declared\n", - "\n", - "```python\n", - "a = 1\n", - "b = 2\n", - "c = \"Fish\"\n", - " ```" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Don't run me until you find the output first!\n", - "if a == 5:\n", - " print (\"1\")\n", - " if b == 1:\n", - " print(\"2\")\n", - "# here comes an else-if \n", - "elif a == 2 or c == \"Fish\":\n", - " print(\"3\")\n", - " \n", - " if b == 1:\n", - " print(\"4\")\n", - " if b == 2:\n", - " print(\"5\")\n", - " if b == 2:\n", - " print(\"6\")\n", - " if c == \"Fish\":\n", - " if a == 1:\n", - " if b == 100:\n", - " print(\"7\")\n", - " else:\n", - " print(\"8\")\n", - " elif a == 1:\n", - " print(\"9\")\n", - "print (\"10\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 2.2. Loops\n", - "\n", - "Let's talk about loops. The syntax for a while-loop is:\n", - "\n", - "```python\n", - "while condition:\n", - " inside the loop\n", - " inside the loop\n", - " inside the loop\n", - "outside the loop\n", - " ```\n", - " \n", - " A small example:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "count = 0\n", - "while count < 3:\n", - " count += 1 # this is the same as count = count +1\n", - " print(f\"Count is {count}\")\n", - "print(\"Left the loop!\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "For-loops iterate through sequences, in this way:\n", - "\n", - "```python\n", - "for x in sequence:\n", - " inside the loop\n", - " inside the loop\n", - " inside the loop\n", - "outside the loop\n", - "```\n", - " \n", - " An example is shown below:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "#Here is a basic list of strings\n", - "fish_list = [\"salmon\", \"trout\", \"parrot\", \"clown\", \"dory\"]\n", - "\n", - "#The for loop:\n", - "for fish in fish_list:\n", - " print(fish)\n", - " print(\"*\")\n", - "print(\"fish list over!\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "An incredibly useful built-in function to use in for loops is `range()`. Range allows you to create a sequence of integers from the start (default is 0), to the stop, with a given step size (default is 1). We can use `range()` in for loops as shown in the example below." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# \"default start is 0, default step size is 1\"\n", - "for number in range(7):\n", - " print (number)\n", - "print(\"**\")\n", - "\n", - "# now we also provide the start as 2.\n", - "# Default step size 1 is still used.\n", - "for number in range(2,7):\n", - " print(number)\n", - "print(\"**\")\n", - "\n", - "# now we also provide the step size as 2.\n", - "for number in range(2,7,2):\n", - " print(number)\n", - "print(\"**\") \n", - "\n", - "# what happens if step size is -1?\n", - "for number in range(6,-1,-1):\n", - " print(number)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "One more useful built-in function will be `enumerate()`. Let's go back to the fish list.\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "for fish in fish_list:\n", - " print(fish)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "What if I also want to keep track of the index of the list element? You can use `enumerate()` which creates a sequence of 2-tuples, where each tuple contains an integer index and an actual element of the original list. Here is how it looks like:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "for item_index, fish in enumerate(fish_list):\n", - " print(f\"{item_index}: {fish}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 3. Data Types and Basic Operations\n", - "\n", - "Python is a **dynamically typed** language. This means that the data type is inferred at run-time and can be changed during run-time. To check the type of a variable you can use the function `type()`." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# var_1 is first defined as an integer\n", - "var_1 = 1\n", - "print(f\"{var_1} is {type(var_1)}\")\n", - "\n", - "# var_1's type is changed to string\n", - "var_1 = \"hi!\"\n", - "print(f\"{var_1} is {type(var_1)}\")\n", - "\n", - "# more types\n", - "var_1 = 0.312\n", - "print(f\"{var_1} is {type(var_1)}\")\n", - "var_1 = 3.\n", - "print(f\"{var_1} is {type(var_1)}\")\n", - "var_1 = 3+2j\n", - "print(f\"{var_1} is {type(var_1)}\")\n", - "var_1 = True\n", - "print(f\"{var_1} is {type(var_1)}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 3.1. Type Casting\n", - "\n", - "Some examples of type casting in Python:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# From int to float\n", - "var_1 = 42\n", - "print(f\"{var_1} is {type(var_1)}\")\n", - "var_1 = float(var_1)\n", - "print(f\"{var_1} is {type(var_1)}\")\n", - "print (\"**\")\n", - "\n", - "# From float to int\n", - "var_2 = 3.14\n", - "print(f\"{var_2} is {type(var_2)}\")\n", - "var_2 = int(var_2)\n", - "# This operations does FLOOR, not round!\n", - "print(f\"{var_2} is {type(var_2)}\")\n", - "print (\"**\")\n", - "\n", - "# From string to int\n", - "var_3 = \"100\"\n", - "print(f\"{var_3} is {type(var_3)}\")\n", - "var_3 = int(var_3)\n", - "print(f\"{var_3} is {type(var_3)}\")\n", - "print(\"**\")\n", - "\n", - "# From float to string\n", - "var_4 = 1.23\n", - "print(f\"{var_4} is {type(var_4)}\")\n", - "var_4 = str(var_4)\n", - "print(f\"{var_4} is {type(var_4)}\")\n", - "print(\"**\")\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 3.2. Basic Operations\n", - "\n", - "Arithmetic operations are fairly standard. There are some examples below. \n", - "* Look out for the difference between `/` division and `//` integer division.\n", - "* `**` is used for power.\n", - "* `%` is modulo." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "a = 50\n", - "b = 7\n", - "\n", - "print(f\"a + b = {a + b}\")\n", - "print(f\"a - b = {a - b}\")\n", - "print(f\"a * b = {a * b}\")\n", - "print(f\"a / b = {a / b}\")\n", - "print(f\"a // b = {a // b}\") # integer divison\n", - "print(f\"a ** b = {a ** b}\") # power\n", - "print(f\"a % b = {a % b}\") # modulo" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Boolean operations are also fairly standard:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "print(f\"(True and False) = {True and False}\")\n", - "print(f\"(True or False) = {True or False}\")\n", - "print(f\"((True and False) or True) = {(True and False) or True}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You can declare strings with a single quote `'`, a double quote `\"` or a three double quotes `\"\"\"`. The string declared with `\"\"\"` is known as a *docstring*, it can span multiple lines and is usually used to comment functions and classes.\n", - "\n", - "**Note:** Throughout the exercises, we will be using f-strings to format our strings nicely. You can learn more about them [here](https://realpython.com/python-f-strings/)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "a = 'Life\\'s but a walking shadow, a poor player,' \n", - "print(a)\n", - "a = \"That struts and frets his hour upon the stage,\"\n", - "print(a)\n", - "a = \"\"\"And then is heard no more. It is a tale\n", - "Told by an idiot, full of sound and fury,\n", - "Signifying nothing.\"\"\"\n", - "print(a)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# The types of quotes do not change anything!\n", - "a = \"fish\" # double quote\n", - "b = 'fish' # single quote\n", - "print(a == b) # the string is the same!" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 4. Lists\n", - "\n", - "Lists are data types containing a sequence of values. The size of the list can change during run-time, as you add and remove elements from the list. \n", - "\n", - "Here is how you can create lists:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "list_a = [] # empty\n", - "print(f\"list_a {list_a}\")\n", - "\n", - "list_b = [1, 2, 3, 4] # 4 elements\n", - "print(f\"list_b {list_b}\")\n", - "\n", - "list_c = [1, 'cat', 0.23] # mixed types\n", - "print(f\"list_c {list_c}\")\n", - "\n", - "list_d = [1, ['cat', 'dog'], 2, 3] # list in list\n", - "print(f\"list_d {list_d}\")\n", - "\n", - "list_e = [1] * 10 # a list of 1s of length 10\n", - "print(f\"list_e {list_e}\")\n", - "\n", - "list_f = list(range(5)) # turns range object into a list\n", - "print(f\"list_f {list_f}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Below we introduce some common operations with lists.\n", - "* Use `len(list1)` to find the length of the list.\n", - "* `list1.append(element)` to add an element to the end of the list.\n", - "* `list1.insert(index, element)` to add an element to an index in the list\n", - "* `list1.extend(list2)` to extend the elements of list1 with the elements of list2\n", - "* `list1.pop()` removes last element from the list\n", - "* `list1.pop(index)` removes the element at the given index\n", - "* `list1.remove(element)` removes the first instance of the given element" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Some common operations\n", - "b = [\"great\", \"minds\", \"think\", \"alike\"]\n", - "print(f\"b: {b}\")\n", - "\n", - "# finding the length\n", - "print(f\"length of b is {len(b)}\")\n", - "\n", - "# append element to list\n", - "b.append(\"sometimes\")\n", - "print(f\"b.append(\\\"sometimes\\\")= {b}\")\n", - "\n", - "# extend list\n", - "c = [\"-\", \"Abraham\", \"Lincoln\"]\n", - "b.extend(c)\n", - "print(f\"c: {c}\")\n", - "print(f\"b.extend(c) = {b}\")\n", - "\n", - "# removes element and specific index\n", - "b.pop(6) \n", - "print(f\"b.pop(6) = {b}\")\n", - "\n", - "# remove specific element\n", - "b.remove(\"Lincoln\") \n", - "b.remove(\"-\")\n", - "print(f\"b.remove(\\\"Lincoln\\\"); b.remove(\\\"-\\\") = {b}\")\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You can also check whether an element is in a list in the following way:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "list_1 = [\"a\", \"b\", \"c\"]\n", - "if \"b\" in list_1:\n", - " print(\"\\\"b\\\" is in list\")\n", - "else:\n", - " print(\"\\\"b\\\" is not in list\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 4.1. List Indexing and Slicing:\n", - "\n", - "You can extract a single element from a list in the following way:\n", - "`list1[index]`\n", - "\n", - "In lists, the indices start from 0. You can also index elements from the end of the list to the beginning by $-1, -2, -3...$. Check out the image below for the example list:\n", - "\n", - "`list_1 = [\"a\", \"b\", \"c\", \"d\", \"e\"]`" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "* You can extract multiple elements by slicing. This will give you elements from the start up to **(but not including)** the end index.\n", - "\n", - " `list1[start_index:end_index]`\n", - "\n", - "\n", - "* If you do not specify the `start_index`, you will retrieve the elements from index $0$ up to the `end_index`.\n", - "\n", - " `list1[:end_index]` is the same as `list1[0:end_index]`\n", - "\n", - "\n", - "* If you do not specify the `end_index`, you will retrieve the elements from the `start_index` up to (and **including**) the end of the list.\n", - "\n", - " `list1[start_index:]`\n", - "\n", - "\n", - "* You can provide a step size.\n", - " `list1[start_index:end_index:step_size]`\n", - " " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### Exercise\n", - "\n", - "Try to write the output of the following code **BEFORE** running the cell." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Don't run BEFORE you solve it!\n", - "list_1 = [\"a\", \"b\", \"c\", \"d\", \"e\"]\n", - "\n", - "print(f\"list_1[-3] = {list_1[-3]}\")\n", - "print(f\"list_1[0:2] = {list_1[0:2]}\")\n", - "print(f\"list_1[:4:2] = {list_1[:4:2]}\")\n", - "print(f\"list_1[::-1] = {list_1[::-1]}\")\n", - "print(f\"list_1[-4:-1] = {list_1[-4:-1]}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You can also assign new values to indices using slicing. Here is an example:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "list_1 = [\"a\", \"b\", \"c\", \"d\", \"e\"]\n", - "\n", - "list_1[-1]= \"<3\"\n", - "print(list_1)\n", - "\n", - "list_1[0:2] = [\"x\", \"y\"]\n", - "print(list_1)\n", - "\n", - "list_1[::2] = [\":)\",\":(\", \":O\"]\n", - "print(list_1)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 4.2. Copying\n", - "\n", - "We have one last thing to say about lists. Observe the behaviour of the following code:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Case 1:\n", - "\n", - "list_1 = [\"a\", \"b\", \"c\", \"d\", \"e\"]\n", - "print(f\"list_1 before {list_1}\")\n", - "\n", - "list_2 = list_1\n", - "list_2.append(\"Z\")\n", - "\n", - "print(f\"list_1 after {list_1}\")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Case 2:\n", - "\n", - "list_1 = [\"a\", \"b\", \"c\", \"d\", \"e\"]\n", - "print(f\"list_1 before function {list_1}\")\n", - "\n", - "def function_that_changes_list(input_list):\n", - " input_list.append(\"Z\")\n", - "\n", - "function_that_changes_list(list_1)\n", - "\n", - "print(f\"list_1 after function {list_1}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "We never changed list_1 explicitly, but the values changed anyway. What's going on?\n", - "\n", - "Well, in Python, when you say `list_2 = list_1`, you are not actually creating a new list, you are only copying the **reference** to the same list. This means that they are actually two variables pointing to the same list! So when you change the values of `list_2`, the values of `list_1` also change (since they are referring to the same list). Something similar is at play when you pass this list to a function. So be careful!\n", - "\n", - "If you do not want this to happen, you can use the function `.copy()` to create a new object with the same values. \n", - "\n", - "#### Exercise\n", - "\n", - "Change the code below and fix the two cases given above using the `.copy()` function. Make sure the contents of `list_1` do not change." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Case 1\n", - "list_1 = [\"a\", \"b\", \"c\", \"d\", \"e\"]\n", - "print(f\"list_1 before {list_1}\")\n", - "\n", - "list_2 = list_1\n", - "list_2.append(\"Z\")\n", - "\n", - "print(f\"list_1 after {list_1}\")\n", - "print(\"**\")\n", - "\n", - "# Case 2\n", - "list_1 = [\"a\", \"b\", \"c\", \"d\", \"e\"]\n", - "print(f\"list_1 before function {list_1}\")\n", - "\n", - "def function_that_changes_list(input_list):\n", - " input_list.append(\"Z\")\n", - "\n", - "list_2 = list_1\n", - "function_that_changes_list(list_2)\n", - "\n", - "print(f\"list_1 after function {list_1}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### Exercise\n", - "\n", - "Now that we know how lists work, here is a quick exercise for you. Fill in the function below that takes a list and returns True if it is a palindrome, False if it is not. Palindromes are defined as sequences that read the same forwards and backwards.\n", - "Examples of palindrome lists:\n", - "* [\"cat\", \"dog\", \"fish\", \"dog\", \"cat\"]\n", - "* [0, 1, 2, 3, 3, 2, 1, 0]\n", - "* [1]\n", - "* []\n", - "\n", - "You may use a for-loop in this exercise. However, if you're feeling ambitious try to do it in 1 line, without using a for-loop (hint: use slicing)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def function_is_palindrome(input_list):\n", - " is_palindrome = True\n", - " # Your code here\n", - " return is_palindrome" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test_list_1 = [\"cat\", \"dog\", \"fish\", \"dog\", \"cat\"]\n", - "res_1 = function_is_palindrome(test_list_1)\n", - "\n", - "test_list_2 = [\"cat\", \"dog\", \"fish\", \"bird\", \"dog\", \"cat\"]\n", - "res_2 = function_is_palindrome(test_list_2)\n", - "\n", - "test_list_3 = [\"cat\"]\n", - "res_3 = function_is_palindrome(test_list_3)\n", - "\n", - "test_list_4 = [\"cat\", \"cat\"]\n", - "res_4 = function_is_palindrome(test_list_4)\n", - "\n", - "if not (res_1 and not res_2 and res_3 and res_4):\n", - " print(\"Test failed\")\n", - "else:\n", - " print(\"Correct! :)\")\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 5. Tuples\n", - "\n", - "Tuples are similar to lists but they are fixed in size and **immutable**, which means that change is not allowed.\n", - "We declare tuples in the following way using parentheses`()`:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "tuple_1 = (\"wash\", \"your\", \"hands\", \"with\", \"soap\")\n", - "\n", - "print(f\"tuple_1 = {tuple_1}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Since change is not allowed, observe the result of the following piece of code." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "tuple_1[2] = (\"face\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You can typecast from list to tuple and vice versa! " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "sequence_1 = [\"here\", \"comes\", \"the\", \"sun\"]\n", - "print(f\"{sequence_1} is {type(sequence_1)}\")\n", - "\n", - "\n", - "# from list to tuple\n", - "sequence_1 = tuple(sequence_1)\n", - "print(f\"{sequence_1} is {type(sequence_1)}\")\n", - "\n", - "#from tuple to list\n", - "sequence_1 = list(sequence_1)\n", - "print(f\"{sequence_1} is {type(sequence_1)}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 6. Dictionaries\n", - "\n", - "An incredibly useful data type to know, you might also know dictionaries as \"hash maps\". Dictionaries are collections of \"key: value\" pairs. You can access the values using the keys in $O(1)$ time.\n", - "\n", - "The keys of a dictionary must be **immutable** and **unique**. Below we show how to define a dictionary.\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "shopping_list = {\"apples\": 3, \"pears\":2, \"eggs\":6, \"bread\":1, \"yogurt\":1}\n", - "print(f\"shopping_list = {shopping_list}\")\n", - "print(\"**\")\n", - "\n", - "\n", - "book_dict = {}\n", - "print(f\"book_dict = {book_dict}\")\n", - "#add key value pairs\n", - "book_dict[\"vonnegut\"] = \"cat\\'s cradle\"\n", - "book_dict[\"ishiguro\"] = \"never let me go\"\n", - "print(f\"book_dict = {book_dict}\")\n", - "print(\"**\")\n", - "\n", - "# we can retrieve the dict keys:\n", - "print(book_dict.keys())\n", - "# and the dict values:\n", - "print(book_dict.values())\n", - "print(\"**\")\n", - "\n", - "#we can also iterate through the dict keys and values with a for-loop\n", - "for key, value in book_dict.items():\n", - " print(f\"{key} : {value}\")\n", - "\n", - "print(\"**\")\n", - "#we can modify the value of a key\n", - "book_dict[\"ishiguro\"] = \"a pale view of hills\"\n", - "print(f\"modified book_dict = {book_dict}\")\n", - "print(\"**\")\n", - "\n", - "#and we can remove a key completely\n", - "removed_value = book_dict.pop(\"ishiguro\")\n", - "print(f\"book_dict with removed value = {book_dict}\")\n", - "print(f\"removed_value = {removed_value}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 7. Functions\n", - "\n", - "You can define a function in Python in the following way:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def multiply(a, b):\n", - " return a * b\n", - "\n", - "print(f\"multiply(100, 2) = {multiply(100, 2)}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You can have default arguments by specifying their default value in the parameters." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def add(a, b, c=0, d=1):\n", - " return a + b + c + d\n", - "\n", - "# use no default arguments\n", - "print(f\"add(1, 2, 100, 1000) = {add(1, 2, 100, 1000)}\")\n", - "\n", - "# use the default value of d\n", - "print(f\"add(1, 2, 100) = {add(1, 2, 100)}\")\n", - "\n", - "# use the default value of c and d\n", - "print(f\"add(1, 2) = {add(1, 2)}\")\n", - "\n", - "# use the default value of c\n", - "print(f\"add(1, 2, d=1000) = {add(1, 2, d=1000)}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "A function can return multiple values in a tuple. You can assign the values of the tuple to separate variables. This is called **tuple unpacking**." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def min_max(input_list):\n", - " return min(input_list), max(input_list)\n", - "\n", - "\n", - "test_list = [1,2,3,4]\n", - "min_val, max_val = min_max(test_list)\n", - "print(f\"min_val: {min_val}, max_val: {max_val}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Note: You have seen tuple unpacking when using function `enumerate` in for-loop." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 7.1. Common Built-in Functions\n", - "\n", - "Here we introduce some nifty commonly used built-in functions. \n", - "\n", - "* You already learned `range()`, `enumerate()`!\n", - "* We have also seen `type()` to return the type of the object. We use `str()`, `int()`, `float()`, `list()`, `tuple()` for typecasting.\n", - "* The functions `len()`, `sum()`, `min()`, `max()`, `any()`, `all()`, `sorted()`, `zip()` are useful for lists and tuples.\n", - "\n", - "Let's see them in action below" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "list_1 = list(range(5))\n", - "print(f\"list_1 = {list_1}\")\n", - "\n", - "print(f\"len(list_1) = {len(list_1)}\")\n", - "print(f\"sum(list_1) = {sum(list_1)}\")\n", - "print(f\"min(list_1) = {min(list_1)}\")\n", - "print(f\"max(list_1) = {max(list_1)}\")\n", - "print(\"**\")\n", - "\n", - "\n", - "list_2 = [5,3,1,2,0,6]\n", - "print(f\"list_2 = {list_2}\")\n", - "print(f\"sorted(list_2) = {sorted(list_2)}\")\n", - "print(\"**\")\n", - "\n", - "\n", - "# any checks whether there are any 1s in the list (OR)\n", - "# all checks whether all elements are 1s. (AND)\n", - "# in Python: 1 = True, 0 = False\n", - "list_3 = [1, 1, 1]\n", - "print(f\"list_3 = {list_3}\")\n", - "print(f\"any(list_3) = {any(list_3)}\")\n", - "print(f\"all(list_3) = {all(list_3)}\")\n", - "\n", - "list_4 = [0, 1, 1]\n", - "print(f\"list_4 = {list_4}\")\n", - "print(f\"any(list_4) = {any(list_4)}\")\n", - "print(f\"all(list_4) = {all(list_4)}\")\n", - "\n", - "list_5 = [0, 0, 0]\n", - "print(f\"list_5 = {list_5}\")\n", - "print(f\"any(list_5) = {any(list_5)}\")\n", - "print(f\"all(list_5) = {all(list_5)}\")\n", - "print(\"**\")\n", - "\n", - "# zip function:\n", - "x = [1,2,3]\n", - "y = [4,5,6]\n", - "zipped = zip(x,y)\n", - "print(f\"x {x}\")\n", - "print(f\"y {y}\")\n", - "print(f\"zipped {list(zipped)}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 7.2. List Comprehensions\n", - "\n", - "One of the most practical things about Python is that you can do many things on just a single line. One popular example is so called *list comprehensions*, a specific syntax to create and initalize lists of objects. Here are some examples.\n", - "\n", - "A syntax for list comprehension is shown below:\n", - "`[thing for thing in list]`\n", - "\n", - "Let's make it more concrete with an example." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "list_of_numbers = [1, 2, 3, 101, 102, 103]\n", - "print(f\"list_of_numbers = {list_of_numbers}\")\n", - "\n", - "#I want to create a new list with all these items doubled.\n", - "doubled_list = [2 * elem for elem in list_of_numbers]\n", - "print(f\"doubled_list = {doubled_list}\")\n", - "\n", - "#A new list with all these items as floats\n", - "float_list = [float(elem) for elem in list_of_numbers]\n", - "print(f\"float_list = {float_list}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Let's make it more interesting by adding an if in there:\n", - "\n", - "`[thing for thing in list if condition]`\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "#I want to create a new list with all these items doubled\n", - "#IF the element is above 100\n", - "conditional_doubled_list = [2 * elem for elem in list_of_numbers if elem > 100]\n", - "print(f\"conditional_doubled_list = {conditional_doubled_list}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### Exercise\n", - "\n", - "You will be given a list of vocabulary words. Your task is to use list comprehensions to iterate through a document and create a new list including the words that are included in the vocabulary. You don't need to worry about duplicates.\n", - "\n", - "Example: \n", - "```python\n", - "vocabulary = [\"a\" \"c\", \"e\"]\n", - "document = [\"a\", \"b\", \"c\", \"d\"]\n", - "new_list = [\"a\", \"c\"]\n", - "```" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vocabulary = ['epfl', 'europe', 'swiss', 'switzerland', 'best', 'education', 'high', 'higher', 'research', 'school', 'science', 'students', 'technology', 'top-tier', 'university']\n", - "\n", - "document = \"\"\"The École polytechnique fédérale de Lausanne (EPFL) is a research institute\n", - "and university in Lausanne, Switzerland, that specializes in natural sciences and engineering.\n", - "It is one of the two Swiss Federal Institutes of Technology, and it has three main missions: \n", - "education, research and technology transfer at the highest international level. EPFL is widely regarded \n", - "as a world leading university. The QS World University Rankings ranks EPFL 12th in the world \n", - "across all fields in their 2017/2018 ranking, whilst Times Higher Education World \n", - "University Rankings ranks EPFL as the world's 11th best school for Engineering and Technology.\"\"\"\n", - "document_parsed = document.split()\n", - "document_parsed = [word.lower() for word in document_parsed]\n", - "new_list = []\n", - "\n", - "#your code here\n", - "new_list = ...\n", - "\n", - "#We convert the list to a set and then back to a list. We do this because converting it to a set automatically\n", - "#removes duplicates (since sets are sequences that do not contain duplicates). afterwards we sort it.\n", - "new_list = sorted(list(set(new_list)))\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "correct_result = ['best', 'education', 'epfl', 'higher', 'research', 'school', 'swiss', 'technology', 'university']\n", - "\n", - "if new_list == correct_result:\n", - " print (\"Correct! :)\")\n", - "else:\n", - " print (\"Incorrect :(\")\n" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "tags": [] - }, - "source": [ - "## 8. (optional now - useful later on) Object-Oriented Programming\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Object-oriented programming is a programming paradigm that provides a means of structuring programs so that properties and behaviors are bundled into individual objects.\n", - "\n", - "For this end, we use classes. Classes are used to create user-defined data structures. Classes define functions called methods, which identify the behaviors and actions that an object created from the class can perform with its data." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "All class definitions start with the class keyword, which is followed by the name of the class and a colon. Any code that is indented below the class definition is considered part of the class’s body. To start, let's declare an `EPFL_faculty` class." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "class EPFL_faculty:\n", - "\n", - " def __init__(self, name, number_of_students):\n", - " self.name = name\n", - " self.number_of_students = number_of_students\n", - "\n", - " # Instance method\n", - " def description(self):\n", - " return f\"The faculty {self.name} has {self.number_of_students} students\"" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "While the class is the blueprint, an instance is an object that is built from a class and contains real data. `.__init__()` sets the initial state of the object by assigning the values of the object’s properties. That is, `.__init__()` initializes each new instance of the class." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "sti = EPFL_faculty(\"STI\", 500)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Now that the instance `sti` has been created, we can call the method description." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "sti.description()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Inheritance is the process by which one class takes on the attributes and methods of another. Newly formed classes are called child classes, and the classes that child classes are derived from are called parent classes. Child classes inherit from the parent's attributs and methods but it can overwrite methods. Let's define a class `EPFL_section` that inherits from `EPFL_faculty` and that overwrites the method `description(self)`." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "class EPFL_section(EPFL_faculty):\n", - " \n", - " def description(self):\n", - " return f\"The section {self.name} has {self.number_of_students} students\"" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "genie_mechanique = EPFL_section(\"GM\", 200)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "genie_mechanique.description()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "__Disclaimer__: This part of the tutorial and its text was inspired by and taken from \"Object-Oriented Programming (OOP) in Python 3\" by David Amos. " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Additional OOP resources" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "For further information on object-oriented programming and classes in Python, here are two useful resources:\n", - "* Object-Oriented Programming (OOP) in Python 3: https://realpython.com/python3-object-oriented-programming/\n", - "* Classes from the official Python Tutorial: https://docs.python.org/3/tutorial/classes.html" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 9. (optional now - useful later on) Matplotlib" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Perhaps the most widely used plotting library in Python is Matplotlib. If you've ever used MATLAB, you'll find that the functions look pretty similar. \n", - "\n", - "In the following exercise sessions, we won't ask you to do any plotting. So this part is optional for those who are interested in having a short introduction.\n", - "\n", - "First, we will import Matplotlib." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Importing in Python\n", - "\n", - "* A short note on importing: to be able to use modules in our code, we import them. \n", - " \n", - " example: `import numpy`\n", - " \n", - "\n", - "* We can also select a name for the imported module.\n", - " \n", - " example: `import numpy as np`. Now when we call numpy functions, we will always use `np.` as a prefix, i.e. `np.zeros()`\n", - " \n", - "\n", - "* You can also choose to only import selected functions/variables/classes from the module. \n", - " \n", - " example: `from numpy import arange`. Now you can use this function as `arange(5)`. You cannot use any other functions from the numpy module as you did not import them." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# To import Matplotlib we do:\n", - "\n", - "import matplotlib.pyplot as plt" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Let's do some plotting! \n", - "\n", - "Let's start with the simplest of plots, the good old line-plot. The function we will use is `plot()`" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "#Let's create some data first to plot\n", - "x = list(range(10))\n", - "y = [2,3,5,1,0,2,3,0,0,1]\n", - "\n", - "#first create a figure\n", - "fig = plt.figure()\n", - "\n", - "#now do the plotting\n", - "#specifying a color and marker are optional.\n", - "#check out the documentation to see what else you can do with the plot function\n", - "plt.plot(x, y, marker=\"*\", color=\"r\")\n", - "\n", - "#axis labels and title\n", - "plt.xlabel(\"x\")\n", - "plt.ylabel(\"y\")\n", - "plt.title(\"just a random plot\")\n", - "\n", - "#so that we see the plot\n", - "plt.show()\n", - "\n", - "#close the plot\n", - "plt.close(fig)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### Exercise\n", - "\n", - "You can plot two lines on top of one another by calling the `plt.plot()` function consecutively. Try to implement this! Also, specify the parameter `label` of the `plt.plot()` function and call the function `plt.legend()` to create a legend for your graph. It should look like the figure shown below." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "![result](images/two_lines_plot.png)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "#Let's create some data first to plot\n", - "x = list(range(10))\n", - "y1 = [2,3,5,1,0,2,3,0,0,1]\n", - "y2 = [1,2,3,5,1,0,2,3,0,0]\n", - "\n", - "#first create a figure\n", - "fig = plt.figure()\n", - "#your code here\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You can create scatter plots (line plots without lines) with `scatter()`" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "#Let's create some data first to plot\n", - "x = list(range(10))\n", - "y1 = [2,3,5,1,0,2,3,0,0,1]\n", - "\n", - "#first create a figure\n", - "fig = plt.figure()\n", - "\n", - "#now do the plotting\n", - "p1 = plt.scatter(x, y1, marker=\"*\", color=\"r\")\n", - "\n", - "#axis labels and title\n", - "plt.xlabel(\"x\")\n", - "plt.ylabel(\"y\")\n", - "plt.title(\"just a random scatter plot\")\n", - "\n", - "#so that we see the plot\n", - "plt.show()\n", - "\n", - "#close the plot\n", - "plt.close(fig)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "And you can read and display images with `imread()` and `imshow()`" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "#first create a figure\n", - "fig = plt.figure()\n", - "\n", - "#now do the plotting\n", - "im = plt.imread(\"images/krabby_patty.jpg\")\n", - "plt.imshow(im)\n", - "\n", - "#axis labels and title\n", - "plt.xlabel(\"x\")\n", - "plt.ylabel(\"y\")\n", - "plt.title(\"krabby patty\")\n", - "\n", - "#so that we see the plot\n", - "plt.show()\n", - "\n", - "#close the plot\n", - "plt.close(fig)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "And that's all for this exercise! If you have any problems, just ask (or even Google) them. You can check out the official Python tutorials for further learning.\n", - "\n", - "https://docs.python.org/3/tutorial/" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.8.10" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/Exercises/01-python/.ipynb_checkpoints/intro_to_python_Sol-checkpoint.ipynb b/Exercises/01-python/.ipynb_checkpoints/intro_to_python_Sol-checkpoint.ipynb deleted file mode 100644 index eba5588..0000000 --- a/Exercises/01-python/.ipynb_checkpoints/intro_to_python_Sol-checkpoint.ipynb +++ /dev/null @@ -1,1565 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Introduction to Python - Solutions" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "
\n", - "\n", - "This notebook was developed for the CS-233 Introduction to Machine Learning course at EPFL, adapted for the CIVIL-226 Introduction to Machine Learning for Engineers course, and re-adapted for the ME-390. We thank contributers in CS-233 ([CVLab](https://www.epfl.ch/labs/cvlab)) and CIVIL-226 ([VITA](https://www.epfl.ch/labs/vita/)).\n", - " \n", - "**Author(s):** [Sena Kiciroglu](mailto:sena.kiciroglu@epfl.ch), minor changes by [Tom Winandy](mailto:tom.winandy@epfl.ch) and [David Mizrahi](mailto:david.mizrahi@epfl.ch)\n", - "
\n", - "\n", - "Welcome to the first exercise of Introduction to Machine Learning. Today we will get familiar with Python, the language we will use for all the exercises of this course. \n", - "\n", - "This week we will introduce some important concepts in the basics of Python. Next week, you will learn how to work with NumPy, a popular Python library used for scientific computing. \n", - "\n", - "Python is a popular language to use for machine learning tasks. This is especially true because of the selection of **libraries and frameworks**, developed specifically for machine learning and scientific computing. To name a few, you have Keras, TensorFlow and PyTorch for developing neural networks, SciPy and NumPy used for scientific computing, Pandas for data analysis, etc. (You might also get to dabble in PyTorch in the upcoming weeks.)\n", - "\n", - "Python also allows you to write quick, readable, high-level code. It's great for fast prototyping. \n", - "\n", - "You can find a useful Python cheatsheet at: https://www.pythoncheatsheet.org/\n", - "\n", - "Let's get into it!\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 1. Anaconda and Jupyter Notebook\n", - "\n", - "If you're reading this Jupyter notebook as it is intended, chances are you already installed Anaconda, a Python distribution that comes with its own package management system, `conda`. Using `conda`, you can install and upgrade software packages and libraries. It will make managing the versions of the libraries you use very convenient.\n", - "\n", - "In these exercises we will use Jupyter Notebooks, which contain Python code, text explanations and visuals. \n", - "\n", - "The Jupyter Notebook document (such as the one you are looking at right now) consists of cells containing Python code, text or other content. You can run each cell by clicking on the button `Run` in the top toolbar, or you can use a keyboard shortcut `Ctrl` + `Enter` (run current cell) or `Shift` + `Enter` (run current cell and move to the cell below)." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 2. Indentation and Control Flow" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Finally we get to start doing some coding!\n", - "\n", - "First thing to know: Python does not separate different lines of code with a semicolon `;`. So just RUN the following cell with no worries." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# This is a Python comment. Start the line with `#` for a comment\n", - "print(\"First line of code. I will declare some variables\")\n", - "a = 1 # second line!!\n", - "b = 2\n", - "c = \"Fish\"\n", - "print(f\"My variables are: a = {a}, b = {b}, c = {c}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Easy! However, in Python you have to be careful and have perfect indentation (a reason why Python code is so readable). The reason is, Python uses indentation to keep track of what is part of the if statement, the loops and the functions. This is different from Java (this is assuming you know Java) where you would have curly brackets `{ }` for this purpose. \n", - "\n", - "Let's start with the if statement.\n", - "\n", - "### 2.1. If Statement\n", - "\n", - "The rule is, all indented parts after the `if condition :` belong to that branch of the if statement. \n", - "\n", - "```python\n", - "if condition :\n", - " inside the statement\n", - " still inside the statement\n", - "elif condition:\n", - " inside the else-if part of the statement\n", - "else:\n", - " inside the else part of the statement\n", - "outside the statement\n", - " ```\n", - " \n", - "Let's see it in action:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "if a + b == 3:\n", - " print(\"It's me again! We are inside the first if statement\")\n", - " print(\"It's optional to use parentheses for the condition a + b ==3\")\n", - " print(\"Don't forget to put a `:` at the end of the condition!!\")\n", - " if (c == \"Fish\"):\n", - " print(\"This is a second if statement inside the first one\")\n", - " print(\"I'm out of the second if statement, but still inside the first one\")\n", - "else:\n", - " print(\"This is the else part of the first if statement.\")\n", - " print(\"These lines will never be printed!\")\n", - "print(\"I'm not inside any of the if statements\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### Exercise\n", - "\n", - "Let's see another if statement example. Try to figure out what the output will be **BEFORE** running the cell below.\n", - "\n", - "Reminder, we declared\n", - "\n", - "```python\n", - "a = 1\n", - "b = 2\n", - "c = \"Fish\"\n", - " ```" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Don't run me until you find the output first!\n", - "if a == 5:\n", - " print (\"1\")\n", - " if b == 1:\n", - " print(\"2\")\n", - "# here comes an else-if \n", - "elif a == 2 or c == \"Fish\":\n", - " print(\"3\")\n", - " \n", - " if b == 1:\n", - " print(\"4\")\n", - " if b == 2:\n", - " print(\"5\")\n", - " if b == 2:\n", - " print(\"6\")\n", - " if c == \"Fish\":\n", - " if a == 1:\n", - " if b == 100:\n", - " print(\"7\")\n", - " else:\n", - " print(\"8\")\n", - " elif a == 1:\n", - " print(\"9\")\n", - "print (\"10\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 2.3. Loops\n", - "\n", - "Let's talk about loops. The syntax for a while-loop is:\n", - "\n", - "```python\n", - "while condition:\n", - " inside the loop\n", - " inside the loop\n", - " inside the loop\n", - "outside the loop\n", - " ```\n", - " \n", - " A small example:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "count = 0\n", - "while count < 3:\n", - " count += 1 # this is the same as count = count +1\n", - " print(f\"Count is {count}\")\n", - "print(\"Left the loop!\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "For-loops iterate through sequences, in this way:\n", - "\n", - "```python\n", - "for x in sequence:\n", - " inside the loop\n", - " inside the loop\n", - " inside the loop\n", - "outside the loop\n", - "```\n", - " \n", - " An example is shown below:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "#Here is a basic list of strings\n", - "fish_list = [\"salmon\", \"trout\", \"parrot\", \"clown\", \"dory\"]\n", - "\n", - "#The for-loop:\n", - "for fish in fish_list:\n", - " print(fish)\n", - " print(\"*\")\n", - "print(\"fish list over!\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "An incredibly useful built-in function to use in for-loops is `range()`. Range allows you to create a sequence of integers from the start (default is 0), to the stop, with a given step size (default is 1). We can use `range()` in for-loops as shown in the example below." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# \"default start is 0, default step size is 1\"\n", - "for number in range(7):\n", - " print (number)\n", - "print(\"**\")\n", - "\n", - "# now we also provide the start as 2.\n", - "# Default step size 1 is still used.\n", - "for number in range(2,7):\n", - " print(number)\n", - "print(\"**\")\n", - "\n", - "# now we also provide the step size as 2.\n", - "for number in range(2,7,2):\n", - " print(number)\n", - "print(\"**\") \n", - "\n", - "# what happens if step size is -1?\n", - "for number in range(6,-1,-1):\n", - " print(number)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "One more useful built-in function will be `enumerate()`. Let's go back to the fish list.\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "for fish in fish_list:\n", - " print(fish)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "What if I also want to keep track of the index of the list element? You can use `enumerate()` which creates a sequence of 2-tuples, where each tuple contains an integer index and an actual element of the original list. Here is how it looks like:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "for item_index, fish in enumerate(fish_list):\n", - " print(f\"{item_index}: {fish}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 3. Data Types and Basic Operations\n", - "\n", - "Python is a **dynamically typed** language. This means that the data type is inferred at run-time and can be changed during run-time. To check the type of a variable you can use the function `type()`." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# var_1 is first defined as an integer\n", - "var_1 = 1\n", - "print(f\"{var_1} is {type(var_1)}\")\n", - "\n", - "# var_1's type is changed to string\n", - "var_1 = \"hi!\"\n", - "print(f\"{var_1} is {type(var_1)}\")\n", - "\n", - "# more types\n", - "var_1 = 0.312\n", - "print(f\"{var_1} is {type(var_1)}\")\n", - "var_1 = 3.\n", - "print(f\"{var_1} is {type(var_1)}\")\n", - "var_1 = 3+2j\n", - "print(f\"{var_1} is {type(var_1)}\")\n", - "var_1 = True\n", - "print(f\"{var_1} is {type(var_1)}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 3.1. Type Casting\n", - "\n", - "Some examples of type casting in Python:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# From int to float\n", - "var_1 = 42\n", - "print(f\"{var_1} is {type(var_1)}\")\n", - "var_1 = float(var_1)\n", - "print(f\"{var_1} is {type(var_1)}\")\n", - "print (\"**\")\n", - "\n", - "# From float to int\n", - "var_2 = 3.14\n", - "print(f\"{var_2} is {type(var_2)}\")\n", - "var_2 = int(var_2)\n", - "# This operations does FLOOR, not round!\n", - "print(f\"{var_2} is {type(var_2)}\")\n", - "print (\"**\")\n", - "\n", - "# From string to int\n", - "var_3 = \"100\"\n", - "print(f\"{var_3} is {type(var_3)}\")\n", - "var_3 = int(var_3)\n", - "print(f\"{var_3} is {type(var_3)}\")\n", - "print(\"**\")\n", - "\n", - "# From float to string\n", - "var_4 = 1.23\n", - "print(f\"{var_4} is {type(var_4)}\")\n", - "var_4 = str(var_4)\n", - "print(f\"{var_4} is {type(var_4)}\")\n", - "print(\"**\")\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 3.2. Basic Operations\n", - "\n", - "Arithmetic operations are fairly standard. There are some examples below. \n", - "* Look out for the difference between `/` division and `//` integer division.\n", - "* `**` is used for power.\n", - "* `%` is modulo." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "a = 50\n", - "b = 7\n", - "\n", - "print(f\"a + b = {a + b}\")\n", - "print(f\"a - b = {a - b}\")\n", - "print(f\"a * b = {a * b}\")\n", - "print(f\"a / b = {a / b}\")\n", - "print(f\"a // b = {a // b}\") # integer divison\n", - "print(f\"a ** b = {a ** b}\") # power\n", - "print(f\"a % b = {a % b}\") # modulo" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Boolean operations are also fairly standard:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "print(f\"(True and False) = {True and False}\")\n", - "print(f\"(True or False) = {True or False}\")\n", - "print(f\"((True and False) or True) = {(True and False) or True}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You can declare strings with a single quote `'`, a double quote `\"` or a three double quotes `\"\"\"`. The string declared with `\"\"\"` is known as a *docstring*, it can span multiple lines and is usually used to comment functions and classes.\n", - "\n", - "**Note:** Throughout the exercises, we will be using f-strings to format our strings nicely. You can learn more about them [here](https://realpython.com/python-f-strings/)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "a = 'Life\\'s but a walking shadow, a poor player,' \n", - "print(a)\n", - "a = \"That struts and frets his hour upon the stage,\"\n", - "print(a)\n", - "a = \"\"\"And then is heard no more. It is a tale\n", - "Told by an idiot, full of sound and fury,\n", - "Signifying nothing.\"\"\"\n", - "print(a)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# The types of quotes do not change anything!\n", - "a = \"fish\" # double quote\n", - "b = 'fish' # single quote\n", - "print(a == b) # the string is the same!" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 4. Lists\n", - "\n", - "Lists are data types containing a sequence of values. The size of the list can change during run-time, as you add and remove elements from the list. \n", - "\n", - "Here is how you can create lists:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "list_a = [] # empty\n", - "print(f\"list_a {list_a}\")\n", - "\n", - "list_b = [1, 2, 3, 4] # 4 elements\n", - "print(f\"list_b {list_b}\")\n", - "\n", - "list_c = [1, 'cat', 0.23] # mixed types\n", - "print(f\"list_c {list_c}\")\n", - "\n", - "list_d = [1, ['cat', 'dog'], 2, 3] # list in list\n", - "print(f\"list_d {list_d}\")\n", - "\n", - "list_e = [1] * 10 # a list of 1s of length 10\n", - "print(f\"list_e {list_e}\")\n", - "\n", - "list_f = list(range(5)) # turns range object into a list\n", - "print(f\"list_f {list_f}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Below we introduce some common operations with lists.\n", - "* Use `len(list1)` to find the length of the list.\n", - "* `list1.append(element)` to add an element to the end of the list.\n", - "* `list1.insert(index, element)` to add an element to an index in the list\n", - "* `list1.extend(list2)` to extend the elements of list1 with the elements of list2\n", - "* `list1.pop()` removes last element from the list\n", - "* `list1.pop(index)` removes the element at the given index\n", - "* `list1.remove(element)` removes the first instance of the given element" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Some common operations\n", - "b = [\"great\", \"minds\", \"think\", \"alike\"]\n", - "print(f\"b: {b}\")\n", - "\n", - "# finding the length\n", - "print(f\"length of b is {len(b)}\")\n", - "\n", - "# append element to list\n", - "b.append(\"sometimes\")\n", - "print(f\"b.append(\\\"sometimes\\\")= {b}\")\n", - "\n", - "# extend list\n", - "c = [\"-\", \"Abraham\", \"Lincoln\"]\n", - "b.extend(c)\n", - "print(f\"c: {c}\")\n", - "print(f\"b.extend(c) = {b}\")\n", - "\n", - "# removes element and specific index\n", - "b.pop(6) \n", - "print(f\"b.pop(6) = {b}\")\n", - "\n", - "# remove specific element\n", - "b.remove(\"Lincoln\") \n", - "b.remove(\"-\")\n", - "print(f\"b.remove(\\\"Lincoln\\\"); b.remove(\\\"-\\\") = {b}\")\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You can also check whether an element is in a list in the following way:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "list_1 = [\"a\", \"b\", \"c\"]\n", - "if \"b\" in list_1:\n", - " print(\"\\\"b\\\" is in list\")\n", - "else:\n", - " print(\"\\\"b\\\" is not in list\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 4.1. List Indexing and Slicing:\n", - "\n", - "You can extract a single element from a list in the following way:\n", - "`list1[index]`\n", - "\n", - "In lists, the indices start from 0. You can also index elements from the end of the list to the beginning by $-1, -2, -3...$. Check out the image below for the example list:\n", - "\n", - "`list_1 = [\"a\", \"b\", \"c\", \"d\", \"e\"]`" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "* You can extract multiple elements by slicing. This will give you elements from the start up to **(but not including)** the end index.\n", - "\n", - " `list1[start_index:end_index]`\n", - "\n", - "\n", - "* If you do not specify the `start_index`, you will retrieve the elements from index $0$ up to the `end_index`.\n", - "\n", - " `list1[:end_index]` is the same as `list1[0:end_index]`\n", - "\n", - "\n", - "* If you do not specify the `end_index`, you will retrieve the elements from the `start_index` up to (and **including**) the end of the list.\n", - "\n", - " `list1[start_index:]`\n", - "\n", - "\n", - "* You can provide a step size.\n", - " `list1[start_index:end_index:step_size]`\n", - " " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### Exercise\n", - "\n", - "Try to write the output of the following code **BEFORE** running the cell." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Don't run BEFORE you solve it!\n", - "list_1 = [\"a\", \"b\", \"c\", \"d\", \"e\"]\n", - "\n", - "print(f\"list_1[-3] = {list_1[-3]}\")\n", - "print(f\"list_1[0:2] = {list_1[0:2]}\")\n", - "print(f\"list_1[:4:2] = {list_1[:4:2]}\")\n", - "print(f\"list_1[::-1] = {list_1[::-1]}\")\n", - "print(f\"list_1[-4:-1] = {list_1[-4:-1]}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You can also assign new values to indices using slicing. Here is an example:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "list_1 = [\"a\", \"b\", \"c\", \"d\", \"e\"]\n", - "\n", - "list_1[-1]= \"<3\"\n", - "print(list_1)\n", - "\n", - "list_1[0:2] = [\"x\", \"y\"]\n", - "print(list_1)\n", - "\n", - "list_1[::2] = [\":)\",\":(\", \":O\"]\n", - "print(list_1)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 4.2. Copying\n", - "\n", - "We have one last thing to say about lists. Observe the behaviour of the following code:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Case 1:\n", - "\n", - "list_1 = [\"a\", \"b\", \"c\", \"d\", \"e\"]\n", - "print(f\"list_1 before {list_1}\")\n", - "\n", - "list_2 = list_1\n", - "list_2.append(\"Z\")\n", - "\n", - "print(f\"list_1 after {list_1}\")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Case 2:\n", - "\n", - "list_1 = [\"a\", \"b\", \"c\", \"d\", \"e\"]\n", - "print(f\"list_1 before function {list_1}\")\n", - "\n", - "def function_that_changes_list(input_list):\n", - " input_list.append(\"Z\")\n", - "\n", - "function_that_changes_list(list_1)\n", - "\n", - "print(f\"list_1 after function {list_1}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "We never changed list_1 explicitly, but the values changed anyway. What's going on?\n", - "\n", - "Well, in Python, when you say `list_2 = list_1`, you are not actually creating a new list, you are only copying the **reference** to the same list. This means that they are actually two variables pointing to the same list! So when you change the values of `list_2`, the values of `list_1` also change (since they are referring to the same list). Something similar is at play when you pass this list to a function. So be careful!\n", - "\n", - "If you do not want this to happen, you can use the function `.copy()` to create a new object with the same values. \n", - "\n", - "#### Exercise\n", - "\n", - "Change the code below and fix the two cases given above using the `.copy()` function. Make sure the contents of `list_1` do not change." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Case 1\n", - "list_1 = [\"a\", \"b\", \"c\", \"d\", \"e\"]\n", - "print(f\"list_1 before {list_1}\")\n", - "\n", - "list_2 = list_1.copy()\n", - "list_2.append(\"Z\")\n", - "\n", - "print(f\"list_1 after {list_1}\")\n", - "print(\"**\")\n", - "\n", - "# Case 2\n", - "list_1 = [\"a\", \"b\", \"c\", \"d\", \"e\"]\n", - "print(f\"list_1 before function {list_1}\")\n", - "\n", - "def function_that_changes_list(input_list):\n", - " input_list.append(\"Z\")\n", - "\n", - "list_2 = list_1.copy()\n", - "function_that_changes_list(list_2)\n", - "\n", - "print(f\"list_1 after function {list_1}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### Exercise\n", - "\n", - "Now that we know how lists work, here is a quick exercise for you. Fill in the function below that takes a list and returns True if it is a palindrome, False if it is not. Palindromes are defined as sequences that read the same forwards and backwards.\n", - "Examples of palindrome lists:\n", - "* [\"cat\", \"dog\", \"fish\", \"dog\", \"cat\"]\n", - "* [0, 1, 2, 3, 3, 2, 1, 0]\n", - "* [1]\n", - "* []\n", - "\n", - "You may use a for-loop in this exercise. However, if you're feeling ambitious try to do it in 1 line, without using a for-loop (hint: use slicing)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# with for-loop\n", - "def function_is_palindrome(input_list):\n", - " is_palindrome = True\n", - " \n", - " # Your code here\n", - " len_of_list = len(input_list) // 2\n", - " for element in range(len_of_list):\n", - " if input_list[element] != input_list[-element-1]:\n", - " is_palindrome = False\n", - " return is_palindrome" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# without for-loop\n", - "def function_is_palindrome(input_list):\n", - " return input_list == input_list[::-1]" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "test_list_1 = [\"cat\", \"dog\", \"fish\", \"dog\", \"cat\"]\n", - "res_1 = function_is_palindrome(test_list_1)\n", - "\n", - "test_list_2 = [\"cat\", \"dog\", \"fish\", \"bird\", \"dog\", \"cat\"]\n", - "res_2 = function_is_palindrome(test_list_2)\n", - "\n", - "test_list_3 = [\"cat\"]\n", - "res_3 = function_is_palindrome(test_list_3)\n", - "\n", - "test_list_4 = [\"cat\", \"cat\"]\n", - "res_4 = function_is_palindrome(test_list_4)\n", - "\n", - "if not (res_1 and not res_2 and res_3 and res_4):\n", - " print(\"Test failed\")\n", - "else:\n", - " print(\"Correct! :)\")\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 5. Tuples\n", - "\n", - "Tuples are similar to lists but they are fixed in size and **immutable**, which means that change is not allowed.\n", - "We declare tuples in the following way using parentheses`()`:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "tuple_1 = (\"wash\", \"your\", \"hands\", \"with\", \"soap\")\n", - "\n", - "print(f\"tuple_1 = {tuple_1}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Since change is not allowed, observe the result of the following piece of code." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "tuple_1[2] = (\"face\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You can typecast from list to tuple and vice versa! " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "sequence_1 = [\"here\", \"comes\", \"the\", \"sun\"]\n", - "print(f\"{sequence_1} is {type(sequence_1)}\")\n", - "\n", - "\n", - "# from list to tuple\n", - "sequence_1 = tuple(sequence_1)\n", - "print(f\"{sequence_1} is {type(sequence_1)}\")\n", - "\n", - "#from tuple to list\n", - "sequence_1 = list(sequence_1)\n", - "print(f\"{sequence_1} is {type(sequence_1)}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 6. Dictionaries\n", - "\n", - "An incredibly useful data type to know, you might also know dictionaries as \"hash maps\". Dictionaries are collections of \"key: value\" pairs. You can access the values using the keys in $O(1)$ time.\n", - "\n", - "The keys of a dictionary must be **immutable** and **unique**. Below we show how to define a dictionary.\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "shopping_list = {\"apples\": 3, \"pears\":2, \"eggs\":6, \"bread\":1, \"yogurt\":1}\n", - "print(f\"shopping_list = {shopping_list}\")\n", - "print(\"**\")\n", - "\n", - "\n", - "book_dict = {}\n", - "print(f\"book_dict = {book_dict}\")\n", - "#add key value pairs\n", - "book_dict[\"vonnegut\"] = \"cat\\'s cradle\"\n", - "book_dict[\"ishiguro\"] = \"never let me go\"\n", - "print(f\"book_dict = {book_dict}\")\n", - "print(\"**\")\n", - "\n", - "# we can retrieve the dict keys:\n", - "print(book_dict.keys())\n", - "# and the dict values:\n", - "print(book_dict.values())\n", - "print(\"**\")\n", - "\n", - "#we can also iterate through the dict keys and values with a for-loop\n", - "for key, value in book_dict.items():\n", - " print(f\"{key} : {value}\")\n", - "\n", - "print(\"**\")\n", - "#we can modify the value of a key\n", - "book_dict[\"ishiguro\"] = \"a pale view of hills\"\n", - "print(f\"modified book_dict = {book_dict}\")\n", - "print(\"**\")\n", - "\n", - "#and we can remove a key completely\n", - "removed_value = book_dict.pop(\"ishiguro\")\n", - "print(f\"book_dict with removed value = {book_dict}\")\n", - "print(f\"removed_value = {removed_value}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 7. Functions\n", - "\n", - "You can define a function in Python in the following way:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def multiply(a, b):\n", - " return a * b\n", - "\n", - "print(f\"multiply(100, 2) = {multiply(100, 2)}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You can have default arguments by specifying their default value in the parameters." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def add(a, b, c=0, d=1):\n", - " return a + b + c + d\n", - "\n", - "# use no default arguments\n", - "print(f\"add(1, 2, 100, 1000) = {add(1, 2, 100, 1000)}\")\n", - "\n", - "# use the default value of d\n", - "print(f\"add(1, 2, 100) = {add(1, 2, 100)}\")\n", - "\n", - "# use the default value of c and d\n", - "print(f\"add(1, 2) = {add(1, 2)}\")\n", - "\n", - "# use the default value of c\n", - "print(f\"add(1, 2, d=1000) = {add(1, 2, d=1000)}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "A function can return multiple values in a tuple. You can assign the values of the tuple to separate variables. This is called **tuple unpacking**." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def min_max(input_list):\n", - " return min(input_list), max(input_list)\n", - "\n", - "\n", - "test_list = [1,2,3,4]\n", - "min_val, max_val = min_max(test_list)\n", - "print(f\"min_val: {min_val}, max_val: {max_val}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Note: You have seen tuple unpacking when using function `enumerate` in for-loop." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 7.1. Common Built-in Functions\n", - "\n", - "Here we introduce some nifty commonly used built-in functions. \n", - "\n", - "* You already learned `range()`, `enumerate()`!\n", - "* We have also seen `type()` to return the type of the object. We use `str()`, `int()`, `float()`, `list()`, `tuple()` for typecasting.\n", - "* The functions `len()`, `sum()`, `min()`, `max()`, `any()`, `all()`, `sorted()`, `zip()` are useful for lists and tuples.\n", - "\n", - "Let's see them in action below" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "list_1 = list(range(5))\n", - "print(f\"list_1 = {list_1}\")\n", - "\n", - "print(f\"len(list_1) = {len(list_1)}\")\n", - "print(f\"sum(list_1) = {sum(list_1)}\")\n", - "print(f\"min(list_1) = {min(list_1)}\")\n", - "print(f\"max(list_1) = {max(list_1)}\")\n", - "print(\"**\")\n", - "\n", - "\n", - "list_2 = [5,3,1,2,0,6]\n", - "print(f\"list_2 = {list_2}\")\n", - "print(f\"sorted(list_2) = {sorted(list_2)}\")\n", - "print(\"**\")\n", - "\n", - "\n", - "# any checks whether there are any 1s in the list (OR)\n", - "# all checks whether all elements are 1s. (AND)\n", - "# in Python: 1 = True, 0 = False\n", - "list_3 = [1, 1, 1]\n", - "print(f\"list_3 = {list_3}\")\n", - "print(f\"any(list_3) = {any(list_3)}\")\n", - "print(f\"all(list_3) = {all(list_3)}\")\n", - "\n", - "list_4 = [0, 1, 1]\n", - "print(f\"list_4 = {list_4}\")\n", - "print(f\"any(list_4) = {any(list_4)}\")\n", - "print(f\"all(list_4) = {all(list_4)}\")\n", - "\n", - "list_5 = [0, 0, 0]\n", - "print(f\"list_5 = {list_5}\")\n", - "print(f\"any(list_5) = {any(list_5)}\")\n", - "print(f\"all(list_5) = {all(list_5)}\")\n", - "print(\"**\")\n", - "\n", - "# zip function:\n", - "x = [1,2,3]\n", - "y = [4,5,6]\n", - "zipped = zip(x,y)\n", - "print(f\"x {x}\")\n", - "print(f\"y {y}\")\n", - "print(f\"zipped {list(zipped)}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 7.2. List Comprehensions\n", - "\n", - "One of the most practical things about Python is that you can do many things on just a single line. One popular example is so called *list comprehensions*, a specific syntax to create and initalize lists of objects. Here are some examples.\n", - "\n", - "A syntax for list comprehension is shown below:\n", - "`[thing for thing in list]`\n", - "\n", - "Let's make it more concrete with an example." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "list_of_numbers = [1, 2, 3, 101, 102, 103]\n", - "print(f\"list_of_numbers = {list_of_numbers}\")\n", - "\n", - "#I want to create a new list with all these items doubled.\n", - "doubled_list = [2 * elem for elem in list_of_numbers]\n", - "print(f\"doubled_list = {doubled_list}\")\n", - "\n", - "#A new list with all these items as floats\n", - "float_list = [float(elem) for elem in list_of_numbers]\n", - "print(f\"float_list = {float_list}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Let's make it more interesting by adding an if in there:\n", - "\n", - "`[thing for thing in list if condition]`\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "#I want to create a new list with all these items doubled\n", - "#IF the element is above 100\n", - "conditional_doubled_list = [2 * elem for elem in list_of_numbers if elem > 100]\n", - "print(f\"conditional_doubled_list = {conditional_doubled_list}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### Exercise\n", - "\n", - "You will be given a list of vocabulary words. Your task is to use list comprehensions to iterate through a document and create a new list including the words that are included in the vocabulary. You don't need to worry about duplicates.\n", - "\n", - "Example: \n", - "```python\n", - "vocabulary = [\"a\" \"c\", \"e\"]\n", - "document = [\"a\", \"b\", \"c\", \"d\"]\n", - "new_list = [\"a\", \"c\"]\n", - "```" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "vocabulary = ['epfl', 'europe', 'swiss', 'switzerland', 'best', 'education', 'high', 'higher', 'research', 'school', 'science', 'students', 'technology', 'top-tier', 'university']\n", - "\n", - "document = \"\"\"The École polytechnique fédérale de Lausanne (EPFL) is a research institute\n", - "and university in Lausanne, Switzerland, that specializes in natural sciences and engineering.\n", - "It is one of the two Swiss Federal Institutes of Technology, and it has three main missions: \n", - "education, research and technology transfer at the highest international level. EPFL is widely regarded \n", - "as a world leading university. The QS World University Rankings ranks EPFL 12th in the world \n", - "across all fields in their 2017/2018 ranking, whilst Times Higher Education World \n", - "University Rankings ranks EPFL as the world's 11th best school for Engineering and Technology.\"\"\"\n", - "document_parsed = document.split()\n", - "document_parsed = [word.lower() for word in document_parsed]\n", - "new_list = []\n", - "\n", - "# Your code here\n", - "new_list = [word for word in document_parsed if (word in vocabulary)]\n", - "\n", - "# We convert the list to a set and then back to a list. We do this because converting it to a set automatically\n", - "# removes duplicates (since sets are sequences that do not contain duplicates). afterwards we sort it.\n", - "new_list = sorted(list(set(new_list)))\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "correct_result = ['best', 'education', 'epfl', 'higher', 'research', 'school', 'swiss', 'technology', 'university']\n", - "\n", - "if new_list == correct_result:\n", - " print (\"Correct! :)\")\n", - "else:\n", - " print (\"Incorrect :(\")\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 8. (optional now - useful later on) Object-Oriented Programming" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Object-oriented programming is a programming paradigm that provides a means of structuring programs so that properties and behaviors are bundled into individual objects.\n", - "\n", - "For this end, we use classes. Classes are used to create user-defined data structures. Classes define functions called methods, which identify the behaviors and actions that an object created from the class can perform with its data." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "All class definitions start with the class keyword, which is followed by the name of the class and a colon. Any code that is indented below the class definition is considered part of the class’s body. To start, let's declare an `EPFL_faculty` class." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "class EPFL_faculty:\n", - "\n", - " def __init__(self, name, number_of_students):\n", - " self.name = name\n", - " self.number_of_students = number_of_students\n", - "\n", - " # Instance method\n", - " def description(self):\n", - " return f\"The faculty {self.name} has {self.number_of_students} students\"" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "While the class is the blueprint, an instance is an object that is built from a class and contains real data. `.__init__()` sets the initial state of the object by assigning the values of the object’s properties. That is, `.__init__()` initializes each new instance of the class." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "sti = EPFL_faculty(\"STI\", 1000)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Now that the instance `sti` has been created, we can call the method description." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# sti.description()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Inheritance is the process by which one class takes on the attributes and methods of another. Newly formed classes are called child classes, and the classes that child classes are derived from are called parent classes. Child classes inherit from the parent's attributs and methods but it can overwrite methods. Let's define a class `EPFL_section` that inherits from `EPFL_faculty` and that overwrites the method `description(self)`." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "class EPFL_section(EPFL_faculty):\n", - " \n", - " def description(self):\n", - " return f\"The section {self.name} has {self.number_of_students} students\"" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "genie_mecanique = EPFL_section(\"GM\", 200)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "genie_mecanique.description()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "__Disclaimer__: This part of the tutorial and its text was inspired by and taken from \"Object-Oriented Programming (OOP) in Python 3\" by David Amos. " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Additional OOP resources" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "For further information on object-oriented programming and classes in Python, here are two useful resources:\n", - "* Object-Oriented Programming (OOP) in Python 3: https://realpython.com/python3-object-oriented-programming/\n", - "* Classes from the official Python Tutorial: https://docs.python.org/3/tutorial/classes.html" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 9. (optional now, useful later on) Matplotlib" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Perhaps the most widely used plotting library in Python is Matplotlib. If you've ever used MATLAB, you'll find that the functions look pretty similar. \n", - "\n", - "In the following exercise sessions, we won't ask you to do any plotting. So this part is optional for those who are interested in having a short introduction.\n", - "\n", - "First, we will import Matplotlib." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Importing in Python\n", - "\n", - "* A short note on importing: to be able to use modules in our code, we import them. \n", - " \n", - " example: `import numpy`\n", - " \n", - "\n", - "* We can also select a name for the imported module.\n", - " \n", - " example: `import numpy as np`. Now when we call numpy functions, we will always use `np.` as a prefix, i.e. `np.zeros()`\n", - " \n", - "\n", - "* You can also choose to only import selected functions/variables/classes from the module. \n", - " \n", - " example: `from numpy import arange`. Now you can use this function as `arange(5)`. You cannot use any other functions from the numpy module as you did not import them." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# To import Matplotlib we do:\n", - "\n", - "import matplotlib.pyplot as plt" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Let's do some plotting! \n", - "\n", - "Let's start with the simplest of plots, the good old line-plot. The function we will use is `plot()`" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "#Let's create some data first to plot\n", - "x = list(range(10))\n", - "y = [2,3,5,1,0,2,3,0,0,1]\n", - "\n", - "#first create a figure\n", - "fig = plt.figure()\n", - "\n", - "#now do the plotting\n", - "#specifying a color and marker are optional.\n", - "#check out the documentation to see what else you can do with the plot function\n", - "plt.plot(x, y, marker=\"*\", color=\"r\")\n", - "\n", - "#axis labels and title\n", - "plt.xlabel(\"x\")\n", - "plt.ylabel(\"y\")\n", - "plt.title(\"just a random plot\")\n", - "\n", - "#so that we see the plot\n", - "plt.show()\n", - "\n", - "#close the plot\n", - "plt.close(fig)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### Exercise\n", - "\n", - "You can plot two lines on top of one another by calling the `plt.plot()` function consecutively. Try to implement this! Also, specify the parameter `label` of the `plt.plot()` function and call the function `plt.legend()` to create a legend for your graph. It should look like the figure shown below." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "![result](images/two_lines_plot.png)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "#Let's create some data first to plot\n", - "x = list(range(10))\n", - "y1 = [2,3,5,1,0,2,3,0,0,1]\n", - "y2 = [1,2,3,5,1,0,2,3,0,0]\n", - "\n", - "#your code here\n", - "#first create a figure\n", - "fig = plt.figure()\n", - "\n", - "#now do the plotting\n", - "#let's add labels for the legend\n", - "p1 = plt.plot(x, y1, marker=\"*\", color=\"r\", label=\"red line\")\n", - "p2 = plt.plot(x, y2, marker=\"^\", color=\"b\", label=\"green line\")\n", - "plt.legend()\n", - "\n", - "#axis labels and title\n", - "plt.xlabel(\"x\")\n", - "plt.ylabel(\"y\")\n", - "plt.title(\"just a random plot\")\n", - "\n", - "#so that we see the plot\n", - "plt.show()\n", - "#close the plot\n", - "plt.close(fig)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You can create scatter plots (line plots without lines) with `scatter()`" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "#Let's create some data first to plot\n", - "x = list(range(10))\n", - "y1 = [2,3,5,1,0,2,3,0,0,1]\n", - "\n", - "#first create a figure\n", - "fig = plt.figure()\n", - "\n", - "#now do the plotting\n", - "p1 = plt.scatter(x, y1, marker=\"*\", color=\"r\")\n", - "\n", - "#axis labels and title\n", - "plt.xlabel(\"x\")\n", - "plt.ylabel(\"y\")\n", - "plt.title(\"just a random scatter plot\")\n", - "\n", - "#so that we see the plot\n", - "plt.show()\n", - "\n", - "#close the plot\n", - "plt.close(fig)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "And you can read and display images with `imread()` and `imshow()`" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "#first create a figure\n", - "fig = plt.figure()\n", - "\n", - "#now do the plotting\n", - "im = plt.imread(\"images/krabby_patty.jpg\")\n", - "plt.imshow(im)\n", - "\n", - "#axis labels and title\n", - "plt.xlabel(\"x\")\n", - "plt.ylabel(\"y\")\n", - "plt.title(\"krabby patty\")\n", - "\n", - "#so that we see the plot\n", - "plt.show()\n", - "\n", - "#close the plot\n", - "plt.close(fig)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "And that's all for this tutorial! If you have any problems, just ask (or even Google) them. You can check out the official Python tutorials for further learning.\n", - "\n", - "https://docs.python.org/3/tutorial/" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.8.10" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/Exercises/02-numpy/.ipynb_checkpoints/numpy_basics-checkpoint.ipynb b/Exercises/02-numpy/.ipynb_checkpoints/numpy_basics-checkpoint.ipynb deleted file mode 100644 index 45544f5..0000000 --- a/Exercises/02-numpy/.ipynb_checkpoints/numpy_basics-checkpoint.ipynb +++ /dev/null @@ -1,1316 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# NumPy Basics" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "
\n", - "\n", - "This notebook was developed for the CS-233 Introduction to Machine Learning course at EPFL, adapted for the CIVIL-226 Introduction to Machine Learning for Engineers course, and re-adapted for the ME-390.\n", - "We thank contributers in CS-233 ([CVLab](https://www.epfl.ch/labs/cvlab)) and CIVIL-226 ([VITA](https://www.epfl.ch/labs/vita/)).\n", - " \n", - "**Author(s):** Jan Bednarík, minor changes by Tom Winandy\n", - "
\n", - "\n", - "In this exercise we will work with a popular Python library for scientific computing with N-dimensional arrays - NumPy. You will see again some of the concepts introduced last week, such as indexing and slicing the lists, but NumPy adds multiple new concepts, namely broadcasting, vectorization, indexing using masking and wide range of functions to work with the arrays, which you will learn to use today. This exercise is quite long and you might not be able to finish it during the exercise sessions. However, the introduced concepts will be used during the following weeks so we would like to encourage you to take an extra time and try to finish the whole exercise before next week, since getting familiar with NumPy will pay-off when working on following exercise (and possibly in other courses relying on NumPy as well). Let's get started!\n", - "\n", - "In the exercises you will be often referred to NumPy functions which you should use. Please inspect the [NumPy reference/documentation](https://docs.scipy.org/doc/numpy/reference/) and find out how to use the functions.\n", - "\n", - "## 1 About NumPy\n", - "\n", - "### NumPy\n", - "\n", - "NumPy is a core library for scientific computing in Python. It offers high-performance multidimensional array computation capabilities. Furthermore, Python provides wide ecosystem of libraries that take NumPy arrays as input.\n", - "\n", - "### NumPy Arrays\n", - "\n", - "NumPy arrays are high-performance homogeneous (= all elements of the same type) multidimensional arrays (think of an N dimensional grid). They are indexed by a tuple of integers. Indexing syntax is similar to lists, tuples, and dictionaries, but NumPy adds some more fancy indexing tools.\n", - "\n", - "Let us start with importing NumPy. By convention, it is imported as ``np``." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import numpy as np\n", - "\n", - "# Let us also import plotting library\n", - "import matplotlib.pyplot as plt " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 2 Working with Arrays\n", - "\n", - "### 2.1 Creating Arrays\n", - "\n", - "Two most common ways of creating NumPy arrays are\n", - "1. Converting array-like Python objects (e.g. lists, tuples) using the function [`np.array` (reference/documentation)](https://numpy.org/doc/stable/reference/generated/numpy.array.html).\n", - "2. Calling one of the built-in functions provided by NumPy.\n", - "\n", - "The following cells introduce the syntax to create the arrays and some common built-in NumPy functions.\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "### Converting Python array-like objects.\n", - "\n", - "# 1D array from list, shape (4, ).\n", - "x_1d = np.array([1, 3, 5, 7])\n", - "\n", - "# 2D array from combination of lists and tuples, shape (3, 3).\n", - "x_2d = np.array([(1, 1, 1), [2, 2, 2], (3, 3, 3)])\n", - "\n", - "# Print the results.\n", - "print(f'x_1d:\\n{x_1d}\\n')\n", - "print(f'x_2d:\\n{x_2d}\\n')" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "### Using built-in functions provided by NumPy.\n", - "\n", - "# 2D array of zeros, 2 rows, 3 columns.\n", - "x_zeros = np.zeros((2, 3))\n", - "\n", - "# 3D array of ones, shape (2, 3, 4) - 2 matrices of 3 rows and 4 columns.\n", - "x_ones = np.ones((2, 3, 4))\n", - "\n", - "# Identity matrix with 4 rows and 4 columns.\n", - "x_identity = np.eye(4)\n", - "\n", - "# Sequence of numbers from 5 to 11 (11 not included) with step 1.\n", - "x_seq = np.arange(5, 11)\n", - "\n", - "# Sequence of ones of the same shape as `x_zeros`.\n", - "x_ones_as_zeros = np.ones_like(x_zeros)\n", - "\n", - "\n", - "# Print the results.\n", - "print(f'x_zeros:\\n{x_zeros}\\n')\n", - "print(f'x_ones:\\n{x_ones}\\n')\n", - "print(f'x_identity:\\n{x_identity}\\n')\n", - "print(f'x_seq:\\n{x_seq}\\n')\n", - "print(f'x_ones_as_zeros:\\n{x_ones_as_zeros}\\n')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 2.2 Data Types\n", - "\n", - "NumPy arrays can be given an explicit data type. Specifying a data type gets useful for instance when using arrays for indexing (integers) or masking (boolean). Full list of supported data types can be found [here](https://docs.scipy.org/doc/numpy/user/basics.types.html).\n", - "\n", - "Data type can be specified when creating an array using an argument ``dtype``, arrays can be also cast to a given datatype using function [``astype`` (documentation)](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.astype.html)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Create an array of 32bit integers.\n", - "x_int = np.array([1, 2, 3, 4, 5], dtype=np.int32)\n", - "\n", - "# Cast integer array to 32 bit float array.\n", - "x_float = x_int.astype(np.float32)\n", - "\n", - "# Print results.\n", - "print(f'Array x_int has data type {x_int.dtype}')\n", - "print(f'Array x_float has data type {x_float.dtype}')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 2.3 Inspecting the Arrays\n", - "\n", - "When working with arrays, it is easy to lose track about current number shape or data type. The properties ``ndim``, ``shape``, ``size``, ``dtype`` facilitate working with arrays and debugging your code. Furthermore, you can also simply print out an array using Python's ``print`` function. " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Crate a 3D array.\n", - "x_3d = np.array([[[1, 2, 3, 4], [4, 7, 1, 9], [0, 4, 6, 8]], \n", - " [[5, 2, 8, 0], [2, 4, 3, 1], [1, 0, 4, 9]]])\n", - "\n", - "# Check the number of dimensions, number of elements, shape, and data type.\n", - "print(f'Number of dimensions: {x_3d.ndim}')\n", - "print(f'Number of elements: {x_3d.size}')\n", - "print(f'Shape: {x_3d.shape}')\n", - "print(f'Data type: {x_3d.dtype}')\n", - "\n", - "# Simply print the array.\n", - "print(x_3d)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 2.4 Reshaping the Arrays\n", - "Arrays can be reshaped using a function [``reshape`` (documentation)](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.reshape.html). Note that the requested shape has to have the same number of elements as the original array.\n", - "\n", - "The shape of an array is given as a tuple of integers representing the number of elements in each dimension. Here a couple of examples of the shapes:\n", - "- () - A 0D array, effectively a scalar.\n", - "- (4, ) - A 1D array (vector) of 4 elements.\n", - "- (3, 4) - A 2D array (matrix) of 3 rows and 4 columns.\n", - "- (2, 3, 4) - A 3D array (block), think of 2 2D matrices each having 3 rows and 4 columns.\n", - "\n", - "When reshaping an array, you can use a value ``-1`` for at most one axis, meaning that the number of elements for that axis will be computed automatically." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Create a 2D array filled with a sequence of numbers.\n", - "x_seq_2d = np.arange(12).reshape(4, 3)\n", - "\n", - "# Create a 3D array filled with ones, last axis computed automatically.\n", - "x_ones_3d = np.ones(8).reshape((2, 2, -1))\n", - "\n", - "# Print the results.\n", - "print(f'x_seq_2d:\\n{x_seq_2d}\\n')\n", - "print(f'x_ones_3d:\\n{x_ones_3d}\\n')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 2.5 Exercises" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "## Create a 1D array of 10 elements of type float32 filled with a value 3.14.\n", - "## Hint: Use np.ones or np.full.\n", - "\n", - "array_pi = None # <<< YOUR CODE HERE\n", - "print(f'array_pi:\\n{array_pi}\\n')\n", - "\n", - "## Find number of elements in the following array without using `size` property.\n", - "## Hint: Use np.prod.\n", - "x = np.zeros((4, 5, 6, 7, 8))\n", - "\n", - "num_elements = None # <<< YOUR CODE HERE\n", - "print(f'Number of elements in x: {num_elements}')\n", - "\n", - "## Reshape the 3D array \"x_unknown\" to a 1D array. Note that you do not know the shape of the array.\n", - "## Hint: You can access the shape property, use the `-1` trick, or function np.ndarray.flatten()\n", - "## (i.e. you have to call it s a function of the array, x.flatten())\n", - "x_unknown = np.zeros(np.random.randint(1, 5, 6))\n", - "\n", - "x_flat = None # <<< YOUR CODE HERE\n", - "print(f'Shape of x_flat: {x_flat.shape}')\n", - "\n", - "# Check the answers:\n", - "assert(array_pi.shape == (10, ) and array_pi.dtype == np.float32 and np.allclose(array_pi, 3.14))\n", - "assert(num_elements == x.size)\n", - "assert(x_flat.ndim == 1)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 3 Accessing Array Elements\n", - "In order to access the values of an array, **indexing** and **slicing** is used the same way you used it to slice Python array-like objects. Since NumPy arrays are N-dimensional, you can use a separate indexing/slicing expression for each axis separately.\n", - "\n", - "NumPy further extends the standard indexing/slicing by the following:\n", - "- indexing using an array of indices\n", - "- indexing using boolean array (i.e. masking).\n", - "- structural indexing\n", - "\n", - "The indexing can be used not only for retrieving the values but also modifying the values in the original array (using the indexed array as an L-value):\n", - "\n", - "- ``selection = x[3:5, 1::3] # Retrieving a value.``\n", - "- ``x[3:5, 1::3] = 3.14 # Replacing the selected values by 3.14``" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 3.1 Standard Indexing and Slicing\n", - "Works the same way as for Python lists, but can be specified separately for every dimension. Use the familiar syntax ``[start : end]`` or ``[start : stop : step]``. When using the range using ``start`` and ``end``, remember that ``start`` is inclusive and ``end`` is exclusive. E.g. writing ``x[2:4]`` will select result in an array of ``[x[2], x[3]]``.\n", - "\n", - "All `start`, `stop` and `step` values can be left out. Missing `start` defaults to `0`, missing `end` defaults to the the index of the last element plus one (remember that ``end`` is exclusive), missing `step` defaults to `1`.\n", - "\n", - "Note that the step can be negative in which case you traverse an array backwards.\n", - "\n", - "The image below depicts a 2D array of the shape (5, 6) and a couple of different indexing strategies. Let us try them out.\n", - "\n", - "\"slicing\"" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Create a 2D array, which will be used in the following cells, an print it out.\n", - "x = np.arange(1, 31).reshape((5, 6))\n", - "print(f'Array x:\\n{x}\\n')\n", - "\n", - "# Access 3 elements in the 1st row.\n", - "orange = x[0, 2:5]\n", - "print(f'orange:\\n{orange}\\n')\n", - "\n", - "# Access the third column.\n", - "red = x[:, 2]\n", - "print(f'red:\\n{red}\\n')\n", - "\n", - "# Access a 2x2 submatrix form the bottom right corner.\n", - "green = x[-2:, -2:]\n", - "print(f'green:\\n{green}\\n')\n", - "\n", - "# Access elements from even indices starting from the 3rd row.\n", - "magenta = x[2::2, ::2]\n", - "print(f'magenta:\\n{magenta}\\n')\n", - "\n", - "# Replace last two rows with zeros.\n", - "x[-2:, :] = 0\n", - "print(x)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 3.2 Indexing by an Array of Indices.\n", - "On top of standard indexing, NumPy also allows for providing a list of integer indices for every axis.\n", - "\n", - "\"slicing\"" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Create a 2D array, which will be used in the following cells, an print it out.\n", - "x = np.arange(1, 31).reshape((5, 6))\n", - "print(f'Array x:\\n{x}\\n')\n", - "\n", - "# Access the 2nd, the 4th and the 5th columns.\n", - "red = x[:, [1, 3, 4]]\n", - "print(f'red:\\n{red}\\n')\n", - "\n", - "# Access the elements from the 2nd and the 3rd rows in a zig-zag fashion.\n", - "magenta = x[[1, 2, 1, 2], range(4)]\n", - "print(f'magenta:\\n{magenta}\\n')\n", - "\n", - "# Replace the violet elemenets with a value -1.\n", - "x[[1, 2, 1, 2], range(4)] = -1\n", - "print(x)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 3.3 Masking\n", - "We have seen indexing using arrays of integers, where the integer numbers pointed to given elements. Another approach is indexing using boolean arrays representing a binary mask. Such a mask must have the same shape as indexed array, or it must match along the first dimensions (where the last dimensions are taken as is). A mask array can only contain boolean values ``True`` and ``False``, otherwise it would be interpreted as indexing by an integer array.\n", - "\n", - "Masking can be combined with traditional indexing/slicing and indexing using integer arrays. However, the mask must have the same shape as that dimension(s) for which we are using the mask.\n", - "\n", - "Masking is especially useful when you want to access those elements in an array which satisfy certain condition. E.g. You might want to access all the elements bigger then a given threshold. Comparison operators (`<`, `>`, `==`, `>=`, `<=`) and other NumPy functions can be used to compare an array to a given value and get a binary mask.\n", - "\n", - "\"slicing\"" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Create a 2D array, which will be used in the following cells, an print it out.\n", - "x = np.arange(1, 31).reshape((5, 6))\n", - "print(f'Array x:\\n{x}\\n')" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "### Creating the mask manually.\n", - "# Create a mask corresponding to the red squares.\n", - "mask = np.zeros((5, 6), dtype=bool)\n", - "mask[0, 1:4] = True\n", - "mask[2, 2] = True\n", - "mask[3, :2] = True\n", - "mask[-1, -2:] = True\n", - "print(f'mask:\\n{mask}\\n')\n", - "\n", - "# Select the values using a mask\n", - "red = x[mask]\n", - "print(f'red:\\n{red}\\n')\n", - "\n", - "# Combining traditional indexing/slicing and masking - select the green\n", - "# columns. Not that the mask is a 1D array whose size is the\n", - "# same as the size of the corresponding dimension of the original \n", - "# array `x`.\n", - "mask = np.array([True, False, False, False, False, True])\n", - "green = x[:, mask]\n", - "print(f'green:\\n{green}\\n')" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "### Creating the mask using comparison operators.\n", - "\n", - "# Extract the values larger than 26.\n", - "mask = x > 26\n", - "sel = x[mask]\n", - "print(f'mask:\\n{mask}\\n')\n", - "print(f'bigger than 26:\\n{sel}\\n')\n", - "\n", - "# Extract the odd values.\n", - "mask = (x % 2) == 1\n", - "sel = x[mask]\n", - "print(f'mask:\\n{mask}\\n')\n", - "print(f'odd:\\n{sel}\\n')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 3.4 Structural Indexing\n", - "Finally, NumPy introduces an object ``np.newaxis`` and an *ellipsis* syntax to facilitate indexing/reshaping.\n", - "\n", - "``np.newaxis`` can be used within square brackets to create a new empty axis. E.g. if we have a 1D array of E elements and we want to make it a column vector explicitly, i.e. a matrix with E rows and 1 column, ``np.newaxis`` object comes in handy. (Note that ``np.newaxis`` is in fact defined as ``None``, therefore you can use ``None`` instead.)\n", - "\n", - "```python\n", - ">>> col_vec = np.array([1, 2, 3])\n", - ">>> col_vec.shape\n", - " (3, )\n", - ">>> col_vec = col_vec[:, np.newaxis] # or col_vec[:, None]\n", - ">>> col_vec.shape\n", - " (3, 1)\n", - "```\n", - "\n", - "``ellipsis`` operator ``...`` stands for \"as many as needed\" consecutive symbols ``:`` used when slicing a multidimensional array.\n", - "\n", - "```python\n", - ">>> x = np.ones((3, 4, 5, 6))\n", - ">>> x.shape\n", - " (3, 4, 5, 6)\n", - ">>> a = x[0, :, :, 3]\n", - ">>> b = x[0, ..., 3]\n", - ">>> np.allclose(a, b)\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 3.5 Exercises\n", - "\n", - "Using only standard indexing/slicing, extract the subarrays as depicted in the Figure below.\n", - "\n", - "\"slicing\"" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "### Using _only_ standard indexing and slicing, select the red, blue and green \n", - "# subarrays from the 3D array depicted above.\n", - "\n", - "# Create a 2D array and print it out.\n", - "x = np.arange(1, 31).reshape((5, 6))\n", - "print(f'Array x:\\n{x}\\n')\n", - "\n", - "# Select the subarrays\n", - "\n", - "red = None # <<< YOUR CODE HERE\n", - "print(f'red:\\n{red}\\n')\n", - "\n", - "green = None # <<< YOUR CODE HERE\n", - "print(f'green:\\n{green}\\n')\n", - "\n", - "blue = None # <<< YOUR CODE HERE\n", - "print(f'blue:\\n{blue}\\n')\n", - "\n", - "# Bonus: Come up with indexing which selects from x the following submatrix:\n", - "# [[29, 28], \n", - "# [11, 10]].\n", - "\n", - "bonus = None # <<< YOUR CODE HERE\n", - "print(f'bonus:\\n{bonus}\\n')\n", - "\n", - "# Check the results:\n", - "assert(np.allclose(red, np.array([[1, 6], [7, 12], [13, 18], [19, 24], [25, 30]])))\n", - "assert(np.allclose(green, np.array([15, 16, 17])))\n", - "assert(np.allclose(blue, np.array([[2, 3], [14, 15], [26, 27]])))\n", - "assert(np.allclose(bonus, np.array([[29, 28], [11, 10]])))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "We will move forward with the exercise session for now, but there are more exercises about indexing using list of indices and masking at the end of the exercise. We encourage you to do them all when you get to the end, as these concepts will keep reocurring in the upcoming exercises." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 4 Iterating\n", - "\n", - "An N dimensional array can be expressed as a list of N-1 dimensional arrays. \n", - "\n", - "For instance, a (2D) matrix ``x = np.ones((2, 3))`` can be thought of as a list of (1D) vectors of lenght 3. As you have seen in Section 3.1, we can access, say, the 2nd row as ``x[1, :]`` which is, however, equivalent to ``x[1]`` (i.e. omitting the ``:`` symbol referring to \"all the values in this dimension\").\n", - "\n", - "Similarly, a 3D array ``x = np.ones((4, 2, 3))`` can be thought of as a list of (2D) matrices of shape (2, 3). Again, we can access, say, the 1st matrix as ``x[0, :, :]``, which is equivalent to ``x[0]``.\n", - "\n", - "You have seen how to iterate through an array (Python list) using ``for``-loop or ``while``-loop in the exercise 1. You can use the same strategy with NumPy arrays as well. I.e. treat an N dimensional array as a list of N-1 dimensional arrays.\n", - "\n", - "Note that for many operations it is preferable _not_ to use an explicit ``for`` or ``while`` loop as the same computation can be usually achieved orders of magnitude faster using so called **vectorization** which will be introduced later. However, explicit iteration still comes in handy at times so it is useful to know how to use it." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Let us create a 3D array, iterate through it's slices, i.e. matrices, and \n", - "# find the trace of every matrix.\n", - "x = np.random.uniform(0, 10, (5, 10, 10))\n", - "\n", - "for i, matrix in enumerate(x):\n", - " print(f'Trace of matrix {i}: {np.trace(matrix)}')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 5 Concatenating, Stacking, Splitting\n", - "\n", - "Arrays can be **concatenated** (i.e. glueing the arrays while keeping the number of dimensions) and **stacked** (gluing the arrays along a newly created dimension). **Splitting** is the counterpart operation to concatenating.\n", - "\n", - "All of the **concatenated** arrays must have the same shape along all the dimensions except the one along which we concatenate. E.g. we can stack two matrices of shapes (4, 2) and (4, 5) along *axis 1* to get a new matrix of shape (4, 7).\n", - "\n", - "All of the **stacked** arrays must have exactly the same shape, the size of the newly created dimensions correspond to the number of stacked arrays. E.g. we can stack 2 matrices of shapes (4, 3) and (4, 3) along the newly created dimension *axis 0* to get a 3D array of shape (2, 4, 3).\n", - "\n", - "The axis for concatenation or stacing is specified using an argument ``axis``.\n", - "\n", - "See the examples below." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "### Concatenating.\n", - "\n", - "# Concatenate a couple of matrices vertically.\n", - "m1 = np.array([[1, 2, 3], [4, 5, 6]])\n", - "m2 = np.array([[10, 20, 30], [40, 50, 60], [70, 80, 90]])\n", - "m3 = np.array([[100, 200, 300]])\n", - "m_cat = np.concatenate([m1, m2, m3], axis=0)\n", - "print(m_cat)\n", - "\n", - "m_cat_error = np.concatenate([m1, m2, m3], axis=1) # This will fail, study the error message." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "### Stacking\n", - "\n", - "# Stack a couple of matrices to create a 3D array.\n", - "m1 = np.array([[1, 2], [4, 5]])\n", - "m2 = np.array([[10, 20], [40, 50]])\n", - "m3 = np.array([[100, 200], [400, 500]])\n", - "\n", - "# We can stack along any of axes 0, 1, 2. Stacking along different\n", - "# axis results in \"rotating\" our newly created 3D cube.\n", - "m_stack_0 = np.stack([m1, m2, m3], axis=0)\n", - "m_stack_1 = np.stack([m1, m2, m3], axis=1)\n", - "m_stack_2 = np.stack([m1, m2, m3], axis=2)\n", - "\n", - "print(m_stack_0.shape)\n", - "print(m_stack_1.shape)\n", - "print(m_stack_2.shape)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 5.1 Exercises\n", - "\n", - "Study the documentation for function [``np.split`` (documentation)](https://numpy.org/doc/stable/reference/generated/numpy.split.html) and use it to solve the following exercise." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "## Create a 3D array of increasing sequence of even numbers (starting from 0) of shape (10, 5, 7).\n", - "\n", - "x = None # <<< YOUR CODE HERE\n", - "print(f'x:\\n{x}\\n')\n", - "\n", - "## Split the array into 5 arrays each of the shape (2, 5, 7)\n", - "\n", - "splits_5 = None # <<< YOUR CODE HERE\n", - "\n", - "## Split the array into 2 arrays of shapes (10, 2, 7) and (10, 3, 7)\n", - "\n", - "splits_2 = None # <<< YOUR CODE HERE\n", - "\n", - "# Check the answers.\n", - "assert((np.unique(x).size == 10 * 5 * 7) and np.all(x % 2 == 0) and np.min(x) == 0 and np.max(x) == 698)\n", - "assert(len(splits_5) == 5 and np.allclose(np.concatenate(splits_5, axis=0), x))\n", - "assert(len(splits_2) == 2 and np.allclose(np.concatenate(splits_2, axis=1), x))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 6 Basic Arithmetic Operators, Linear Algebra\n", - "\n", - "Basic arithmetic operators `+`, `-`, `*`, `/`, `//`, `**`, `%` are applied element-wise as long as one of the operands is a scalar or both operands are arrays of the same shape. If the two arrays are not the same shape, **broadcasting** will be applied (see Section 7 Broadcasting).\n", - "\n", - "Here are the most common linear algebra operators which you will mostly use for vectors (1D arrays) and matrices (2D arrays):\n", - "- `np.matmul` - Scalar product, vector-matrix or matrix-matrix multiplication (very similar to `np.dot`, read more about it [here](https://numpy.org/doc/stable/reference/generated/numpy.matmul.html)).\n", - "- `@` - The same as `np.matmul`, syntactic sugar.\n", - "- [`np.linalg.inv`( documentation)](https://numpy.org/doc/stable/reference/generated/numpy.linalg.inv.html) - Matrix inversion.\n", - "- [`np.linalg.norm` (documentation)](https://numpy.org/doc/stable/reference/generated/numpy.linalg.norm.html) - Norm computation (L2 norm by default).\n", - "- [`np.linalg.solve` (documentation)](https://numpy.org/doc/stable/reference/generated/numpy.linalg.solve.html) - Numerically stable solution to a system of linear equations given as Ax = b.\n", - "- `x.T` - Transposition.\n", - "\n", - "See the examples below." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "### Arithmetic operations.\n", - "\n", - "# When used for scalar and array operands, the scalar is applied to every element of an array regardless of its shape.\n", - "x = np.zeros((2, 2))\n", - "print(f'x:\\n{x}\\n')\n", - "x += 1\n", - "print(f'x + 1:\\n{x}\\n')\n", - "\n", - "# When used for two array operands, the operator is applies to their corresponding values pair-wise.\n", - "x1 = np.arange(10).reshape((2, 5))\n", - "x2 = np.zeros((2, 5))\n", - "print(f'x1:\\n{x1}\\n')\n", - "print(f'x1:\\n{x2}\\n')\n", - "x2min1 = x2 - x1\n", - "print(f'x2 - x1:\\n{x2min1}\\n')" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "### Linear algebra.\n", - "\n", - "## Dot product of two orthogonal vectors.\n", - "v1 = np.array([0.0893, 0.9332, 0.3481])\n", - "v2 = np.array([-0.6949, -0.1920, 0.6930])\n", - "v_dot = np.dot(v1, v2)\n", - "\n", - "# If they are orthogonal, their dot product should be close to 0.\n", - "print('v1 and v2 are orthogonal: {}'.format(\n", - " ('FALSE', 'TRUE')[int(np.isclose(v_dot, 0., atol=1e-5))]))\n", - "\n", - "## Matrix multiplication.\n", - "m1 = np.eye(3)\n", - "m2 = np.random.uniform(-10., 10., (3, 8))\n", - "m_mult = m1 @ m2\n", - "\n", - "# m1 is an identity matrix, therefore the matrix multiplication with \n", - "# any matrix M will produce the same matrix M.\n", - "print('m_mult is the same as m2: {}'.format(\n", - " ('FALSE', 'TRUE')[np.allclose(m2, m_mult)]))\n", - "\n", - "## Solve a linear system Ax = b.\n", - "# All the coefficients are random so it is extremely unlikely that we would\n", - "# generate a rank deficient matrix A and therefore the system of linear\n", - "# equations will have a solution.\n", - "A = np.random.uniform(-1., 1., (10, 10))\n", - "b = np.random.uniform(-1., 1., (10, ))\n", - "x = np.linalg.solve(A, b)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 6.1 Exercises" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "## Generate a matrix of shape (100, 100) filled with Euler's number. You cannot use np.full.\n", - "\n", - "eul = None # <<< YOUR CODE HERE\n", - "print(f'eul:\\n{eul}\\n')\n", - "\n", - "## Generate a 1D array of length 10 of powers of 2, i.e. [2^0, 2^1, ..., 2^9]\n", - "\n", - "pows = None # <<< YOUR CODE HERE\n", - "print(f'pows:\\n{pows}\\n')\n", - "\n", - "## Check the answers:\n", - "assert(np.allclose(eul, np.stack([[2.71828182] * 100] * 100, axis=0)))\n", - "assert(np.allclose(pows, [2**0, 2**1, 2**2, 2**3, 2**4, 2**5, 2**6, 2**7, 2**8, 2**9]))" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Helper function to print an arrow.\n", - "def plot_arrow(pts, clr):\n", - " plt.plot(*pts[:2].T, color=clr, marker='*')\n", - " plt.plot(*pts[1:3].T, color=clr, marker='*')\n", - " plt.plot(*pts[[1, 3], :].T, color=clr, marker='*')\n", - "\n", - "## The array 'arrow' contains 4 2D points defining a blue arrow. The objective\n", - "## is to make the arrow 2 times shorter and thinner and rotate it by 45 degrees \n", - "## counter-clockwise. \n", - "## First you will rotate the arrow by multiplying the points with the rotation \n", - "## matrix, where the rotation matrix stands on the left.\n", - "## Then, you will scale the arrow by multiplying the previous result with the scale matrix\n", - "## where the scale matrix stands on the left.\n", - "\n", - "## Hint: You will need to do some transpose operations.\n", - "\n", - "arrow = np.array([[ 0., 0.,], \n", - " [ 0., 2.], \n", - " [-0.5, 1.5], \n", - " [ 0.5, 1.5]])\n", - "angle = np.pi / 4.\n", - "\n", - "rot = np.array([[np.cos(angle), -np.sin(angle)], \n", - " [np.sin(angle), np.cos(angle)]])\n", - "\n", - "scale = np.array([[0.5, 0.], \n", - " [0., 0.5]])\n", - "\n", - "arrow_sr = None # <<< YOUR CODE HERE\n", - "\n", - "# Plot the arrows.\n", - "plt.figure(figsize=(5, 5))\n", - "plt.xlim(-3, 3)\n", - "plt.ylim(-3, 3)\n", - "plot_arrow(arrow, 'b')\n", - "plot_arrow(arrow_sr, 'r')\n", - "\n", - "## Check the answers.\n", - "assert(np.allclose(arrow_sr, np.array([[ 0. , 0. ],\n", - " [-0.70710678, 0.70710678],\n", - " [-0.70710678, 0.35355339],\n", - " [-0.35355339, 0.70710678]])))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 7 Broadcasting\n", - "\n", - "Broadcasting allows for performing arithmetic and other operations on arrays of different shape, where the smaller is \"broadcast\" over the larger array. For instance, adding a column vector *v* to a matrix *M*, M + v, will effectively take every column of the matrix and add the vector *v* element-wise.\n", - "\n", - "Broadcasting further allows for so called **vectorization**, i.e. performing a given operation in parallel where the actual looping occurs in highly-optimized C code rather than in Python, where looping is slow.\n", - "\n", - "Example:\n", - "\n", - "```python\n", - ">>> a = np.arange(6).reshape((2, 3)) # shape (2, 3)\n", - "array([[0, 1, 2],\n", - " [3, 4, 5]])\n", - "\n", - ">>> b = np.array([10, 20, 30]) # shape (3, )\n", - "array([10, 20, 30])\n", - "\n", - ">>> a + b\n", - "array([[10, 21, 32],\n", - " [13, 24, 35]]) # shape (2, 3)\n", - "```\n", - "\n", - "### 7.1 Broadcasting Rules\n", - "\n", - "The corresponding dimensions of the 2 arrays must satisfy one of the following:\n", - "- Have the same dimensions.\n", - "- One of the dimensions is 1.\n", - "\n", - "Furthermore, non-existent dimensions are treated as 1.\n", - "\n", - "Here are a couple of examples of the input and output shapes to a binary operation (such as `+`) being applied on 2 arrays *A* and *B*:\n", - "\n", - "\n", - "\n", - "**Note:** Do not confuse the concept of _vectorization_ with NumPy's function `np.vectorize`, which is provided for programming convenience, not for performance and thus does not guranatee the actual vectorization of an operation.\n", - "\n", - "If the concept is not clear, you can read more about broadcasting [here](https://numpy.org/devdocs/user/basics.broadcasting.html).\n", - "\n", - "Go through the examples below and try to understand how the arrays are constructed and computed." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "## Manual Python looping vs. vectorization - Multiplying a \n", - "# matrix by a vector row-wise.\n", - "m = np.arange(15).reshape(5, 3)\n", - "v = np.array([0, 4, 2])\n", - "m_loop = np.copy(m)\n", - "m_vect = np.copy(m)\n", - "\n", - "# Python loop.\n", - "for i in range(m.shape[0]):\n", - " m_loop[i] *= v\n", - "\n", - "# Vectorization.\n", - "m_vect *= v\n", - "\n", - "# Check that both results are the same.\n", - "assert(np.allclose(m_loop, m_vect))\n", - "\n", - "## Generate a matrix where each row holds a constant value \n", - "## which increases throughout the rows.\n", - "seq_mat = np.ones((5, 3)) * np.arange(5).reshape((-1, 1))\n", - "print(f'seq_mat:\\n{seq_mat}\\n')\n", - "\n", - "m = np.random.randint(0, 10, (4, 5))\n", - "print(f'm:\\n{m}\\n')\n", - "\n", - "## Add a vector to a matrix row-wise (horizontally).\n", - "add_rw = np.array([10, 20, 30, 40, 50])\n", - "m_add_rw = m + add_rw\n", - "print(f'm_add_rw:\\n{m_add_rw}\\n')\n", - "\n", - "## Add a vector to a matrix column-wise (vertically).\n", - "add_cw = np.array([10, 20, 30, 40]).reshape((-1, 1))\n", - "m_add_cw = m + add_cw\n", - "print(f'm_add_cw:\\n{m_add_cw}\\n')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 7.2 Exercises" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "## Given a matrix 'data' defined below, Compute a matrix data_pow, \n", - "## where a value in each column is taken to the power of its column\n", - "## index.\n", - "\n", - "data = np.random.uniform(0, 5, (4, 5))\n", - "data_pow = None # <<< YOUR CODE HERE\n", - "print(f'data_pow:\\n{data_pow}\\n')\n", - "\n", - "## Generate a matrix of shape (5, 4), where each row is an integer \n", - "## sequence starting from 0 with an increment of a row index i. E.g.\n", - "## the first 3 rows would be:\n", - "##\n", - "## [0*0, 0*0, 0*0, 0*0] [0, 0, 0, 0]\n", - "## [0*1, 1*1, 2*1, 3*1] = [0, 1, 2, 3]\n", - "## [0*2, 1*2, 2*2, 3*2] [0, 2, 4, 6]\n", - "##\n", - "## You can use 1D arrays only and the rules of broadcasting.\n", - "# Hint: Use np.arange\n", - "\n", - "seqs = None # <<< YOUR CODE HERE\n", - "print(f'seqs:\\n{seqs}\\n')\n", - "\n", - "## Check the results:\n", - "data_pow_gt = np.copy(data)\n", - "for i in range(data.shape[1]):\n", - " data_pow_gt[:, i] = data_pow_gt[:, i] ** i\n", - "assert(np.allclose(data_pow, data_pow_gt))\n", - "assert(np.allclose(seqs, np.array([[ 0, 0, 0, 0],\n", - " [ 0, 1, 2, 3],\n", - " [ 0, 2, 4, 6],\n", - " [ 0, 3, 6, 9],\n", - " [ 0, 4, 8, 12]])))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 8 Common NumPy Functions\n", - "\n", - "NumPy offers plethora of functions to perform computations with N dimensional arrays. One of the common concepts is that these functions would accept an argument `axis` using which you can specify along which axis the operation should be performed. \n", - "\n", - "For example, given a matrix `A = np.array([[1, 2, 3], [4, 5, 6]])`, we might want to find a sum of values over rows and columns:\n", - "\n", - "\n", - "\n", - "```python\n", - ">>> sum_per_row = np.sum(A, axis=1)\n", - "array([6, 15])\n", - "\n", - ">>> sum_per_col = np.sum(A, axis=0)\n", - "array([5, 7, 9])\n", - "```\n", - "\n", - "Among the most common functions you might need the following: \n", - "- [`np.sum` (documentation)](https://numpy.org/doc/stable/reference/generated/numpy.sum.html)\n", - "- [`np.prod` (documentation)](https://numpy.org/doc/stable/reference/generated/numpy.prod.html) \n", - "- [`np.mean` (documentation)](https://numpy.org/doc/stable/reference/generated/numpy.mean.html)\n", - "- [`np.std` (documentation)](https://numpy.org/doc/stable/reference/generated/numpy.std.html)\n", - "- `np.min`\n", - "- `np.max`\n", - "- [`np.argmin` (documentation)](https://numpy.org/doc/stable/reference/generated/numpy.argmin.html)\n", - "- [`np.argmax` (documentation)](https://numpy.org/doc/stable/reference/generated/numpy.argmax.html)\n", - "- [`np.sort` (documentation)](https://numpy.org/doc/stable/reference/generated/numpy.sort.html)\n", - "- `np.abs` \n", - "- [`np.sqrt` (documentation)](https://numpy.org/doc/stable/reference/generated/numpy.sqrt.html)\n", - "- [`np.unravel_index` (documentation)](https://numpy.org/doc/stable/reference/generated/numpy.unravel_index.html).\n", - "\n", - "Please study their corresponding reference pages and fill in the following exercises." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 8.1 (OPTIONAL) Exercises" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "## Given a matrix M, find the product of the values in each row.\n", - "M = np.arange(12).reshape((4, 3))\n", - "\n", - "sm = None # <<< YOUR CODE HERE\n", - "\n", - "# Check the results.\n", - "assert(np.allclose(sm, np.array([0, 60, 336, 990])))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Write a function `rescale` which takes as input an array $x \\in \\mathbb R ^ {N \\times M}$ and scalars $a np.ndarray:\n", - " \"\"\" Rescales the input from range [min(x), max(x)] to range [a, b].\n", - " \n", - " Args:\n", - " x (np.ndarray): Input array, shape (N, M).\n", - " a (float): Lower bound of the output range.\n", - " b (float): Upper bound of the output range.\n", - " \n", - " Returns:\n", - " np.ndarray: Rescaled array, shape (N, M).\n", - " \"\"\"\n", - " y = None # <<< YOUR CODE HERE\n", - "\n", - " return y\n", - "\n", - "# Test.\n", - "x = 2 * np.random.rand(3, 2) - 1\n", - "a, b = 1, 3\n", - "y = rescale(x, a, b)\n", - "\n", - "assert(np.isclose(np.min(y), a) and np.isclose(np.max(y), b))\n", - "\n", - "print(f'Input array:\\n{x}\\n')\n", - "print(f'Required range: {a, b}\\n')\n", - "print(f'Output array:\\n{y}\\n')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Write function `find_closest`, which given scalar $u \\in \\mathbb R$ and input array $x \\in \\mathbb R ^ {N \\times M}$, returns the closest element to the scalar in the array $x_{i^*,j^*} : (i^*,j^*)=\\text{argmin}_{i,j} | x_{i,j} - u |$. $\\text{argmin}$ is the operation that finds the index of the minimum element.\n", - "\n", - "Hint: Use `np.abs`, [`np.argmin` (documentation)](https://numpy.org/doc/stable/reference/generated/numpy.argmin.html), [`np.unravel_index` (documentation)](https://numpy.org/doc/stable/reference/generated/numpy.unravel_index.html)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def find_closest(x: np.ndarray, u: float) -> float:\n", - " \"\"\" Finds the closest element to `u` in `x`.\n", - " \n", - " Args:\n", - " x (np.ndarray): Input array, shape (N, M).\n", - " u (float): A value to which the closest element in x is searched for.\n", - " \n", - " Returns:\n", - " float: Closest element in `x` to `u`.\n", - " \"\"\"\n", - " closest = None # <<< YOUR CODE HERE\n", - "\n", - " return closest\n", - "\n", - "# Test.\n", - "x = np.arange(11)\n", - "u = np.random.uniform(0, 10)\n", - "x_ij = find_closest(x, u)\n", - "\n", - "assert(x_ij == x[int(round(u))] )\n", - "\n", - "print(f'The closest element to {u:.3f} within {x} is {x_ij}')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Write a function `z_score_normalize` which takes as input an array $x \\in \\mathbb R ^ {N \\times M}$ and returns an output array $y \\in \\mathbb R ^ {N \\times M}$ such that $\\mathbb E [y]= 0$ and $\\sigma [y] = 1$, where $\\sigma[y]$ is a standard deviation of $y$.\n", - "\n", - "Hint: Use [`np.mean` (documentation)](https://numpy.org/doc/stable/reference/generated/numpy.mean.html) [`np.std` (documentation)](https://numpy.org/doc/stable/reference/generated/numpy.std.html)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def z_score_normalize(x: np.ndarray) -> np.ndarray:\n", - " \"\"\" Normalizes the input x so that its mean is 0 and std is 1.\n", - " \n", - " Args:\n", - " x (np.ndarray): Input array, shape (N, M).\n", - " \n", - " Returns:\n", - " np.ndarray: Normalized array, shape (N, M).\n", - " \"\"\"\n", - " normalized = None # <<< YOUR CODE HERE\n", - " \n", - " return normalized\n", - "\n", - "# Test.\n", - "x = 2 * np.random.rand(3, 2) - 1\n", - "y = z_score_normalize(x)\n", - "\n", - "assert(np.isclose(np.mean(y), 0))\n", - "assert(np.isclose(np.std(y), 1.))\n", - "\n", - "print(f'Mean/std for input array\\n{x}\\nis: {np.mean(x):.3f}/{np.std(x):.3f}\\n')\n", - "print(f'Mean/std for normalized array\\n{y} is:\\n{np.mean(y):.3f}/{np.std(y):.3f}')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 9 Shallow and Deep Copy\n", - "\n", - "A simple assignment makes no copy of the underlying data, have a look at the following example:\n", - "\n", - "```python\n", - ">>> a = np.array([1., 2., 3.])\n", - ">>> b = a\n", - ">>> b[1] = 1.602e-19\n", - ">>> print(a, b)\n", - "array([1, 1.602e-19, 3])\n", - "array([1, 1.602e-19, 3])\n", - "```\n", - "\n", - "Assigning a slice of an array to a new array works with the very same data as well, i.e. no copy is made:\n", - "\n", - "```python\n", - ">>> a = np.array([1., 2., 3.])\n", - ">>> b = a[:2]\n", - ">>> b[0] = 6.626e-34\n", - ">>> print(a, b)\n", - "array([6.626e-34, 2, 3])\n", - "array([6.626e-34, 2])\n", - "```\n", - "\n", - "In order to truly copy the data, we need to make a deep copy using the NumPy function [`np.copy` (documentation)](https://numpy.org/doc/stable/reference/generated/numpy.copy.html):\n", - "\n", - "```python\n", - ">>> a = np.array([1., 2., 3.])\n", - ">>> b = a.copy()\n", - ">>> b[2] = 9.807\n", - ">>> print(a, b)\n", - "array([1, 2, 3])\n", - "array([2, 2, 9.807])\n", - "```\n", - "\n", - "\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 10 Random\n", - "\n", - "The module [`np.random` (documentation)](https://numpy.org/doc/stable/reference/random/index.html) implements pseudo-random number generators (RNG) for various distributions. We will frequently use it to randomly sample our data or to randomly initialize the parameters of our models.\n", - "\n", - "In order to fix the RNG and thus to be able to reproduce the computation, we can set the so called *seed*. Setting the seed guarantees that the same sequence of numbers will be generated by the RNG in each run.\n", - "\n", - "```python\n", - ">>> constant = 3 # Any we want\n", - ">>> np.random.seed(constant)\n", - "```\n", - "\n", - "Here are some of the functions we will be using most frequently:\n", - "- np.random.randint\n", - "- np.random.shuffle\n", - "- np.random.uniform\n", - "- np.random.randn\n", - "- np.random.permutation\n", - "\n", - "Please study their respective [reference pages](https://docs.scipy.org/doc/numpy-1.16.0/reference/routines.random.html) and complete the exercises below." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 10.1 (OPTIONAL) Exercises" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "## Create a 1D array of randomly permuted sequence from 0 to 10.\n", - "x_perm = None # <<< YOUR CODE HERE\n", - "print(f'Permuted array:\\n{x_perm}\\n')\n", - "\n", - "## Check the answers.\n", - "assert(np.unique(x_perm).shape == (11, ) and np.min(x_perm) == 0 and np.max(x_perm) == 10)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 11 (OPTIONAL) Extra Indexing Exercises" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Using indexing by integer arrays, extract the subarrays as depicted in the Figure below.\n", - "\n", - "\"slicing\"" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "### Using a combination of standard indexing/slicing and an indexing by an array \n", - "# of indices, select the following subarrays.\n", - "\n", - "# Create a 2D array an print it out.\n", - "x = np.arange(1, 31).reshape((5, 6))\n", - "print(f'Array x:\\n{x}\\n')\n", - "\n", - "# Select the subarrays.\n", - "\n", - "red = None # <<< YOUR CODE HERE\n", - "print(f'red:\\n{red}\\n')\n", - "\n", - "green = None # <<< YOUR CODE HERE\n", - "print(f'green:\\n{green}\\n')\n", - "\n", - "blue = None # <<< YOUR CODE HERE\n", - "print(f'blue:\\n{blue}\\n')\n", - "\n", - "# Bonus: Come up with indexing which results in matrix `x` being stacked 3 times \n", - "# horizontally, i.e. the resulting shape is (15, 6).\n", - "# Hint: Use functions range and list, use operator * with a list.\n", - "\n", - "bonus = None # <<< YOUR CODE HERE\n", - "print(f'bonus:\\n{bonus}\\n')\n", - "\n", - "# Check the results:\n", - "assert(np.allclose(red, np.arange(1, 30, 7)))\n", - "assert(np.allclose(green, np.array([[5, 6], [11, 12], [23, 24]])))\n", - "assert(np.allclose(blue, np.array([19, 25, 26])))\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Using masking, extract the subarray as depicted in the Figure below.\n", - "\n", - "\"slicing\"" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "### Construct a mask and then extract the red values above the main diagonal.\n", - "\n", - "## Extract the red values.\n", - "mask = None # <<< YOUR CODE HERE\n", - "print(f'mask:\\n{mask}\\n')\n", - "\n", - "red = None # <<< YOUR CODE HERE\n", - "print(f'red:\\n{red}\\n')\n", - "\n", - "## Bonus: Extract all the values, which are not red.\n", - "## Hint: Use your already constructed `mask`, study NumPy's binary operations\n", - "## https://docs.scipy.org/doc/numpy/reference/routines.bitwise.html\n", - "\n", - "the_rest = None # <<< YOUR CODE HERE\n", - "print(f'the_rest:\\n{the_rest}\\n')\n", - "\n", - "## Bonus2: Extract the values which can be divided by both 2 and 3.\n", - "## Hint: Construct two masks and combine them using NumPy's binary operators.\n", - "\n", - "div23 = None # <<< YOUR CODE HERE\n", - "print(f'div23:\\n{div23}\\n')\n", - "\n", - "# Check the results:\n", - "assert(np.allclose(red, np.concatenate(\n", - " [range(2, 7), range(9, 13), range(16, 19), range(23, 25), [30]])))\n", - "assert(np.allclose(np.sort(np.concatenate([red, the_rest])), np.arange(1, 31)))\n", - "assert(not np.any(np.fmod(div23, 2.)) and not np.any(np.fmod(div23, 3.)))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 12 Next Steps\n", - "\n", - "Feel free to consult more thorough NumPy tutorials.\n", - "\n", - "- NumPy basics: [https://docs.scipy.org/doc/numpy/user/basics.html](https://docs.scipy.org/doc/numpy/user/basics.html)\n", - "- Official NumPy tutorial: [https://docs.scipy.org/doc/numpy/user/quickstart.html](https://docs.scipy.org/doc/numpy/user/quickstart.html)" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.8.10" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/Exercises/02-numpy/.ipynb_checkpoints/numpy_basics_Sol-checkpoint.ipynb b/Exercises/02-numpy/.ipynb_checkpoints/numpy_basics_Sol-checkpoint.ipynb deleted file mode 100644 index 1f40a1a..0000000 --- a/Exercises/02-numpy/.ipynb_checkpoints/numpy_basics_Sol-checkpoint.ipynb +++ /dev/null @@ -1,1308 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# NumPy Basics - Solutions" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "
\n", - "\n", - "This notebook was developed for the CS-233 Introduction to Machine Learning course at EPFL, adapted for the CIVIL-226 Introduction to Machine Learning for Engineers course, and re-adapted for the ME-390.\n", - "We thank contributers in CS-233 ([CVLab](https://www.epfl.ch/labs/cvlab)) and CIVIL-226 ([VITA](https://www.epfl.ch/labs/vita/)).\n", - " \n", - " \n", - "**Author(s):** [Jan Bednarík](mailto:jan.bednarik@epfl.ch), minor changes by [Tom Winandy](mailto:tom.winandy@epfl.ch)\n", - "
\n", - "\n", - "In this exercise we will work with a popular Python library for scientific computing with N-dimensional arrays - NumPy. You will see again some of the concepts introduced last week, such as indexing and slicing the lists, but NumPy adds multiple new concepts, namely broadcasting, vectorization, indexing using masking and wide range of functions to work with the arrays, which you will learn to use today. This exercise is quite long and you might not be able to finish it during the exercise sessions. However, the introduced concepts will be used during the following weeks so we would like to encourage you to take an extra time and try to finish the whole exercise before next week, since getting familiar with NumPy will pay-off when working on following exercise (and possibly in other courses relying on NumPy as well). Let's get started!\n", - "\n", - "In the exercises you will be often referred to NumPy functions which you should use. Please inspect the [NumPy reference/documentation](https://docs.scipy.org/doc/numpy/reference/) and find out how to use the functions.\n", - "\n", - "## 1 About NumPy\n", - "\n", - "### NumPy\n", - "\n", - "NumPy is a core library for scientific computing in Python. It offers high-performance multidimensional array computation capabilities. Furthermore, Python provides wide ecosystem of libraries that take NumPy arrays as input.\n", - "\n", - "### NumPy Arrays\n", - "\n", - "NumPy arrays are high-performance homogeneous (= all elements of the same type) multidimensional arrays (think of an N dimensional grid). They are indexed by a tuple of integers. Indexing syntax is similar to lists, tuples, and dictionaries, but NumPy adds some more fancy indexing tools.\n", - "\n", - "Let us start with importing NumPy. By convention, it is imported as ``np``." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import numpy as np\n", - "\n", - "# Let us also import plotting library\n", - "import matplotlib.pyplot as plt " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 2 Working with Arrays\n", - "\n", - "### 2.1 Creating Arrays\n", - "\n", - "Two most common ways of creating NumPy arrays are\n", - "1. Converting array-like Python objects (e.g. lists, tuples) using the function [`np.array` (reference/documentation)](https://numpy.org/doc/stable/reference/generated/numpy.array.html).\n", - "2. Calling one of the built-in functions provided by NumPy.\n", - "\n", - "The following cells introduce the syntax to create the arrays and some common built-in NumPy functions.\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "### Converting Python array-like objects.\n", - "\n", - "# 1D array from list, shape (4, ).\n", - "x_1d = np.array([1, 3, 5, 7])\n", - "\n", - "# 2D array from combination of lists and tuples, shape (3, 3).\n", - "x_2d = np.array([(1, 1, 1), [2, 2, 2], (3, 3, 3)])\n", - "\n", - "# Print the results.\n", - "print(f'x_1d:\\n{x_1d}\\n')\n", - "print(f'x_2d:\\n{x_2d}\\n')" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "### Using built-in functions provided by NumPy.\n", - "\n", - "# 2D array of zeros, 2 rows, 3 columns.\n", - "x_zeros = np.zeros((2, 3))\n", - "\n", - "# 3D array of ones, shape (2, 3, 4) - 2 matrices of 3 rows and 4 columns.\n", - "x_ones = np.ones((2, 3, 4))\n", - "\n", - "# Identity matrix with 4 rows and 4 columns.\n", - "x_identity = np.eye(4)\n", - "\n", - "# Sequence of numbers from 5 to 11 (11 not included) with step 1.\n", - "x_seq = np.arange(5, 11)\n", - "\n", - "# Sequence of ones of the same shape as `x_zeros`.\n", - "x_ones_as_zeros = np.ones_like(x_zeros)\n", - "\n", - "\n", - "# Print the results.\n", - "print(f'x_zeros:\\n{x_zeros}\\n')\n", - "print(f'x_ones:\\n{x_ones}\\n')\n", - "print(f'x_identity:\\n{x_identity}\\n')\n", - "print(f'x_seq:\\n{x_seq}\\n')\n", - "print(f'x_ones_as_zeros:\\n{x_ones_as_zeros}\\n')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 2.2 Data Types\n", - "\n", - "NumPy arrays can be given an explicit data type. Specifying a data type gets useful for instance when using arrays for indexing (integers) or masking (boolean). Full list of supported data types can be found [here](https://docs.scipy.org/doc/numpy/user/basics.types.html).\n", - "\n", - "Data type can be specified when creating an array using an argument ``dtype``, arrays can be also cast to a given datatype using function [``astype`` (documentation)](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.astype.html)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Create an array of 32bit integers.\n", - "x_int = np.array([1, 2, 3, 4, 5], dtype=np.int32)\n", - "\n", - "# Cast integer array to 32 bit float array.\n", - "x_float = x_int.astype(np.float32)\n", - "\n", - "# Print results.\n", - "print(f'Array x_int has data type {x_int.dtype}')\n", - "print(f'Array x_float has data type {x_float.dtype}')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 2.3 Inspecting the Arrays\n", - "\n", - "When working with arrays, it is easy to lose track about current number shape or data type. The properties ``ndim``, ``shape``, ``size``, ``dtype`` facilitate working with arrays and debugging your code. Furthermore, you can also simply print out an array using Python's ``print`` function. " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Crate a 3D array.\n", - "x_3d = np.array([[[1, 2, 3, 4], [4, 7, 1, 9], [0, 4, 6, 8]], \n", - " [[5, 2, 8, 0], [2, 4, 3, 1], [1, 0, 4, 9]]])\n", - "\n", - "# Check the number of dimensions, number of elements, shape, and data type.\n", - "print(f'Number of dimensions: {x_3d.ndim}')\n", - "print(f'Number of elements: {x_3d.size}')\n", - "print(f'Shape: {x_3d.shape}')\n", - "print(f'Data type: {x_3d.dtype}')\n", - "\n", - "# Simply print the array.\n", - "print(x_3d)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 2.4 Reshaping the Arrays\n", - "Arrays can be reshaped using a function [``reshape`` (documentation)](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.reshape.html). Note that the requested shape has to have the same number of elements as the original array.\n", - "\n", - "The shape of an array is given as a tuple of integers representing the number of elements in each dimension. Here a couple of examples of the shapes:\n", - "- () - A 0D array, effectively a scalar.\n", - "- (4, ) - A 1D array (vector) of 4 elements.\n", - "- (3, 4) - A 2D array (matrix) of 3 rows and 4 columns.\n", - "- (2, 3, 4) - A 3D array (block), think of 2 2D matrices each having 3 rows and 4 columns.\n", - "\n", - "When reshaping an array, you can use a value ``-1`` for at most one axis, meaning that the number of elements for that axis will be computed automatically." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Create a 2D array filled with a sequence of numbers.\n", - "x_seq_2d = np.arange(12).reshape(4, 3)\n", - "\n", - "# Create a 3D array filled with ones, last axis computed automatically.\n", - "x_ones_3d = np.ones(8).reshape((2, 2, -1))\n", - "\n", - "# Print the results.\n", - "print(f'x_seq_2d:\\n{x_seq_2d}\\n')\n", - "print(f'x_ones_3d:\\n{x_ones_3d}\\n')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 2.5 Exercises" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "## Create a 1D array of 10 elements of type float32 filled with a value 3.14.\n", - "## Hint: Use np.ones or np.full.\n", - "array_pi = np.full(10, 3.14, dtype=np.float32) \n", - "array_pi = np.ones(10, dtype=np.float32) * 3.14\n", - "print(f'array_pi:\\n{array_pi}\\n')\n", - "\n", - "## Find number of elements in the following array without using `size` property.\n", - "## Hint: Use np.prod.\n", - "x = np.zeros((4, 5, 6, 7, 8))\n", - "num_elements = np.prod(x.shape) \n", - "print(f'Number of elements in x: {num_elements}')\n", - "\n", - "## Reshape the 3D array \"x_unknown\" to a 1D array. Note that you do not know the shape of the array.\n", - "## Hint: You can access the shape property, use the `-1` trick, or function np.ndarray.flatten()\n", - "## (i.e. you have to call it s a function of the array, x.flatten())\n", - "x_unknown = np.zeros(np.random.randint(1, 5, 6))\n", - "# 3 solutions examples:\n", - "x_flat = x_unknown.reshape(np.prod(x_unknown.shape)) \n", - "x_flat = x_unknown.reshape(-1)\n", - "x_flat = x_unknown.flatten()\n", - "print(f'Shape of x_flat: {x_flat.shape}')\n", - "\n", - "# Check the answers:\n", - "assert(array_pi.shape == (10, ) and array_pi.dtype == np.float32 and np.allclose(array_pi, 3.14))\n", - "assert(num_elements == x.size)\n", - "assert(x_flat.ndim == 1)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 3 Accessing Array Elements\n", - "In order to access the values of an array, **indexing** and **slicing** is used the same way you used it to slice Python array-like objects. Since NumPy arrays are N-dimensional, you can use a separate indexing/slicing expression for each axis separately.\n", - "\n", - "NumPy further extends the standard indexing/slicing by the following:\n", - "- indexing using an array of indices\n", - "- indexing using boolean array (i.e. masking).\n", - "- structural indexing\n", - "\n", - "The indexing can be used not only for retrieving the values but also modifying the values in the original array (using the indexed array as an L-value):\n", - "\n", - "- ``selection = x[3:5, 1::3] # Retrieving a value.``\n", - "- ``x[3:5, 1::3] = 3.14 # Replacing the selected values by 3.14``" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 3.1 Standard Indexing and Slicing\n", - "Works the same way as for Python lists, but can be specified separately for every dimension. Use the familiar syntax ``[start : end]`` or ``[start : stop : step]``. When using the range using ``start`` and ``end``, remember that ``start`` is inclusive and ``end`` is exclusive. E.g. writing ``x[2:4]`` will select result in an array of ``[x[2], x[3]]``.\n", - "\n", - "All `start`, `stop` and `step` values can be left out. Missing `start` defaults to `0`, missing `end` defaults to the the index of the last element plus one (remember that ``end`` is exclusive), missing `step` defaults to `1`.\n", - "\n", - "Note that the step can be negative in which case you traverse an array backwards.\n", - "\n", - "The image below depicts a 2D array of the shape (5, 6) and a couple of different indexing strategies. Let us try them out.\n", - "\n", - "\"slicing\"" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Create a 2D array, which will be used in the following cells, an print it out.\n", - "x = np.arange(1, 31).reshape((5, 6))\n", - "print(f'Array x:\\n{x}\\n')\n", - "\n", - "# Access 3 elements in the 1st row.\n", - "orange = x[0, 2:5]\n", - "print(f'orange:\\n{orange}\\n')\n", - "\n", - "# Access the third column.\n", - "red = x[:, 2]\n", - "print(f'red:\\n{red}\\n')\n", - "\n", - "# Access a 2x2 submatrix form the bottom right corner.\n", - "green = x[-2:, -2:]\n", - "print(f'green:\\n{green}\\n')\n", - "\n", - "# Access elements from even indices starting from the 3rd row.\n", - "magenta = x[2::2, ::2]\n", - "print(f'magenta:\\n{magenta}\\n')\n", - "\n", - "# Replace last two rows with zeros.\n", - "x[-2:, :] = 0\n", - "print(x)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 3.2 Indexing by an Array of Indices.\n", - "On top of standard indexing, NumPy also allows for providing a list of integer indices for every axis.\n", - "\n", - "\"slicing\"" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Create a 2D array, which will be used in the following cells, an print it out.\n", - "x = np.arange(1, 31).reshape((5, 6))\n", - "print(f'Array x:\\n{x}\\n')\n", - "\n", - "# Access the 2nd, the 4th and the 5th columns.\n", - "red = x[:, [1, 3, 4]]\n", - "print(f'red:\\n{red}\\n')\n", - "\n", - "# Access the elements from the 2nd and the 3rd rows in a zig-zag fashion.\n", - "magenta = x[[1, 2, 1, 2], range(4)]\n", - "print(f'magenta:\\n{magenta}\\n')\n", - "\n", - "# Replace the violet elemenets with a value -1.\n", - "x[[1, 2, 1, 2], range(4)] = -1\n", - "print(x)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 3.3 Masking\n", - "We have seen indexing using arrays of integers, where the integer numbers pointed to given elements. Another approach is indexing using boolean arrays representing a binary mask. Such a mask must have the same shape as indexed array, or it must match along the first dimensions (where the last dimensions are taken as is). A mask array can only contain boolean values ``True`` and ``False``, otherwise it would be interpreted as indexing by an integer array.\n", - "\n", - "Masking can be combined with traditional indexing/slicing and indexing using integer arrays. However, the mask must have the same shape as that dimension(s) for which we are using the mask.\n", - "\n", - "Masking is especially useful when you want to access those elements in an array which satisfy certain condition. E.g. You might want to access all the elements bigger then a given threshold. Comparison operators (`<`, `>`, `==`, `>=`, `<=`) and other NumPy functions can be used to compare an array to a given value and get a binary mask.\n", - "\n", - "\"slicing\"" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Create a 2D array, which will be used in the following cells, an print it out.\n", - "x = np.arange(1, 31).reshape((5, 6))\n", - "print(f'Array x:\\n{x}\\n')" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "### Creating the mask manually.\n", - "# Create a mask corresponding to the red squares.\n", - "mask = np.zeros((5, 6), dtype=bool)\n", - "mask[0, 1:4] = True\n", - "mask[2, 2] = True\n", - "mask[3, :2] = True\n", - "mask[-1, -2:] = True\n", - "print(f'mask:\\n{mask}\\n')\n", - "\n", - "# Select the values using a mask\n", - "red = x[mask]\n", - "print(f'red:\\n{red}\\n')\n", - "\n", - "# Combining traditional indexing/slicing and masking - select the green\n", - "# columns. Not that the mask is a 1D array whose size is the\n", - "# same as the size of the corresponding dimension of the original \n", - "# array `x`.\n", - "mask = np.array([True, False, False, False, False, True])\n", - "green = x[:, mask]\n", - "print(f'green:\\n{green}\\n')" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "### Creating the mask using comparison operators.\n", - "\n", - "# Extract the values larger than 26.\n", - "mask = x > 26\n", - "sel = x[mask]\n", - "print(f'mask:\\n{mask}\\n')\n", - "print(f'bigger than 26:\\n{sel}\\n')\n", - "\n", - "# Extract the odd values.\n", - "mask = (x % 2) == 1\n", - "sel = x[mask]\n", - "print(f'mask:\\n{mask}\\n')\n", - "print(f'odd:\\n{sel}\\n')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 3.4 Structural Indexing\n", - "Finally, NumPy introduces an object ``np.newaxis`` and an *ellipsis* syntax to facilitate indexing/reshaping.\n", - "\n", - "``np.newaxis`` can be used within square brackets to create a new empty axis. E.g. if we have a 1D array of E elements and we want to make it a column vector explicitly, i.e. a matrix with E rows and 1 column, ``np.newaxis`` object comes in handy. (Note that ``np.newaxis`` is in fact defined as ``None``, therefore you can use ``None`` instead.)\n", - "\n", - "```python\n", - ">>> col_vec = np.array([1, 2, 3])\n", - ">>> col_vec.shape\n", - " (3, )\n", - ">>> col_vec = col_vec[:, np.newaxis] # or col_vec[:, None]\n", - ">>> col_vec.shape\n", - " (3, 1)\n", - "```\n", - "\n", - "``ellipsis`` operator ``...`` stands for \"as many as needed\" consecutive symbols ``:`` used when slicing a multidimensional array.\n", - "\n", - "```python\n", - ">>> x = np.ones((3, 4, 5, 6))\n", - ">>> x.shape\n", - " (3, 4, 5, 6)\n", - ">>> a = x[0, :, :, 3]\n", - ">>> b = x[0, ..., 3]\n", - ">>> np.allclose(a, b)\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 3.5 Exercises\n", - "\n", - "Using only standard indexing/slicing, extract the subarrays as depicted in the Figure below.\n", - "\n", - "\"slicing\"" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "### Using _only_ standard indexing and slicing, select the red, blue and green \n", - "# subarrays from the 3D array depicted above.\n", - "\n", - "# Create a 2D array and print it out.\n", - "x = np.arange(1, 31).reshape((5, 6))\n", - "print(f'Array x:\\n{x}\\n')\n", - "\n", - "# Select the subarrays\n", - "red = x[:, ::5]\n", - "print(f'red:\\n{red}\\n')\n", - "\n", - "green = x[2, 2:5]\n", - "print(f'green:\\n{green}\\n')\n", - "\n", - "blue = x[::2, 1:3] \n", - "print(f'blue:\\n{blue}\\n')\n", - "\n", - "# Bonus: Come up with indexing which selects from x the following submatrix:\n", - "# [[29, 28], \n", - "# [11, 10]].\n", - "bonus = x[-1::-3, -2:-4:-1]\n", - "print(f'bonus:\\n{bonus}\\n')\n", - "\n", - "# Check the results:\n", - "assert(np.allclose(red, np.array([[1, 6], [7, 12], [13, 18], [19, 24], [25, 30]])))\n", - "assert(np.allclose(green, np.array([15, 16, 17])))\n", - "assert(np.allclose(blue, np.array([[2, 3], [14, 15], [26, 27]])))\n", - "assert(np.allclose(bonus, np.array([[29, 28], [11, 10]])))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "We will move forward with the exercise session for now, but there are more exercises about indexing using list of indices and masking at the end of the exercise. We encourage you to do them all when you get to the end, as these concepts will keep reocurring in the upcoming exercises." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 4 Iterating\n", - "\n", - "An N dimensional array can be expressed as a list of N-1 dimensional arrays. \n", - "\n", - "For instance, a (2D) matrix ``x = np.ones((2, 3))`` can be thought of as a list of (1D) vectors of lenght 3. As you have seen in Section 3.1, we can access, say, the 2nd row as ``x[1, :]`` which is, however, equivalent to ``x[1]`` (i.e. omitting the ``:`` symbol referring to \"all the values in this dimension\").\n", - "\n", - "Similarly, a 3D array ``x = np.ones((4, 2, 3))`` can be thought of as a list of (2D) matrices of shape (2, 3). Again, we can access, say, the 1st matrix as ``x[0, :, :]``, which is equivalent to ``x[0]``.\n", - "\n", - "You have seen how to iterate through an array (Python list) using ``for``-loop or ``while``-loop in the exercise 1. You can use the same strategy with NumPy arrays as well. I.e. treat an N dimensional array as a list of N-1 dimensional arrays.\n", - "\n", - "Note that for many operations it is preferable _not_ to use an explicit ``for`` or ``while`` loop as the same computation can be usually achieved orders of magnitude faster using so called **vectorization** which will be introduced later. However, explicit iteration still comes in handy at times so it is useful to know how to use it." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Let us create a 3D array, iterate through it's slices, i.e. matrices, and \n", - "# find the trace of every matrix.\n", - "x = np.random.uniform(0, 10, (5, 10, 10))\n", - "\n", - "for i, matrix in enumerate(x):\n", - " print(f'Trace of matrix {i}: {np.trace(matrix)}')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 5 Concatenating, Stacking, Splitting\n", - "\n", - "Arrays can be **concatenated** (i.e. glueing the arrays while keeping the number of dimensions) and **stacked** (gluing the arrays along a newly created dimension). **Splitting** is the counterpart operation to concatenating.\n", - "\n", - "All of the **concatenated** arrays must have the same shape along all the dimensions except the one along which we concatenate. E.g. we can stack two matrices of shapes (4, 2) and (4, 5) along *axis 1* to get a new matrix of shape (4, 7).\n", - "\n", - "All of the **stacked** arrays must have exactly the same shape, the size of the newly created dimensions correspond to the number of stacked arrays. E.g. we can stack 2 matrices of shapes (4, 3) and (4, 3) along the newly created dimension *axis 0* to get a 3D array of shape (2, 4, 3).\n", - "\n", - "The axis for concatenation or stacing is specified using an argument ``axis``.\n", - "\n", - "See the examples below." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "### Concatenating.\n", - "\n", - "# Concatenate a couple of matrices vertically.\n", - "m1 = np.array([[1, 2, 3], [4, 5, 6]])\n", - "m2 = np.array([[10, 20, 30], [40, 50, 60], [70, 80, 90]])\n", - "m3 = np.array([[100, 200, 300]])\n", - "m_cat = np.concatenate([m1, m2, m3], axis=0)\n", - "print(m_cat)\n", - "\n", - "m_cat_error = np.concatenate([m1, m2, m3], axis=1) # This will fail, study the error message." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "### Stacking\n", - "\n", - "# Stack a couple of matrices to create a 3D array.\n", - "m1 = np.array([[1, 2], [4, 5]])\n", - "m2 = np.array([[10, 20], [40, 50]])\n", - "m3 = np.array([[100, 200], [400, 500]])\n", - "\n", - "# We can stack along any of axes 0, 1, 2. Stacking along different\n", - "# axis results in \"rotating\" our newly created 3D cube.\n", - "m_stack_0 = np.stack([m1, m2, m3], axis=0)\n", - "m_stack_1 = np.stack([m1, m2, m3], axis=1)\n", - "m_stack_2 = np.stack([m1, m2, m3], axis=2)\n", - "\n", - "print(m_stack_0.shape)\n", - "print(m_stack_1.shape)\n", - "print(m_stack_2.shape)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 5.1 Exercises\n", - "\n", - "Study the documentation for function [``np.split`` (documentation)](https://numpy.org/doc/stable/reference/generated/numpy.split.html) and use it to solve the following exercise." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Create a 3D array of increasing sequence of even numbers (starting from 0) of shape (10, 5, 7).\n", - "x = np.arange(0, 10 * 5 * 7 * 2, 2).reshape((10, 5, 7))\n", - "print(f'x:\\n{x}\\n')\n", - "\n", - "# Split the array into 5 arrays each of the shape (2, 5, 7)\n", - "splits_5 = np.split(x, 5, axis=0)\n", - "\n", - "# Split the array into 2 arrays of shapes (10, 2, 7) and (10, 3, 7)\n", - "splits_2 = np.split(x, [2], axis=1)\n", - "\n", - "# Check the answers.\n", - "assert((np.unique(x).size == 10 * 5 * 7) and np.all(x % 2 == 0) and np.min(x) == 0 and np.max(x) == 698)\n", - "assert(len(splits_5) == 5 and np.allclose(np.concatenate(splits_5, axis=0), x))\n", - "assert(len(splits_2) == 2 and np.allclose(np.concatenate(splits_2, axis=1), x))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 6 Basic Arithmetic Operators, Linear Algebra\n", - "\n", - "Basic arithmetic operators `+`, `-`, `*`, `/`, `//`, `**`, `%` are applied element-wise as long as one of the operands is a scalar or both operands are arrays of the same shape. If the two arrays are not the same shape, **broadcasting** will be applied (see Section 7 Broadcasting).\n", - "\n", - "Here are the most common linear algebra operators which you will mostly use for vectors (1D arrays) and matrices (2D arrays):\n", - "- `np.matmul` - Scalar product, vector-matrix or matrix-matrix multiplication (very similar to `np.dot`, read more about it [here](https://numpy.org/doc/stable/reference/generated/numpy.matmul.html)).\n", - "- `@` - The same as `np.matmul`, syntactic sugar.\n", - "- [`np.linalg.inv`( documentation)](https://numpy.org/doc/stable/reference/generated/numpy.linalg.inv.html) - Matrix inversion.\n", - "- [`np.linalg.norm` (documentation)](https://numpy.org/doc/stable/reference/generated/numpy.linalg.norm.html) - Norm computation (L2 norm by default).\n", - "- [`np.linalg.solve` (documentation)](https://numpy.org/doc/stable/reference/generated/numpy.linalg.solve.html) - Numerically stable solution to a system of linear equations given as Ax = b.\n", - "- `x.T` - Transposition.\n", - "\n", - "See the examples below." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "### Arithmetic operations.\n", - "\n", - "# When used for scalar and array operands, the scalar is applied to every element of an array regardless of its shape.\n", - "x = np.zeros((2, 2))\n", - "print(f'x:\\n{x}\\n')\n", - "x += 1\n", - "print(f'x + 1:\\n{x}\\n')\n", - "\n", - "# When used for two array operands, the operator is applies to their corresponding values pair-wise.\n", - "x1 = np.arange(10).reshape((2, 5))\n", - "x2 = np.zeros((2, 5))\n", - "print(f'x1:\\n{x1}\\n')\n", - "print(f'x1:\\n{x2}\\n')\n", - "x2min1 = x2 - x1\n", - "print(f'x2 - x1:\\n{x2min1}\\n')" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "### Linear algebra.\n", - "\n", - "## Dot product of two orthogonal vectors.\n", - "v1 = np.array([0.0893, 0.9332, 0.3481])\n", - "v2 = np.array([-0.6949, -0.1920, 0.6930])\n", - "v_dot = np.dot(v1, v2)\n", - "\n", - "# If they are orthogonal, their dot product should be close to 0.\n", - "print('v1 and v2 are orthogonal: {}'.format(\n", - " ('FALSE', 'TRUE')[int(np.isclose(v_dot, 0., atol=1e-5))]))\n", - "\n", - "## Matrix multiplication.\n", - "m1 = np.eye(3)\n", - "m2 = np.random.uniform(-10., 10., (3, 8))\n", - "m_mult = m1 @ m2\n", - "\n", - "# m1 is an identity matrix, therefore the matrix multiplication with \n", - "# any matrix M will produce the same matrix M.\n", - "print('m_mult is the same as m2: {}'.format(\n", - " ('FALSE', 'TRUE')[np.allclose(m2, m_mult)]))\n", - "\n", - "## Solve a linear system Ax = b.\n", - "# All the coefficients are random so it is extremely unlikely that we would\n", - "# generate a rank deficient matrix A and therefore the system of linear\n", - "# equations will have a solution.\n", - "A = np.random.uniform(-1., 1., (10, 10))\n", - "b = np.random.uniform(-1., 1., (10, ))\n", - "x = np.linalg.solve(A, b)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 6.1 Exercises" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "## Generate a matrix of shape (100, 100) filled with Euler's number. You cannot use np.full.\n", - "eul = np.ones((100, 100)) * np.exp(1)\n", - "print(f'eul:\\n{eul}\\n')\n", - "\n", - "## Generate a 1D array of length 10 of powers of 2, i.e. [2^0, 2^1, ..., 2^9]\n", - "pows = 2. ** np.arange(10)\n", - "print(f'pows:\\n{pows}\\n')\n", - "\n", - "## Check the answers:\n", - "assert(np.allclose(eul, np.stack([[2.71828182] * 100] * 100, axis=0)))\n", - "assert(np.allclose(pows, [2**0, 2**1, 2**2, 2**3, 2**4, 2**5, 2**6, 2**7, 2**8, 2**9]))" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# Helper function to print an arrow.\n", - "def plot_arrow(pts, clr):\n", - " plt.plot(*pts[:2].T, color=clr, marker='*')\n", - " plt.plot(*pts[1:3].T, color=clr, marker='*')\n", - " plt.plot(*pts[[1, 3], :].T, color=clr, marker='*')\n", - "\n", - "## The array 'arrow' contains 4 2D points defining a blue arrow. The objective\n", - "## is to make the arrow 2 times shorter and thinner and rotate it by 45 degrees \n", - "## counter-clockwise. \n", - "## First you will rotate the arrow by multiplying the points with the rotation \n", - "## matrix, where the rotation matrix stands on the left.\n", - "## Then, you will scale the arrow by multiplying the previous result with the scale matrix\n", - "## where the scale matrix stands on the left.\n", - "\n", - "## Hint: You will need to do some transpose operations.\n", - "\n", - "arrow = np.array([[ 0., 0.,], \n", - " [ 0., 2.], \n", - " [-0.5, 1.5], \n", - " [ 0.5, 1.5]])\n", - "angle = np.pi / 4.\n", - "rot = np.array([[np.cos(angle), -np.sin(angle)], \n", - " [np.sin(angle), np.cos(angle)]])\n", - "scale = np.array([[0.5, 0.], \n", - " [0., 0.5]])\n", - "\n", - "# When doing matrix multiplication, always know your shapes.\n", - "# Rotation matrix in D dimension is always (DxD), here D=2\n", - "# The arrow is made of 4 points (4x2).\n", - "# To rotate it with matrix operation on the left, you need to transpose it:\n", - "rotated_arrow_T = rot @ arrow.T # (shapes: (2x2)@(2x4) = (2x4))\n", - "# Now scale and return the arrow to how you defined your space (NxD)\n", - "arrow_sr = (scale @ rotated_arrow_T).T\n", - "\n", - "plt.figure(figsize=(5, 5))\n", - "plt.xlim(-3, 3)\n", - "plt.ylim(-3, 3)\n", - "plot_arrow(arrow, 'b')\n", - "plot_arrow(arrow_sr, 'r')\n", - "\n", - "## Check the answers.\n", - "assert(np.allclose(arrow_sr, np.array([[ 0. , 0. ],\n", - " [-0.70710678, 0.70710678],\n", - " [-0.70710678, 0.35355339],\n", - " [-0.35355339, 0.70710678]])))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 7 Broadcasting\n", - "\n", - "Broadcasting allows for performing arithmetic and other operations on arrays of different shape, where the smaller is \"broadcast\" over the larger array. For instance, adding a column vector *v* to a matrix *M*, M + v, will effectively take every column of the matrix and add the vector *v* element-wise.\n", - "\n", - "Broadcasting further allows for so called **vectorization**, i.e. performing a given operation in parallel where the actual looping occurs in highly-optimized C code rather than in Python, where looping is slow.\n", - "\n", - "Example:\n", - "\n", - "```python\n", - ">>> a = np.arange(6).reshape((2, 3)) # shape (2, 3)\n", - "array([[0, 1, 2],\n", - " [3, 4, 5]])\n", - "\n", - ">>> b = np.array([10, 20, 30]) # shape (3, )\n", - "array([10, 20, 30])\n", - "\n", - ">>> a + b\n", - "array([[10, 21, 32],\n", - " [13, 24, 35]]) # shape (2, 3)\n", - "```\n", - "\n", - "### 7.1 Broadcasting Rules\n", - "\n", - "The corresponding dimensions of the 2 arrays must satisfy one of the following:\n", - "- Have the same dimensions.\n", - "- One of the dimensions is 1.\n", - "\n", - "Furthermore, non-existent dimensions are treated as 1.\n", - "\n", - "Here are a couple of examples of the input and output shapes to a binary operation (such as `+`) being applied on 2 arrays *A* and *B*:\n", - "\n", - "\n", - "\n", - "**Note:** Do not confuse the concept of _vectorization_ with NumPy's function `np.vectorize`, which is provided for programming convenience, not for performance and thus does not guranatee the actual vectorization of an operation.\n", - "\n", - "If the concept is not clear, you can read more about broadcasting [here](https://numpy.org/devdocs/user/basics.broadcasting.html).\n", - "\n", - "Go through the examples below and try to understand how the arrays are constructed and computed." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "## Manual Python looping vs. vectorization - Multiplying a \n", - "# matrix by a vector row-wise.\n", - "m = np.arange(15).reshape(5, 3)\n", - "v = np.array([0, 4, 2])\n", - "m_loop = np.copy(m)\n", - "m_vect = np.copy(m)\n", - "\n", - "# Python loop.\n", - "for i in range(m.shape[0]):\n", - " m_loop[i] *= v\n", - "\n", - "# Vectorization.\n", - "m_vect *= v\n", - "\n", - "# Check that both results are the same.\n", - "assert(np.allclose(m_loop, m_vect))\n", - "\n", - "## Generate a matrix where each row holds a constant value \n", - "## which increases throughout the rows.\n", - "seq_mat = np.ones((5, 3)) * np.arange(5).reshape((-1, 1))\n", - "print(f'seq_mat:\\n{seq_mat}\\n')\n", - "\n", - "m = np.random.randint(0, 10, (4, 5))\n", - "print(f'm:\\n{m}\\n')\n", - "\n", - "## Add a vector to a matrix row-wise (horizontally).\n", - "add_rw = np.array([10, 20, 30, 40, 50])\n", - "m_add_rw = m + add_rw\n", - "print(f'm_add_rw:\\n{m_add_rw}\\n')\n", - "\n", - "## Add a vector to a matrix column-wise (vertically).\n", - "add_cw = np.array([10, 20, 30, 40]).reshape((-1, 1))\n", - "m_add_cw = m + add_cw\n", - "print(f'm_add_cw:\\n{m_add_cw}\\n')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 7.2 Exercises" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "## Given a matrix 'data' defined below, Compute a matrix data_pow, \n", - "## where a value in each column is taken to the power of its column\n", - "## index.\n", - "data = np.random.uniform(0, 5, (4, 5))\n", - "data_pow = data ** np.arange(5)\n", - "print(f'data_pow:\\n{data_pow}\\n')\n", - "\n", - "## Generate a matrix of shape (5, 4), where each row is an integer \n", - "## sequence starting from 0 with an increment of a row index i. E.g.\n", - "## the first 3 rows would be:\n", - "##\n", - "## [0*0, 0*0, 0*0, 0*0] [0, 0, 0, 0]\n", - "## [0*1, 1*1, 2*1, 3*1] = [0, 1, 2, 3]\n", - "## [0*2, 1*2, 2*2, 3*2] [0, 2, 4, 6]\n", - "##\n", - "## You can use 1D arrays only and the rules of broadcasting.\n", - "# Hint: Use np.arange\n", - "seqs = np.arange(5)[:, None] * np.arange(4)\n", - "print(f'seqs:\\n{seqs}\\n')\n", - "\n", - "## Check the results:\n", - "data_pow_gt = np.copy(data)\n", - "for i in range(data.shape[1]):\n", - " data_pow_gt[:, i] = data_pow_gt[:, i] ** i\n", - "assert(np.allclose(data_pow, data_pow_gt))\n", - "assert(np.allclose(seqs, np.array([[ 0, 0, 0, 0],\n", - " [ 0, 1, 2, 3],\n", - " [ 0, 2, 4, 6],\n", - " [ 0, 3, 6, 9],\n", - " [ 0, 4, 8, 12]])))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 8 Common NumPy Functions\n", - "\n", - "NumPy offers plethora of functions to perform computations with N dimensional arrays. One of the common concepts is that these functions would accept an argument `axis` using which you can specify along which axis the operation should be performed. \n", - "\n", - "For example, given a matrix `A = np.array([[1, 2, 3], [4, 5, 6]])`, we might want to find a sum of values over rows and columns:\n", - "\n", - "\n", - "\n", - "```python\n", - ">>> sum_per_row = np.sum(A, axis=1)\n", - "array([6, 15])\n", - "\n", - ">>> sum_per_col = np.sum(A, axis=0)\n", - "array([5, 7, 9])\n", - "```\n", - "\n", - "Among the most common functions you might need the following: \n", - "- [`np.sum` (documentation)](https://numpy.org/doc/stable/reference/generated/numpy.sum.html)\n", - "- [`np.prod` (documentation)](https://numpy.org/doc/stable/reference/generated/numpy.prod.html) \n", - "- [`np.mean` (documentation)](https://numpy.org/doc/stable/reference/generated/numpy.mean.html)\n", - "- [`np.std` (documentation)](https://numpy.org/doc/stable/reference/generated/numpy.std.html)\n", - "- `np.min`\n", - "- `np.max`\n", - "- [`np.argmin` (documentation)](https://numpy.org/doc/stable/reference/generated/numpy.argmin.html)\n", - "- [`np.argmax` (documentation)](https://numpy.org/doc/stable/reference/generated/numpy.argmax.html)\n", - "- [`np.sort` (documentation)](https://numpy.org/doc/stable/reference/generated/numpy.sort.html)\n", - "- `np.abs` \n", - "- [`np.sqrt` (documentation)](https://numpy.org/doc/stable/reference/generated/numpy.sqrt.html)\n", - "- [`np.unravel_index` (documentation)](https://numpy.org/doc/stable/reference/generated/numpy.unravel_index.html).\n", - "\n", - "Please study their corresponding reference pages and fill in the following exercises." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 8.1 (OPTIONAL) Exercises" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "## Given a matrix M, find the product of the values in each row.\n", - "M = np.arange(12).reshape((4, 3))\n", - "sm = np.prod(M, axis=1)\n", - "\n", - "# Check the results.\n", - "assert(np.allclose(sm, np.array([0, 60, 336, 990])))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Write a function `rescale` which takes as input an array $x \\in \\mathbb R ^ {N \\times M}$ and scalars $a np.ndarray:\n", - " \"\"\" Rescales the input from range [min(x), max(x)] to range [a, b].\n", - " \n", - " Args:\n", - " x (np.ndarray): Input array, shape (N, M).\n", - " a (float): Lower bound of the output range.\n", - " b (float): Upper bound of the output range.\n", - " \n", - " Returns:\n", - " np.ndarray: Rescaled array, shape (N, M).\n", - " \"\"\"\n", - " return (b - a) * (x - np.min(x)) / (np.max(x) - np.min(x)) + a\n", - "\n", - "# Test.\n", - "x = 2 * np.random.rand(3, 2) - 1\n", - "a, b = 1, 3\n", - "y = rescale(x, a, b)\n", - "\n", - "assert(np.isclose(np.min(y), a) and np.isclose(np.max(y), b))\n", - "\n", - "print(f'Input array:\\n{x}\\n')\n", - "print(f'Required range: {a, b}\\n')\n", - "print(f'Output array:\\n{y}\\n')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Write function `find_closest`, which given scalar $u \\in \\mathbb R$ and input array $x \\in \\mathbb R ^ {N \\times M}$, returns the closest element to the scalar in the array $x_{i^*,j^*} : (i^*,j^*)=\\text{argmin}_{i,j} | x_{i,j} - u |$. $\\text{argmin}$ is the operation that finds the index of the minimum element.\n", - "\n", - "Hint: Use `np.abs`, [`np.argmin` (documentation)](https://numpy.org/doc/stable/reference/generated/numpy.argmin.html), [`np.unravel_index` (documentation)](https://numpy.org/doc/stable/reference/generated/numpy.unravel_index.html)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def find_closest(x: np.ndarray, u: float) -> float:\n", - " \"\"\" Finds the closest element to `u` in `x`.\n", - " \n", - " Args:\n", - " x (np.ndarray): Input array, shape (N, M).\n", - " u (float): A value to which the closest element in x is searched for.\n", - " \n", - " Returns:\n", - " float: Closest element in `x` to `u`.\n", - " \"\"\"\n", - " index_min = (np.abs(x - u)).argmin()\n", - " index_min_unravelled = np.unravel_index(index_min, x.shape)\n", - " return x[index_min_unravelled]\n", - "\n", - "# Test.\n", - "x = np.arange(12).reshape(3,4)\n", - "u = np.random.uniform(0, 11)\n", - "x_ij = find_closest(x, u)\n", - "\n", - "assert(x_ij == round(u))\n", - "\n", - "print(f'The closest element to {u:.3f} within\\n{x} is {x_ij}')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Write a function `z_score_normalize` which takes as input an array $x \\in \\mathbb R ^ {N \\times M}$ and returns an output array $y \\in \\mathbb R ^ {N \\times M}$ such that $\\mathbb E [y]= 0$ and $\\sigma [y] = 1$, where $\\sigma[y]$ is a standard deviation of $y$.\n", - "\n", - "Hint: Use [`np.mean` (documentation)](https://numpy.org/doc/stable/reference/generated/numpy.mean.html) [`np.std` (documentation)](https://numpy.org/doc/stable/reference/generated/numpy.std.html)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def z_score_normalize(x: np.ndarray) -> np.ndarray:\n", - " \"\"\" Normalizes the input x so that its mean is 0 and std is 1.\n", - " \n", - " Args:\n", - " x (np.ndarray): Input array, shape (N, M).\n", - " \n", - " Returns:\n", - " np.ndarray: Normalized array, shape (N, M).\n", - " \"\"\"\n", - " return (x - np.mean(x)) / np.std(x)\n", - "\n", - "# Test.\n", - "x = 2 * np.random.rand(3,2) - 1\n", - "y = z_score_normalize(x)\n", - "\n", - "assert(np.isclose(np.mean(y), 0))\n", - "assert(np.isclose(np.std(y), 1.))\n", - "\n", - "print(f'Mean/std for input array\\n{x}\\nis: {np.mean(x):.3f}/{np.std(x):.3f}\\n')\n", - "print(f'Mean/std for normalized array\\n{y} is:\\n{np.mean(y):.3f}/{np.std(y):.3f}')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 9 Shallow and Deep Copy\n", - "\n", - "A simple assignment makes no copy of the underlying data, have a look at the following example:\n", - "\n", - "```python\n", - ">>> a = np.array([1., 2., 3.])\n", - ">>> b = a\n", - ">>> b[1] = 1.602e-19\n", - ">>> print(a, b)\n", - "array([1, 1.602e-19, 3])\n", - "array([1, 1.602e-19, 3])\n", - "```\n", - "\n", - "Assigning a slice of an array to a new array works with the very same data as well, i.e. no copy is made:\n", - "\n", - "```python\n", - ">>> a = np.array([1., 2., 3.])\n", - ">>> b = a[:2]\n", - ">>> b[0] = 6.626e-34\n", - ">>> print(a, b)\n", - "array([6.626e-34, 2, 3])\n", - "array([6.626e-34, 2])\n", - "```\n", - "\n", - "In order to truly copy the data, we need to make a deep copy using the NumPy function [`np.copy` (documentation)](https://numpy.org/doc/stable/reference/generated/numpy.copy.html):\n", - "\n", - "```python\n", - ">>> a = np.array([1., 2., 3.])\n", - ">>> b = a.copy()\n", - ">>> b[2] = 9.807\n", - ">>> print(a, b)\n", - "array([1, 2, 3])\n", - "array([2, 2, 9.807])\n", - "```\n", - "\n", - "\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 10 Random\n", - "\n", - "The module [`np.random` (documentation)](https://numpy.org/doc/stable/reference/random/index.html) implements pseudo-random number generators (RNG) for various distributions. We will frequently use it to randomly sample our data or to randomly initialize the parameters of our models.\n", - "\n", - "In order to fix the RNG and thus to be able to reproduce the computation, we can set the so called *seed*. Setting the seed guarantees that the same sequence of numbers will be generated by the RNG in each run.\n", - "\n", - "```python\n", - ">>> constant = 3 # Any we want\n", - ">>> np.random.seed(constant)\n", - "```\n", - "\n", - "Here are some of the functions we will be using most frequently:\n", - "- np.random.randint\n", - "- np.random.shuffle\n", - "- np.random.uniform\n", - "- np.random.randn\n", - "- np.random.permutation\n", - "\n", - "Please study their respective [reference pages](https://docs.scipy.org/doc/numpy-1.16.0/reference/routines.random.html) and complete the exercises below." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 10.1 (OPTIONAL) Exercises" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "## Create a 1D array of randomly permuted sequence from 0 to 10.\n", - "x_perm = np.random.permutation(11) \n", - "x_perm = np.arange(11)\n", - "np.random.shuffle(x_perm)\n", - "print(f'Permuted array:\\n{x_perm}\\n')\n", - "\n", - "## Check the answers.\n", - "assert(np.unique(x_perm).shape == (11, ) and np.min(x_perm) == 0 and np.max(x_perm) == 10)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 11 (OPTIONAL) Extra Indexing Exercises" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Using indexing by integer arrays, extract the subarrays as depicted in the Figure below.\n", - "\n", - "\"slicing\"" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "### Using a combination of standard indexing/slicing and an indexing by an array \n", - "# of indices, select the following subarrays.\n", - "\n", - "# Create a 2D array an print it out.\n", - "x = np.arange(1, 31).reshape((5, 6))\n", - "print(f'Array x:\\n{x}\\n')\n", - "\n", - "# Select the subarrays.\n", - "red = x[range(5), range(5)]\n", - "print(f'red:\\n{red}\\n')\n", - "\n", - "green = x[[0, 1, 3], 4:]\n", - "print(f'green:\\n{green}\\n')\n", - "\n", - "blue = x[[3, 4, 4], [0, 0, 1]]\n", - "print(f'blue:\\n{blue}\\n')\n", - "\n", - "# Bonus: Come up with indexing which results in matrix `x` being stacked 3 times \n", - "# horizontally, i.e. the resulting shape is (15, 6).\n", - "# Hint: Use functions range and list, use operator * with a list.\n", - "bonus = x[list(range(5)) * 3, :]\n", - "print(f'bonus:\\n{bonus}\\n')\n", - "\n", - "# Check the results:\n", - "assert(np.allclose(red, np.arange(1, 30, 7)))\n", - "assert(np.allclose(green, np.array([[5, 6], [11, 12], [23, 24]])))\n", - "assert(np.allclose(blue, np.array([19, 25, 26])))\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Using masking, extract the subarray as depicted in the Figure below.\n", - "\n", - "\"slicing\"" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "### Construct a mask and then extract the red values above the main diagonal.\n", - "\n", - "# Extract the red values.\n", - "mask = np.zeros_like(x, dtype=bool)\n", - "for i in range(5):\n", - " mask[i, i + 1:] = True\n", - "print(f'mask:\\n{mask}\\n')\n", - "\n", - "red = x[mask]\n", - "print(f'red:\\n{red}\\n')\n", - "\n", - "# Bonus: Extract all the values, which are not red.\n", - "# Hint: Use your already constructed `mask`, study NumPy's binary operations\n", - "# https://docs.scipy.org/doc/numpy/reference/routines.bitwise.html\n", - "the_rest = x[~mask]\n", - "print(f'the_rest:\\n{the_rest}\\n')\n", - "\n", - "# Bonus2: Extract the values which can be divided by both 2 and 3.\n", - "# Hint: Construct two masks and combine them using NumPy's binary operators.\n", - "mask = (x % 2 == 0) & (x % 3 == 0)\n", - "div23 = x[mask]\n", - "print(f'div23:\\n{div23}\\n')\n", - "\n", - "# Check the results:\n", - "assert(np.allclose(red, np.concatenate(\n", - " [range(2, 7), range(9, 13), range(16, 19), range(23, 25), [30]])))\n", - "assert(np.allclose(np.sort(np.concatenate([red, the_rest])), np.arange(1, 31)))\n", - "assert(not np.any(np.fmod(div23, 2.)) and not np.any(np.fmod(div23, 3.)))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 12 Next Steps\n", - "\n", - "Feel free to consult more thorough NumPy tutorials.\n", - "\n", - "- NumPy basics: [https://docs.scipy.org/doc/numpy/user/basics.html](https://docs.scipy.org/doc/numpy/user/basics.html)\n", - "- Official NumPy tutorial: [https://docs.scipy.org/doc/numpy/user/quickstart.html](https://docs.scipy.org/doc/numpy/user/quickstart.html)" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.8.10" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/Exercises/03-linear-regression/.ipynb_checkpoints/linear_regression-checkpoint.ipynb b/Exercises/03-linear-regression/.ipynb_checkpoints/linear_regression-checkpoint.ipynb deleted file mode 100644 index 5980034..0000000 --- a/Exercises/03-linear-regression/.ipynb_checkpoints/linear_regression-checkpoint.ipynb +++ /dev/null @@ -1,1319 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Linear Regression" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "
\n", - "\n", - "This notebook is part of a series of exercises for the CIVIL-226 Introduction to Machine Learning for Engineers course at EPFL, and adapted for the ME-390. Copyright (c) 2021 [VITA](https://www.epfl.ch/labs/vita/) lab at EPFL. Use of this source code is governed by an MIT-style license that can be found in the LICENSE file or at https://www.opensource.org/licenses/MIT\n", - "\n", - "**Author(s):** Tom Winandy and David Mizrahi\n", - "
\n" - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": {}, - "outputs": [], - "source": [ - "# Function to align all tables to the left (useful for later on)" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "\n" - ], - "text/plain": [ - "" - ] - }, - "metadata": {}, - "output_type": "display_data" - } - ], - "source": [ - "%%html\n", - "" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": {}, - "outputs": [], - "source": [ - "import pandas as pd\n", - "import numpy as np\n", - "import matplotlib.pyplot as plt\n", - "from typing import Any, Callable\n", - "\n", - "# Helper file with functions for pre-processing and visualization\n", - "import helpers" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 0. Intro \n", - "\n", - "In the first part of the exercise, you're tasked with implementing linear regression with only one variable to predict profits for a restaurant. This is known as **[simple linear regression](https://en.wikipedia.org/wiki/Simple_linear_regression)**, as opposed to **multiple linear regression** (where multiple variables are taken into account for the prediction). You'll see later on that the code implemented here will work just as well for multiple linear regression.\n", - "\n", - "**Question:** How does a regression problem differ from a classification problem?\n", - "\n", - "**Answer:** YOUR ANSWER HERE\n", - "\n", - "*Background: Suppose you're the CEO of a restaurant franchise and are considering different cities for opening a new outlet. The chain already has restaurants in various cities and you have data for profits and populations from the cities. You would like to use this data to predict the profit of a restaurant based on where it opens.*" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 1. Data loading & pre-processing" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Here, we'll use a dataset containing 97 restaurants, with the population of the city (in 10'000's of inhabitants) they operate in and their respective profit (in 10'000's of USD). Take a look at the file `restaurant_data.csv` and see how it's loaded by running the cell below." - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "There are 97 rows and 2 columns.\n" - ] - }, - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
populationprofit
06.110117.5920
15.52779.1302
28.518613.6620
37.003211.8540
45.85986.8233
\n", - "
" - ], - "text/plain": [ - " population profit\n", - "0 6.1101 17.5920\n", - "1 5.5277 9.1302\n", - "2 8.5186 13.6620\n", - "3 7.0032 11.8540\n", - "4 5.8598 6.8233" - ] - }, - "execution_count": 4, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "restaurant_df = pd.read_csv('data/restaurant_data.csv')\n", - "\n", - "print(f\"There are {restaurant_df.shape[0]} rows and {restaurant_df.shape[1]} columns.\")\n", - "# Show the first 5 rows of the data\n", - "restaurant_df.head(5)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Run the cell below to get a plot of the data. " - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "" - ] - }, - "execution_count": 5, - "metadata": {}, - "output_type": "execute_result" - }, - { - "data": { - "image/png": "iVBORw0KGgoAAAANSUhEUgAAAjMAAAGwCAYAAABcnuQpAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8qNh9FAAAACXBIWXMAAA9hAAAPYQGoP6dpAAA6f0lEQVR4nO3de3xU9Z3/8fcAIRBIhksCSSSGIAEvWDflTmwEu6J0taA+WoytXErVbiEUqQ+17bJi7Qpaxa5x1dpShFJTtytQfquPWlqBKBe5mKiopcEkSBfYXICEJBpCOL8/2Ewzycxk7uecmdfz8cjjYc6cGb5zMs685/v9fL9fh2EYhgAAAGyql9kNAAAACAVhBgAA2BphBgAA2BphBgAA2BphBgAA2BphBgAA2BphBgAA2FofsxsQaRcuXNDx48eVnJwsh8NhdnMAAIAfDMPQ2bNnlZmZqV69fPe9xHyYOX78uLKyssxuBgAACMKxY8c0YsQIn+fEfJhJTk6WdPFipKSkmNwaAADgj8bGRmVlZbk+x32J+TDTMbSUkpJCmAEAwGb8KRExtQB41apVmjhxopKTkzVs2DDNmTNHhw8fdjtnwYIFcjgcbj9TpkwxqcUAAMBqTA0zO3fu1OLFi7V3715t27ZN58+f18yZM9Xc3Ox23k033aQTJ064fl5//XWTWgwAAKzG1GGmP/zhD26/r1u3TsOGDdPBgwdVUFDgOp6YmKj09HS/HrO1tVWtra2u3xsbG8PTWAAAYEmWWmemoaFBkjRkyBC34zt27NCwYcM0ZswY3X333aqpqfH6GKtWrZLT6XT9MJMJAIDY5jAMwzC7EdLF+eSzZ8/W6dOn9dZbb7mOv/LKKxo4cKCys7NVVVWlFStW6Pz58zp48KASExO7PY6nnpmsrCw1NDRQAAwAgE00NjbK6XT69fltmdlMS5Ys0fvvv6+3337b7fjcuXNd/z1u3DhNmDBB2dnZeu2113Tbbbd1e5zExESPIQcAAMQmS4SZoqIibd26VaWlpT0ujJORkaHs7GxVVFREqXUAAMDKTA0zhmGoqKhImzdv1o4dO5STk9Pjferr63Xs2DFlZGREoYUAAMDqTC0AXrx4sTZu3KiXX35ZycnJOnnypE6ePKnPPvtMktTU1KT7779fe/bsUXV1tXbs2KFbbrlFqampuvXWW81sOgAAsAhTC4C9req3bt06LViwQJ999pnmzJmjsrIynTlzRhkZGZoxY4YeffRRv2cpBVJABAAArME2BcA95aj+/fvrjTfeiFJrAABAICprm3T0VItGDh2gnNQBprXDEgXAAADAPs60nNPSknKVVtS6jhXkpqm4ME/OpISot8dSi+YBAADrW1pSrl1H6tyO7TpSp6KSMlPaQ5gBAAB+q6xtUmlFrdq7lIq0G4ZKK2pVVdfs5Z6RQ5gBAAB+O3qqxeft1fWEGQAAYGHZQ5J83j5yaPQLgQkzAADAb6PSBqogN029uyyv0tvhUEFumimzmggzAAAgIMWFecofnep2LH90qooL80xpD1OzAQBAQJxJCdqwaJKq6ppVXd/MOjMAAMCeclLNDTEdGGYCAAC2RpgBAAC2RpgBAAC2RpgBAAC2RpgBAAC2RpgBAAC2RpgBAAC2RpgBAAC2RpgBAAC2xgrAAIC4UFnbpKOnWkxfeh/hR5gBAMS0My3ntLSkXKUVta5jBblpKi7MkzMpwcSWIVwYZgIAxLSlJeXadaTO7diuI3UqKikzqUUIN8IMACBmVdY2qbSiVu2G4Xa83TBUWlGrqrpmk1qGcCLMAABi1tFTLT5vr64nzMQCwgwAIGZlD0nyefvIoRQCxwLCDAAgZo1KG6iC3DT1djjcjvd2OFSQm8asphhBmAEAxLTiwjzlj051O5Y/OlXFhXkmtQjhxtRsAEBMcyYlaMOiSaqqa1Z1fTPrzMQgwgwAIC7kpBJiYhXDTAAAwNbomQEAIMrYWiG8CDMAAEQJWytEBsNMAABECVsrRAZhBgCAKGBrhcghzAAAEAVsrRA5hBkAAKKArRUihzADAEAUsLVC5BBmAACIErZWiAymZgMAECVsrRAZ9MwAABBlOakDNGPsMBmGoe2Ha5jJFCJ6ZgAAiDIWzwsvemYAAIgyFs8LL8IMAABRxOJ54UeYAQAgilg8L/wIMwAARBGL54UfYQYAgChi8bzwI8wAABBlLJ4XXkzNBgAgylg8L7wIMwAAmCQnlRATDgwzAQAAWyPMAAAAWyPMAAAAWzM1zKxatUoTJ05UcnKyhg0bpjlz5ujw4cNu5xiGoZUrVyozM1P9+/fX9OnT9eGHH5rUYgAAYDWmhpmdO3dq8eLF2rt3r7Zt26bz589r5syZam7+++qHTzzxhNasWaNnn31W+/fvV3p6um644QadPXvWxJYDAACrcBhGl80hTFRbW6thw4Zp586dKigokGEYyszM1LJly/Tggw9KklpbWzV8+HA9/vjjuvfee3t8zMbGRjmdTjU0NCglJSXSTwEAAIRBIJ/flqqZaWhokCQNGTJEklRVVaWTJ09q5syZrnMSExN13XXXaffu3R4fo7W1VY2NjW4/AAAgdlkmzBiGoeXLl+vaa6/VuHHjJEknT56UJA0fPtzt3OHDh7tu62rVqlVyOp2un6ysrMg2HAAAmMoyYWbJkiV6//33VVJS0u02R5f9KwzD6Hasww9+8AM1NDS4fo4dOxaR9gIAAGuwxArARUVF2rp1q0pLSzVixAjX8fT0dEkXe2gyMjJcx2tqarr11nRITExUYmJiZBsMAAAsw9SeGcMwtGTJEm3atElvvvmmcnJy3G7PyclRenq6tm3b5jp27tw57dy5U9OmTYt2cwEAgAWZ2jOzePFivfzyy/r973+v5ORkVx2M0+lU//795XA4tGzZMj322GPKzc1Vbm6uHnvsMSUlJenOO+80s+kAAMAiTA0zzz//vCRp+vTpbsfXrVunBQsWSJIeeOABffbZZ/rud7+r06dPa/LkyfrjH/+o5OTkKLcWAABYkaXWmYkE1pkBAMB+bLvODAAAQKAIMwAAwNYIMwAAwNYIMwAAwNYIMwAAwNYIMwAAwNYIMwAAwNYIMwAAwNYIMwAAwNYIMwAAwNYIMwAAwNYIMwAAwNYIMwAAwNYIMwAAwNYIMwAAwNb6mN0AAAAQGZW1TTp6qkUjhw5QTuoAs5sTMYQZAABizJmWc1paUq7SilrXsYLcNBUX5smZlGBiyyKDYSYAAGLM0pJy7TpS53Zs15E6FZWUmdSiyCLMAAAQQyprm1RaUat2w3A73m4YKq2oVVVds0ktixzCjI1U1jZp++GamHwhAgDC4+ipFp+3V9fH3mcINTM2EG9jnwCA4GUPSfJ5+8ihsVcITM+MDcTb2CcAIHij0gaqIDdNvR0Ot+O9HQ4V5KbF5KwmwozFxePYJwAgNMWFecofnep2LH90qooL80xqUWQxzGRx/ox9xmLKBgAEz5mUoA2LJqmqrlnV9c2sMwNzxePYJwAgPHJSYzvEdGCYyeLicewTAIBAEGZsIN7GPgEgEljeInYxzGQD8Tb2CQDhxPIWsY+eGRvJSR2gGWOHEWQAIAAsbxH7CDMAgJjF8hbxgTADAIhZ8bi0fzwizAAAYhbLW8QHwgwAIGaxvEV8IMwAAGIay1vEPqZmAwBiGstbxD7CDAAgLsTL0v7xiGEmAABga4QZAABga4QZAABga4QZAABga4QZAABga4QZAABga4QZAABga4QZAABga4QZAABga6wADCDuVNY26eipFpa1B2IEYQZA3DjTck5LS8pVWlHrOlaQm6biwjw5kxJMbBmAUDDMBCBuLC0p164jdW7Hdh2pU1FJmUktAhAOhBkAcaGytkmlFbVqNwy34+2GodKKWlXVNZvUMgChIswAiAtHT7X4vL26njAD2BVhBkBcyB6S5PP2kUMpBAbsijADIC6MShuogtw09XY43I73djhUkJvGrCbAxkwNM6WlpbrllluUmZkph8OhLVu2uN2+YMECORwOt58pU6aY01gAtldcmKf80alux/JHp6q4MM+kFgEIB1OnZjc3N+uaa67RwoULdfvtt3s856abbtK6detcv/ft2zdazQMQY5xJCdqwaJKq6ppVXd/MOjNAjDA1zMyaNUuzZs3yeU5iYqLS09P9fszW1la1tra6fm9sbAy6fQBiU04qIQaIJZavmdmxY4eGDRumMWPG6O6771ZNTY3P81etWiWn0+n6ycrKilJLAYRDZW2Tth+uYao0AL85DKPLogsmcTgc2rx5s+bMmeM69sorr2jgwIHKzs5WVVWVVqxYofPnz+vgwYNKTEz0+DieemaysrLU0NCglJSUSD8NAEFidV4AnTU2NsrpdPr1+W3p7Qzmzp3r+u9x48ZpwoQJys7O1muvvabbbrvN430SExO9Bh0A1uVrdd4NiyaZ1CoAdmD5YabOMjIylJ2drYqKCrObAiCMWJ0XQChsFWbq6+t17NgxZWRkmN0UAGHE6rwAQmHqMFNTU5OOHDni+r2qqkrl5eUaMmSIhgwZopUrV+r2229XRkaGqqur9cMf/lCpqam69dZbTWw1gHBjdV4AoTC1Z+bAgQPKy8tTXt7FBauWL1+uvLw8/eu//qt69+6tDz74QLNnz9aYMWM0f/58jRkzRnv27FFycrKZzQYQZqzOCyAUlpnNFCmBVEMDME9DS5uKSsqYzQRAUgzNZgIQP1idF0CwCDMALIXVeQEEylazmQAAALoizAAAAFsjzAAAAFujZga2UFnbpKOnWigKBQB0Q5iBpbH5IACgJwwzwdJ8bT4Ie6msbdL2wzXsswQg7OiZgWV1bD7YVefNBxlysj561wBEGj0zsCw2H4wN9K4BiDTCDCyLzQftr6N3rb3Lrimde9cAIFSEGVgWmw/aH71rAKKBMANLKy7MU/7oVLdj+aNTVVyYZ1KLEAh61wBEAwXAsDQ2H7S3jt61XUfq3Iaaejscyh+dyt8SQFjQMwNbyEkdoBljh/HhZ0P0rgGINHpmAEQUvWsAIo0wAyAqclIJMQAig2EmAABga4QZAABga4QZAABga4QZAABga4QZAABga4QZAABga4QZAABga6wzg7hTWduko6daWLwNAGIEYQZx40zLOS0tKVdpRa3rWEFumooL8+RMSjCxZbA7AjJgLsIMbCWUD42lJeXadaTO7diuI3UqKinThkWTwtlMxAkCMmANhBlYVufgMjgpIaQPjcraJrf7dmg3DJVW1Kqqrplv1AgYARmwBsIMJFmrm9zTt93BSQlq/KzN7bxAPjSOnmrxeXt1PWEGgSEgA9ZBmIlzVuwm9/Rt93RLW7fzAvnQyB6S5PP2kUP50EFgCMiAdTA1O8756iY3Q8e33XbD8Ps+1fXNPZ4zKm2gCnLT1NvhcDve2+FQQW4aHzoIGAEZsI6gwsyoUaNUX1/f7fiZM2c0atSokBuF6PAWHDr3eERbT992PfH3Q6O4ME/5o1PdjuWPTlVxYV7A/yZAQAasI6hhpurqarW3t3c73traqv/5n/8JuVGIDit2k/f0bbez3g6H8ken+t1GZ1KCNiyapKq6ZlXXN1uiPgj2VlyYp6KSMrdhWgIyEH0BhZmtW7e6/vuNN96Q0+l0/d7e3q4///nPGjlyZNgah8iyYjd5x7fdXUfq3HqMeuliGOlcOxPsh0ZOKiEG4UFABqzBYRj+Fyf06nVxVMrhcKjr3RISEjRy5Eg99dRTuvnmm8PbyhA0NjbK6XSqoaFBKSkpZjfHcuat3dctOHT0eJg1tbShpa3bt92OouRTLef40ACAOBDI53dAYaZDTk6O9u/fr9TU1J5PNhlhxjdfwcHsRb/4tgsA8SviYcZOCDP+ITgAAKwkkM9vv2tmnnnmGd1zzz3q16+fnnnmGZ/nLl261N+HhUVQRwIAsCu/e2ZycnJ04MABDR06VDk5Od4f0OFQZWVl2BoYKnpmAACwn4j0zJSXl7tmL1VVVYXWQgAAgDDxe9G8IUOGqKamRpJ0/fXX68yZM5FqEwAAgN/8DjMDBw50rfq7Y8cOtbV13ysHAAAg2vweZvrHf/xHzZgxQ1dccYUk6dZbb1Xfvn09nvvmm2+Gp3UAAAA98DvMbNy4UevXr9cnn3yinTt36qqrrlJSkv9LzwMAAERCUOvMzJgxQ5s3b9agQYMi0KTwYjYTAAD2E5HZTJ1t377d9d8dWcjRZedYAACAaPC7ALirDRs26Oqrr1b//v3Vv39/feELX9Cvf/3rcLYNAACgR0H1zKxZs0YrVqzQkiVLlJ+fL8MwtGvXLn3nO99RXV2d7rvvvnC3EwAAwKOgN5p85JFHNG/ePLfj69ev18qVKy21qB41MwAA2E8gn99BDTOdOHFC06ZN63Z82rRpOnHiRDAPCQAAEJSgwszo0aP1n//5n92Ov/LKK8rNzQ25UQBiU2Vtk7YfrlFVXbPZTQEQQ4KqmXnkkUc0d+5clZaWKj8/Xw6HQ2+//bb+/Oc/eww53pSWluqnP/2pDh48qBMnTmjz5s2aM2eO63bDMPTII4/oxRdf1OnTpzV58mT9x3/8h6666qpgmg3AJGdazmlpSblKK2pdxwpy01RcmCdnUoKJLQMQC4Lqmbn99tu1b98+paamasuWLdq0aZNSU1O1b98+3XrrrX4/TnNzs6655ho9++yzHm9/4okntGbNGj377LPav3+/0tPTdcMNN+js2bPBNBuASZaWlGvXkTq3Y7uO1KmopMykFgGIJQH3zLS1temee+7RihUrtHHjxpD+8VmzZmnWrFkebzMMQz/72c/0ox/9SLfddpukiwXGw4cP18svv6x77703pH8bQHRU1ja59ch0aDcMlVbUqqquWTmpA0xoGYBYEXDPTEJCgjZv3hyJtripqqrSyZMnNXPmTNexxMREXXfdddq9e7fX+7W2tqqxsdHtB4B5jp5q8Xl7dT31MwBCE9Qw06233qotW7aEuSnuTp48KUkaPny42/Hhw4e7bvNk1apVcjqdrp+srKyIthOAb9lDfO/hNnIovTIAQhNUAfDo0aP16KOPavfu3Ro/frwGDHB/M1q6dGlYGid13ybBMAyfWyf84Ac/0PLly12/NzY2EmgAE41KG6iC3DTtOlKn9k7LWvV2OJQ/OpUhJgAhCyrM/PKXv9SgQYN08OBBHTx40O02h8MRljCTnp4u6WIPTUZGhut4TU1Nt96azhITE5WYmBjyvw8gfIoL81RUUuZWO5M/OlXFhXkmtgpArAgqzHRe4TdSG03m5OQoPT1d27ZtU17exTe8c+fOaefOnXr88cfD+m8BiCxnUoI2LJqkqrpmVdc3a+TQAfTIAAiboDeaXLt2rcaNG6d+/fqpX79+GjdunH75y18G9BhNTU0qLy9XeXm5pIshqby8XJ9++qkcDoeWLVumxx57TJs3b9ahQ4e0YMECJSUl6c477wy22QBMlJM6QDPGDiPIAAiroHpmVqxYoaefflpFRUWaOnWqJGnPnj267777VF1drZ/85Cd+Pc6BAwc0Y8YM1+8dtS7z58/XSy+9pAceeECfffaZvvvd77oWzfvjH/+o5OTkYJoNAABiUFAbTaampqq4uFiFhYVux0tKSlRUVKS6ujov94w+NpoEAMB+Avn8Dqpnpr29XRMmTOh2fPz48Tp//nwwDxk3KmubdPRUCzUDAACESVBh5pvf/Kaef/55rVmzxu34iy++qG984xthaVisYW8aIHh8CQDgS1DDTEVFRdqwYYOysrI0ZcoUSdLevXt17NgxzZs3TwkJf/9w7hp4os0qw0zz1u7zus7GhkWTTGsXYGV8CQDiV8SHmQ4dOqQvfvGLkqRPPvlEkpSWlqa0tDQdOnTIdV64p2vbFXvTAMHxtUElXwIAdAgqzGzfvj3c7Yhp/uxNQ5gB3PElAIC/gl5nBv5jbxogcGxQCcBfhJko6NibpneXYbfeDocKctP4dgl4wJcAAP4izERJcWGe8kenuh1jbxrAO74EAPBXULOZ7MQqs5k6sDcN4L+GlrZuG1QymwmID4F8fhNmAFgeXwKA+BPxqdkAEE05qYQYAN5RMwMAAGyNMAMAAGyNMAMAAGyNmhkgDrFxI4BYQpgB4ggbNwKIRQwzAXHE18aNAGBXhBkgQiprm7T9cI2q6qyxh1DHxo3tXZaW6rxxIwDYEcNMQJhZdSiH3dsBxCp6ZoAws+pQDhs3AohVhBnAT/4MG1l5KIeNGwHEKoaZgB4EMmxk9aGc4sK8bhs3sns7ALsjzAA98DVstGHRJLfjVh/KcSYlaMOiSWzcCCCmMMwE+BDosJFdhnJyUgdoxthhlmkPAISCMGNBVpvSaxYrXAd/ho26Ki7MU/7oVLdjDOUAQOQwzGQhVp3SG21Wug7BDBsxlAMA0UXPjIVYdUpvtFnpOoQybBTOoRwr9FIBgFXRM2MRHbUZXXWuzYiHb/dWvA5mzgCyUi8VAFgVYcYirD6lN1qseB3MHDYKZCYVAMQrwoxFWH1Kb7RE+jpU1jbp6KmWoAJJTmp0a1+s2EsFAFZEmLGIIQP6anBSgk63tLkd7+2Q8kdbZ0qvJ74CQqDhoaNGZdeROrfp0L0dDuWPTg36Opg1XBNKeLJiLxUAWBFhxiKWlpSroUuQkaSU/gmWndLrKyAYMoIOD5GoUYnmcE1lbZM+PNGoDburtb/6tOt4oOGJ3joA8A9hxgK8DSdI0umWNp1qORdQ70EovQGB6GnWUSDhoWubw1mjEq3hGk/hrrNAw1OkeqkAINYQZiwgXMMJ0RxK6SkgeOIpPPhqc7hqVKI1XOMp3HUWTHhiLyUA6BlhxgLCNZwQzaGUngKCL53DQzTaHI3hGl+9a10FEp5YgA8AesaieRYQjv18At1DKFQ9BQRfOsJDtNocjf2SAgl3wYQn9lICAO8IMxYR6n4+wewhFIqeAoI/4SGabY70fkn+hDurbTYJALGCYSaLCHU4wYyZLz3Vc/RU6xHNNkd6uMZbsW5n1LoAQGQ4DMPLO2+MaGxslNPpVENDg1JSUsxuTkTNW7vP68yXSK4W6ysg9BQezGpzJDS0tHULcBOzB2vBtJG68hInPTIAEIBAPr8JMzHE04fphOzBWmjhD1NPbbb73kMU6wJA6AgzncRTmOlQVdesQ8cbQl60LZoIAACAzgL5/KYAOAblpA7Q7/b/Te8ePeN2vPOCdlbDbB0AQLAIMzEo2tO0w6mytknbD9dYuo0AAGthNlMIorVtQKDsuEGhWRtBAgDsjzATBKt/8Npxg8Jorl4MAIgtDDMFoacNFqOt69BMNFa8DSc7D4sBAMxHz0yAorUDsz989RDZaYNCOw6LAQCsgzATICt98PrqIVr51Su18NqRursgR+cvGH7X9ZhRB2SVYTGr1kBFQjw9VwCxjzATICt98PrqIbr+qZ2uYx29Nb6YWQfkbSuAjpWAI/1ha/UaqHCKp+cKIH5QMxMgq9SjBLJLsz/1PGbXAUV6I0hfzH7u0RRPzxVA/KBnJgie6lG+mD0oqvUo/uzS3KGneh4r1AGFuhFksMMmVnju0RJPzxVAfKFnJgjOpAQ9U/gPmpg92HVsf/VpFZWUqaGlLSptGJU2UCn9Asui1fWeZwX5UwcULYGuBHym5Zzmrd2n65/aqYXr9mvGkzs0b+0+v/8OVnrukRZPzxVAfLF0mFm5cqUcDofbT3p6utnNknSxu/7dT8+4HYtmd31lbZMaPz8f0H281fOEow7IrJV7Qx02sUoNVDTE03MFEF8sP8x01VVX6U9/+pPr9969e5vYmous0F0fSM1MT4W0vgpw8y4d5PrG7un+ZhaUhuPvYHbxcTTF03MFEF8s3TMjSX369FF6errrJy0tzewmWaK7PpCaGX8KaT0V4Kb076MDR0/7HL4xs6A0XH8HM4uPoy2eniuA+GH5npmKigplZmYqMTFRkydP1mOPPaZRo0Z5Pb+1tVWtra2u3xsbG8PeJit01//9W3at2t0XztXgpAStXzhJ9S3n/C6I7VqA+9z2I1533e7YXsDsHqpw/R1CLT62k3h6rgDih6V7ZiZPnqwNGzbojTfe0C9+8QudPHlS06ZNU319vdf7rFq1Sk6n0/WTlZUV9nZZZXr2xW/Z7j1VE0cO1o77Z+gLWYMCKqTtkJM6QNlDkrS/+nSP2wuY3UMV7r9DoMXHdhZPzxVA7HMYRpdPLAtrbm7WZZddpgceeEDLly/3eI6nnpmsrCw1NDQoJSUlbG1paGnrNj070FqRcK3CGu5v2dsP12jhuv1eb1+3cKJmjB2mytomt8X5uj3O/dPd2hOJVWfD8XcAAFhPY2OjnE6nX5/flh9m6mzAgAG6+uqrVVFR4fWcxMREJSYmRrwtoXTXh7toNic1vEMF/g7f+FtQGskiYYZNAACWHmbqqrW1VR9//LEyMjLMbopLMN31Vl+FNZDhG38KSqPxfBk2AYD4Zememfvvv1+33HKLLr30UtXU1OgnP/mJGhsbNX/+fLObFjSzi2b95c+u2x3DRo/MvkqSPPaM2OX5AgDsy9Jh5m9/+5sKCwtVV1entLQ0TZkyRXv37lV2drbZTQtKZW2T/t/7x32eE81dt33xNXzz3rHT+tHmQzp0/O8zxQpy0/T9mWO6rUkTrl3G2eUZAOCNpcPMb3/7W7ObEBaeaka8sdoqrJ3rcXw9j9KKWo81MaFOn2aXZwBAT2xVM2NXnmpGuor2tO5gLC0p19tHeg5k0t9rYnzV30zIHqzq+mafWyBYvb4IAGA+wkyEddSMdF2zpSurr8La8Twu+DmRv3NNTLCrC3u7dl3XuwEAxDdLDzPFgp5qRu67IVdfveYSU3pk/K1DOdNyTkt/G1xPSEdNTKCrC0vhq7fxhjocAIgNhJkI66lmxIwgE2gdytKScn10PLhtITrXxOSkDpBhGNpffbrbeZ5mN0Vq2wjqcAAgtjDMFILK2iZtP1zjc7jDKlsfdBZIHUqgw0sdvD2/QLZAiNS1ow4HAGILYSYIZ1rOad7afbr+qZ0+az46WGmn4kDrUHoKH954e36B9raE+9pRhwMAsYdhpiD4+mbfueajQ7SW3PenBiTQOpSewoc3j8y+yuOQjb9bIHQI97WLdB0OACD6CDMBCnRF264BIxIflIHUgATaM+ItfPSSdMHH4/gKBf6sLtxVuK5dpOpwAADmIcwEyN9v9tEsMg2kpyjQnhHJc/j4YvZgHTjavZC3g69QYObmkME8fwCAtVEzEyB/v9n3VGTqT/GwP4KpAQm0DqUjfGy/f7rWLZyo7fdP13/987SQi3PN2hzSSjVMAIDQ0TMTIH++2fc0FPW1F3a7TU8OpccmmBqQQHpGfA2TBTNcZAVm9gwBAMKPMBOEnj7EewoYB7sMz7x9pFbfWLtXxYVfDPhDNZQaEF91KP4Mk9k9FESqhgkAEF0Ow+hhnX2ba2xslNPpVENDg1JSUsL62N4+xCtrm3T9UzuDekx/emm69pbMW7vPa0+Rp9lV/ojEYwIA4K9APr/pmQmBt2/2wc4Aki7uPv2djQdVcs+Ubrd56y35tznj9KMth8I23BPojC0AAMxEmIkQT0NRXxgxSOV/O9PjffdU1nsMDN6Kin+05RBrsQAA4hZhJkI81ZM8/PsP/b7/3sr6bkNX/vSWsBYLACDeMDU7BP5Mr+6Yfmz8X+jwl6PL74HsaRQqK+4nBQCAN/TMBCGYBfEC3eNo8qihbr9Hu7fErtOuAQDxhzAThED3ZpIC2+No2mVDu/V+RHvlWrtPuwYAxA+GmQIU7K7L3oZuuirITdPz3xjv8TYzVq41a5VeAAD8Rc9MgEKZ6eNp6KYgN0333zhG9c3neuz96Npb0tvhULth6FTLubDv9wQAgF0QZgIUSu2Kr6GbytomVxFvT70gg5MS9PDvq6OyiSUAAFZHmAlQOGpXOk+hDqaYOJiaHQAAYhU1M0EIZ+1KT7trdxVszQ4AALGKnpkghGumTzDbBrA6LwAA7ggzIQh1xd1gggmr8wIA4I5hpjDwtBKwP6sDBxNMWJ0XAAB39MyEwFPx7tRRQ+VwSLs/qXcd81bQG2wxMavzAgDwdw7D6FJJGmMaGxvldDrV0NCglJSUsD72vLX7ugURTzrCiaeZRu8dO6Mfbf5Ah443uo5dfUmK7r3uMl2V6fTZ08LqvACAWBXI5zc9M0HyVrzriaeCXk+9Oh0++J9GLXn54mwmX9O0w7VLNgAAdkbNTJAC3ThSct/Z2tOUbE98TdMGAACEmaAFsnFkhz69LhbtelsrxhPWjwEAwDfCTJD83Tiys/MXLoaXUHt1AADA3xFmQuBpJWBfOqZaB9Orw/oxAAB4RgFwCDytBPzw7z/scaq1tynZnviz51NlbZOOnmphVhMAIC4xNTvMGlrauq0B42lGkqfzPJmQPVgLp43UlZd0n6YdzCaVAADYQSCf34SZCPF3DZjO50kXa2P69HLozGdt2rC7WvurT7vO7RpUPK1z42tNGwAA7IIw04lZYSZUPQWVytomXf/UTq/3337/dIacAAC2FcjnNwXAFuRt6nbnadr+bFIZyr/f075SAABYBQXAUVRZ26R3quolOTRl1FCvPSf+BJVI7J5NDQ4AwI4IM1FwpuWcvvubd902n5Qubkr5wjfHdwsK/gSVnNQBQW1S6YunVYk7ViCmBgcAYFUMM0XB0pLybkFGkvZU1nvcqmBU2kBNHTXU42NN7dSj42mdm2B3z/ZnaAsAACuiZybCetqQsrSiVm9V1OpLuWlux70tLNz5uKd1boIt+vVnaIuCYgCAFdEzE2H+bF1w19p9mrd2nxpa2iRdDECeenIkafcn9d16ScIxIS0SNTgAAEQDPTNh4GsFXn+3Luhcm+JvL0k4C3a9rUocSg0OAADRQM9MCM60nNO8tft0/VM7tXDdfs14codbD8vR+mbd/vxuvx6rc22Kv70kvgp2gxHOGhwAAKKFnpkQ9DT7Z85/7NLp/ws2/ioqeVe/WTSlx14Sb7U4nUNRoL0p4azBAQAgWuiZCVJPs39e2f9pwEFGkj463qiikrIee0kiuWheTuoAzRg7jCADALAFemaC1FOY2FPpuYC3JxeMizOcTrWc89lL4s9QFLtpAwDiAWEmSD2FiamjhmpL2XGvt2c6++l4w+deb+8o8u346cpXwe7kUUP08O8/ZCVfAEBcYJgpSB1horeXBWFee/+knP29BwdfQUbybyq0t6Eow1BYC4OjiX2hAACBYtfsEDS0tKmopMxjIW5vh0PjswepoqYpoNqZzjtj+6vzUJRhGLbcTZt9oQAAncXcrtnPPfeccnJy1K9fP40fP15vvfWW2U2SdHH2z8qvXunxtnbD0L7q09r03Xw9cfsX/H7MYKZCdy7YjWRhcCSFe5o5ACB+WL5m5pVXXtGyZcv03HPPKT8/Xz//+c81a9YsffTRR7r00kvNbp5f4SEtJdHnOatvu1rDnf3CUqhrx5V8IzHNHAAQPyzfM7NmzRotWrRI3/72t3XFFVfoZz/7mbKysvT88897PL+1tVWNjY1uP5HkT3jo6ZzJo4aGbSq0t1qe3g6HCnLTLBkK7NqbBACwBkuHmXPnzungwYOaOXOm2/GZM2dq927PK+uuWrVKTqfT9ZOVlRXRNvoTHqIdMOy2kq8de5MAANZh6TBTV1en9vZ2DR8+3O348OHDdfLkSY/3+cEPfqCGhgbXz7FjxyLeTn/CQzQDRsdKvtvvn651Cydq+/3TtWHRJMsW0tqxNwkAYB2Wr5mRJEeXDznDMLod65CYmKjERN81KuHmzzYAZmwV4G2NGisqLszrNjPMyr1JAADrsHSYSU1NVe/evbv1wtTU1HTrrbECf8KDnQJGNLEvFAAgWJYeZurbt6/Gjx+vbdu2uR3ftm2bpk2bZlKrEEnsCwUACJSle2Ykafny5brrrrs0YcIETZ06VS+++KI+/fRTfec73zG7aQAAwAIsH2bmzp2r+vp6/fjHP9aJEyc0btw4vf7668rOzja7ad2wsSMAANHHdgZhwFL8AACEV8xtZ2B1LMUPAIB5CDMh6liKv71LB1fnpfgBAEDkEGZCxFL8AACYizATIpbiBwDAXISZEHlbir+XQyzFDwBAFBBmwqC4ME+TRw1xO3bBkNraL6ihpc2kVgEAEB8IM2HgTEpQn169ul3MfVWnojajqbK2SdsP11BwDACIO5ZfNM8OOmY0ddV5RlOkhptY4wYAEO/omQkDM2c0scYNACDeEWbCwKwZTaxxAwAAYSYsvM1o6u1wRHRGE2vcAABAmAmb4sI85Y9OdTuWPzpVxYV5Efs3WeMGAAAKgMPGmZSgDYsmqaquWdX1zVHZObujR2jXkTq3oabeDofyR6eyxg0AIC7QMxNmOakDNGPssKgFCTN6hAAAsBJ6ZmzOjB4hAACshDATI3JSCTEAgPjEMBMAALA1wgwAALA1wgwAALA1wgwAALA1wgwAALA1wgwAALA1pmaHoLK2SUdPtbC2CwAAJiLMBOFMyzktLSlXaUWt61hBbpqKC/PkTEowsWUAAMQfhpmCsLSkXLuO1Lkd23WkTkUlZSa1CACA+EWYCVBlbZNKK2rdNnaUpHbDUGlFrarqmk1qGQAA8YkwE6Cjp1p83l5dT5gBACCaCDMByh6S5PP2kUMpBAYAIJoIMwEalTZQBblp6u1wuB3v7XCoIDeNWU0AAEQZYSYIxYV5yh+d6nYsf3SqigvzTGoRAADxi6nZQXAmJWjDokmqqmtWdX0z68wAAGAiwkwIclIJMQAAmI1hJgAAYGuEGQAAYGuEGQAAYGuEGQAAYGuEGQAAYGuEGQAAYGuEGQAAYGuEGQAAYGuEGQAAYGuEGQAAYGsxv52BYRiSpMbGRpNbAgAA/NXxud3xOe5LzIeZs2fPSpKysrJMbgkAAAjU2bNn5XQ6fZ7jMPyJPDZ24cIFHT9+XMnJyXI4HGF73MbGRmVlZenYsWNKSUkJ2+PaCdfgIq7DRVwHrkEHrsNFXIfQroFhGDp79qwyMzPVq5fvqpiY75np1auXRowYEbHHT0lJidsXaQeuwUVch4u4DlyDDlyHi7gOwV+DnnpkOlAADAAAbI0wAwAAbI0wE6TExEQ9/PDDSkxMNLsppuEaXMR1uIjrwDXowHW4iOsQvWsQ8wXAAAAgttEzAwAAbI0wAwAAbI0wAwAAbI0wAwAAbI0w48HKlSvlcDjcftLT033eZ+fOnRo/frz69eunUaNG6YUXXohSayNn5MiR3a6Dw+HQ4sWLPZ6/Y8cOj+f/5S9/iXLLg1daWqpbbrlFmZmZcjgc2rJli9vthmFo5cqVyszMVP/+/TV9+nR9+OGHPT7uq6++qiuvvFKJiYm68sortXnz5gg9g/DwdR3a2tr04IMP6uqrr9aAAQOUmZmpefPm6fjx4z4f86WXXvL4+vj8888j/GyC09NrYcGCBd2ey5QpU3p83Fh6LUjy+Dd1OBz66U9/6vUx7fZaWLVqlSZOnKjk5GQNGzZMc+bM0eHDh93OiYf3hp6ug5nvDYQZL6666iqdOHHC9fPBBx94Pbeqqkpf+cpX9KUvfUllZWX64Q9/qKVLl+rVV1+NYovDb//+/W7XYNu2bZKkr33taz7vd/jwYbf75ebmRqO5YdHc3KxrrrlGzz77rMfbn3jiCa1Zs0bPPvus9u/fr/T0dN1www2uPcA82bNnj+bOnau77rpL7733nu666y59/etf1zvvvBOppxEyX9ehpaVF7777rlasWKF3331XmzZt0l//+ld99atf7fFxU1JS3F4bJ06cUL9+/SLxFELW02tBkm666Sa35/L666/7fMxYey1I6vb3/NWvfiWHw6Hbb7/d5+Pa6bWwc+dOLV68WHv37tW2bdt0/vx5zZw5U83Nza5z4uG9oafrYOp7g4FuHn74YeOaa67x+/wHHnjAuPzyy92O3XvvvcaUKVPC3DJzfe973zMuu+wy48KFCx5v3759uyHJOH36dHQbFiGSjM2bN7t+v3DhgpGenm6sXr3adezzzz83nE6n8cILL3h9nK9//evGTTfd5HbsxhtvNO64446wtzkSul4HT/bt22dIMo4ePer1nHXr1hlOpzO8jYsST9dg/vz5xuzZswN6nHh4LcyePdu4/vrrfZ5j59eCYRhGTU2NIcnYuXOnYRjx+97Q9Tp4Eq33BnpmvKioqFBmZqZycnJ0xx13qLKy0uu5e/bs0cyZM92O3XjjjTpw4IDa2toi3dSoOHfunDZu3KhvfetbPW7YmZeXp4yMDH35y1/W9u3bo9TCyKuqqtLJkyfd/taJiYm67rrrtHv3bq/38/b68HUfu2loaJDD4dCgQYN8ntfU1KTs7GyNGDFCN998s8rKyqLTwAjZsWOHhg0bpjFjxujuu+9WTU2Nz/Nj/bXwv//7v3rttde0aNGiHs+182uhoaFBkjRkyBBJ8fve0PU6eDsnGu8NhBkPJk+erA0bNuiNN97QL37xC508eVLTpk1TfX29x/NPnjyp4cOHux0bPny4zp8/r7q6umg0OeK2bNmiM2fOaMGCBV7PycjI0IsvvqhXX31VmzZt0tixY/XlL39ZpaWl0WtoBJ08eVKSPP6tO27zdr9A72Mnn3/+uR566CHdeeedPjeSu/zyy/XSSy9p69atKikpUb9+/ZSfn6+KioootjZ8Zs2apd/85jd688039dRTT2n//v26/vrr1dra6vU+sf5aWL9+vZKTk3Xbbbf5PM/OrwXDMLR8+XJde+21GjdunKT4fG/wdB26iuZ7Q8zvmh2MWbNmuf776quv1tSpU3XZZZdp/fr1Wr58ucf7dO2tMP5vYeWeejHsYu3atZo1a5YyMzO9njN27FiNHTvW9fvUqVN17NgxPfnkkyooKIhGM6PC09+6p79zMPexg7a2Nt1xxx26cOGCnnvuOZ/nTpkyxa1ANj8/X1/84hdVXFysZ555JtJNDbu5c+e6/nvcuHGaMGGCsrOz9dprr/n8MI/V14Ik/epXv9I3vvGNHmsd7PxaWLJkid5//329/fbb3W6Lp/cGX9dBiv57Az0zfhgwYICuvvpqrykxPT29W5KuqalRnz59NHTo0Gg0MaKOHj2qP/3pT/r2t78d8H2nTJlii29b/uiY0ebpb93121XX+wV6Hztoa2vT17/+dVVVVWnbtm0+v3l50qtXL02cODFmXh8ZGRnKzs72+Xxi9bUgSW+99ZYOHz4c1PuEXV4LRUVF2rp1q7Zv364RI0a4jsfbe4O369DBjPcGwowfWltb9fHHHysjI8Pj7VOnTnXN9Onwxz/+URMmTFBCQkI0mhhR69at07Bhw/RP//RPAd+3rKzM63Wzm5ycHKWnp7v9rc+dO6edO3dq2rRpXu/n7fXh6z5W1/FmVVFRoT/96U9BhXbDMFReXh4zr4/6+nodO3bM5/OJxddCh7Vr12r8+PG65pprAr6v1V8LhmFoyZIl2rRpk958803l5OS43R4v7w09XQfJxPeGkMqHY9T3v/99Y8eOHUZlZaWxd+9e4+abbzaSk5ON6upqwzAM46GHHjLuuusu1/mVlZVGUlKScd999xkfffSRsXbtWiMhIcH4r//6L7OeQti0t7cbl156qfHggw92u63rdXj66aeNzZs3G3/961+NQ4cOGQ899JAhyXj11Vej2eSQnD171igrKzPKysoMScaaNWuMsrIyVyX+6tWrDafTaWzatMn44IMPjMLCQiMjI8NobGx0PcZdd91lPPTQQ67fd+3aZfTu3dtYvXq18fHHHxurV682+vTpY+zduzfqz89fvq5DW1ub8dWvftUYMWKEUV5ebpw4ccL109ra6nqMrtdh5cqVxh/+8Afjk08+McrKyoyFCxcaffr0Md555x0znmKPfF2Ds2fPGt///veN3bt3G1VVVcb27duNqVOnGpdccklcvRY6NDQ0GElJScbzzz/v8THs/lr453/+Z8PpdBo7duxwe723tLS4zomH94aeroOZ7w2EGQ/mzp1rZGRkGAkJCUZmZqZx2223GR9++KHr9vnz5xvXXXed23127Nhh5OXlGX379jVGjhzp9X9qu3njjTcMScbhw4e73db1Ojz++OPGZZddZvTr188YPHiwce211xqvvfZaFFsbuo7p5V1/5s+fbxjGxSmYDz/8sJGenm4kJiYaBQUFxgcffOD2GNddd53r/A6/+93vjLFjxxoJCQnG5ZdfbvmA5+s6VFVVebxNkrF9+3bXY3S9DsuWLTMuvfRSo2/fvkZaWpoxc+ZMY/fu3dF/cn7ydQ1aWlqMmTNnGmlpaUZCQoJx6aWXGvPnzzc+/fRTt8eI9ddCh5///OdG//79jTNnznh8DLu/Fry93tetW+c6Jx7eG3q6Dma+Nzj+r4EAAAC2RM0MAACwNcIMAACwNcIMAACwNcIMAACwNcIMAACwNcIMAACwNcIMAACwNcIMAACwNcIMAFuaPn26li1bZpnHAWCePmY3AACiYceOHZoxY4ZOnz6tQYMGuY5v2rQpJjaEBeIZYQZAXBsyZIjZTQAQIoaZAARk+vTpWrJkiZYsWaJBgwZp6NCh+pd/+Rd1bPN2+vRpzZs3T4MHD1ZSUpJmzZqliooK1/1feuklDRo0SFu2bNGYMWPUr18/3XDDDTp27JjrnAULFmjOnDlu/+6yZcs0ffp0r+3auHGjJkyYoOTkZKWnp+vOO+9UTU2NJKm6ulozZsyQJA0ePFgOh0MLFixwPZ/Ow0z+tv+NN97QFVdcoYEDB+qmm27SiRMngrmcAMKAMAMgYOvXr1efPn30zjvv6JlnntHTTz+tX/7yl5IuBpEDBw5o69at2rNnjwzD0Fe+8hW1tbW57t/S0qJ/+7d/0/r167Vr1y41NjbqjjvuCKlN586d06OPPqr33ntPW7ZsUVVVlSuwZGVl6dVXX5UkHT58WCdOnNC///u/e3wcf9v/5JNP6te//rVKS0v16aef6v777w+p/QCCxzATgIBlZWXp6aeflsPh0NixY/XBBx/o6aef1vTp07V161bt2rVL06ZNkyT95je/UVZWlrZs2aKvfe1rkqS2tjY9++yzmjx5sqSL4eiKK67Qvn37NGnSpKDa9K1vfcv136NGjdIzzzyjSZMmqampSQMHDnQNJw0bNsytZqaziooKv9v/wgsv6LLLLpMkLVmyRD/+8Y+DajeA0NEzAyBgU6ZMkcPhcP0+depUVVRU6KOPPlKfPn1cIUWShg4dqrFjx+rjjz92HevTp48mTJjg+v3yyy/XoEGD3M4JVFlZmWbPnq3s7GwlJye7hqQ+/fRTvx/j448/9qv9SUlJriAjSRkZGa4hLQDRR5gBEHGGYbiFH0ndfu98rFevXq4anA6dh3m6am5u1syZMzVw4EBt3LhR+/fv1+bNmyVdHH4KpJ3+tL/r7CeHw+H1vgAijzADIGB79+7t9ntubq6uvPJKnT9/Xu+8847rtvr6ev31r3/VFVdc4Tp2/vx5HThwwPX74cOHdebMGV1++eWSpLS0tG4FteXl5V7b85e//EV1dXVavXq1vvSlL+nyyy/v1lPSt29fSVJ7e7vXx/G3/QCshTADIGDHjh3T8uXLdfjwYZWUlKi4uFjf+973lJubq9mzZ+vuu+/W22+/rffee0/f/OY3dckll2j27Nmu+yckJKioqEjvvPOO3n33XS1cuFBTpkxx1ctcf/31OnDggDZs2KCKigo9/PDDOnTokNf2XHrpperbt6+Ki4tVWVmprVu36tFHH3U7Jzs7Ww6HQ//93/+t2tpaNTU1dXscf9sPwFoIMwACNm/ePH322WeaNGmSFi9erKKiIt1zzz2SpHXr1mn8+PG6+eabNXXqVBmGoddff91taCYpKUkPPvig7rzzTk2dOlX9+/fXb3/7W9ftN954o1asWKEHHnhAEydO1NmzZzVv3jyv7UlLS9NLL72k3/3ud7ryyiu1evVqPfnkk27nXHLJJXrkkUf00EMPafjw4VqyZInHx/Kn/QCsxWEw0AsgANOnT9c//MM/6Gc/+1lQ93/ppZe0bNkynTlzJqztAhC/6JkBAAC2RpgBAAC2xjATAACwNXpmAACArRFmAACArRFmAACArRFmAACArRFmAACArRFmAACArRFmAACArRFmAACArf1/YHvcI+OqMSEAAAAASUVORK5CYII=\n", - "text/plain": [ - "
" - ] - }, - "metadata": {}, - "output_type": "display_data" - } - ], - "source": [ - "restaurant_df.plot(kind='scatter', x='population', y='profit')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "To simplify things this time around, we'll omit the validation set. Given that there is no validation set, and that we won't implement cross-validation, we won't be able to perform any hyper-parameter search.\n", - "\n", - "Here, the target label is the `profit`, and the (only) feature is the `population`." - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "metadata": {}, - "outputs": [], - "source": [ - "# We'll use 80% of our data as training data and the remaining 20% as test data\n", - "# Here, we use a random seed to ensure that the data shuffling and splitting can be reproduced\n", - "X_train, y_train, X_test, y_test, feature_names = helpers.preprocess_data(restaurant_df, label=\"profit\", train_size=0.8, seed=42)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Adding the intercept" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The goal of linear regression is to fit a line of slope $w_1$ and of intercept $b$ such that for any data $x^{(i)}$, the prediction $\\hat{y}^{(i)}$ is:\n", - "$$\\hat{y}^{(i)} = w_{1}x^{(i)} + b$$\n", - "\n", - "Note that this can also be written as:\n", - "$$\\hat{y}^{(i)} = \\begin{bmatrix} b & w_1 \\end{bmatrix} \\cdot \\begin{bmatrix} 1 \\\\ x^{(i)} \\end{bmatrix}$$" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Therefore, in order to take into account the offset term ($b$) directly in our matrix, we add an additional first column to `X` and set it to all ones. Then, we treat the intercept as another feature (`b` will be treated as `w_0`), which will make our matrix computation easier.\n", - "\n", - "__Note__: The same principle applies if the data has multiple features:\n", - "$$\\hat{y}^{(i)} = w_{n}x^{(i)}_{n} + \\ ... \\ + w_{2}x^{(i)}_{2} + w_{1}x^{(i)}_{1} + b$$\n", - "is equivalent to\n", - "$$\\hat{y}^{(i)} = \\begin{bmatrix} b & w_1 & w_2 & ... & w_D \\end{bmatrix} \\cdot \\begin{bmatrix} 1 \\\\ x^{(i)}_{1} \\\\ x^{(i)}_{2} \\\\ ... \\\\ x^{(i)}_{D} \\end{bmatrix}$$\n", - "\n", - "Let's add a column of ones (known as the offset term / constant term) to the feature matrix `X`." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "
\n", - "As a rule, for linear regression, the constant is always included in the feature matrix $\\mathbf{X}$, and the intercept / bias term will be part to the weight vector $\\mathbf{w}$. \n", - "\n", - "However, this **will not be the case** in future exercises, where the bias term will be separate.\n", - " \n", - "
" - ] - }, - { - "cell_type": "code", - "execution_count": 7, - "metadata": {}, - "outputs": [], - "source": [ - "def add_constant(X: np.ndarray) -> np.ndarray:\n", - " \"\"\" Adds an constant term to the dataset (as the first column)\n", - "\n", - " Args:\n", - " X (np.ndarray): Dataset of shape (N, D-1)\n", - "\n", - " Returns: \n", - " Dataset with offset term added, of shape (N, D)\n", - "\n", - " \"\"\"\n", - " X_with_offset = np.insert(X, 0, 1, axis=1)\n", - "\n", - " return X_with_offset\n", - "\n", - "X_train = add_constant(X_train)\n", - "X_test = add_constant(X_test)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "**Question:** In simple linear regression, what happens if no intercept is added?" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "**Answer:** YOUR ANSWER HERE" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Data preview" - ] - }, - { - "cell_type": "code", - "execution_count": 8, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Features: ['population']\n" - ] - } - ], - "source": [ - "print(f\"Features: {feature_names}\")" - ] - }, - { - "cell_type": "code", - "execution_count": 9, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Training set features:\n", - "X_train: \n", - " [[ 1. 8.2951]\n", - " [ 1. 9.3102]\n", - " [ 1. 20.341 ]\n", - " [ 1. 6.0062]\n", - " [ 1. 7.0032]\n", - " [ 1. 8.5781]\n", - " [ 1. 8.2111]\n", - " [ 1. 8.0959]\n", - " [ 1. 5.1301]\n", - " [ 1. 5.0269]]\n", - "\n", - "Training set labels:\n", - "y_train: \n", - " [ 5.7442 3.9624 20.992 1.2784 11.854 12. 6.5426 4.1164\n", - " 0.56077 -2.6807 ]\n" - ] - } - ], - "source": [ - "# Visualisation of X_train and y_train (separation of the features and the labels)\n", - "print('Training set features:')\n", - "print(f'X_train: \\n {X_train[:10]}')\n", - "\n", - "print('\\nTraining set labels:')\n", - "print(f'y_train: \\n {y_train[:10]}')" - ] - }, - { - "cell_type": "code", - "execution_count": 10, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Training set shape:\n", - "X: (77, 2), y: (77,)\n", - "\n", - "Test set shape:\n", - "X: (20, 2), y: (20,)\n" - ] - } - ], - "source": [ - "# Show shapes\n", - "print('Training set shape:')\n", - "print(f'X: {X_train.shape}, y: {y_train.shape}')\n", - "\n", - "print('\\nTest set shape:')\n", - "print(f'X: {X_test.shape}, y: {y_test.shape}')" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Notation" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Now that we have added the constant term, here's how our data looks like:\n", - "\n", - "- features: $\\boldsymbol{X} \\in \\mathbb{R}^{N \\times (d+1)}$, $\\forall \\ \\boldsymbol{x}^{(i)} \\in \\boldsymbol{X}: \\boldsymbol{x}^{(i)} \\in \\mathbb{R}^{(d+1)}$.\n", - "- labels: $\\boldsymbol{y} \\in \\mathbb{R}^{N}$, $\\forall \\ y^{(i)} \\in \\boldsymbol{y}: y^{(i)} \\in \\mathbb{R}$ \n", - " \n", - " where $N$ is the number of examples in our dataset, and $d$ is the number of features per example. In other words, $d$ is the dimension of the independent variables. \n", - " \n", - "\n", - "For the weights, we have:\n", - " \n", - " \n", - " - weights: $\\mathbf{w} \\in \\mathbb{R}^{d+1}$, where $w_0$ (or $b$) is known as the intercept." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - " **Note:**\n", - " $\\boldsymbol{X}$ is sometimes called the design matrix, where $\\boldsymbol{X}_{i, :}$ denotes $\\boldsymbol{x}^{(i)}$. \n", - " Note that a single example $\\boldsymbol{x}^{(i)}$ is a column vector of dimension (shape in python language) $((d+1) \\times 1)$, while the design matrix $\\boldsymbol{X}$ is of dimension (shape) $(N \\times (d+1))$, where each row represents an example and each column represents a feature. " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 2. Loss function" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "One of the first step when working on a machine learning problem is to pick a loss / cost function. Here, we will use the Mean Squared Error (MSE), defined as: \n", - "\n", - "$$\n", - "\\begin{align}\n", - "J(\\mathbf{w}) = \\frac{1}{N} \\sum_{i=1}^{N} (\\hat{y}^{(i)} - y^{(i)})^{2} \\\\\n", - "= \\frac{1}{N} \\sum_{i=1}^{N} (\\mathbf{w}^T{\\boldsymbol{x}}^{(i)} - y^{(i)})^{2} \\\\\n", - "= \\frac{1}{N} (\\mathbf{X} \\mathbf{w}-\\mathbf{y})^{T} (\\mathbf{X} \\mathbf{w}-\\mathbf{y})\n", - "\\end{align}$$\n", - "\n", - "where $N$ is the number of examples, $\\hat{y}^{(i)}$ is the prediction for the $i^{th}$ example, and ${y}^{(i)}$ is the ground-truth for the $i^{th}$ example.\n", - "\n", - "Implement the function `mse_loss()`\n", - "\n", - "**Note about loss / cost:** The function we want to minimize or maximize is called the cost function, loss function, or error function. In this exercise, we use these terms interchangeably, though some machine learning publications assign special meaning to some of these terms.\n", - "\n", - "**Hint**: Use the matrix form shown above and make use of NumPy operations." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def mse_loss(X: np.ndarray, y: np.ndarray, w: np.ndarray) -> np.ndarray:\n", - " \"\"\"Compute the Mean Square Error (MSE)\n", - " \n", - " Args:\n", - " X (np.ndarray): Dataset of shape (N, D)\n", - " y (np.ndarray): Labels of shape (N, )\n", - " w (np.ndarray): Weights of shape (D, )\n", - "\n", - " Returns:\n", - " Distances of shape (N,)\n", - " \"\"\"\n", - " ### START CODE HERE ### (≈ 3 lines of code)\n", - "\n", - " loss = ...\n", - " ### END CODE HERE ###\n", - " return loss" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Let's initialize the weights to 0 and look at the current loss." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "zero_weights = np.zeros(X_train.shape[1])" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "train_loss = mse_loss(X_train, y_train, zero_weights)\n", - "test_loss = mse_loss(X_test, y_test, zero_weights)\n", - "print(f\"Train loss: {train_loss:.5f}\")\n", - "print(f\"Test loss: {test_loss:.5f}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "**Expected output:** \n", - "\n", - "| | |\n", - "|---|--------------------------------------------------|\n", - "| **Train loss** | 62.15811 |\n", - "| **Test loss** | 71.79680 |" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "helpers.plot_linear_regression_2d(X=X_train, y=y_train, w=zero_weights, feature_name=\"population\", label_name=\"profit\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Not great, right? We'll see in the next sections how to fit our model in order to get a much better predictor.\n", - "\n", - " **Question:** Before proceeding, based on the lecture, what are some ways you could think of to determine a better set of weights? \n", - "\n", - "**Answer:** YOUR ANSWER HERE\n", - "" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 3. Gradient Descent" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Now we need to define a function to perform gradient descent on the weights $\\mathbf{w}$ using the update rules. First, write a function that computes the gradient of the loss function (`mse_gradient()`) and then use it in the `gradient_descent()` function to update the weights at every iteration.\n", - "\n", - "As seen in the previous section, our loss is:\n", - "$$\n", - "J(\\mathbf{w}) = \\frac{1}{N} (\\mathbf{X} \\mathbf{w}-\\mathbf{y})^{T} (\\mathbf{X} \\mathbf{w}-\\mathbf{y})\n", - "$$\n", - "\n", - "Therefore, the derivative w.r.t to ${\\mathbf{w}}$ is:\n", - "\n", - "$$ \\nabla_{\\mathbf{w}} J(\\mathbf{w}) = \\frac{2}{N} \\mathbf{X}^{T} (\\mathbf{X} \\mathbf{w} - \\mathbf{y}) \n", - "$$\n", - "\n", - "**Note:** You can use http://www.matrixcalculus.org/ to compute the gradient.\n", - "\n", - "\n", - "The gradient descent formula is:\n", - "$$\\mathbf{w} := \\mathbf{w} - \\alpha \\nabla_{\\mathbf{w}} J(\\mathbf{w})$$\n", - "\n", - "where $\\nabla_{\\mathbf{w}} J(\\mathbf{w})$ is the gradient of the loss function at the current iteration, $\\mathbf{w}$ is the weights vector, and $\\alpha$ is the learning rate.\n", - "\n", - "**Hint**: Use the matrix form of the gradient and make use of NumPy operations." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def mse_gradient(X: np.ndarray, y: np.ndarray, w: np.ndarray) -> np.ndarray:\n", - " \"\"\"Compute the gradient of the MSE\n", - " \n", - " Args:\n", - " X (np.ndarray): Dataset of shape (N, D)\n", - " y (np.ndarray): Labels of shape (N, )\n", - " w (np.ndarray): Weights of shape (D, )\n", - "\n", - " Returns:\n", - " Gradient of shape (D, )\n", - " \"\"\"\n", - " ### START CODE HERE ### (≈ 2 lines of code)\n", - " \n", - " grad = ...\n", - " ### END CODE HERE ###\n", - " return grad\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def gradient_descent(X: np.ndarray, y: np.ndarray, w: np.ndarray, alpha: float, max_iters: int) -> (np.ndarray, np.ndarray):\n", - " \"\"\"Gradient descent for linear regression.\n", - " \n", - " Args:\n", - " X (np.ndarray): Dataset of shape (N, D)\n", - " y (np.ndarray): Labels of shape (N, )\n", - " w (np.ndarray): Weights of shape (D, )\n", - " alpha (float): Learning rate\n", - " max_iters (int): Maximum number of gradient descent iteration\n", - "\n", - " Returns:\n", - " w (np.ndarray): Optimum weights of shape (D, )\n", - " losses (np.ndarray): Loss at every iteration of gradient descent. Shape is (max_iters, )\n", - " \"\"\"\n", - " # Define an array to store the evolution of the loss\n", - " losses = np.zeros(max_iters)\n", - " \n", - " for n_iter in range(max_iters):\n", - " ### START CODE HERE ### (≈ 2 lines of code)\n", - " # Update w using the gradient descent formula\n", - " w = ...\n", - " # Compute the loss with the updated w\n", - " loss = ...\n", - " ### END CODE HERE ###\n", - " \n", - " # Track losses\n", - " losses[n_iter] = loss\n", - " \n", - " # Print loss at some iterations\n", - " if n_iter % (max_iters / 20) == 0:\n", - " if w.shape[0] == 2: \n", - " print(f\"Iteration {n_iter}: loss={loss:.5f}, w0={w[0]:.3f}, w1={w[1]:.3f}\")\n", - " else:\n", - " print(f\"Iteration {n_iter}: loss={loss:.5f}\")\n", - "\n", - " return w, losses" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Let's initialize some additional variables - the learning rate alpha, and the number of iterations to perform." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "alpha = 0.01\n", - "iters = 2000\n", - "w = np.zeros((X_train.shape[1], ))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Now let's run the gradient descent algorithm to fit our parameters theta to the training set." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "w, loss = gradient_descent(X_train, y_train, w, alpha, iters)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Note that `gradient_descent` prints the loss and the values of the weights matrix, `w`. The reason is that `w` is at the core of our algorithm. Make sure to understand that the whole point of the learning algorithm is to update this `w` so that the linear regression model (described by `w`) fits the data as well as possible. As `X` and `y` are fixed, the only parameter that can be changed is `w`. This why we use the gradient of the loss w.r.t `w` in gradient descent. It enables us to get closer to the best value of `w` at every iteration." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Now, play with the learning rate, `alpha`, and the number of iterations, `iters`, to see how the convergence changes. Document your findings.\n", - "\n", - "__Hint__: \n", - "- Try `alpha = 0.05`. What's happening? Try to guess why.\n", - "- Try `alpha = 0.001`. Why is the final loss bigger than when `alpha = 0.01`?\n", - "- Try `alpha = 0.001` with `iters = 20 000`. Is the problem of the loss solved?" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "alpha = 0.01 # Try changing this\n", - "iters = 2000 # Try changing this\n", - "w = np.zeros((X_train.shape[1], ))\n", - "w, loss = gradient_descent(X_train, y_train, w, alpha, iters)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Finally we can compute the loss (error) of the trained model using our fitted parameters." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "train_loss = mse_loss(X_train, y_train, w)\n", - "test_loss = mse_loss(X_test, y_test, w)\n", - "print(f\"Train loss: {train_loss:.5f}\")\n", - "print(f\"Test loss: {test_loss:.5f}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "**Expected output:** with `alpha = 0.01` and `iters = 2000`.\n", - "\n", - "| | |\n", - "|---|--------------------------------------------------|\n", - "| **Train loss** |9.34136 |\n", - "| **Test loss** | 7.57149 |" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Let's also look at how the regression line looks like." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "helpers.plot_linear_regression_2d(X=X_train, y=y_train, w=w, feature_name=\"population\", label_name=\"profit\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Looks pretty good! The red line is our trained model, it represents the estimated profit of our new restaurant for every population size possible. Remember that the model is 100% described by our parameters $\\mathbf{w}$ (in this case $\\mathbf{w} = [b, w_1]$). If we had chosen a $\\mathbf{w}$ that doesn't fit the model well, we would have gotten a red line that doesn't fit the data." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Since the gradient descent function also outputs a vector with the loss at each training iteration, we can plot that as well. The goal of gradient descent is to get a model that fits the data well, so we hope that the loss decreases throughout the iterations of gradient descent. Minimizing the MSE in linear regression is a convex optimization problem, so if everything goes well, it should reach a global minima." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "helpers.plot_loss(loss)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If the plot had shown a non-decreasing function, it would have raised questions about the validity of our implementation of gradient descent. In any case, it's always good practice to plot this graph to see if our algorithm works as expected. " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 4. Least squares method" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "It turns out that linear regression with MSE is one of these rare cases where we can compute the optimum of the loss function analytically. Let's see how:\n", - "\n", - "\n", - "Let's start from the loss function: \n", - "$$\n", - "\\begin{align}\n", - "J(\\mathbf{w}) = \\frac{1}{N} \\sum_{i=1}^{N} (\\hat{y}^{(i)} -y^{(i)})^{2} \\\\\n", - " = \\frac{1}{N} \\sum_{i=1}^{N} (\\mathbf{w}^T \\boldsymbol{x}^{(i)} -y^{(i)})^{2} \\\\\n", - "= \\frac{1}{N} (\\mathbf{X}\\mathbf{w}-\\mathbf{y})^{T}(\\mathbf{X}\\mathbf{w}-\\mathbf{y})\n", - "\\end{align}\n", - "$$\n", - "\n", - "This function is convex in $\\mathbf{w}$, so let's try to find its minimum.\n", - "\n", - "Take the derivative with respect to $\\mathbf{w}$: (Use http://www.matrixcalculus.org/ if necessary)\n", - "$$\n", - "\\frac{\\partial J(\\mathbf{w})}{\\partial \\mathbf{w}}=\\frac{2}{N} \\mathbf{X}^{\\top}(\\mathbf{X} \\mathbf{w} - \\mathbf{y})\n", - "$$\n", - "Set to 0 and solve:\n", - "$$\n", - "\\begin{align}\n", - "\\frac{2}{N} \\mathbf{X}^{\\top}(\\mathbf{X} \\mathbf{w} - \\mathbf{y}) = 0 \\\\\n", - "\\Leftrightarrow \\mathbf{X}^{T} \\mathbf{X} \\mathbf{w} = \\mathbf{X}^{T} \\mathbf{y}\n", - "\\end{align}\n", - "$$\n", - "\n", - "\n", - "Therefore, the linear regression model has an analytical solution in the form of the normal equations:\n", - "$$\\hat{\\mathbf{w}} = (\\mathbf{X}^{T}\\mathbf{X)}^{-1} \\ \\mathbf{X}^{T} \\ \\mathbf{y}$$\n", - "This is known as the **least squares** method. The advantage of this method is that you can directly get the optimal weights $\\mathbf{w}$ from this short matrix expression.\n", - "\n", - "Please use this solution to complete the function `least_squares` and to obtain the weight parameters $\\mathbf{w}$. \n", - "\n", - "**Note:** Use `np.linalg.solve` to solve a linear matrix equation, as it is more stable and more accurate than computing the inverse. You can find the documentation for this method [here](https://numpy.org/doc/stable/reference/generated/numpy.linalg.solve.html)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def least_squares(X: np.ndarray, y: np.ndarray) -> np.ndarray:\n", - " \"\"\"Solves linear regression using least squares\n", - "\n", - " Args:\n", - " X: Data of shape (N, D)\n", - " y: Labels of shape (N, )\n", - "\n", - " Returns:\n", - " Weight parameters of shape (D, )\n", - " \"\"\"\n", - "\n", - " ### START CODE HERE ### (≈ 1 line of code)\n", - " w = ...\n", - " ### END CODE HERE ###\n", - " return w\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "ls_w = least_squares(X_train, y_train)\n", - "print(ls_w)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "train_loss = mse_loss(X_train, y_train, ls_w)\n", - "test_loss = mse_loss(X_test, y_test, ls_w)\n", - "print(f\"Train loss: {train_loss:.5f}\")\n", - "print(f\"Test loss: {test_loss:.5f}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "**Expected output:** \n", - "\n", - "| | |\n", - "|---|--------------------------------------------------|\n", - "| **Train loss** | 9.34135 |\n", - "| **Test loss** | 7.57472 |" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "helpers.plot_linear_regression_2d(X=X_train, y=y_train, w=ls_w, feature_name=\"population\", label_name=\"profit\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "**Question:** Compare the loss and plot obtained with least-squares to the loss and plot obtained with gradient descent. What can you say about these two methods, is the end result similar?\n", - "\n", - "**Answer:** YOUR ANSWER HERE" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 5. Prediction\n", - "Based on the weights ($\\mathbf{w}$), we just computed and the linear model, let's define a function `predict`, which we'll use to give a prediction of the expected restaurant profit ($\\hat{y}$) based on the population ($x$)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "def predict(X, w):\n", - " \"\"\"Predicts value using linear regression weights\n", - "\n", - " Args:\n", - " X: Dataset (without the offset) of shape (M, D)\n", - " w: Weights (with bias term) of shape (D,)\n", - "\n", - " Returns:\n", - " Predictions of shape (M, )\n", - " \"\"\"\n", - " ### START CODE HERE (≈ 1 line of code)\n", - " y_hat = ...\n", - " ### END CODE HERE\n", - " return y_hat" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# What's the predicted profit in a city of 10'000 inhabitants.\n", - "expected_profit = predict([1, 10], w)\n", - "print(f\"A new restaurant in a city of 10'000 inhabitants has an expected profit of {expected_profit*1000:.2f} $.\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 6. Multiple Linear Regression\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Now, we're tasked with implementing linear regression with multiple features to predict the price of an house. We'll see that the code implemented in the previous parts works just as well for multiple features.\n", - "\n", - "*Background: Suppose you want to buy a new house and you want to figure out if its price is too low or too high based on the current house market. You know the number of rooms and the size of the house you want to buy, and you are going to predict the market price of the house based on these two features.*" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 6.1. House Dataset" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Here, we'll use a dataset containing information on houses in Portland, Oregon. This dataset consists of 47 houses with their respective size (in thousands of sqft), number of bedrooms and their respective price (in USD). Take a look at the file `house_data.csv` and see how it's loaded by running the cell below." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "house_df = pd.read_csv('data/house_data.csv')\n", - "\n", - "print(f\"There are {house_df.shape[0]} rows and {house_df.shape[1]} columns.\")\n", - "# Show the first 5 rows of the data\n", - "house_df.head(5)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "We want to predict the price of a house using its size and number of bedrooms. " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "# We'll use 80% of our data as training data and the remaining 20% as test data\n", - "# Here, we use a random seed to ensure that the data shuffling and splitting can be reproduced\n", - "X_train_mult, y_train_mult, X_test_mult, y_test_mult, feature_names_mult = helpers.preprocess_data(house_df, label=\"price\", train_size=0.8, seed=42)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "As in the simple linear regression case, we'll first add a constant term to our training data for the intercept." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "X_train_mult = add_constant(X_train_mult)\n", - "X_test_mult = add_constant(X_test_mult)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "print(f\"Features: {feature_names_mult}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "This time, there are several features. To be exact, we have 2 features and we can plot according to one feature at a time, to see how each feature correlates to the target variable `y`.\n", - "\n", - "Run the following cell with `feature_num = 1` and then with `feature_num = 2` (`0` is the constant term)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "feature_num = 1\n", - "plt.scatter(X_train_mult[:,feature_num], y_train_mult)\n", - "\n", - "plt.ylabel(\"price\")\n", - "if feature_num == 0:\n", - " plt.xlabel(\"constant\")\n", - " plt.title(\"constant vs price\")\n", - "else:\n", - " plt.xlabel(f\"{feature_names_mult[feature_num - 1]}\")\n", - " plt.title(f\"{feature_names_mult[feature_num - 1]} vs price\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "It is also possible to plot the target variable according to both features. \n", - "\n", - "Using `plot_data_3d` from `helpers.py`, we can generate 3D plot that shows the training and test set according to both features. \n", - "- You can toggle each dataset on or off by clicking on the legend (upper left). \n", - "- You can also interact with the plot, zoom in and out, and see it through different angles. Try to carefully choose the view angle in order to get the equivalent of the 2 plots above (cancel one dimension)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "helpers.plot_data_3d(X_train=X_train_mult, y_train=y_train_mult, X_test=X_test_mult, y_test=y_test_mult, feature_names=feature_names_mult, label_name=\"price\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 6.2 Training" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### 6.2.1. Gradient Descent" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If you have implemented the function `gradient_descent` correctly in section 1, you should be able to use it for multiple features.\n", - "\n", - "Try to call `gradient_descent(X_train_mult, y_train_mult, np.zeros((X_train_mult.shape[1], )), 0.02, 5000)`. \n", - "If it doesn't work, go back to your `gradient_descent` function in section 1, write in matrix form, rerun the function cell and try to call it again with the above parameters. " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "w_mult, loss = gradient_descent(X_train_mult, y_train_mult, np.zeros((X_train_mult.shape[1], )), 0.02, 5000)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Then, as usual, we can compute the loss of our newly trained model." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "train_loss = mse_loss(X_train_mult, y_train_mult, w_mult)\n", - "test_loss = mse_loss(X_test_mult, y_test_mult, w_mult)\n", - "print(f\"Train loss: {train_loss:.1f}\")\n", - "print(f\"Test loss: {test_loss:.1f}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "**Expected output:** \n", - "\n", - "| | |\n", - "|---|--------------------------------------------------|\n", - "| **Train loss** | 4320753675.3 |\n", - "| **Test loss** | 3673378116.2 |" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### 6.2.2. Least squares" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If `least_squares` is implemented correctly, it should also be able to work without any modification for multiple features." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "ls_w_mult = least_squares(X_train_mult, y_train_mult)\n", - "print(ls_w_mult)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "train_loss = mse_loss(X_train_mult, y_train_mult, ls_w_mult)\n", - "test_loss = mse_loss(X_test_mult, y_test_mult, ls_w_mult)\n", - "print(f\"Train loss: {train_loss:.1f}\")\n", - "print(f\"Test loss: {test_loss:.1f}\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "**Expected output:** \n", - "\n", - "| | |\n", - "|---|--------------------------------------------------|\n", - "| **Train loss** | 4320753667.6 |\n", - "| **Test loss** | 3673281817.0 |" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "This output is a realistic one. Taking the square root of the test loss we get an approximation of the average difference between our model prediction and the reality (the Root Mean Square Error, or RMSE)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "average_difference = np.sqrt(test_loss)\n", - "print(\"The average difference between the predicted price and the actual price of a house (on the test set) is\",average_difference,\"$.\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "When using MSE, this is a good practice to make sense of the result loss in a tangible way in order to evaluate if the model performs well or not. A big loss doesn't necessarily translates to a poor model. For example, here, our model has a ~18% relative error (because the average house price is 340,000$). If we had obtain the same loss for a model predicting a variable of average value 65 000, the same loss would have translated to a ~92% relative error." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 6.3. Plotting the regression surface" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Now that our model is trained, we can plot the regression surface using `plot_surface_3d` from `helpers`. " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "helpers.plot_surface_3d(w=w_mult,\n", - " X_train=X_train_mult, \n", - " y_train=y_train_mult, \n", - " X_test=X_test_mult, \n", - " y_test=y_test_mult,\n", - " feature_names=feature_names_mult, \n", - " label_name=\"price\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "**Question:**\n", - "What are your thoughts on this regression fit?\n", - "\n", - "**Answer:**\n", - "YOUR ANSWER HERE" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Congratulations on finishing this exercise! In the next exercise, we'll take a look at classification with logistic regression." - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.8.10" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -}