<figure>
  <IMG SRC="https://raw.githubusercontent.com/mbakker7/exploratory_computing_with_python/master/tudelft_logo.png" WIDTH=250 ALIGN="right">
</figure>

# Exploratory Computing with Python
*Developed by Mark Bakker*

## Notebook 2: Arrays and basic `if`-statements
###One dimensional arrays
In this notebook, we will do math on arrays using functions of the `numpy` package. A nice overview of `numpy` functionality can be found [here](http://wiki.scipy.org/Tentative_NumPy_Tutorial). We will also make plots. So we start by importing the plotting part of the `matplotlib` package and call it `plt` and we import the `numpy` package and call it `np`. We also tell IPython to put all graphs inline. We will add these three lines at the top of all upcoming notebooks as we will always be using `numpy` and `matplotlib`. 

In [0]:
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline

There are many ways to create arrays. For example, you can enter the individual elements of an array

In [0]:
np.array([1, 7, 2, 12])

Note that the `array` function takes one sequence of points between square brackets. 
Another function to create an array is `ones(shape)`, which creates an array of the specified `shape` filled with the value 1. 
There is an analogous function `zeros(shape)` to create an array willed with the value 0 (which can also be achieved with `0 * ones(shape)`). Next to the already mentioned `linspace` function there is the `arange(start,end,step)` 
function, which creates an array starting at `start`, taking steps equal to `step` and stopping before it reaches `end`. If you don't specify the `step`, 
it is set equal to 1. If you only specify one input value, it returns a sequence starting at 0 and incrementing by 1 until the specified value is reached (but again, it stops before it reaches that value)

In [0]:
print np.arange(1, 7) # Takes defauls steps of 1 and doesn't include 7
print np.arange(5) # Starts at 0 end ends at 4, giving 5 numbers

Recall that comments in Python are preceded by a `#`. 
Arrays have a dimension. So far we have only used one-dimensional arrays. 
Hence the dimension is 1. 
For one-dimensional arrays, you can also compute the length (which is part of Python and not `numpy`), which returns the number of values in the array

In [0]:
x = np.array([1, 7, 2, 12])
print 'number of dimensions of x:', np.ndim(x)
print 'length of x:',len(x)

The individual elements of an array can be accessed with their index. Indices start at 0. 
This may require a bit of getting used to. It means that the first value in the array has index 0. The index of an array is specified using square brackets.

In [0]:
x = np.arange(20, 30)
print x
print x[0]
print x[5]

A range of indices may be specified using the colon syntax:
`x[start:end_before]` or `x[start:end_before:step]`. If the `start` or `end_before` isn't specified, 0 will be used. If the step isn't specified, 1 will be used. You can also start at the end and count back. Generally, the index of the end is not known. You can find out how long the array is and access the last value by typing `x[len(x)-1]` but it would be inconvenient to have to type `len(arrayname)` all the time. Luckily, there is a shortcut: `x[-1]` is the last value in the array. This all requires practice. Make sure you understand the following examples:

In [0]:
x = np.arange(20, 30)
print x
print x[0:5]
print x[:5] # same as previous one
print x[3:7]
print x[2:9:2] # step is 2
print x[-1:4:-2] # starts at last one and stops before reaching index 4 with step -2

You can assign one value to a range of an array by specifying a range of indices, 
or you can assign an array to a range of another array, as long as the ranges have equal length. In the second example below, the first 5 values of `x` (specified as `x[0:5]`) are given the values `[40,42,44,46,48]`.

In [0]:
x = 20 * np.ones(10)
print x
x[0:5] = 40
print x
x[0:5] = np.arange(40, 50, 2)
print x

In the example below, it is meant to give the last 5 values of `x` the values [50,52,54,56,58], but there are some errors in the code. Remove the comment markers and run the code to see the error message. Then fix the code and run it again.

In [0]:
#x = np.ones(10)
#x[5:] = np.arange(50, 62, 1)
#print x

###Exercise 1, <a name="back1"></a> Arrays and indices
Create an array of zeros with length 20. Change the first 5 values to 10. Change the next 10 values to a sequence starting at 12 and increasig with steps of 2 to 30 - do this with one command. Set the final 5 values to 30. Plot the value of the array on the y-axis vs. the index of the array on the x-axis. Draw vertical dashed lines at x=4 and x=14 (i.e, the section between the dashed lines is where the line increases from 10 to 30). Set the minimum and maximum values of the y-axis to 8 and 32 using the `ylim` command.

<a href="#ex1answer">Answer for Exercise 1</a>

### Arrays, Lists, and Tuples
A one-dimensional array is a sequence of values that you can do math on. Next to the array, Python has several other data types that can store a sequence of values. The first one is called a `list` and is entered between square brackets. The second one is a tuple (you are right, strange name), and it is entered with parentheses. The difference is that you can change the values of a list after you create them, and you can not do that with a tuple. Other than that, for now you just need to remember that they exist, and that you *cannot* do math with either lists or tuples. When you do `2 * alist` where `alist` is a list, you don't multiply all values in `alist` with the number 2. What happens is that you create a new list that contains `alist` twice (so it adds them back to back). The same holds for tuples. That can be very useful, but not when your intent is to multiply all values by 2. In the example below, the first value in a list is modified. Try to modify one of the values in `btuple` below and you will see that you get an error message:

In [0]:
alist = [1, 2, 3]
print 'alist', alist
btuple = (10, 20, 30)
print 'btuple', btuple
alist[0] = 7  # Since alist is a list, you can change values 
print 'modified alist', alist
#btuple[0] = 100  # Will give an error
#print 2*alist

Lists and tuples are versatile data types in Python. We already used lists without knowing it when we created our first array with the command `array([1,7,2,12])`. What we did is we gave the `array` function one input argument: the list `[1,7,2,12]`, and the `array` function returned a one-dimensional array with those values. Lists and tuples can consist of a sequences of pretty much anything, not just numbers. In the example given below, `alist` contains 5 *things*: the integer 1, the float 20, the word `python`, an array with the values 1,2,3, and finally, the function `len`. The latter means that `alist[4]` is actually the function `len`. That function can be called to determine the length of an array as shown below. The latter may be a bit confusing, but it is cool behavior if you take the time to think about it.

In [0]:
alist = [1, 20.0, 'python', np.array([1,2,3]), len]
print alist
print alist[0]
print alist[2]
print alist[4](alist[3])  # same as len( np.array([1,2,3]) )

### Two-dimensional arrays
Arrays may have arbitrary dimensions (as long as they fit in your computer's memory). We will make frequent use of two-dimensional arrays. They can be created with any of the aforementioned functions by specifying the number of rows and columns of the array. Note that the number of rows and columns must be a tuple (so they need to be between parentheses), as the functions expect only one input argument, which may be either one number or a tuple of multiple numbers.

In [0]:
x = np.ones((3, 4)) # An array with 3 rows and 4 columns
print x

Arrays may also be defined by specifying all the values in the array. The `array` function gets passed one list consisting of separate lists for each row of the array. In the example below the rows are entered on different lines. That may make it easier to enter the array, but it is note required. You can change the size of an array to any shape using the `reshape` function as long as the total number of entries doesn't change. 

In [0]:
x = np.array([[4, 2, 3, 2],
              [2, 4, 3, 1],
              [0, 4, 1, 3]])
print x
print np.reshape(x, (6, 2))  # 6 rows, 2 columns
print np.reshape(x, (1, 12))  # 1 row, 12 columns

The index of a two-dimensional array is specified with two values, first the row index, then the column index.

In [0]:
x = np.zeros((3, 8))
x[0,0] = 100
x[1,4:] = 200  # Row with index 1, columns starting with 4 to the end
x[2,-1:4:-1] = 400  # Row with index 2, columns counting back from the end and stop before reaching index 4
print x

###Arrays are not matrices
Now that we talk about the rows and columns of an array, the math-oriented reader may think that arrays are matrices, or that one-dimensional arrays are vectors. It is crucial to understand that *arrays are not vectors or matrices*. The multiplication and division of two arrays is term by term

In [0]:
a = np.arange(4, 20, 4)
b = np.array([2, 2, 4, 4])
print 'array a:', a
print 'array b:', b
print 'a * b  :', a * b  # term by term multiplication
print 'a / b  :', a / b  # term by term division

Note that, just like for scalars, integer division gives integers (in Python 2.X). If that is not what you want (and it rarely is), make sure that at least one of the arrays is of type float by putting at least one floating point number in the array. Python figures out what data type to assign to the array (called `dtype` for short). You can ask for the `dtype` of an array, or you can specify it as a keyword argument.

In [0]:
a = np.arange(4)
b = np.array([2, 2, 4, 4])
print 'array a, dtype: ', a, a.dtype
print 'array b, dtype: ', b, b.dtype
print 'a / b  :', a / b  # interger divistion
a = np.arange(4.)  # make array a of type float, same as np.arange(4,dtype='float')
print 'array a, dtype: ', a, a.dtype
print 'a / b  :', a / b  # float division !

###Exercise 2, <a name="back2"></a> Two-dimensional array indices
For the array `x` shown below, write code to print: 

* the first row of `x`
* the first column of `x`
* the third row of `x`
* the last two columns of `x`
* the four values in the upper right hand corner of `x`
* the four values at the center of `x`

`x = np.array([[4, 2, 3, 2],
              [2, 4, 3, 1],
              [2, 4, 1, 3],
              [4, 1, 2, 3]])`

<a href="#ex2answer">Answer for Exercise 2</a>

###Visualizing two-dimensional arrays
Two-dimensonal arrays can be visualized with the `plt.matshow` function. In the example below, the array is very small (only 4 by 4), but it illustrates the general principle. A colorbar is added as a legend showing that the value 1 corresponds to dark blue and the value 4 corresponds to dark red. The ticks in the colorbar are specified to be 1, 2, 3, and 4. Note that the first row of the matrix (with index 0), is plotted at the top, which corresponds to the location of the first row in the matrix.

In [0]:
x = np.array([[8, 4, 6, 2],
              [4, 8, 6, 2],
              [4, 8, 2, 6],
              [8, 2, 4, 6]])
plt.matshow(x)
plt.colorbar(ticks=[2, 4, 6, 8])
print x

The colors that are used are the default color map (it is called `jet`), which maps the highest value to red, the lowest value to blue and the numbers in between varying between green and yellow. If you want other colors, you can choose one of the other color maps. To find out all the available color maps, go [here](href="http://matplotlib.org/examples/color/colormaps_reference.html). To change the color map, you need to import the `cm` part of the matplotlib package, which contains all the color maps. After you have imported the color map package (which we call `cm` below), you can specify any of the available color maps with the `cmap` keyword. Try a few.

In [0]:
import matplotlib.cm as cm
plt.matshow(x, cmap=cm.rainbow)
plt.colorbar(ticks=np.arange(2, 9, 2));

###Exercise 3, <a name="back3"></a> Create and visualize an array
Create an array of size 10 by 10. The upper left-hand quadrant of the array should get the value 4, the upper right-hand quadrant the value 3, the lower right-hand quadrant the value 2 and the lower left-hand quadrant the value 1. First create an array of 10 by 10 using the `zeros` command, then fill each quadrant by specifying the correct index ranges. Note that the first index is the row number. The second index runs from left to right. Visualize the array using `matshow`. It should give a red, yellow, light blue and dark blue box (clock-wise starting from upper left) when you use the default `jet` colormap.

<a href="#ex3answer">Answer for Exercise 3</a>

### Exercise 4, <a name="back4"></a> Create and visualize a slightly fancier array
Consider the image shown below, which roughly shows the letters TU. You are asked to create an array that represents the same TU. First create a zeros array of 11 rows and 17 columns. Give the background value 0, the letter T value -1, and the letter U value +1. <a name="back4"></a>

<img src= "https://raw.githubusercontent.com/mbakker7/exploratory_computing_with_python/master/notebook2/tufig.png" width="500px" />

<a href="#ex4answer">Answer to Exercise 4</a>

### Basic `if` statements
An `if` statement lets you perform a task only when the outcome of the `if` statement is true. For example

In [0]:
avalue = 4
print avalue
if avalue < 6:
    print 'changing avalue in first if statement'
    avalue = avalue + 2
print avalue
if avalue > 20:
    print 'changing a in second if statement'
    avalue = 200
print avalue  # avalue hasn't changed as avalue is not larger than 20

Notice the syntax of the `if` statement. It starts with `if` followed by a statement that is either `True` or `False` and then a colon. After the colon, you need to indent and the entire indented code block (in this case 2 lines of code) is executed if the statement is `True`. Otherwise it is not executed. The following comparisons can be made. Make sure you understand them all.

In [0]:
a = 4
print a < 4
print a <= 4 # a is smaller than or equal to 4
print a == 4 # a is equal to 4. Note that there are 2 equal signs
print a >= 4 
print a >  4
print a != 4 # a is not equal to 4

It is important to understand the difference between one equal sign like `a = 4` and two equal signs like `a == 4`. One equal sign means assignment. Whatever is on the right side of the equal sign is assigned to what is on the left side of the equal sign. Two equal signs is a comparison and results in either `True` (when the left and right sides are equal) or `False`. A variable that can either be `True` or `False` is called a *boolean* variable. 

In [0]:
print 4 == 4
a = 4 == 5
print a
print type(a)

Comparisons can also be used for arrays. For example let's create an array and find out what values of the array are below 3:

In [0]:
data = np.arange(5)
print data
print data < 3

The statement `a < 3` returns an array of type `boolean` that has the same length as the array `data` and for each item in the array it is either `True` or `False`. The cool thing is that this array of `True` and `False` values can be used to specify the indices of an array:

In [0]:
a = np.arange(5)
b = np.array([ True, True, True, False, False ])
print a[b]

When the indices of an array are specified with a boolean array, only the values of the array where the boolean array is `True` are selected. This is a very powerful feature. For example, all values of an array that are less than, for example, 3 may be obtained by specifying a comparison as the indices.

In [0]:
a = np.arange(5)
print 'the total array: ',a
print 'values less than 3: ', a[a < 3]

If we want to replace all values that are less than 3 by, for example, the value 10, use the following short syntax:

In [0]:
a = np.arange(5)
print a
a[a < 3] = 10
print a

###Exercise 5, <a name="back5"></a> Replace high and low in an array
Create an array for variable $x$ consisting of 100 points from 0 to 20. Compute $y=\sin(x)$ and plot $y$ vs. $x$ with a blue line. Next, replace all values of $y$ that are larger than 0.5 by 0.5, and all values that are smaller than $-$0.75 by $-$0.75 and plot $x$ vs. $y$ using a red line on the same graph. 

<a href="#ex5answer">Answer to Exercise 5</a>

###Exercise 6, <a name="back6"></a> Change marker color based on data value
Create an array for variable x consisting of 100 points from 0 to 20 and compute $y=\sin(x)$. Plot a blue dot for every $y$ that is larger than zero, and a red dot otherwise

<a href="#ex6answer">Answer to Exercise 6</a>

###Select indices based on multiple conditions
Multiple conditions can be given as well. When two conditions both have to be true, use the `&` symbol. When at least one of the conditions needs to be true, use the '|' symbol (that is the vertical bar). For example, let's plot blue markers when $y>0.7$ or $y<-0.5$ (using one plot statement), and a red marker when $-0.5\le y\le 0.7$. When there are multiple conditions, they need to be between parenteses. Note that in the example below, $x$ varies from 0 to 6$\pi$ (`pi` is part of `numpy`).

In [0]:
x = np.linspace(0, 6 * np.pi, 50)
y = np.sin(x)
plt.plot(x[(y > 0.7) | (y < -0.5)], y[(y > 0.7) | (y < -0.5)], 'bo' )
plt.plot(x[(y > -0.5) & (y < 0.7)], y[(y > -0.5) & (y < 0.7)], 'ro' )

###Exercise 7, <a name="back7"></a> Multiple conditions 
The file `xypoints.dat` contains 1000 randomly chosen $x,y$ locations of points; both $x$ and $y$ vary between -10 and 10. Load the data using `loadtxt`, and store the first row of the array in an array called `x` and the second row in an array called `y`. First, plot a red dot for all points. On the same graph, plot a blue dot for all $x,y$ points where $x<-2$ and $-5\le y \le 0$. Finally, plot a green dot for any point that lies in the circle witch center $(x_c,y_c)=(5,0)$ and with radius $R=5$. Hint: it may be useful to compute a new array for the radial distance $r$ between any point and the center of the circle using the formula $r=\sqrt{(x-x_c)^2+(y-y_c)^2}$. Use the `plt.axis('image')` command to make sure the scales along the two axes are equal and the circular area looks like a circle.

<a href="#ex7answer">Answer to Exercise 7</a>

###Answers to the exercises

<a name="ex1answer">Answer to Exercise 1</a>

In [0]:
x = np.zeros(20)
x[:5] = 10
x[5:15] = np.arange(12, 31, 2)
x[15:] = 30
plt.plot(x)
plt.plot([4, 4], [8, 32],'k--')
plt.plot([14, 14], [8, 32],'k--')
plt.ylim(8, 32)

<a href="#back1">Back to Exercise 1</a>

<a name="ex2answer">Answer to Exercise 2</a>

In [0]:
x = np.array([[4, 2, 3, 2],
              [2, 4, 3, 1],
              [2, 4, 1, 3],
              [4, 1, 2, 3]])
print 'the first row of x'
print x[0]
print 'the first column of x'
print x[:, 0]
print 'the third row of x'
print x[2]
print 'the last two columns of x'
print x[:, -2:]
print 'the four values in the upper right hand corner'
print x[:2, 2:]
print 'the four values at the center of x'
print x[1:3, 1:3]

<a href="#back2">Back to Exercise 2</a>

<a name="ex3answer">Answer to Exercise 3</a>

In [0]:
x = np.zeros((10, 10))
x[:5, :5] = 4
x[:5, 5:] = 3
x[5:, 5:] = 2
x[5:, :5] = 1
print x
plt.matshow(x)
plt.colorbar(ticks=[1, 2, 3, 4]);

<a href="#back3">Back to Exercise 3</a>

<a name="ex4answer">Answer to Exercise 4</a>

In [0]:
x = np.zeros((11, 17))
x[2:4, 1:7] = -1
x[2:9, 3:5] = -1
x[2:9, 8:10] = 1
x[2:9, 13:15] = 1
x[7:9, 10:13] = 1
print x
plt.matshow(x, interpolation='nearest');
plt.yticks(range(11, -1, -1))
plt.xticks(range(0, 17));
plt.ylim(10.5, -0.5)
plt.xlim(-0.5, 16.5);

<a href="#back4">Back to Exercise 4</a>

<a name="ex5answer">Answer to Exercise 5</a>

In [0]:
x = np.linspace(0, 20, 100)
y = np.sin(x)
plt.plot(x, y, 'b')
y[y > 0.5] = 0.5
y[y < -0.75] = -0.75
plt.plot(x,y,'r')

<a href="#back5">Back to Exercise 5</a>

<a name="ex6answer">Answer to Exercise 6</a>

In [0]:
x = np.linspace(0, 6 * np.pi, 50)
y = np.sin(x)
plt.plot(x[y > 0], y[y > 0], 'bo' )
plt.plot(x[y <= 0], y[y <= 0], 'ro' )

<a href="#back6">Back to Exercise 6</a>

<a name="ex6answer">Answer to Exercise 7</a>

In [0]:
x,y = np.loadtxt('xypoints.dat')
plt.plot(x, y, 'ro')
plt.plot(x[(x < -2) & (y >= -5) & (y < 0) ], y[(x < -2) & (y >= -5) & (y < 0) ], 'bo')
r = np.sqrt((x - 5) ** 2 + y ** 2)
plt.plot(x[r < 5], y[r < 5],'go')
plt.axis('image');

<a href="#back7">Back to Exercise 7</a>