## 1.5 Basics of `Python`

`Python`

is another object-oriented language (OOL). It was created in the early 90’s but was not popularized until the 00’s. It lends itself to writing structured, easy-to-read computer code.

It is intended to be easier to understand and learn than other OOLs. One of its strength is that it has a **massive** base of open-source modules, which allow programmers to implement very sophisticated functionality simply by making a few function calls (not unlike `R`

’s packages).

More information is available from the `Python`

Software Foundation, on Stack Exchange (and similar sites), and in reference manuals, such as Jake VanderPlas’ A Whirlwind Tour of Python or the `Python`

3 documentation.

### 1.5.1 Integrated Development Environments for `Python`

For data science purposes, Anaconda and Jupyter are popular `Python`

**integrated development environments** (IDE); Rodeo, Spyder, PyCharm, Ninja (an others) also provide RStudio-like functionality for `Python`

. Installation instructions are available on the respective websites.

We will not explain how to install and set-up `Python`

on your machine the way we did so for `R`

and RStudio at this stage, although we will revisit this in Data Engineering and Data Management and Reporting and Deployment).

### 1.5.2 Introduction to `Python`

The content of this section (and the next one) is intended to help data analysts get a better sense of how `Python`

could be used for data analysis. They are not designed to teach the ins and outs of `Python`

programming. Instead, they illustrate typical tasks through examples.^{15}

#### Fundamentals

Let us start with the basics.

##### Using `Python`

as a Scientific Calculator

Mathematical expressions can easily be evaluated numerically in `Python`

. For scientific calculations, one should import the `math`

module (package/library) which contains many mathematical functions.

It is important to note that `Python`

also provides facilities for integer arithmetic which will be covered later. In this section, only floating-point calculations are used.

Modules can be imported using the `import`

function.

We can call pre-compiled functions in a module by prepending the module name (with a period) to the function name: `module.function_name()`

is the `Python`

equivalent of `package::function_name()`

in `R`

.

For instance, there is a `cos`

function in the `math`

module: it is called using `math.cos()`

.

We can evaluate \(\cos(\sqrt{\pi})\) with:

`-0.20029354112337366`

\(\arctan (2^5/3)\) with

`1.477319545636307`

and \(\ln(1+e^4)\) with

`4.0181499279178094`

##### Using Variables to Hold Intermediate Results

It could be helpful to break complex calculations into smaller steps. Variables can be used to store intermediate results. We will see later how variables are used in algorithmic settings.

For instance, we could break down the evaluation of \(\exp(\sin(\sqrt{2}+2))\) into three parts:

\(x=\sqrt{2}\)

\(y=\sin(x+2)\)

\(z = \exp(y)\)

In order to display the values taken by the variables, we must call on them separately, as below:

`(1.4142135623730951, -0.26925647329402774, 0.7639472984402832)`

The variables are saved even when they are not displayed, however.

##### Numbers as Formatted Strings

Quite often, we may want to control the way numbers are displayed (this can come in handy when reporting results). For example, we may wish to display no more than 4 decimal places for all real numbers, or we may want to pad numbers with zeros so that they all have a given width.

The following block illustrates a number of ways to obtain **formatted strings** of the number 12.3456789. For more details on the format specification mini-language, please consult the documentation.

Note that a string must be enclosed within double quotes or single quotes. We will discuss general string operations shortly.

We can format the number as a string of width 10, with 2 decimal places:

`' 12.35'`

or as a string with 4 decimal places:

`'12.3457'`

or as a zero-padded string of width 5, with no decimal:

`'00012'`

##### Fixed Decimals

**Floating-point numbers** are usually shunned as they are inherently inexact. For example, we might be bewildered to find out what the following sum amounts to:

`3.3000000000000003`

the result 3.3000000000000003 is definitely not what we would expect as a sum, namely, 3.3.

The `decimal`

module allows us to express decimal numbers *exactly* (see the documentation for more information).

Let’s look at a few examples of working with `decimal`

and `Decimal()`

.

We start by defining `x`

and `y`

as the **fixed decimal** values 1.1 and 1.2, respectively. Note that the numbers must entered as strings.

These computations behave as we would expect:

```
3.3
2
1.331
```

If we do not enter the numbers as strings, they will be treated as floating-point numbers, and then be converted to a string, leading to unexpected results.

`3.300000000000000266453525910`

Rounding works as one would expect when variables are correctly declared as fixed decimals:

`Decimal('3.142')`

Once fixed decimals are used, we must use mathematical functions provided by the `decimal`

module in order to stay within that module (unfortunately, trigonmetric functions are not available).

For instance, if

then

```
0.4
-1.832581463748310130367054424
-0.7958800173440752191450444211
```

The same results could be obtained using the `math`

module functions:

```
0.4
-1.8325814637483102
-0.7958800173440752
```

##### Exercises

Evaluate \(\lfloor 10001/4 \rfloor\) and \(\arcsin (\pi/4)\).

Obtain the value of \(s\) in the following: \(a=\pi(1+\ln 5)\), \(b=\frac{1}{3+\sqrt{4}}\) and \(s=a+b\).

Obtain a formatted string of \(\sin(\pi^2)\) of width 8, with 5 decimal places.

Turn the value of \(\sqrt{3}\) into a fixed decimal with 8 decimal places.

#### List and Tuples

**Lists** and **tuples** are important objects in `Python`

programming. Even though we will be mostly using `numpy`

arrays and certain `pandas`

objects instead of lists later on, it is useful to learn the basics of lists as some of the concepts are transferrable.

##### List Creation

A **list** holds a sequence of objects, who do not all have to be the same type.

One way to create a list is to enclose the elements, separated by commas, with square brackets.

Let us illustrate this concept with a simple list containing three objects.

We can extract the elements using indices (note that the first element corresponds to index 0, the second to index 1, etc.):

```
3
'a'
5.1
```

The type of each of the elements can be found below

```
<class 'int'>
<class 'str'>
<class 'float'>
```

We can also “multiply” an element and transform it into a longer list:

`['Ho', 'Ho', 'Ho', 'Ho', 'Ho', 'Ho', 'Ho', 'Ho', 'Ho', 'Ho']`

or create a list of integers ranging from \(0\) to \(n-1\), or from \(a\) to \(b-1\):

```
[0, 1, 2, 3, 4]
[3, 4, 5, 6]
```

##### Tuples

Tuples are list-like objects, but with the following differences:

they are defined with parentheses instead of square brackets (sometimes, the parentheses can be omitted);

they are

**immutable**(once created, they cannot be modified).

For instance, if

then we can obtain the length of `t`

and print its 2nd element using

```
3
a
```

but we cannot change the value of the third element of `t`

or append a new value to `t`

: both commands in the next block of code are illegal:

although the same command applied to the list `x`

would be legal:

`[3, 'a', 1, 5]`

If we know the dimension of a tuple `t`

, we can also use an **extract pattern** to extract the individual components, as the following examples illustrate.

`1 two 3.0`

We could use the `_`

(place holder) to extract solely the second component, say.

`two`

What do you think is happening below?

```
days = [(0,"Sun"), (1, "Mon"), (2, "Tue"), (3, "Wed"), (4, "Thu"), (5, "Fri"), (6, "Sat")]
for n, d in days:
print(d+" is represented by " + str(n))
```

```
Sun is represented by 0
Mon is represented by 1
Tue is represented by 2
Wed is represented by 3
Thu is represented by 4
Fri is represented by 5
Sat is represented by 6
```

##### List Comprehension

**List comprehension** is a powerful way to create lists, based on set notation. Before we get into the technical details, let us look at some examples.

We start by importing solely the function `sqrt()`

from the `math`

module (doing so means that we will not require the prefix `math.`

in order to invoke `sqrt()`

); we also declare an index list `x`

:

`[1, 4, 9, 16]`

We can now build new lists from `x`

, such as the list of the squares of the elements of `x`

:

`[1, 16, 81, 256]`

the list of the square roots of the elements of `x`

greater than 4:

`[3.0, 4.0]`

or the list of integers from 0 to 9 (equivalent to `range(10)`

):

`[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]`

The most basic form of list comprehension is `[f(x) for x in l]`

, where `l`

is a list (or an **iterable**) and `f(x)`

is an expression in `x`

.

It creates a list obtained by applying `f`

to each element or iterate in `l`

.^{16}

An optional **conditional** (we will discuss those shortly) can also be present, giving the general form `[f(x) for x in l if g(x)]`

, for some boolean expression `g`

(taking on the values `True`

or `False`

) where generation of the list elements only applies to elements that satisfy the boolean expression.

Multiple lists or iterables can be specified in list comprehension. For example, the following creates a list of all possible tuples `(x,y,z)`

such that `x`

is `True`

or `False`

, `y`

is from 4 to 6, and `z`

is a string equal to either ‘math’ or ‘stat’.

`[(True, 4, 'math'), (True, 4, 'stat'), (True, 5, 'math'), (True, 5, 'stat'), (True, 6, 'math'), (True, 6, 'stat'), (False, 4, 'math'), (False, 4, 'stat'), (False, 5, 'math'), (False, 5, 'stat'), (False, 6, 'math'), (False, 6, 'stat')]`

We can mimick list comprehension with the help of **loops** (to be discussed shortly), but this process is much less efficient. Whenever possible, it is preferable to use the former to generate lists.

##### List Operations

We illustrate various other operations that can be performed on lists in the blocks below; remember that list elements are **zero-indexed** (that is, the first element in the list has index 0):

sublisting

changing values

sorting values

appending values

concatenating lists

deleting elements

Consider a given list `x`

:

`[3, 1, 7, 2, 5]`

We can find the length of the list (remember, ordinals start with 0, cardinals with 1):

`5`

or print the sublist from the second element to the fourth element, say:

`[1, 7, 2]`

We could also modify the second element of the list (index 1), say:

`[3, 4, 7, 2, 5]`

Note that `x`

is now permanently changed … or at least, until it is modified again; if we want to modify the last entry but we are not sure about the length of the list, for instance, we could use:

`[3, 4, 7, 2, 6]`

If we are looking to change the third last element as well, we could use

`[3, 4, 1, 2, 6]`

Finally, we could sort the resulting list:

`[1, 2, 3, 4, 6]`

A lot of `Python`

methods are applied using the syntax `object.method()`

, in contrast to the typical `R`

syntax that would use `method(object)`

; so it is `x.sort()`

instead of `sort(x)`

.

Let us create another list, this time with booleans:

`[3, True, False]`

We can append a value, say 5, at the end of this list, as below:

`[3, True, False, 5]`

It is also possible to concatenate lists, using the (somewhat confusing) addition notation:

`[1, 2, 3, 4, 6, 3, True, False, 5]`

and delete the last element of this new list:

`[1, 2, 3, 4, 6, 3, True, False]`

or delete a range of elements, say from the 3rd to the 6th, from the resulting list:

`[1, 2, True, False]`

##### Exercises

Create a list of integers from −10 to 5.

Use list comprehension to create a list

`(x,y)`

so that`x+y > 8`

where`x`

can be any nonnegative integer at most 10 and`y`

can be any positive integer at most 7.Use list comprehension to create a list

`(x,y)`

so that`y`

is the square of`x`

and`x`

is from 1 to 10.Write one line of code that returns a list obtained from

`x = ['one', 2, 3, 'four', 5, 6, 'seven', 8, 9, 10, 'eleven', 12, 13, 'fourteen']`

by moving all the elements of type `str`

to the end of the list. (Hint: Use list comprehension and concatenation. To check if `a`

is of type `str`

, use `type(a) is str`

. To check if `a`

is not of type `str`

, use `type(a) is not str`

.)

#### Flow Control

We will take a brief look at two ways to alter the flow of control in `Python`

: **conditional statements** and **loops**.

##### Conditional Satements

`Python`

supports `if-elif-else`

statements in various forms.

In the following example, we let `x`

be some random integer between 1 and 12 (using function `randint()`

from module `random`

) and see how the results are affected.

`9`

Let us agree to print the string ’Hello`if`

x` is less than 5, like so:

Perhaps we want to print ‘Out of range’ if `x`

is less than 5 or greater than 9, and `Within range`

otherwise?

`Within range`

Finally, we might want to print ‘Small’ if `x`

is positive and less than 5; otherwise, print ‘Five’ if `x`

is 5; otherwise, print ‘Six’ if `x`

is 6; otherwise, print `+`

:

```
if 0 < x and x < 5:
print('Small')
elif x == 5:
print('Five')
elif x == 6:
print('Six')
else:
print('+')
```

`+`

Run this sequence of blocks a number of times to see the various outcomes.

**Important:** Note that the code block that follows an `if`

, `else`

, or `elif`

statement must be **properly indented**. The custom is to use four spaces for indentation. The following example illustrates the effects of different indentations.

```
x = 4
if x < 5:
print('Small')
else:
print('This string will not be printed, because the else statement never triggers')
print('Neither will this, for the same reason')
print('This will be printed no matter what x is, as it falls outside the if-else statement block')
```

```
Small
This will be printed no matter what x is, as it falls outside the if-else statement block
```

##### Loops

Loops are useful for repeatedly executing a statement or a block. We first consider the **for loop**.

Let us start with a simple example: for each value in the list `[1,3,8]`

, we print its square.

```
1
9
64
```

We could also compute sums with loops, such as 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9:

`45`

Or print the first `n`

even nonnegative integers

```
0
2
4
6
8
```

If a for loop is used to create a list, it is probably best to rewrite it using list comprehension. The following time comparison (using `%%timeit`

) illustrates the contrast when building a list of \(100 \times 1000\) items.

Using a loop:

Using list comprehension:

**While loops** are useful for iterating until a certain condition is met. For instance, if we want to print the first 10 even positive integers, separated by a space, we could use the following block:

```
i = 0
while i < 10: # Repeat the following block until i reaches 10 or greater
i += 1 # iterated index
print(2*i, end=' ')
```

`2 4 6 8 10 12 14 16 18 20 `

Or we could print the 26 lower case English alphabets letters on one line, with no separation:

`abcdefghijklmnopqrstuvwxyz`

Note that `ord`

returns the ordinal for a character; `chr`

does the reverse.

##### Exercises

- Write an if statement that prints
`odd`

if`x`

is odd and prints`even`

if`x`

is even where`x`

is a random integer between -100 and 100, inclusive.

(Hint: `x % n`

returns the remainder of `x`

divided by `n`

).

- Use a single while loop to print all pairs
`(x,y)`

such that`x+y=100`

and`x`

ranges from 0 to 50.

#### Functions

A **function** is a grouped sequence of code that can be called, such as `cos()`

and `print()`

. A function can have 0 or more **arguments**: `cos()`

takes one argument, whereas `print()`

can have up to five (see documentation for details).

##### Named Functions

Functions facilitate code re-use. `Python`

functions are defined *via* the `def`

statement. In the example below, we define a function that returns a pair consisting of the sum and the product of its arguments.

The parentheses around the tuple are optional in this context. The ouput for \(x=3\) and \(y=4\) can be obtained as below (once the function is compiled):

`(7, 12)`

Functions can also have default argument values. In the following example, if the second argument is not supplied, it takes on the value 5.

Compare the results of the two calls below:

```
[2, 3, 4, 5]
[7, 8, 9]
```

##### Anonymous (Lambda) Functions

Another way to define a function is with a **lambda statement**. This approach is mostly used to define one-line functions.^{17}

Anonymous functions are defined using the one-line notation:

`lambda variables: output`

For instance,

We can apply a bivariate function `func`

to arguments `x`

and `y`

, in a general context, using:

and apply in specific contexts (rule, inputs) as follows:

```
12
27
```

But we do not need to define the function prior to the call. This would also work:

```
12
27
```

##### Exercises

- Write a function
`myFunc()`

that returns the square of`x`

if`x`

is of type`int`

and returns`None`

otherwise (hint:`type(x) is int`

is the syntax for testing if`x`

is of type`int`

).

Verify that the function behaves as expected:

- Write a function
`mySoS()`

that accepts a list of floats as the only argument and returns the sum of squares of the numbers (assume that the argument is indeed a list of floats – no need to test if the condition is met).

Verify that the function behaves as expected:

- What is the result of the following code?

```
def mystery(func, n):
return [ func(i) for i in range(n) ]
print(mystery( lambda x: (2*x+1)**2, 5 ))
```

Rewrite the function using an anonymous function (a single line of code).

#### Strings

**String** (text) manipulation is an important part of data cleaning. Often, the raw data contains string fields that do not quite follow an expected format. For example, proper nouns could be incorrectly capitalized. Dates could have been entered under different conventions. Fortunately, `Python`

offers many tools that make string manipulation rather painless. In this section, we look at some of the commonly-performed operations on strings.

Strings can be defined using single or double quotes; note that `Python`

supports unicode strings.

`<class 'str'> <class 'str'> <class 'str'>`

We can use the multiplication syntax to define a string made up of identical copies of another string as illustrated below:

```
First stringFirst stringFirst stringFirst stringFirst stringFirst stringFirst stringFirst stringFirst stringFirst string
北京北京北京
```

Strings can be concatenated using the addition syntax:

```
First string北京
北京北京北京First stringSecond string
```

The character in position `i`

(the **index**) of the string a can be accessed via `a[i]`

. Remember that the first character’s index is 0.

Negative indices can also be used:`a[-4]`

returns the fourth character from the end, say. For instance, we can print the first, seventh, last, and fourth-last characters of `a`

using:

`F s g r`

We can obtain a **substring** of a string a using the syntax `a[i:j]`

where `i`

specifies the starting index and `j-1`

the ending index. Note that `a[:j]`

is equivalent to `a[0:j]`

, and `a[i:]`

is the substring starting at index `i`

and reaching until the end of `a`

.

```
rs
Fir
string
```

For a string `x`

, `x.split()`

**splits** the string into a list of words separated by a space (by default). Note that a contiguous sequence of space characters including newline (`\n`

), carriage return (`\r`

), and tab (`\t`

) is considered as one space.

We can also specify what separating characters to use for the splitting, instead of spaces. For example, `x.split(',')`

splits `x`

on commas and `x.split('--')`

splits it on `--`

.

Consider the examples below:

`['This', 'is', 'a', 'long', 'sentence', 'with', 'weird', 'spaces', 'separating', 'the', 'words.']`

`print('One,two, three ,four'.split(',')) # Note that ` three ` is one of the words after separation.`

`['One', 'two', ' three ', 'four']`

`['Five', 'six', 'ninety-four']`

In some case, it is helpful to remove leading and trailing space characters (**whitespace stripping**).

```
time
time
```

It is common to combine `strip()`

with `split(',')`

:

`['One', 'two', 'three']`

In fact, the `strip()`

method can accept a string consisting of all characters to be stripped from anothe string, in any combination. For instance, we can strip any leading and trailing characters contained in `['&','#','-','.','!']`

from any string as follows:

`Hel#lo!?`

The methods `upper()`

, `lower()`

, and `title()`

are useful for **altering the case** of characters in a string. The following examples showcase their functionality.

```
GARBAGE COLLECTION
garbage collection
Garbage Collection
```

The following example illustrates a function that takes a phrase and turns it into an acronym by concatenating the first letters of the words and capitalizing all the letters. Does the code make sense?

```
def acronymize(phrase):
a = '' # start with the empty string
for w in phrase.split(): # iterate through words in the phrase
a += w[0] # pick the first letter of the words and concatenate
return a.upper() # capitalize and return
acronymize("Be right back"), acronymize("Your mileage might vary")
```

`('BRB', 'YMMV')`

It can also be useful to **convert a string** representing a number to a number type, and vice versa. The following examples illustrate how these tasks can be achieved.

```
number = 12.345
s = str(number)
print( s, type(s))
f = float(s)
print(f, type(f))
i = int('345')
print(i, type(i))
```

```
12.345 <class 'str'>
12.345 <class 'float'>
345 <class 'int'>
```

We can also check if a string `t`

is a substring of another string `s`

via `t in s`

(**pattern matching**).

```
True
False
```

If we want to obtain the index at which a substring begins, we can use the `find()`

method. If the substring is not found, -1 is returned.

```
2
-1
```

We shall revisit `Python`

strings when we discuss Natural Language Processing.

##### Exercises

- Complete the definition of the function
`myRep()`

with arguments`x`

,`y`

, and`n`

(where`x`

and`y`

can be assumed to be strings and`n`

can be assumed to be a nonnegative integer) that returns the string`x+y`

repeated`n`

times.

Verify that the function behaves as expected:

- Complete the definition of the function
`posOfi()`

with argument`s`

and returns a list of indices at which`s`

contains the letter ‘i’ (hint: use the enumerate function).

Verify that the function behaves as expected:

- Complete the following function which takes a string consisting of a paragraph of sentences ending with a period and returns a list of all the sentences, with leading and trailing spaces stripped. You may assume that every period ends a proper sentence and there are no sentences not ending in a period.

Verify that the function behaves as expected:

What effect do the methods

`upper()`

,`lower()`

, and`title()`

have on non-alphabetical characters?Complete the following function which takes a list of full names as argument an returns a list of names that are not properly capitalized. For example, for the argument

`['John Doe', 'JANE Kelly', 'nicole dunn', 'David Huang']`

, the function returns`['JANE Kelly, 'nicole Dunn']`

.

- Complete the following function which takes a list
`l`

of strings as argument and returns a list consisting of the strings in`l`

not containing the symbol`-`

. For example, given the argument`['Hi', 'Good-bye', 'Ciao', 'Twenty-one']`

, the function should return`['Hi', 'Ciao']`

.

#### Dictionaries

A **dictionary** is a data structure for **key-value pairs** (`k:v`

). To define a dictionary, simply list the key-value pairs enclosed within braces (`{`

,`}`

), as shown in the following examples.

The simplest dictionary is the one that is empty:

`<class 'dict'>`

A more interesting dictionary could be the one below:

`<class 'dict'>`

We can **access** the value for key `k`

in dictionary `d`

via `d[k]`

. Note that an exception will be raised if `d`

does not contain the key `k`

.

We can check if a key `k`

is in a dictionary `d`

via `k in d`

.

```
4
False
```

We can **add** a new key-value pair `k:v`

to a dictionary `d`

via `d[k] = v`

.

`{1: (1, 2), 2: 3.45, 'three': 'string'}`

Conversely, we can delete key `k`

and its associated value from dictionary `d`

via del `d[k]`

.

`{1: (1, 2), 'three': 'string'}`

We can also iterate over the keys in a dictionary using a **for loop**.

```
<class 'int'> <class 'tuple'>
<class 'str'> <class 'str'>
```

The following code gives the same output

```
<class 'int'> <class 'tuple'>
<class 'str'> <class 'str'>
```

##### Exercises

- Complete the following function which takes a list of pairs as argument and returns a dictionary with the first components as keys and the second components as the corresponding values. For example, given the argument
`[(1,'a'),(2,'b')]`

, the function returns`{1: 'a', 2: 'b'}`

.

- Complete the following function which takes a dictionary as argument and removes all the key-value pairs that do not have values of type
`str`

. For example, calling the function with the dictionary`{'one': 1, 'two': 'Two', 'three': 3}`

will change the dictionary to`{'two': 'Two'}`

.

### 1.5.3 `NumPy`

and Arrays

`NumPy`

is a `Python`

module that supports numerical computation on multi-dimensional arrays. It comes with many useful mathematical functions.

It is the backbone to the scientific computing library `SciPy`

and data analysis and manipulation library `pandas`

. Even though it is possible to do basic statisical analysis using a comprehensive statistics package without direct manipulation of `NumPy`

arrays, knowledge of `NumPy`

is essential for performing custom operations.

In this section, we get a taste of `NumPy`

arrays of dimension at most two. What is covered only scratches the surface of this powerful library. A handy cheat sheet can be found here.

It is customary to use the alias `np`

when importing the module.

#### Arrays

Unlike lists, `NumPy`

arrays cannot contain elements of different types. There are various ways to create such arrays.

We can create a 1D array from a list:

`(4,)`

`shape`

is the method that returns the array’s dimensions. We can create a 2D array from a list of lists:

`(2, 3)`

If some of the elements are not of the “right” type, they are converted automatically:

`['n' 'u' 'm' '15']`

We can also define a `NumPy`

array out of a range using the `arange()`

function:

```
array([1, 2, 3, 4])
['n' 'u' 'm' '15']
```

yields the same result as `np.array([1,2,3,4])`

, but it is more efficient, from a computational perspective.

We can also obtain special arrays, composed of zeros, or composed of ones, with the functions `zeros()`

and `ones()`

. Here is a 3x4 2D array of 0s:

`(3, 4)`

and 2x1x3 3D array of 1s:

`3`

Note the difference between the `shape`

and `ndim`

methods: the former gives the actual dimensions (number of rows, columns, etc.), the latter, the number of dimensions (axes).

We can also define `NumPy`

arrays containing random values; for instance, here is a 1D array of 10 random values sampled from the standard normal distribution, using the function `random.normal()`

:

```
[-1.10501533 -0.69929125 -0.00882625 1.12738611 0.60354054 1.50509863
1.07440466 -0.86260135 1.12680367 -0.01988042]
```

#### Arithmetic

**Adding** and **subtracting** `NumPy`

arrays of the same dimensions works as we would expect. Using `x`

and `y`

as above, and `x2`

as below, we get:

`[0 0 0 0]`

`[2 4 6 8]`

```
[[ 2 4 6]
[ 8 10 12]]
```

**Multiplication by a scalar** also works as expected:

`[2 4 6 8]`

However, note that **multiplication** and **division** via `*`

and `/`

(resp.) are applied component-wise:

`[ -1 -4 -9 -16]`

as is **exponentiation**:

```
[[ 1 8 27]
[ 64 125 216]]
```

**Broadcasting** allows addition and substraction to be performed between arrays that do not have the same shape. There are rules governing when such operations are valid and what the effects are. Here, we provide two simple examples:

`array([4.5, 5.5, 6.5, 7.5])`

```
array([[0, 1, 2],
[3, 4, 5]])
```

Can you determine what broadcasting does from these examples?

#### Math Functions

`NumPy`

contain some useful methods mapping arrays to a scalar.

For instance, `sum`

adds up the elements in the array.

`10`

(the same result could have been obtained with `np.sum(x)`

).

The usual statistical descriptions are also available as methos:

`1.118033988749895 1.25 2.5`

`NumPy`

also has a collection of mathematical functions that can be applied **component-wise**, such as `abs()`

and `exp()`

:

```
[1.10501533 0.69929125 0.00882625 1.12738611 0.60354054 1.50509863
1.07440466 0.86260135 1.12680367 0.01988042]
```

```
[[ 2.71828183 7.3890561 20.08553692]
[ 54.59815003 148.4131591 403.42879349]]
```

`NumPy`

functions are more efficient when it comes to array computations; they should be used whenever possible.

#### Logic Operations

Operations over arrays of boolean values can also be performed efficiently in `NumPy`

.

Let us create a boolean array `bx`

of the same shape as `x`

, with `bx[i] = True`

if and only if `x[i] >= 2.5`

, and a boolean array `by`

of the same shape as `y`

, with `by[i] = True`

if and only if `y[i] >= 3.5`

.

```
[False False True True]
[[False False False]
[ True True True]]
```

Comparison of two `NumPy`

arrays of the same shape results in a boolean array, yet again of the same shape. Note that comparison is performed component-wise:

`[False False True False]`

Comparisons use the symbols `==`

, `<`

, and `>`

:

`[False True False True]`

We can perform **boolean operations** (AND, OR, NEG) on boolean arrays:

AND is computed using `&`

:

`array([False, False, True, True])`

OR with `|`

:

`array([ True, False, True, True])`

NEG with `~`

:

`array([False, True, False, False])`

We can also sum over the values of a boolean array (in this case, `True`

is interpreted as 1 and `False`

as 0):

`3`

##### Exercises

- Complete the following code so that
`sq`

is a 1D`numpy`

array of the squares of the first 100 positive integers. Use list comprehension.

Obtain a

`NumPy`

array from the array`sq`

in the section by applying the function \(\sqrt{x}+1\) to each entry`x`

in`sq`

(hint: use broadcasting and`np.sqrt()`

).Complete the following definition of

`myFunc()`

which takes a positive integer argument`n`

and a positive real number`d`

and generates an array of`n`

random values drawn from the standard normal distribution and returns the number of values whose absolute values are less than or equal to`d`

.

You may assume that `n`

is a positive integer and `d`

is a nonnegative float when `myFunc()`

is called (hint: use `numpy.random.randn()`

for generating the random array).

Verify that the function behaves as expected: