This lesson is being piloted (Beta version)

# Making Choices

## Overview

Teaching: 30 min
Exercises: 0 min
Questions
• How can my programs do different things based on data values?

Objectives
• Write conditional statements including `if`, `elif`, and `else` branches.

• Correctly evaluate expressions containing `and` and `or`.

In our last lesson, we analyzed multiple files and showed temperature anomalies over several geographical areas. However, it was not easy to customize our plots and for instance label the various areas accordingly. How can we use Python to automatically recognize the different features we saw, and take a different action for each? In this lesson, we’ll learn how to write code that runs only when certain conditions are true.

## Conditionals

We can ask Python to take different actions, depending on a condition, with an `if` statement:

``````num = 37
if num > 100:
print('greater')
else:
print('not greater')
print('done')
``````
``````not greater
done
``````

The second line of this code uses the keyword `if` to tell Python that we want to make a choice. If the test that follows the `if` statement is true, the body of the `if` (i.e., the lines indented underneath it) are executed. If the test is false, the body of the `else` is executed instead. Only one or the other is ever executed:

Conditional statements don’t have to include an `else`. If there isn’t one, Python simply does nothing if the test is false:

``````num = 53
print('before conditional...')
if num > 100:
print(num,' is greater than 100')
print('...after conditional')
``````
``````before conditional...
...after conditional
``````

We can also chain several tests together using `elif`, which is short for “else if”. The following Python code uses `elif` to print the sign of a number.

``````num = -3

if num > 0:
print(num, 'is positive')
elif num == 0:
print(num, 'is zero')
else:
print(num, 'is negative')
``````
``````-3 is negative
``````

Note that to test for equality we use a double equals sign `==` rather than a single equals sign `=` which is used to assign values.

We can also combine tests using `and` and `or`. `and` is only true if both parts are true:

``````if (1 > 0) and (-1 > 0):
print('both parts are true')
else:
print('at least one part is false')
``````
``````at least one part is false
``````

while `or` is true if at least one part is true:

``````if (1 < 0) or (-1 < 0):
print('at least one test is true')
``````
``````at least one test is true
``````

## `True` and `False`

`True` and `False` are special words in Python called `booleans`, which represent truth values. A statement such as `1 < 0` returns the value `False`, while `-1 < 0` returns the value `True`.

## Checking our Data

Let’s go back to our temperature anomaly datasets and plot the monthly average, min and max for the entire period i.e. December 1978 to February 2019.

``````filenames = sorted(glob.glob('../data/uahncdc.lt-*.csv'))
composite_data = numpy.zeros((120,27))

for f in filenames:
data = numpy.loadtxt(fname = f,  skiprows=1, delimiter=',')
composite_data += data

composite_data/=len(filenames)

fig = matplotlib.pyplot.figure(figsize=(10.0, 3.0))

axes.set_ylabel('average')
axes.plot(numpy.mean(composite_data, axis=0))

fig.tight_layout()

matplotlib.pyplot.show()
``````
``````---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-4-e9f9fa30176a> in <module>
6 for f in filenames:
7     data = numpy.loadtxt(fname = f,  skiprows=1, delimiter=',')
----> 8     composite_data += data
9
10 composite_data/=len(filenames)

ValueError: operands could not be broadcast together with shapes (120,27) (13,27) (120,27)
``````

We get an error because the first file does not contain 10 years of data but starts from december 1978 to december 1979 (13 rows instead of 120 rows).

Now that we’ve seen how conditionals work, we can add a `if` statement to skip the first filename:

• ‘uahncdc.lt-01.csv’ contains less observations than all the other files and the reason is that it starts from december 1978 to december 1979
• ‘uahncdc.lt-05.csv’
``````import glob
import numpy
import matplotlib.pyplot
%matplotlib inline

filenames = sorted(glob.glob('data/uahncdc.lt-*.csv'))
composite_data = numpy.zeros((120,27))

nfiles = 0
for f in filenames:
data = numpy.loadtxt(fname = f,  skiprows=1, delimiter=',')
if data.shape == composite_data.shape:
composite_data += data
nfiles = nfiles + 1

composite_data/=nfiles

fig = matplotlib.pyplot.figure(figsize=(10.0, 3.0))

axes.set_ylabel('average')
axes.plot(numpy.mean(composite_data, axis=0))

fig.tight_layout()

matplotlib.pyplot.show()
``````

We can also print a warning if a file is skipped with an `else` condition:

``````else:
print("Warning: file " , f , "skipped")
``````

We could also use `elif` to test another condition:

``````    elif data.shape[0] > composite_data.shape[0]:
print("Warning: file " , f , " has too many rows so we skip it")
else:
print("Warning: file " , f , "has not enough rows so we skip it")
``````

So we check if the file has more or less rows than expected.

Let’s test that out:

``````import glob
import numpy
import matplotlib.pyplot
%matplotlib inline

filenames = sorted(glob.glob('data/uahncdc.lt-*.csv'))
composite_data = numpy.zeros((120,27))

nfiles = 0
for f in filenames:
data = numpy.loadtxt(fname = f,  skiprows=1, delimiter=',')
if data.shape == composite_data.shape:
composite_data += data
nfiles = nfiles + 1
elif data.shape[0] > composite_data.shape[0]:
print("Warning: file " , f , " has too many rows so we skip it")
else:
print("Warning: file " , f , "has not enough rows so we skip it")

composite_data/=nfiles

fig = matplotlib.pyplot.figure(figsize=(10.0, 3.0))

axes.set_ylabel('average')
axes.plot(numpy.mean(composite_data, axis=0))

fig.tight_layout()

matplotlib.pyplot.show()
``````
``````Warning: file  data/uahncdc.lt-01.csv has not enough rows so we skip it
Warning: file  data/uahncdc.lt-05.csv has not enough rows so we skip it
``````

In this way, we have asked Python to do something different depending on the condition of our data. Here we printed messages in all cases, but we could also imagine not using the `else` catch-all so that messages are only printed when something is wrong, freeing us from having to manually examine every plot for features we’ve seen before.

## How Many Paths?

Consider this code:

``````if 4 > 5:
print('A')
elif 4 == 5:
print('B')
elif 4 < 5:
print('C')
``````

Which of the following would be printed if you were to run this code? Why did you pick this answer?

1. A
2. B
3. C
4. B and C

## Solution

C gets printed because the first two conditions, `4 > 5` and `4 == 5`, are not true, but `4 < 5` is true.

## What Is Truth?

`True` and `False` booleans are not the only values in Python that are true and false. In fact, any value can be used in an `if` or `elif`. After reading and running the code below, explain what the rule is for which values are considered true and which are considered false.

``````if '':
print('empty string is true')
if 'word':
print('word is true')
if []:
print('empty list is true')
if [1, 2, 3]:
print('non-empty list is true')
if 0:
print('zero is true')
if 1:
print('one is true')
``````

## That’s Not Not What I Meant

Sometimes it is useful to check whether some condition is not true. The Boolean operator `not` can do this explicitly. After reading and running the code below, write some `if` statements that use `not` to test the rule that you formulated in the previous challenge.

``````if not '':
print('empty string is not true')
if not 'word':
print('word is not true')
if not not True:
print('not not True is true')
``````

## Close Enough

Write some conditions that print `True` if the variable `a` is within 10% of the variable `b` and `False` otherwise. Compare your implementation with your partner’s: do you get the same answer for all possible pairs of numbers?

## Solution 1

``````a = 5
b = 5.1

if abs(a - b) < 0.1 * abs(b):
print('True')
else:
print('False')
``````

## Solution 2

``````print(abs(a - b) < 0.1 * abs(b))
``````

This works because the Booleans `True` and `False` have string representations which can be printed.

## In-Place Operators

Python (and most other languages in the C family) provides in-place operators that work like this:

``````x = 1  # original value
x += 1 # add one to x, assigning result back to x
x *= 3 # multiply x by 3
print(x)
``````
``````6
``````

Write some code that sums the positive and negative numbers in a list separately, using in-place operators. Do you think the result is more or less readable than writing the same without in-place operators?

## Solution

``````positive_sum = 0
negative_sum = 0
test_list = [3, 4, 6, 1, -1, -5, 0, 7, -8]
for num in test_list:
if num > 0:
positive_sum += num
elif num == 0:
pass
else:
negative_sum += num
print(positive_sum, negative_sum)
``````

Here `pass` means “don’t do anything”. In this particular case, it’s not actually needed, since if `num == 0` neither sum needs to change, but it illustrates the use of `elif` and `pass`.

## Sorting a List Into Buckets

In our `data` folder, large data sets are stored in files whose names start with “inflammation-“ and small data sets – in files whose names start with “small-“. We also have some other files that we do not care about at this point. We’d like to break all these files into three lists called `large_files`, `small_files`, and `other_files`, respectively.

Add code to the template below to do this. Note that the string method `startswith` returns `True` if and only if the string it is called on starts with the string passed as an argument, that is:

``````"String".startswith("Str")
``````
``````True
``````

But

``````"String".startswith("str")
``````
``````False
``````

Use the following Python code as your starting point:

``````files = ['inflammation-01.csv',
'myscript.py',
'inflammation-02.csv',
'small-01.csv',
'small-02.csv']
large_files = []
small_files = []
other_files = []
``````

1. loop over the names of the files
2. figure out which group each filename belongs
3. append the filename to that list

In the end the three lists should be:

``````large_files = ['inflammation-01.csv', 'inflammation-02.csv']
small_files = ['small-01.csv', 'small-02.csv']
other_files = ['myscript.py']
``````

## Solution

``````for file in files:
if file.startswith('inflammation-'):
large_files.append(file)
elif file.startswith('small-'):
small_files.append(file)
else:
other_files.append(file)

print('large_files:', large_files)
print('small_files:', small_files)
print('other_files:', other_files)
``````

## Counting Vowels

1. Write a loop that counts the number of vowels in a character string.
2. Test it on a few individual words and full sentences.
3. Once you are done, compare your solution to your neighbor’s. Did you make the same decisions about how to handle the letter ‘y’ (which some people think is a vowel, and some do not)?

## Solution

``````vowels = 'aeiouAEIOU'
sentence = 'Mary had a little lamb.'
count = 0
for char in sentence:
if char in vowels:
count += 1

print("The number of vowels in this string is " + str(count))
``````

## Key Points

• Use `if condition` to start a conditional statement, `elif condition` to provide additional tests, and `else` to provide a default.

• The bodies of the branches of conditional statements must be indented.

• Use `==` to test for equality.

• `X and Y` is only true if both `X` and `Y` are true.

• `X or Y` is true if either `X` or `Y`, or both, are true.

• Zero, the empty string, and the empty list are considered false; all other numbers, strings, and lists are considered true.

• `True` and `False` represent truth values.