As data-processing problems have become more complex, computer scientists have devel- oped data structures to help solve them. A data structure combines several data values into a unit so they can be treated as one thing. The data elements within a data structure are usually organized in a special way that allows the programmer to access and manipulate them. As you saw in Chapter 4, a string is a data structure that organizes text as a sequence of characters. In this chapter, we explore the use of two other common data structures: the list and the dictionary. A list allows the programmer to manipulate a sequence of data val- ues of any types. A dictionary organizes data values by association with other data values rather than by sequential position.
Lists and dictionaries provide powerful ways to organize data in useful and interesting applications. In addition to exploring the use of lists and dictionaries, this chapter also introduces the definition of simple functions. These functions help to organize program instructions, in much the same manner as data structures help to organize data.
Lists
A list is a sequence of data values called items or elements. An item can be of any type.
Here are some real-world examples of lists:
• A shopping list for the grocery store
• A to-do list
• A roster for an athletic team
• A guest list for a wedding
• A recipe, which is a list of instructions
• A text document, which is a list of lines
• The names in a phone book
The logical structure of a list resembles the structure of a string. Each of the items in a list is ordered by position. Like a character in a string, each item in a list has a unique index that specifies its position. The index of the first item is 0, and the index of the last item is the length of the list minus 1. As sequences, lists and strings share many of the same operators, but they include different sets of methods. We now examine these in detail.
List Literals and Basic Operators
As you have seen, literal string values are written as sequences of characters enclosed in quote marks. In Python, a list literal is written as a sequence of data values separated by commas. The entire sequence is enclosed in square brackets ([ and ]). Here are some example list literals:
[1951, 1969, 1984] # A list of integers ["apples", "oranges", "cherries"] # A list of strings [] # An empty list
Copyright 2019 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
136
You can also use other lists as elements in a list, thereby creating a list of lists. Here is one example of such a list:
[[5, 9], [541, 78]]
It is interesting that when the Python interpreter evaluates a list literal, each of the elements is evaluated as well. When an element is a number or a string, that literal is included in the resulting list. However, when the element is a variable or any other expression, its value is included in the list, as shown in the following session:
>>> import math
>>> x = 2
>>> [x, math.sqrt(x)]
[2, 1.4142135623730951]
>>> [x + 1]
[3]
Thus, you can think of the [] delimiters as a kind of function, like print, which evaluates its arguments before using their values.
You can also build lists of integers using the range and list functions introduced in Chapter 3. The next session shows the construction of two lists and their assignment to variables:
>>> first = [1, 2, 3, 4]
>>> second = list(range(1, 5))
>>> first [1, 2, 3, 4]
>>> second [1, 2, 3, 4]
The list function can build a list from any iterable sequence of elements, such as a string:
>>> third = list("Hi there!")
>>> third
['H', 'i', ' ' , 't', 'h', 'e', 'r', 'e', '!']
The function len and the subscript operator [] work just as they do for strings:
>>> len(first) 4
>>> first[0]
1
>>> first[2:4]
[3, 4]
Concatenation (+) and equality (==) also work as expected for lists:
>>> first + [5, 6]
[1, 2, 3, 4, 5, 6]
>>> first == second True
Copyright 2019 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
137 Lists
The print function strips the quotation marks from a string, but it does not alter the look of a list:
>>> print("1234") 1234
>>> print([1, 2, 3, 4]) [1, 2, 3, 4]
To print the contents of a list without the brackets and commas, you can use a for loop, as follows:
>>> for number in [1, 2, 3, 4]:
print(number, end = " ") 1 2 3 4
Finally, you can use the in operator to detect the presence or absence of a given element:
>>> 3 in [1, 2, 3]
True
>>> 0 in [1, 2, 3]
False
Table 5-1 summarizes these operators and functions, where L refers to a list.
Operator or Function What It Does
L[<an integer expression>] Subscript used to access an element at the given index position.
L[<start>:<end>] Slices for a sublist. Returns a new list.
L1 + L2 List concatenation. Returns a new list consisting of
the elements of the two operands.
print(L) Prints the literal representation of the list.
len(L) Returns the number of elements in the list.
list(range(<upper>)) Returns a list containing the integers in the range 0 through upper - 1.
==, !=, <, >, <=, >= Compares the elements at the corresponding posi- tions in the operand lists. Returns True if all the results are true, or False otherwise.
for <variable> in L: <statement> Iterates through the list, binding the variable to each element.
<any value> in L Returns True if the value is in the list or False otherwise.
table 5-1 Some operators and functions used with lists
Copyright 2019 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
138
Replacing an Element in a List
The examples discussed thus far might lead you to think that a list behaves exactly like a string. However, there is one huge difference. Because a string is immutable, its structure and contents cannot be changed. But a list is changeable—that is, it is mutable. At any point in a list’s lifetime, elements can be inserted, removed, or replaced. The list itself maintains its identity but its internal state—its length and its contents—can change.
The subscript operator is used to replace an element at a given position, as shown in the next session:
>>> example = [1, 2, 3, 4]
>>> example [1, 2, 3, 4]
>>> example[3] = 0
>>> example [1, 2, 3, 0]
Note that the subscript operation refers to the target of the assignment statement, which is not the list but an element’s position within it. Much of list processing involves replacing each element with the result of applying some operation to that element. We now present two examples of how this is done.
The first session shows how to replace each number in a list with its square:
>>> numbers = [2, 3, 4, 5]
>>> numbers [2, 3, 4, 5]
>>> for index in range(len(numbers)):
numbers[index] = numbers[index] ** 2
>>> numbers [4, 9, 16, 25]
Note that the code uses a for loop over an index rather than a for loop over the list ele- ments, because the index is needed to access the positions for the replacements. The next session uses the string method split to extract a list of the words in a sentence. These words are then converted to uppercase letters within the list:
>>> sentence = "This example has five words."
>>> words = sentence.split()
>>> words
['This', 'example', 'has', 'five', 'words.']
>>> for index in range(len(words)):
words[index] = words[index].upper()
>>> words
['THIS', 'EXAMPLE', 'HAS', 'FIVE', 'WORDS.']
List Methods for Inserting and Removing Elements
The list type includes several methods for inserting and removing elements. These meth- ods are summarized in Table 5-2, where L refers to a list. To learn more about each method, enter Copyright 2019 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203help(list.<method name>) in a Python shell.
139 Lists
The method insert expects an integer index and the new element as arguments. When the index is less than the length of the list, this method places the new element before the existing element at that index, after shifting elements to the right by one position. At the end of the operation, the new element occupies the given index position. When the index is greater than or equal to the length of the list, the new element is added to the end of the list. The next session shows insert in action:
>>> example = [1, 2]
>>> example [1, 2]
>>> example.insert(1, 10)
>>> example [1, 10, 2]
>>> example.insert(3, 25)
>>> example [1, 10, 2, 25]
The method append is a simplified version of insert. The method append expects just the new element as an argument and adds the new element to the end of the list. The method
extend performs a similar operation, but adds the elements of its list argument to the end of the list. The next session shows the differences between append, extend, and the + operator
>>> example = [1, 2]
>>> example [1, 2]
>>> example.append(3)
>>> example [1, 2, 3]
>>> example.extend([11, 12, 13])
>>> example
[1, 2, 3, 11, 12, 13]
>>> example + [14, 15]
[1, 2, 3, 11, 12, 13, 14, 15]
>>> example
[1, 2, 3, 11, 12, 13]
List Method What It Does
L.append(element) Adds element to the end of L.
L.extend(aList) Adds the elements of aList to the end of L.
L.insert(index, element) Inserts element at index if index is less than the length of L. Otherwise, inserts element at the end of L.
L.pop() Removes and returns the element at the end of L. L.pop(index) Removes and returns the element at index. table 5-2 List methods for inserting and removing elements
Copyright 2019 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
140
Note that the + operator builds and returns a brand new list containing the elements of the two operands, whereas append and extend modify the list object on which the methods are called.
The method pop is used to remove an element at a given position. If the position is not specified, pop removes and returns the last element. If the position is specified, pop removes the element at that position and returns it. In that case, the elements that followed the removed element are shifted one position to the left. The next session removes the last and first elements from the example list:
>>> example
[1, 2, 10, 11, 12, 13]
>>> example.pop() # Remove the last element 13
>>> example
[1, 2, 10, 11, 12]
>>> example.pop(0) # Remove the first element 1
>>> example [2, 10, 11, 12]
Note that the method pop and the subscript operator expect the index argument to be within the range of positions currently in the list. If that is not the case, Python raises an exception.
Searching a List
After elements have been added to a list, a program can search for a given element. The in operator determines an element’s presence or absence, but programmers often are more interested in the position of an element if it is found (for replacement, removal, or other use). Unfortunately, the list type does not include the convenient find method that is used with strings. Recall that find returns either the index of the given substring in a string or –1 if the substring is not found. Instead of find, you must use the method index to locate an element’s position in a list. It is unfortunate that index raises an exception when the target element is not found. To guard against this unpleasant consequence, you must first use the in operator to test for presence and then the index method if this test returns
True. The next code segment shows how this is done for an example list and target element:
aList = [34, 45, 67]
target = 45
if target in aList:
print(aList.index(target)) else:
print(-l)
Sorting a List
Although a list’s elements are always ordered by position, it is possible to impose a natural ordering on them as well. In other words, you can arrange some elements in numeric or alphabetical order. A list of numbers in ascending order and a list of names in alphabetical Copyright 2019 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
141 Lists
order are sorted lists. When the elements can be related by comparing them for less than and greater than as well as equality, they can be sorted. The list method sort mutates a list by arranging its elements in ascending order. Here is an example of its use:
>>> example = [4, 2, 10, 8]
>>> example [4, 2, 10, 8]
>>> example.sort()
>>> example [2, 4, 8, 10]
Mutator Methods and the Value None
The functions and methods examined in previous chapters return a value that the caller can then use to complete its work. Mutable objects (such as lists) have some methods devoted entirely to modifying the internal state of the object. Such methods are called mutators. Exam- ples are the list methods insert, append, extend, pop, and sort. Because a change of state is all that is desired, a mutator method usually returns no value of interest to the caller (but note that
pop is an exception to this rule). Python nevertheless automatically returns the special value None even when a method does not explicitly return a value. We mention this now only as a warning against the following type of error. Suppose you forget that sort mutates a list, and instead you mistakenly think that it builds and returns a new, sorted list and leaves the original list unsorted.
Then you might write code like the following to obtain what you think is the desired result:
>>> aList = aList.sort()
Unfortunately, after the list object is sorted, this assignment has the result of setting the variable aList to the value None. The next print statement shows that the reference to the list object is lost:
>>> print(aList) None
Later in this book, you will learn how to make something useful out of None.
Aliasing and Side Effects
As you learned earlier, numbers and strings are immutable. That is, you cannot change their internal structure. However, because lists are mutable, you can replace, insert, or remove elements. The mutable property of lists leads to some interesting phenomena, as shown in the following session:
>>> first = [10, 20, 30]
>>> second = first
>>> first [10, 20, 30]
>>> second [10, 20, 30]
>>> first[1] = 99
>>> first
Copyright 2019 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
142
[10, 99, 30]
>>> second [10, 99, 30]
In this example, a single list object is created and modified using the subscript operator.
When the second element of the list named first is replaced, the second element of the list named second is replaced also. This type of change is what is known as a side effect. This happens because after the assignment second = first, the variables first and
second refer to the exact same list object. They are aliases for the same object, as shown in Figure 5-1. This phenomenon is known as aliasing.
Figure 5-1 Two variables refer to the same list object f irst
second 10 99 30
0 1 2
Figure 5-2 Two variables refer to different list objects
f irst 10 99 30
third 10 99 30
0 1 2
0 1 2
If the data are immutable strings, aliasing can save on memory. But as you might imagine, aliasing is not always a good thing when side effects are possible. Assignment creates an alias to the same object rather than a reference to a copy of the object. To prevent aliasing, you can create a new object and copy the contents of the original to it, as shown in the next session:
>>> third = []
>>> for element in first:
third.append(element)
>>> first [10, 99, 30]
>>> third [10, 99, 30]
>>> first[l] = 100
>>> first [10, 100, 30]
>>> third [10, 99, 30]
The variables first and third refer to two different list objects, although their contents are initially the same, as shown in Figure 5-2. The important point is that they are not aliases, so you don’t have to be concerned about side effects.
Copyright 2019 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
143 Lists
A simpler way to copy a list is to pass the source list to a call of the list function, as follows:
>>> third = list(first)
Equality: Object Identity and Structural Equivalence
Occasionally, programmers need to see whether two variables refer to the exact same object or to different objects. For example, you might want to determine whether one variable is an alias for another. The == operator returns True if the variables are aliases for the same object. Unfor- tunately, == also returns True if the contents of two different objects are the same. The first rela- tion is called object identity, whereas the second relation is called structural equivalence. The
== operator has no way of distinguishing between these two types of relations.
Python’s is operator can be used to test for object identity. It returns True if the two oper- ands refer to the exact same object, and it returns False if the operands refer to distinct objects (even if they are structurally equivalent). The next session shows the difference between == and is, and Figure 5-3 depicts the objects in question.
>>> first = [20, 30, 40]
>>> second = first
>>> third = list(first) # Or first[:]
>>> first == second True
>>> first == third True
>>> first is second True
>>> first is third False
Figure 5-3 Three variables and two distinct list objects 20 30 40
third 20 30 40
f irst second
0 1 2
0 1 2
Example: Using a List to Find the Median of a Set of Numbers
Researchers who do quantitative analysis are often interested in the median of a set of num- bers. For example, the U.S. government often gathers data to determine the median family income. Roughly speaking, the median is the value that is less than half the numbers in the set and greater than the other half. If the number of values in a list is odd, the median of the list is the value at the midpoint when the set of numbers is sorted; otherwise, the median is the average of the two values surrounding the midpoint. Thus, the median of the list
Copyright 2019 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
144
[1, 3, 3, 5, 7] is 3, and the median of the list [1, 2, 4, 4] is also 3. The following script inputs a set of numbers from a text file and prints their median:
"""
File: median.py
Prints the median of a set of numbers in a file.
"""
fileName = input("Enter the filename: ") f = open(fileName, 'r')
# Input the text, convert it to numbers, and
# add the numbers to a list numbers = []
for line in f:
words = line.split() for word in words:
numbers.append(float(word))
# Sort the list and print the number at its midpoint numbers.sort()
midpoint = len(numbers) // 2 print("The median is", end = " ") if len(numbers) % 2 == 1:
print(numbers[midpoint]) else:
print((numbers[midpoint] + numbers[midpoint - 1]) / 2)
Note that the input process is the most complex part of this script. An accumulator list,
numbers, is set to the empty list. The for loop reads each line of text and extracts a list of words from that line. The nested for loop traverses this list to convert each word to a number. The list method append then adds each number to the end of numbers, the accumulator list. The remaining lines of code locate the median value. When run with an input file whose contents are
3 2 7 8 2 1 5
the script produces the following output:
The median is 3.0
Tuples
A tuple is a type of sequence that resembles a list, except that, unlike a list, a tuple is immu- table. You indicate a tuple literal in Python by enclosing its elements in parentheses instead of square brackets. The next session shows how to create several tuples:
>>> fruits = ("apple", "banana")
>>> fruits
('apple', 'banana')
Copyright 2019 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203