OReilly Learning Python 2nd Edition Dec 2003 ISBN 0596002815

(1)

6.4 Dictionaries in Action

As Table 6-2 suggests, dictionaries are indexed by key, and nested dictionary entries are referenced by a series of indexes (keys in square brackets). When Python creates a dictionary, it stores its items in any order it chooses; to fetch a value back, supply the key that it is associated with. Let's go back to the interpreter to get a feel for some of the dictionary operations in

Table 6-2.

6.4.1 Basic Dictionary Operations

In normal operation, you create dictionaries and store and access items by key:

Here, the dictionary is assigned to variable d2; the value of the key 'spam' is the integer 2. We use the same square bracket syntax to index dictionaries by key as we did to index lists by offsets, but here it means access by key, not position.

Notice the end of this example: the order of keys in a dictionary will almost always be different than what you originally typed. This is on purposeto implement fast key lookup (a.k.a.

(2)

The built-in len function works on dictionaries too; it returns the number of items stored away in the dictionary, or

equivalently, the length of its keys list. The dictionary has_key

method allows you to test for key existence, and the keys

method returns all the keys in the dictionary, collected in a list. The latter of these can be useful for processing dictionaries sequentially, but you shouldn't depend on the order of the keys list. Because the keys result is a normal list, however, it can always be sorted if order matters:

>>> len(d2) # Number of entries in dictionary

Notice the third expression in this listing: the in membership test used for strings and lists also works on dictionariesit checks if a key is stored in the dictionary, like the has_key method call of the prior line. Technically, this works because dictionaries define iterators that step through their keys lists. Other types provide iterators that reflect their common uses; files, for

example, have iterators that read line by line; more on iterators in Chapter 14 and Chapter 21.

In Chapter 10, you'll see that the last entry in Table 6-2 is another way to build dictionaries by passing lists of tuples to the new dict call (really, a type constructor), when we explore the zip function. It's a way to construct a dictionary from key and value lists in a single call.

(3)

Dictionaries are mutable, so you can change, expand, and shrink them in-place without making new dictionaries, just like lists. Simply assign a value to a key to change or create the entry. The del statement works here too; it deletes the entry associated with the key specified as an index. Notice the

nesting of a list inside a dictionary in this example (the value of key "ham"); all collection data types in Python can nest inside each other arbitrarily:

>>> d2['ham'] = ['grill', 'bake', 'fry'] # Change entry. >>> d2

{'eggs': 3, 'spam': 2, 'ham': ['grill', 'bake', 'fry']}

>>> del d2['eggs'] # Delete entry. >>> d2

{'spam': 2, 'ham': ['grill', 'bake', 'fry']}

>>> d2['brunch'] = 'Bacon' # Add new entry. >>> d2

{'brunch': 'Bacon', 'spam': 2, 'ham': ['grill', 'bake', 'fry']}

As with lists, assigning to an existing index in a dictionary

changes its associated value. Unlike lists, whenever you assign a new dictionary key (one that hasn't been assigned before), you create a new entry in the dictionary, as was done in the previous example for key 'brunch'. This doesn't work for lists, because Python considers an offset out of bounds if it's beyond the end of a list, and throws an error. To expand a list, you need to use such tools as the append method or slice assignment instead.

6.4.3 More Dictionary Methods

(4)

respectively.

>>> d2.values( ), d2.items( )

([3, 1, 2], [('eggs', 3), ('ham', 1), ('spam', 2)])

Such lists are useful in loops that need to step through

dictionary entries one by one. Fetching a nonexistent key is normally an error, but the get method returns a default value (None, or a passed-in default) if the key doesn't exist.

>>> d2.get('spam'), d2.get('toast'), d2.get('toast', 88) (2, None, 88)

The update method provides something similar to

concatenation for dictionaries; it merges the keys and values of one dictionary into another, blindly overwiting values of the same key:

Dictionaries also provide a copy method; more on this method in the next chapter. In fact, dictionaries come with more

methods than those listed in Table 6-2; see the Python library manual or other documentation sources for a comprehensive list.

6.4.4 A Languages Table

(5)

creator name by indexing on language name:

>>> table = {'Python': 'Guido van Rossum', ... 'Perl': 'Larry Wall',

>>> for lang in table.keys( ):

... print lang, '\t', table[lang] ...

Tcl John Ousterhout Python Guido van Rossum Perl Larry Wall

The last command uses a for loop, which we haven't covered yet. If you aren't familiar with for loops, this command simply iterates through each key in the table and prints a

tab-separated list of keys and their values. See Chapter 10 for more on for loops.

Because dictionaries aren't sequences, you can't iterate over them directly with a for statement, in the way you can with strings and lists. But if you need to step through the items in a dictionary it's easy: calling the dictionary keys method returns a list of all stored keys you can iterate through with a for. If needed, you can index from key to value inside the for loop as done in this code.

(6)

membership to work on dictionaries as well.

6.4.5 Dictionary Usage Notes

Here are a few additional details you should be aware of when using dictionaries:

Sequence operations don't work. Dictionaries are mappings, not sequences; because there's no notion of ordering among their items, things like concatenation (an ordered joining) and slicing (extracting contiguous section) simply don't apply. In fact, Python raises an error when your code runs, if you try to do such things.

Assigning to new indexes adds entries. Keys can be created either when you write a dictionary literal (in which case they are embedded in the literal itself), or when you assign values to new keys of an existing dictionary object. The end result is the same.

Keys need not always be strings. Our examples used strings as keys, but any other immutable objects (not lists) work just as well. In fact, you could use integers as keys, which makes a dictionary look much like a list (when

indexing, at least). Tuples are sometimes used as dictionary keys too, allowing for compound key values. And class

instance objects (discussed in Part VI) can be used as keys too, as long as they have the proper protocol methods; roughly, they need to tell Python that their values won't change, or else they would be useless as fixed keys.

(7)

When you use lists, it is illegal to assign to an offset that is off the end of the list:

>>> L = [ ]

>>> L[99] = 'spam'

Traceback (most recent call last): File "<stdin>", line 1, in ?

IndexError: list assignment index out of range

Although you could use repetition to pre-allocate as big a list as you'll need (e.g., [0]*100), you can also do something that looks similar with dictionaries, which does not require such space allocations. By using integer keys, dictionaries can emulate lists that seem to grow on offset assignment:

>>> D = { } dictionary with a single entry; the value of key 99 is the string 'spam'. You're able to access this structure with offsets much like a list, but you don't have to allocate space for all the

positions you might ever need to assign values to in the future.

6.4.5.2 Using dictionaries for sparse data

structures

In a similar way, dictionary keys are also commonly leveraged to implement sparse data structuresfor example,

(8)

>>> Matrix = { }

Here, we use a dictionary to represent a three-dimensional array, all of which are empty except for the two positions,

(2,3,4) and (7,8,8). The keys are tuples that record the

coordinates of nonempty slots. Rather than allocating a large and mostly empty three-dimensional matrix, we can use a simple two-item dictionary. In this scheme, accessing empty slots triggers a nonexistent key exceptionthese slots are not physically stored:

>>> Matrix[(2,3,6)]

Traceback (most recent call last): File "<stdin>", line 1, in ?

KeyError: (2, 3, 6)

If we want to fill in a default value instead of getting an error message here, there are at least three ways we can handle such cases. We can either test for keys ahead of time in if

statements, use the try statement to catch and recover from the exception explicitly, or simply use the dictionary get

method shown earlier to provide a default for keys that do not exist:

>>> if Matrix.has_key((2,3,6)): # Check for key before fetch. ... print Matrix[(2,3,6)]

... else:

... print 0 ...

(9)

>>> try:

As you can see, dictionaries can play many roles in Python. In general, they can replace search data structures (since indexing by key is a search operation), and represent many types of

structured information. For example, dictionaries are one of many ways to describe the properties of an item in your

program's domain; they can serve the same role as "records" or "structs" in other languages:

This example fills out the dictionary by assigning to new keys over time. Especially when nested, Python's built-in data types allow us to easily represent structured information:

(10)

... 'jobs': ['trainer', 'writer'], ... 'web': 'www.rmi.net/~lutz',

... 'home': {'state': 'CO', 'zip':80501}}

This example uses a dictionary to capture object properties again, but has coded it all at once (rather than assigning to each key separately), and has nested a list and a dictionary to represent structure property values. To fetch components of nested objects, simply string together indexing operations:

>>> mel['name'] 'Mark'

>>> mel['jobs']

['trainer', 'writer'] >>> mel['jobs'][1] 'writer'

>>> mel['home']['zip'] 80501

Finally, note that more ways to build dictionaries may emerge over time. In Python 2.3, for example, the calls

dict(name='mel', age=41) and dict([('name,'bob'),

('age',30)]) also build two-key dictionaries. See Chapter 10,

(11)

Why You Will Care: Dictionary Interfaces

Besides being a convenient way to store information by key in your programs, some Python extensions also present interfaces that look and work the same as dictionaries. For instance, Python's interface to dbm access-by-key files looks much like a dictionary that must be opened; strings are stored and fetched using key indexes:

import anydbm

file = anydbm.open("filename") # Link to file. file['key'] = 'data' # Store data by key. data = file['key'] # Fetch data by key.

Later, you'll see that we can store entire Python objects this way too, if we replace anydbm in the above with shelve (shelves are access-by-key databases of persistent Python objects). For Internet work, Python's CGI script support also presents a dictionary-like interface; a call to cgi.FieldStorage yields a

dictionary-like object, with one entry per input field on the client's web page:

import cgi

form = cgi.FieldStorage( ) # Parse form data. if form.has_key('name'):

showReply('Hello, ' + form['name'].value)

(12)

6.3 Dictionaries

Besides lists, dictionaries are perhaps the most flexible built-in data type in Python. If you think of lists as ordered collections of objects, dictionaries are unordered collections; their chief distinction is that items are stored and fetched in dictionaries by

key, instead of positional offset.

Being a built-in type, dictionaries can replace many of the searching algorithms and data structures you might have to implement manually in lower-level languagesindexing a dictionary is a very fast search operation. Dictionaries also sometimes do the work of records and symbol tables used in other languages, can represent sparse (mostly empty) data structures, and much more. In terms of their main properties, dictionaries are:

Accessed by key, not offset

Dictionaries are sometimes called associative arrays or hashes. They associate a set of values with keys, so that you can fetch an item out of a dictionary using the key that stores it. You use the same indexing operation to get

components in a dictionary, but the index takes the form of a key, not a relative offset.

Unordered collections of arbitrary objects

(13)

Variable length, heterogeneous, arbitrarily nestable

Like lists, dictionaries can grow and shrink in place (without making a copy), they can contain objects of any type, and support nesting to any depth (they can contain lists, other dictionaries, and so on).

Of the category mutable mapping

Dictionaries can be changed in place by assigning to indexes, but don't support the sequence operations that work on strings and lists. Because dictionaries are

unordered collections, operations that depend on a fixed order (e.g., concatenation, slicing) don't make sense.

Instead, dictionaries are the only built-in representative of the mapping type categoryobjects that map keys to values.

Tables of object references (hash tables)

If lists are arrays of object references, dictionaries are unordered tables of object references. Internally,

dictionaries are implemented as hash tables (data

structures that support very fast retrieval), which start small and grow on demand. Moreover, Python employs optimized hashing algorithms to find keys, so retrieval is very fast. Dictionaries store object references (not copies), just like lists.

Table 6-2 summarizes some of the most common dictionary operations (see the library manual for a complete list).

(14)

empty dictionary is an empty set of braces, and dictionaries can be nested by writing one as a value inside another dictionary, or within a list or tuple.

[4] The same note about the relative rarity of literals applies here: dictionaries are often built up by assigning to new keys at runtime, rather than writing literals. But see the following section on changing dictionaries; lists and dictionaries are grown in different ways. Assignment to new keys works for dictionaries, but fails for lists (lists are grown with append instead).

Table 6-2. Common dictionary literals and operations

Operation Interpretation

D1 = { } Empty dictionary

D2 = {'spam': 2, 'eggs': 3} Two-item dictionary

D3 = {'food': {'ham': 1, 'egg': 2}} Nesting

D2['eggs']d3['food']['ham'] Indexing by key

D2.has_key('eggs'), 'eggs' in D2D2.keys( )D2.values( )D2.copy( )D2.get(key,

default)D2.update(D1)

Methods: membership test, keys list, values list, copies, defaults, merge, etc.

len(d1) Length (number stored entries)

D2[key] = 42del d2[key] Adding/changing, deleting

(15)

Chapter 14. Advanced Function Topics

(16)

Chapter 21. Class Coding Details

Did all of Chapter 20 make sense? If not, don't worry; now that we've had a quick tour, we're going to dig a bit deeper and

study the concepts we've introduced in further detail. This

(17)

Chapter 10. while and for Loops

In this chapter, we meet Python's two main looping

constructsstatements that repeat an action over and over. The first of these, the while loop, provides a way to code general loops; the second, the for statement, is designed for stepping through the items in a sequence object and running a block of code for each item.

There are other kinds of looping operations in Python, but the two statements covered here are the primary syntax provided for coding repeated actions. We'll also study a few unusual

(18)

Part VI: Classes and OOP

In Part VI, we study the basics of object-oriented

programming (OOP), as well as the code you write to use OOP in Pythonthe class statement. As you'll see, OOP is an option in Python, but a good one: no other construct in the language supports code reuse to the degree that the

class statement does. Especially in larger programs,

OOP's notion of programming by customizing is a powerful paradigm to apply, and can cut development time

(19)

Chapter 13. Scopes and Arguments

Chapter 12 looked at basic function definition and calls. As

(20)

Chapter 27. Common Tasks in Python

At this point in the book, you have been exposed to a fairly complete survey of the more formal aspects of the language (the syntax, the data types, etc.). In this chapter, we'll "step out of the classroom" by looking at a set of basic computing tasks and examining how Python programmers typically solve them, hopefully helping you ground the theoretical knowledge with concrete results.

Python programmers don't like to reinvent wheels when they already have access to nice, round wheels in their garage. Thus, the most important content in this chapter is the description of selected tools that make up the Python standard librarybuilt-in functions, library modules, and their most useful functions and classes. While you most likely won't use all of these in any one program, no useful program avoids all of these. Just as Python provides a list object type because sequence manipulations occur in all programming contexts, the library provides a set of modules that will come in handy over and over again. Before designing and writing any piece of generally useful code, check to see if a similar module already exists. If it's part of the

standard Python library, you can be assured that it's been

heavily tested; even better, others are committed to fixing any remaining bugsfor free.

(21)

of the standard Python toolset. Three other O'Reilly books provide excellent additional information: the Python Pocket Reference, written by Mark Lutz, which covers the most important modules in the standard library, along with the

syntax and built-in functions in compact form; Fredrik Lundh's

Python Standard Library, which takes on the formidable task of both providing additional documentation for each module in the standard library as well as providing an example program

showing how to use each module; and finally, Alex Martelli's

Python in a Nutshell provides a thorough yeteminently readable and concise description of the language and standard library. As we'll see in Section 27.1, Python comes with tools that make self-learning easy as well.

Just as we can't cover every standard module, the set of tasks covered in this chapter is necessarily limited. If you want more, check out the Python Cookbook (O'Reilly), edited by David

Ascher and Alex Martelli. This Cookbook covers many of the same problem domains we touch on here but in much greater depth and with much more discussion. That book, leveraging the collective knowledge of the Python community, provides a much broader and richer survey of Pythonic approaches to common tasks.

This chapter limits itself to tools available as part of standard Python distributions. The next two chapters expand the scope to third party modules and libraries, since many such modules can be just as valuable to the Python programmer.

This chapter starts by covering common tasks which apply to fundamental programming conceptstypes, data structures, strings, moving on to conceptually higher-level topics like files and directories, Internet-related operations and process

(22)

Part IV: Functions

In Part IV, we study the Python functiona package of code that can be called repeatedly, with different inputs and outputs each time. We've already been calling functions earlier in the book: open, to make a file object, for

instance. Here, the emphasis will be on coding

(23)

Chapter 20. Class Coding Basics

Now that we've talked about OOP in the abstract, let's move on to the details of how this translates to actual code. In this

chapter and in Chapter 21, we fill in the syntax details behind the class model in Python.

If you've never been exposed to OOP in the past, classes can be somewhat complicated if taken in a single dose. To make class coding easier to absorb, we'll begin our detailed look at OOP by taking a first look at classes in action in this chapter. We'll

expand on the details introduced here in later chapters of this part of the book; but in their basic form, Python classes are easy to understand.

Classes have three primary distinctions. At a base level, they are mostly just namespaces, much like the modules studied in

(24)

Chapter 12. Function Basics

In Part III, we looked at basic procedural statements in Python. Here, we'll move on to explore a set of additional statements that create functions of our own. In simple terms, a function is a device that groups a set of statements, so they can be run more than once in a program. Functions also let us specify parameters that serve as function inputs, and may differ each time a function's code is run. Table 12-1 summarizes the

primary function-related tools we'll study in this part of the book.

Table 12-1. Function-related statements and expressions

Statement Examples

Calls myfunc("spam", ham, "toast")

def, return, yield def adder(a, b=1, *c): return a+b+c[0]

global def function( ): global x; x = 'new'

(25)

27.1 Exploring on Your Own

Before digging into specific tasks, we should say a brief word about self-exploration. We have not been exhaustive in

coverage of object attributes or module contents in order to focus on the most important aspects of the objects under

discussion. If you're curious about what we've left out, you can look it up in the Library Reference, or you can poke around in the Python interactive interpreter, as shown in this section. The dir built-in function returns a list of all of the attributes of an object, and, along with the type built-in, provides a great way to learn about the objects you're manipulating. For

example:

>>> dir([ ]) # What are the attributes of lists? ['__add__', '__class__', '__contains__', '__delattr__', '__delitem__',

'__delslice__', '__doc__', '__eq__', '__ge__', '__getattribute__', '__getitem__', '__getslice__', '__gt__', '__hash__', '__iadd__', '__imul__', '__init__',

'__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__',

'__repr__', '__rmul__', '__setattr__', '__setitem__', '__setslice__', '__str__', 'append', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort']

What this tells you is that the empty list object has a few

methods: append, count, extend, index, insert, pop, remove,

reverse, sort, and a lot of "special methods" that start with an underscore (_) or two (__). These are used under the hood by Python when performing operations like +. Since these special methods are not needed very often, we'll write a simple utility function that will not display them:

>>> def mydir(obj):

... orig_dir = dir(obj)

... return [item for item in orig_dir if not item.startswith('_')] ...

(26)

Using this new function on the same empty list yields:

>>> mydir([ ]) # What are the attributes of lists?

['append', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort']

You can then explore any Python object:

>>> mydir(( )) # What are the attributes of tuples? [ ] # Note: no "normal" attributes

>>> import sys # What are the attributes of files? >>> mydir(sys.stdin) # What are the attributes of files?

['close', 'closed', 'fileno', 'flush', 'isatty', 'mode', 'name', 'read', 'readinto', 'readline', 'readlines', 'seek', 'softspace', 'tell', 'truncate', 'write',

'writelines', 'xreadlines']

>>> mydir(sys) # Modules are objects too.

['argv', 'builtin_module_names', 'byteorder', 'copyright', 'displayhook',

'dllhandle', 'exc_info', 'exc_type', 'excepthook', 'exec_prefix', 'executable', 'exit', 'getdefaultencoding', 'getrecursionlimit', 'getrefcount', 'hexversion', 'last_traceback', 'last_type', 'last_value', 'maxint', 'maxunicode', 'modules', 'path', 'platform', 'prefix', 'ps1', 'ps2', 'setcheckinterval', 'setprofile', 'setrecursionlimit', 'settrace', 'stderr', 'stdin', 'stdout', 'version',

'version_info', 'warnoptions', 'winver']

>>> type(sys.version) # What kind of thing is 'version'? <type 'string'>

>>> print repr(sys.version) # What is the value of this string? '2.3a1 (#38, Dec 31 2002, 17:53:59) [MSC v.1200 32 bit (Intel)]'

Recent versions of Python also contain a built-in that is very helpul to beginners, named (appropriately enough) help:

(27)

DESCRIPTION

This module provides access to some objects used or maintained by the interpreter and to functions that interact strongly with the

interpreter.

Dynamic objects:

argv--command line arguments; argv[0] is the script pathname if known path--module search path; path[0] is the script directory, else '' modules--dictionary of loaded modules

displayhook--called to show results in an interactive session excepthook--called to handle any uncaught exception other than SystemExit

To customize printing in an interactive session or to install a

custom top-level exception handler, assign other functions to replace these.

...

There is quite a lot to the online help system. We recommend that you start it first in its "modal" state, just by typing help( ). From then on, any string you type will yield its

documentation. Type quit to leave the help mode.

>>> help( )

(28)

DESCRIPTION

This module provides socket operations and some related functions. On Unix, it supports IP (Internet Protocol) and Unix domain sockets. On other systems, it only supports IP. Functions specific for a

socket are available as methods of the socket object.

Functions:

Here is a list of the Python keywords. Enter any keyword to get more help.

and elif global or

Here is a list of available topics. Enter any topic name to get more help.

ASSERTION DEBUGGING LITERALS SEQUENCEMETHODS1 ASSIGNMENT DELETION LOOPING SEQUENCEMETHODS2 ATTRIBUTEMETHODS DICTIONARIES MAPPINGMETHODS SEQUENCES

ATTRIBUTES DICTIONARYLITERALS MAPPINGS SHIFTING AUGMENTEDASSIGNMENT ELLIPSIS METHODS SLICINGS

(29)

BINARY EXPRESSIONS NONE SPECIALMETHODS BITWISE FILES NUMBERMETHODS STRINGMETHODS BOOLEAN FLOAT NUMBERS STRINGS

CALLABLEMETHODS FORMATTING OBJECTS SUBSCRIPTS CALLS FRAMEOBJECTS OPERATORS TRACEBACKS CLASSES FRAMES PACKAGES TRUTHVALUE CODEOBJECTS FUNCTIONS POWER TUPLELITERALS COERCIONS IDENTIFIERS PRECEDENCE TUPLES

COMPARISON IMPORTING PRINTING TYPEOBJECTS COMPLEX INTEGER PRIVATENAMES TYPES

CONDITIONAL LISTLITERALS RETURNING UNARY CONVERSIONS LISTS SCOPING UNICODE help> TYPES

3.2 The standard type hierarchy

Below is a list of the types that are built into Python. Extension modules written in C can define additional types. Future versions of Python may add types to the type hierarchy (e.g., rational numbers, efficiently stored arrays of integers, etc.).

...

(30)

Part V: Modules

(31)

Part III: Statements and Syntax

In Part III, we study Python's procedural statement set: statements that select from alternative actions, repeat operations, print objects, and so on. Since this is our first formal look at statements, we will also explore Python's general syntax model. As we'll see, Python has a familiar and simple syntax model, though we often type much less in Python statements than in some other languages.

We'll also meet the boolean expressions in conjunction with conditional statements and loops, and learn about Python documentation schemes while studying the syntax of documentation strings and comments. At an abstract level, the statements we'll meet here are used to create and process the objects in Part II. By the end of this part, you will be able to code and run substantial Python

(32)

Part I: Getting Started

Part I begins by exploring some of the ideas behind

(33)

Part II: Types and Operations

In Part II, we study Python's built-in core data types, sometimes called object types. Although there are more kinds of objects in Python than we will meet in this part, the types discussed here are generally considered the core data typesthe main subjects of almost every Python script you're likely to read or write.

This part of the book is organized around major core types, but watch for related topics, such as dynamic