(a) (b)
Figure 6.9: Two drawings of a planar network. (a) A small planar network with four nodes and six edges. It is self-evident that the network is planar, since in this depiction it has no edges that cross. (b) The same network redrawn with two of its edges crossing.
Even though the edges cross, the network is still planar—a network is planar if itcanbe drawn without crossing edges. How you actually draw it is up to you.
data structures, there is no obvious two-dimensional surface onto which the network falls but it is planar nonetheless.
Among non-tree networks, some are planar for physical reasons. A good example is a road network. Because roads are confined to the Earth’s surface they form a roughly planar network. It does happen sometimes that roads meet without intersecting, one passing over another on a bridge, so that in fact, if one wishes to be precise, the road network is not planar. However, such instances are rare and the network is planar to a good approximation.
Another example is the network of which countries, states, or provinces are adjacent to which others—see Fig. 6.10. We can take a map depicting any set of contiguous geographic regions, represent each by a node, and draw an edge between any two that share a border. It is easy to see that the resulting network can always be drawn without crossing edges provided the regions in question are formed of contiguous landmasses.12
Networks of this type, representing regions on a map, have played an important role in mathematics, in the proof of the four-color theorem, which
12Technically, the map of the lower 48 US states in Fig. 6.10 does not quite satisfy this latter condition, since the state of Michigan is formed of two landmasses. (Several other states include offshore islands, but these are mostly too small to figure on our map.) We could get around this by having two nodes for the Upper and Lower Peninsulas of Michigan, though in that case we would no longer have exactly one node per state. In Fig. 6.10 we use just one node for Michigan, situated in the Lower Peninsula, but we do include an edge between Michigan and Wisconsin, which would not be present were it not for the Upper Peninsula, which shares a border with Wisconsin.
6.9 | P
lanar networks
Figure 6.10: Graph of the adjacencies of the lower 48 United States. In this network each of the lower 48 states in the US is represented as a node and there is an edge between any two nodes if the corresponding states share a border. The resulting graph is planar, and indeed any set of states, countries, or other regions on a two-dimensional map can be turned into a planar graph in this way.
states that it is possible to color any set of regions on a two-dimensional map, real or imagined, with at most four colors such that no two adjacent regions have the same color, no matter how many regions there are or of what size or shape.13 By constructing the network corresponding to the map in question, this problem can be converted into a problem of coloring the nodes of a planar network in such a way that no two nodes connected by an edge have the same color. The number of colors required to color a network in this way is called thechromatic numberof the network and many mathematical results are known about chromatic numbers. The proof of the four-color theorem—the proof that the chromatic number of a planar network is always four or less—is one of the triumphs of traditional graph theory and was first given by Appel and Haken [24–26] in 1976 after more than a hundred years of valiant effort within the mathematics community.14
An important question that arises in graph theory is how to determine, given
13The theorem only applies for a map on a surface with topological genus zero, such as a flat plane or a sphere. A map on a torus (which has genus 1) can require as many as seven colors.
14Appel and Haken’s proof was controversial at the time of its publication because it made extensive use of a computer to check large numbers of special cases. On the one hand, the proof was revolutionary for being the first proof of a major mathematical result generated in this fashion.
On the other hand, a number of people questioned whether it could really be considered a proof at all, given that it was far too large for a human being to check its correctness by hand.
a particular network, whether that network is planar. For a small network it is a straightforward matter to draw a picture and play around with the positions of the nodes to see if one can find an arrangement in which no edges cross, but for a large network this is impractical and a more general method is needed.
Technically, Kuratowski’s theorem says that a non- planar network contains an expansion of K5 or UG.
An expansion is a network with any number (includ- ing zero) of extra nodes added along its edges, as in the expansion of K5shown here.
One such method makes use of Kuratowski’s theorem, which states that every non-planar network must contain, somewhere within it, at least one of two distinctive smaller networks or subgraphs, called K5 and UG, both of which are themselves non-planar. It immediately follows that a network is planar if, and only if, it contains neither of these subgraphs.
This approach is not, however, particularly useful for the analysis of real- world networks, because such networks are rarely precisely planar. (And if they are, then, as in the case of the shared border network of countries or states, it is usually clear for other reasons that they are planar and hence Kuratowski’s theorem is unnecessary.) More often, like the road network, they are very nearly planar, but have a few edge crossings somewhere in the network. For such a network, Kuratowski’s theorem would tell us, correctly, that the network was not planar, but we would be missing the point. What we would really like is some measure of the degree of planarity of a network, a measure that could tell us, for example, that the road network is 99% planar, even though there are a few bridges or tunnels here and there. One possible such measure is the minimum number of edge crossings with which the network can be drawn.
This, however, would be a difficult quantity to calculate since, at least in the simplest approach, its evaluation would require us to try every possible way of drawing the network, of which there are an impossibly large number for all but the smallest of networks. Perhaps another approach would be to look at the number of occurrences of K5or UG in the network. So far, however, no widely accepted metric for degree of planarity has emerged. If such a measure were to gain currency it might well find occasional use in the study of real-world networks.
6.10 D
egree
Thedegreeof a node in an undirected network is the number of edges connected to it—see Fig. 6.11. In a social network of friendships between individuals, for instance, a person’s degree is the number of friends they have. Note, however, that the definition of degree is in terms of number of edges, not number of neighboring nodes. The difference is important in multigraphs: if a node has two parallel edges to the same neighbor, both contribute to the degree—see Fig. 6.11b.
Despite its simplicity, degree is one of most useful and most widely used of
6.10 | D
egree
(a) (b)
Figure 6.11: Degree of a node. (a) The central node has degree five because it has five attached edges. (b) The central node has five neighbors, but its degree is eight, because it has eight attached edges.
network concepts. It will play a large role in many of the developments in this book. Throughout the book we will denote the degree of nodei byki. For a network ofnnodes the degree can be written in terms of the adjacency matrix as15
ki Õn
j1
Ai j. (6.12)
Every edge in an undirected network has two ends and if there aremedges in total then there are 2mends of edges. But the number of ends of edges is also equal to the sum of the degrees of all the nodes, so
2m
Õn
i1
kiÕ
i j
Ai j, (6.13)
a result that we will use many times throughout this book.
The mean degreecof a node in an undirected network is c 1
n Õn
i1
ki, (6.14)
15Note that this expression gives the correct result even if there are multiedges in the network, so long as the adjacency matrix is defined as in Section 6.2. It also works if there are self-edges, pro- vided each self-edge edge is represented by a diagonal elementAii2 as discussed in Section 6.2, and not 1.
and combining this with Eq. (6.13) we get c 2m
n . (6.15)
This relation too will come up repeatedly throughout the book.
Occasionally we will come across networks in which all nodes have the same degree. In graph theory, such networks are calledregular graphsorregular networks. A regular network in which all nodes have degree k is sometimes calledk-regular. An example of a regular network is a periodic lattice such as a square or triangular lattice. On the square lattice, for instance, every node has degree four.
An infinite square lattice is an example of a 4-regular
network. 6.10.1 Density and sparsity
The maximum possible number of edges in a simple network (i.e., one with no multiedges or self-edges) is n2
12n(n−1). Theconnectanceordensityρof a network is the fraction of those edges that are actually present:
ρ m
n 2
2m
n(n−1) c
n−1, (6.16)
where we have made use of Eq. (6.15) in the last equality. Most of the networks we are interested in are sufficiently large that (6.16) can be safely approximated as
ρ c
n. (6.17)
The density lies strictly in the range 0 ≤ ρ ≤ 1. It can be thought of as the probability that a pair of nodes, picked uniformly at random from the whole network, is connected by an edge. This probability plays an important role in the random graph model discussed in Chapter 11.
Now consider a sequence of networks of increasing sizen. If the densityρ remains non-zero as n becomes large the networks are said to bedense. In a dense network the fraction of non-zero elements in the adjacency matrix is non-vanishing in the limit of large n. A network whereρ → 0 in the limit of large n is said to be sparse, and the fraction of non-zero elements in the adjacency matrix tends to zero.
These definitions only apply if you can actually take the limitn→ ∞, or at least extrapolate the limiting behavior from a sequence of networks of different sizes. When we are working with theoretical models of networks, as we will in later chapters of the book, we can take the limit formally and state whether a network is sparse or dense, but in practical situations involving observed
6.10 | D
egree
networks we cannot do this. We cannot take the limit as an empirical metabolic network or food web becomes large—we are stuck with the network nature gives us. For such networks there is no formal sense in which they are either sparse or dense.
Informally, on the other hand, one does often hear a network described, for example, as being sparse. Usually this just means that the value ofρis small.
In this qualitative sense, “sparse” just means that most of the possible edges that could exist in the network are not present.
In some cases real-world networks do change their sizes and by making measurements for different sizes we can make a guess as to whether they are best regarded as sparse or dense. The Internet and the World Wide Web are two examples of networks whose growth over time allows us to say with some conviction that they are best regarded as sparse.
In fact, most of the networks we examine in this book are usually considered to be sparse. There are very few examples where a network can truly be said to be dense, either in the mathematical sense above or in the more informal sense of just having a lot of edges.16 For our purposes, particularly when we come to study model networks, a more important distinction than that between sparse and dense is the distinction between networks with constant and diverging average degree.
Equation (6.17) tells us that the average degreec of a network is related to the density byc ρn, so in a dense network, where ρis constant as n →
∞, the average degree grows linearly with n, while for sparse networks the average degree grows sublinearly. And for some networks the average does not grow at all, meaning thatρgoes as 1/nfor largenandcremains constant.
Such networks will play an important role in the developments of this book.
There seems to be no universally accepted name for them, although they are occasionally calledextremely sparse[71].
Friendship networks, for example, plausibly have constant average degree, since it seems unlikely that the number of a person’s friends will be deter- mined by the population of the world as a whole. How many friends a person has is more a function of how much time they have to devote to the mainte- nance of friendships, which is presumably independent of world population.
Friendship networks therefore can be regarded as “extremely sparse.”
Arguably, indeed, most of the networks in this book fall into the extremely sparse category. If the average degree of a node does increase withnit usually
16A possible exception to the pattern is food webs. Studies comparing ecosystems of different sizes seem to show that the density of food webs is roughly constant, regardless of their size, indicating that food webs may be dense networks [153, 322].
does so only slowly, say as logn. This sparsity has many implications. It makes possible a number of types of calculations that would otherwise be challenging, though at the same time it makes others harder. Sparsity will be particularly important when we look at computer algorithms in Chapter 8 and when we construct mathematical models of networks in Chapters 11 to 13.
6.10.2 Directed networks
Node degrees are more complicated in directed networks. In a directed net- work each node has two degrees: thein-degreeis the number of ingoing edges connected to a node and theout-degreeis the number of outgoing edges. Bearing in mind that the adjacency matrix of a directed network has elementsAi j 1 if there is an edge fromjtoi, the in- and out-degrees of nodeican be written
kini Õn
j1
Ai j, koutj Õn
i1
Ai j. (6.18)
These expressions also work for networks with multiedges, and for networks with self-edges provided a self-edge is represented by a diagonal element Aii1 in the adjacency matrix, as discussed in Section 6.4.
The number of edgesmin a directed network is equal to the total number of ingoing ends of edges at all nodes, or equivalently to the total number of outgoing ends of edges, so
m Õn
i1
kini Õn
j1
koutj Õ
i j
Ai j. (6.19)
Thus the mean in-degree cinand the mean out-degree cout of every directed network are equal:
cin 1 n
Õn
i1
kini 1 n
Õn
j1
koutj cout. (6.20)
For simplicity we will just denote both by c and, combining Eqs. (6.19) and (6.20), we get
c m
n. (6.21)
Note that this differs by a factor of two from the equivalent result for an un- directed network, Eq. (6.15).