No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted by section 107 or 108 of 1976 United States Copyright Act, without either the prior written permission of the publisher or permission through payment of the appropriate fee per copy to Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923 (Website: www.copyright.com ). Return instructions and a free return shipping label are available at: www.wiley.com/go/returnlabel.
Basic Data Types
Logic
Proofs
Mathematical Induction
Analysis of Algorithms
Number Theory
Relations
Counting
Probability
Graphs and Trees
Why You Might Care
In information retrieval, where we can try to find the document from a large collection that is most relevant to a particular search, it is common to represent each document by a vector (a series of numbers) based on the words used in the document, and to find the most relevant documents by identifying those “pointing in the same direction” as the query vector. Much of the basic material in this chapter may be familiar, but regardless of whether you have seen it before, it is important and standard content that is important to get comfortable with.
Booleans, Numbers, and Arithmetic
Booleans)
Integers, Reals, and Rationals)
Of the example real numbers above, both 1 and 99.44 are rational numbers; we can write them as11and. We will also use standard notation for intervals of real numbers, which denotes all real numbers between two specified values.
Absolute Value)
We use round brackets to mean "exclude endpoint" and square brackets to mean "include endpoint" when we denote a range:. a,b) shows those real numbers x for which a Floor and ceiling) Raising a number to an integer power) The basic idea is to choose a rational number m/n that approximates with a small error - for example, approximately the first digits of its decimal expansion (which can be written asm/10k) - and approximately bxbybm/n. For example, 2π is approximated by the sequence shown in Figure 2.4; the value of 2πis the limit of this sequence of approximations. For an irrational exponentx, the value of bxis is approximated arbitrarily closely by choosing a rationalmn sufficiently close to the toxan and calculating the value ofbm/n. While basically every modern programming language supports exponentiation—including positive, fractional, and negative powers—in some form, often in a separate math library, the actual calculation behind the scenes is quite complicated. For each baseb, note that logbx gets larger as ex increases, but it grows very slowly. Logarithms appear frequently in the analysis of data structures and algorithms, including a number that we will discuss in this book. Several facts about logarithms will be useful in these analyses, and are useful in other situations as well. We will return to these concepts of division in detail in Chapter 7, but we will start here with the formal definitions of the concepts related to remainders: By rearranging the floor-based definition from definition 2.9 whennmodk = 0, we can see that the conditionk|nis is also equivalent to the conditionk·nk=n. Even, odd, and parity) These quantities have deep and important connections to cryptography, error-correcting codes, and other applications that we will explore later. We will start with the summation notation that allows us to express the result of adding many numbers:. We will very occasionally consider an infinite sequence of numbers x1,x2,. we can write∑∞i=1xito denote the infinite sum of these numbers. We solved example 2.17 by first calculating ∑ij=1j, which is the sum of the numbers in the ith row. Napoleon Bonaparte Section 2.2 introduced the primitive types of objects that we will use throughout the book. In this section we begin with sets, in which objects are collected without respect for order or repetition. Sets) Set membership) The cardinality of the bit array is 2, because there are two distinct elements of that array (ie 0 and 1). It is important to remember that the integer 2 and the array{2} are two completely different kinds of things. Colon in the set abstraction notation is read as "such that", so the set in Definition 2.18 would be read. LetX denotes the set of all sets that do not contain itself: that is, letX:={S:S∈/S}. The empty set ∅ ) Set complement) Set union) Set intersection) We end this subsection with some pieces of notation that allow us to perform mathematical operations on the elements of a set. We will also sometimes want to represent the sum or product of the elements of a specific set (instead of a range of values such as x1,x2, . . .,xn). Set equality) Subset) Proper subset) Superset and proper superset) There are two important types of sets of sets that we will define in the rest of this section, both derived from a base set. The first interesting use of a set of sets is to form the separation of its into a set of disjoint subsets whose union is precisely S. The idea is that two elements in the same cluster will be "similar" and two entities in different clusters will be "different". (So students might be sorted by major or dorms; restaurants might be sorted by their cuisine or geography; and web pages might be sorted by the set of words that appear in them.) For more information on clustering, see the discussion at p. . Our second important type of set of sets is the power set of S, which is the set of all subsets of S:. Denis Diderot Supplément au voyage de Bougainville (1796) In section 2.3, we introduced sets - collections of objects in which the order of these objects does not matter. In many circumstances, however, the order is important: if a Java method takes two parameters, changing the order of those parameters will usually change what the method does; if there is a site of interest on longitudex and latitudey then displaying on longitudey and latitudex will not work. Sequence, list, and tuple) Taking it further: An annoyingly pedantic point: we're sloppy with notation in example 2.38;. In addition to the "obvious" sequences like Examples 2.37 and 2.38, we have also already seen some definitions that do not seem to involve sequences, but are implicitly about ordered tuples of values. Sequences of elements from the same set) For a set S and a positive integer n, we write S n to denote We sum the number of "blocks" of difference in each of the dimensions; we take the absolute value of the difference in each component because we care about the difference in each dimension rather than which point has the highest value in that component.). Vector length) Vector addition) Scalar product) The dot product of two student vectors represents the number of regular courses they have taken. Ladc∈Rn be an ann vector indicating the number of credit hours for the class you took in your college career. The rotation matrix can be used in computer graphics to recreate a scene from a different perspective; see p. One particular square matrix that will appear often is the identity matrix, which has a one on the main diagonal and zeros everywhere else (see Figure 2.38). For a square matrix M∈Rn×n, we can say that the size of Misn (instead of saying that its size is for n). Again, as with vectors, adding two matrices that do not have the same size is meaningless. The product of two matrices is the matrix, and not a single number: the entry in its row and AB's column is derived from A's row and B's column. In this section we will give formal definitions of functions and some terminology related to functions, and also discuss some special types of functions. Functions themselves are a special case of relations, and we will review the definition of functions in Chapter 8 when we discuss relations.). And it is practical to define a function with a table only if the set of possible entries is quite small!). To define the dual use of the table, we specify an output corresponding to each of the 8 possible inputs, as shown in Figure 2.47. In Java, for example, you would write anisPrimefunction with an explicit declaration that the input is an integer (int) and the output is a boolean. However, it is not necessary that all possible outcomes are actually achieved: in other words, there may be an element b∈B for which no a∈Awithf(a) =b. Putting together the facts from (i) and (ii), we conclude that the range of s is exactly the set of all prime numbers. We will also introduce a small extension to the set abstraction notation from Section 2.3.1 that is related to the range of a function. The range is a bit harder to see, but it turns out to be the set of all prime numbers. There are no arrows pointing to 14 or 15, so these two numbers are in the codomain, but not excluded.). Note that the definition of a function guarantees that each element in the first column has one and only one arrow going from there to the second column: if :A→ is a function, then to each ∈A a assigned unique outputf(a)∈B. We can read the domain, codomain, and switch directly from this image: the do-. main is the set of elements in the first column; the codomain is the set of elements in the second column; and the array is the set of elements in the second column for which there is at least one incoming arrow. As with many function-related concepts, the visual representation of functions provides a nice way of thinking about function composition: the function ◦f corresponds to the "short circuit" of the pictures of the functionsf andg. An onto functionf : A → Bguarantees that every elementb ∈ Bis “hits at least once” byf—that is, thatb=f(a) for at least onea∈ A. Because each element of the codomain is “hit” by double at most once, the function is one-to-one. There is at most one such because f is one-to-one, and at least one such because f is op.). Bijection) The fact that each element in the right-hand column has exactly one incoming arrow below it is exactly what guarantees that reversing the direction of each arrow still results in the arrow diagram of a function.). On the other hand, if iff is not one-to-one, then there exists such that f(a) = b† andf(a′) =b†fora6=a′; thusf−1(b†) would have to beboth aanda′, which is forbidden by the definition of a function. A bijectionf :A →Bhar exactly one arrow entering. each element in the second column, and by definition it also has exactly one arrow exiting each element in the first column. This image-based approach should help illustrate why a function that is not onto, or that is not one-to-one, does not have an inverse. Degree) Roots) Note that a polynomial between two roots must change direction. Draw a picture!) A polynomial of degree 0 never changes direction, so it is always zero or never zero. Butp′ has at most one root, as we have just argued, and sophas at most two roots. Avektor(orn-vector) is an element of Rn, for a positive integer ≥ 2. An element ofR1 = Ris called ascalar.) Abit vector is an element of {0, 1}n. Given a matrixM∈ Rn×mand a real numberα∈ R, the matrixαMis specified by (αM)i,j =αMi,j. The sumM+M′ is meaningless if MandM′ has different dimensions.) The product of two matricesA ∈ Rn×mandB ∈ Rm×pis a matrix AB ∈ Rn×p whose components are given by (AB)i,j = ∑mk =1Ai, kBk,y. Why You Might Care An Introduction to Propositional Logic That is, Goldbach's conjecture is indeed either true or false; it's just that we don't know what it is. Here's a silly (but obviously true) statement: The sentence "snow is white" is true if and only if snow is white. (Of course!). The University of Minnesota's mascot is the Badgeris, an atomic proposition, because it cannot be conceptually divided into a simpler claim. Your password is only valid if it is at least 8 characters long, you have not used it as your password before and it contains at least three different types of characters (lowercase letters, uppercase letters, numbers, non-alphanumeric characters). The University of Washington's mascot is a Duck or the University of Oregon's mascot is a Duck is a compound proposition because it is conceptually divisible into two simpler propositions—namely, the University of Washington's mascot is a Duck and the University of Oregon's mascot is a Duck. The form of the compound proposition is "p, only if none and at least three of {s,t,u,v} are true." (Later we will see how to write this compound proposition in standard logical notation; see Example 3.15.). Conjunction (and): ∧ ) Consider the sentence If you scratch my back, I'll scratch yours. The easiest way to understand this sentence is as a promise: I promised to scratch you as long as you scratch me. I haven't promised anything about what I'll do if you don't scratch my back—I can refrain from scratching your back, or I can give you a generous backscratch anyway, but I've made no guarantees. You'd be right to call me a liar if you scratched my back and I didn't scratch yours in return.). This statement is different from "if you signed, then the contract is valid": for example, the contract may not be valid because you are a legal minor and therefore legally have no signing rights.). I am available for a day-long on-site interview on October 8th in Minneapolis or Hong Kong. However, the "or" in "Minneapolis or Hong Kong" is exclusive because it is not physically possible to be present in Minneapolis and Hong Kong at the same time. Solution: The "or" in "email or phone" is inclusive because you can receive both an email and a phone call. In Java, for example, the condition!p && q—that is “notpandq” in Java syntax—will be interpreted as(!p) && q, because not/¬/!binds more strictly than and/∧/&&. The operator precedence rules tell us in what order two different operators are used in an expression. The choice that logical operators associate to the left (instead of associating to the right) will not matter for most of the logical connections anyway. Several important types of statements are defined in terms of their truth tables: those that are always true (tautologies), sometimes true (satisfiable statements), or never true (unsatisfiable statements). We will begin by considering statements that are always true: Etymologically, the word autology comes from tout. Peter With the definitions from Section 3.2 in hand, we turn to a few extensions: some special types of statements and some special ways of representing statements. We now turn to propositions that are sometimes true and those propositions that are never true:. The last column of this truth table contains only the letters "T", proving that modus ponens is a tautology. p∨ ¬p Law of excluded middle. Solution: We will answer the question by constructing a truth table for the given proposition:. Because there is at least one "T" in the last column in the truth table, the proposition is satisfied. Which of the converses, contrapositives, and inverses ofp⇒q are logically equivalent to the original statement p⇒q. This basic idea – of replacing one logical connection with another (or with several others) – is a crucial part of the very construction of computers; we return to this idea in Section 4.4.1. Conjunctive normal form) Disjunctive normal form) All propositions are expressible in CNF) But just because some geniuses were laughed at doesn't mean that everyone who is laughed at are geniuses. A predicate is not the kind of thing that is true or false, so predicates are different from propositions; rather, a predicate is a “proposition with blanks” waiting to be filled. We have seen that we can form a statement from a predicate by applying that predicate to a specific argument. But we can also form a statement from a predicate using quantifiers, which allow us to formalize statements such as every Java program contains at least four forloops (false!) or there is a statement that does not can be expressed by only the conjunctions∧and∨(where! See Exercise) 4,71). Universal quantifier (“for all”): ∀ ) But the expressions in (B) mean different things, in the sense that we can construct a context in which these two statements have different truth values (for example, x = 3 and y = −2). Recall that atautology is a proposition that is always true – in other words, it is true regardless of what each Boolean variable in the proposition “means” (that is, whether pi is true or false). Note the crucial difference between Example 3.38, which says that every element of S either makes P true or makes P false, and Example 3.39, which says that either every element of S makes P true or every element of S makes P false. (Intuitively, this is the difference between "Every letter is either a vowel or a consonant" and "Every letter is a vowel or every letter is a consonant." The former is true; the latter is false.). A for which your implementation of insertion sort does not sort correctly A. The equivalence you use is a statement of predicate logic:. For a game as big as chess, we can't afford to count all the way to the bottom of the tree; instead, we estimate the quality of each position after calculating a handful of layers deep in the game tree. Similar to a quantified expression, an unbound variable is a variable whose function meaning could change if we replaced that variable with a different name. Your choices and the demon's choices are made in the left-to-right order (from outside to inside) of the quantifiers. For each of the following English sentences, find as many different logical readings based on order of quantification as you can. Atheorem of predicate logic is a fully quantified expression that is true for all possible meanings of the predicates in it. Unlike with propositional logic, there is no algorithm that is guaranteed to determine whether a given fully quantified predicate logic expression is a proposition. Why You Might Care The goal is that, as long as corruption is limited, the decrypted message is identical to the original message – in other words: thatm=m′as long asc′ ≈c. The goal is that, as long as there is not too much corruption, the received message m′ is identical to the sent message m. The danger in error detection is that we are sent a codeword ∈ Ct that is corrupted into a bitstringc′, but we report "no error" because cec' ∈ C. Note that we never err when we report "error".) The danger in error correction is that we report another codeword ∈ Cbecausec is closer toc than toc. We say that Ccandetectℓerrors if, for every codeword c∈ Cand for every sequence up to ℓerrors applied to c, we can correctly report “error” or “no error”. Our goal with error correction codes is to ensure that the decoded message is identical to the original message, as long as there are not too many errors in the transmission. At a high level, we will achieve this goal by ensuring that the code words in our code are all “very different” from each other. The only way we would fail to detect the error is if the bitstringcí received is itself another codeword. Here are the three main theorems we will prove in the rest of this section: Good news) Better news) Bad news) For an example of decoding, suppose we receive the (possibly corrupted) bitstringc during Repetition3code. Error detection and error correcting codes. The Bell System Technical Journal, XXIX April 1950. distance, while improving the speed from 13 to 47. In fact, for the Hamming code we will use several different parity bits, corresponding to different subsets of bits afm. The Hamming code, like the Hamming distance, is named after Richard Hamming, who invented this code in 1950. The basic idea of the Hamming code is to use an extra bit that, like the 16th digit of a credit card number, makes redundant a value calculated from the previous components of the message. These received bits correspond to the message 1111 with parity bits 111, where the fourth bit of the message has been reversed. And, relatedly, why were the parity bits of the Hamming code chosen this way?). Because all four of these signatures are different, we can distinguish which message bit is corrupt based on which set of two or more parity bits looks wrong. In the last two sections we constructed two different codes, both for 4-bit messages with minimum distance 3: the repetition code (rate124) and the Hamming code (rate. There are no strings of such': one that is with the first bit flipped, one that is with the second bit flipped;. Using packing arguments to derive bounds on error-correcting codes Now, let's return to error-correcting codes and use the circle packing intuition (and the last two lemmas) to prove a limit on the number of words codecs with bits that can be "adapted". The “sphere-packing bound”: distance-3 version) Oscar Wilde In Section 4.2, we saw a number of claims about error-correcting codes—and, more importantly, evidence that these claims were true. In fact, when faced with a claim that you need to prove, a variety of strategies (including those strategies from Section 4.2) are possible approaches you can use. We will prove a lot of facts in this book, but at the Python-like level of proof. Almost every statement that we will prove here—or that you will ever need to prove—will be a universally quantified statement of the form∀x∈ S:P(x). Note the creative requirement if you choose to develop a trial by case: you should. For example, a two-case proofqand¬q corresponds to the logical equivalencep ≡ (q ⇒ p)∧(¬q ⇒ p).) A valid proof of any logically equivalent proposition can be used to prove that ϕ is true , but a few logical equivalences prove to be particularly useful. When trying to prove a claimϕ, it suffices to instead prove any proposition that is logically equivalent toϕ. If you are going to use a proof by contrapositive, suppose you are using a proof by contrapositive. Proof by contrapositive is generally preferred to proof by contradiction when a proof by contrapositive is possible. Note that the claim in Example 4.21 was not an implication, so a proof by means of contrapositives was not an option. 2 and the infinity of prime numbers were proofs of a "for all" statement - and indeed even these two statements could have been formulated as universal quantifications. For example, we could have formulated Example 4.21 as the following statement: for all integers nandd we have 6=d·√. Understand what you are trying to do. Read the statement you want to prove. Now that you understand the statement you're trying to prove, it's time to actually prove it.Properties of exponentials)
Logarithm)
Properties of logarithms)
Modulus (remainder))
Prime and composite numbers)
Summation notation)
Product notation)
Sets: Unordered Collections
Set cardinality)
Set Abstraction)
Set difference)
Disjoint sets)
Partition)
Power set)
Sequences, Vectors, and Matrices: Ordered Collections
Cartesian product)
Vector)
Dot product)
Matrix)
Identity matrix)
Matrix multiplication)
Functions
Function)
Domain/codomain)
Range/image)
Set abstraction using functions)
Function composition)
Onto functions)
One-to-one functions)
Function inverses)
Polynomial)
Nonzero) polynomials of degree k have at most k roots)
Algorithm)
Chapter at a Glance
Propositions and Truth Values)
Atomic and compound propositions)
Negation (not): ¬ )
Disjunction (or): ∨ )
Implication: ⇒ )
Exclusive or: ⊕ )
If and only if: ⇔ )
Truth assignment)
Truth table)
Propositional Logic: Some Extensions
Tautology)
Satisfiable propositions)
Unsatisfiable propositions/contradictions)
Logical equivalence)
All propositions are expressible in DNF)
An Introduction to Predicate Logic
Predicate)
Existential quantifier (“there exists”): ∃ )
Theorems in predicate logic)
Proof by assuming the antecedent)
Gödel’s (First) Incompleteness Theorem)
Predicate Logic: Nested Quantifiers
Chapter at a Glance Propositional Logic
An Extended Application with Proofs: Error-Correcting Codes
Hamming distance)
Minimum distance)
Rate)
Repetition code)
Distance and rate of the repetition code) The Repetition ℓ code has rate 1 ℓ and minimum distance ℓ
Parity function)
Hamming code)
Distance and rate of the Hamming code) The Hamming code has rate 4 7 and minimum distance 3
Balls around codewords are disjoint)
The Hamming code is optimal)
Proofs and Proof Techniques
Proof)
Proof by cases)
Proof by contrapositive)
Proof by contradiction)
Disproof by counterexample)
Proof by construction)