CIS 541 Number Systems
Numbers
Decimal
The integer number 48356 can be represented as:
48356 = 6×100+ 5×101+ 3×102+ 8×103+ 4×104 in general an integer can be represented as:
anan−1. . .a1a0 =a0×100+a1×101+. . .+an−1×10n−1+an×10n
and a fractional part as:
.b1b2b3. . . =b1×10−1+b2×10−2+b3×10−3+. . . So a real number is:
anan−1. . .a1a0.b1b2b3. . . =
n
X
k=0
ak10k +
∞
X
k=1
bk10−k
CIS 541 Number Systems
Numbers
Base β
anan−1. . .a1a0.b1b2b3. . . =
n
X
k=0
akβk +
∞
X
k=1
bkβ−k
β = 2,8,10,16 are common bases.
Nested form
N = (anan−1. . . a0)β =Pn
k=0akβk =
a0+β(a1+β(a2+. . .+β(an−1+β(an)). . .)
Conversion between bases: Integer Part
Method1: Convert to nested form, then replace each number in the new base. Carry out arithmetic in the new base.
(3781)10
= 1 + 10 ( 8 + 10 ( 7 + 10 (3)))
=(1)2+(1010)2((1000)2+(1010)2((0111)2+(1010)2((0011)2)))
= (111011000101)2
Method 2: Observe that when
N =a0+β(a1+β(a2+. . .+β(an−1 +β(an)). . .) is divided by β the remainder is a0 and the quotient is
N =a1+β(a2+. . .+β(an−1+β(an)). . .) So successive divides and collecting the remainders recontructs N. So we can use this for conversion between bases.
Conversion between bases: Integer Part Example
2|3781 . 2|1890 1
2|945 0 2|472 1 2|236 0 2|118 0 2|59 0 2|29 1 2|14 1 2|7 0 2|3 1 2|1 1 0 1
N = 1×20+ 0×21+ 1×22+ 0×23+ 0×24+ 0×25+ 1×26+ 1×27+ 0×28+ 1×29+ 1×210+ 1×211
= 1 + 2(0 + 2(1 + 2(0 + 2(0 + 2(0 + 2(1 + 2(1 + 2(0 + 2(1 + 2(1 + 2(1). . .)
= 3781
CIS 541 Number Systems
Conversion between bases: Fractionl Part
x =
∞
X
k=1
ckβ−k = (0.c1c2c3. . .)β
note that
βx= (c1.c2c3...)β
thus the digit c1 is the integer part of βx or I(βx) and the fractional part is F(βx)
d0 =x
d1 =F(βd0) c1 =I(βd0) d2 =F(βd1) c2 =I(βd1) ...
while doing the arithmetic in decimal
CIS 541 Number Systems
Conversion between bases: Fractionl Part Example
. .8 . .304
× 2 × 2
1 .6 0 .608
× 2 × 2
1 .2 1 .216
× 2 × 2
0 .4 0 .432
× 2 × 2
0 .8 0 .864
× 2 × 2
1 .6 1 .728
× 2 × 2
1 .2 1 .456
× 2 × 2
0 .4 0 .912
× 2 × 2
0 .8 1 .824
... ...
Conversion between bases: Example
(2576.35546875)10 in octal would be:
8|2576 . . .35546875
8|322 0 × 8
8|40 2 2 .84375000
8|5 0 × 8
8|0 5 6 .75000000
× 8
6 .00000000
= (5020.266)8 = (101 000 010 000.010 110 110)2
So converting to octal first is faster for humans.
The best way to do conversions is 10 ↔8 ↔ 2↔ 16
When converting (N)α to base β nested is prefered when α < β and dividing when α > β.
Error
approximate value = true value + signed error.
or
signed error = true value - approximate value So the signed error, can be positive, negative or zero. Generally we are only interested in the absolute value of the error.
error=|signed error| So error is positive or zero.
When might we care about the sign of the error?
How about when trying to fit a part inside of another part? When the error is a direction and we want to make a course correction?
CIS 541 Roundoff errors
Relative Error
Relative error is the error relative to the answer.
|true value−approximate value| true value
So for a number x that has a machine representation of x˜ it’s relative error is
|x−x˜| x
Relative error is always the difference between an approximation and the real answer divided by the real answer.
Why would we care about relative error? An error of a foot in measuring the size of a desk matters, but an error of a foot in measuring the distance to the sun probably doesn’t.
What if the true value is zero? The relative error is undefined.
CIS 541 Roundoff errors
Condition and Stability
Condition
Say we have a problem with input (data) x and output (answer) y =F(x). The problem is said to be well-conditioned if ”small” changes in x lead to ”small” changes in y.What if the changes iny are ”Large”? That’s right, it is Ill-conditioned.
Stability
Stability is concerned with the sensitivity of an algorithm for solving the problem. An algorithm is said to be stable if if ”small” changes in the input (x)lead to ”small” changes in the output (y).
And if the changes in the output are large? Then the algorithm is said to be unstable.
Roundoff Errors
Let’s look at a pair of machine numbers on a number line.
x− x+
Consider the number x to be between x− and x+ where x− is the first machine number less than x and x+ to be the first machine number greater than x. Rounding is choosing which machine number to represent x. In this case, a correctly rounded number x can only have either x− or x+ as it’s machine representation. The question becomes which one is the correct one?
Different methods choose different numbers as correct.
Rounding: Round to nearest
The first method of rounding is round to nearest, which is what most people consider rounding. In this class we will also consider rounding to be round to nearest (unless otherwise stated).
As an examples, consider 3 decimal digits of accuracy and a pair of numbers on a number line, say .752 and .753, which are .001 appart.
.752 .753
A number x, between .752 and .753 can be represented by one of these numbers (.752 or .753) as a closest approximation. In round to nearest or rounding we would choose the machine number closest to x as its approximation.
CIS 541 Roundoff errors
Rounding: Round to nearest
So if the error of an approximation is not greater than1/2×10−3 or .0005 it can be accepted as correct.
So an answer is correct if error ≤ 1
2 ×10−k
where k is the number of digits. Consider .001 to be , so
roundof f error ≤ 1 2 in round to nearest (rounding).
CIS 541 Roundoff errors
Roundoff Errors
A number is represented by a machine with a machine number. The machine epsilon or unit roundoff error is the error when rounding to one. We can find this by determining the first number that can be represented that is greater than one. For instance if we represent numbers with 23 bits, the machine epsilon
= 2−23. So for single precision single = 2−23. So for double precision with a 52 bit mantissadouble = 2−52.
Roundoff Versus Chopping
x0 x x00
The numberxlies between two machine numbersx and x0. We can choose to round x when storing which would be x00 in this case, or we can chop (truncate) it to the number x0 in this case. The method used affects the error in representation. For rounding the error is:
error ≤ 1 2 For chopping the error is:
error ≤ So which is better?
Precision
X =.256834×105
The digit 2 is the most significant digit while 4 is the least significant digit.
Accuracy is how close to a target you get, say 2”
to a bullseye. And it is measured with one value.
Precision is how good your estimate of the accuracy is, say you are accurate to 2”±.2”. It also refers to how close a group of estimates are. So if you hit the bullseye it is an accurate shot. If you keep hitting it over and over it is precise and accurate. If you keep missing it but missing it in the same manner it is precise but inaccurate.
In a computer the accuracy is how close to the correct answer your machine representation is. The precision is how many digits of accuracy your machine representation has (number of bits in mantissa).
When accurate and precise, it is exact.
CIS 541 Precision
Loss of Significance.
x =.3721448693 y =.3720214371
What is the relative error of x −y in a computer with 5 decimal digits of accuracy?
˜
x =.37214
˜
y =.37202
x−˜ y =.00012 = .12000×10−3 x−y =.0001234322
The relative error is:
|(x−y)−(˜x−y)˜ |
|x−y| = .0000034322
.0001234322 ≈3×10−2 but the relative error of x˜ and y˜≈1.3×10−5
When x˜−y˜ is stored as .12000×10−3 there are 3 spurious zeros added which are a loss of significance.
CIS 541 Precision
Loss of precision Theorem
Let x and y be normalized floating-point numbers with x > y > 0. If 2−p ≤ 1 − xy ≤ 2−q for some positive integers p and q, then at most p and at least q significant binary bits are lost in the subtraction.
Basically the closer the two numbers the greater the loss of significance.
So how could we avoid loss of significance?
Avoiding loss of significance
Use double precision. Doesn’t always work.
Modify the calculations to remove subtractions of numbers close together (their difference is small).
f(x) =p
x2+ 1−1 as x approaches 0, √
x2 + 1 approaches 1. So reorder to remove the subtraction.
f(x) = (p
x2+ 1−1)(
√x2+ 1 + 1
√x2+ 1 + 1) = x2
√x2+ 1 + 1 f(x) =x−sin(x)
approximate sin(x) with its Taylor series.
sin(x) =x−x3 3! +x5
5! −x7 7! +. . . f(x) =x−sin(x) =x−(x−x3
3! +x5 5! −x7
7! +. . .)
= x3 3! −x5
5! + x7
7! −. . .)
Fixed Point
How could we store the integer 345 in a decimal system with 8 digits?
How about 00000345?
How could we store the real number 345.643?
We’d have to know where the decimal point was.
Let’s say we chose 4 digits after the decimal point.
0345.6430
This is fixed point, since the decimal point does not move but remains fixed.
CIS 541 Precision
Floating Point
Instead what if we used floating point. We can then have a fixed number of digits to represent the number (we call this the mantissa) and we use the remaining digits for the exponent. So now with our same example lets store 345.643 with 6 digits for the mantissa and 2 digits for the exponent or.
mmmmmmee
So let’s first normalize the number, that is put the first non-zero term after the decimal point or
345.643 = .345643×103 So we can store it as:
34564303
Why do we normalize? Because a normalized number uses the most precision available. (don’t waste leading zeros. Also, there is a unique representation.
CIS 541 Precision
Fixed vs. Floating Point
Fixed Point Floating Point
faster larger range of
numbers for same number of digits simpler (ie cheaper) Distance between
machine nos. not always the same great when only
integers or numbers all same magnitude
Example Floating Point Systems
Floating point system 3 decimal digits
2 digits for the mantissa 1 digits for the exponent
Let’s look at the numbers that can be represented.
.00×100 =.00000000000 .10×10−9 =.00000000010 .11×10−9 =.00000000011 ...
.98×10−9 =.00000000098 .99×10−9 =.00000000099 .10×10−8 =.00000000100 .11×10−8 =.00000000110 .12×10−8 =.00000000120
Notice how when the exponent changes, the distance between succesive numbers goes up by an order of magnitude (in the base). And notice how the distance between zero and the first number is large, compared to the first number and the number just larger than it.
This is the hole at zero.
Example Floating Point Systems
Floating point system 1 8 decimal digits
5 digits for the mantissa 3 digits for the exponent
Floating point system 2 8 hexidecimal digits
6 digits for the mantissa 2 digits for the exponent
What about x < 0? We need a negative sign for the mantissa!
What about x < 1? We need a negative sign for the exponent!
CIS 541 IEEE Floating Point
Example Floating Point Systems
Questions about the example systems.
Note: Assume a normalized floating point system, unless specified otherwise.
5m 3e dec 6m 2e Hex System 1 System 2
smallest number
−.99999×10999 −.F F F F F F×16F F (10)
smallest positive no.
.10000×10−999 .100000×16−F F (10)
largest no.
.99999×10999 .F F F F F F×16F F (10)
largest rel. round-off err
5 100005
8 1000008 largest rel. round-off err
for number stored as 13 or13(16)
129955 8(16)
12f f f8(16)
CIS 541 IEEE Floating Point
IEEE single-precision floating point standard
contains ±0,±∞, normal and subnormal single- precision floating-point numbers but not NaNs (Not a Number) values.
±q×2m
sign of q 1 bit integer |m| 8 bits
number q 23 bits
(−1)s×2c−127×(1.f)2
s = 0 = + s = 1 = -
c is the exponent as an excess-127 code, so the exponent goes from -127 to 128.
f is the mantissa in 1plus form, Since the first bit is always 1 it doesn’t need to be stored. So:
0 < c <(11111111)2 = 255
0 and 255 are special so the actual exponent is
−126 ≤c−127 ≤127 The Mantissa
1≤ (1.f)2 ≤(1.11111111111111111111111)2 = 2−2−23 The largest number is
(2−2−23)2127 ≈2128 ≈3.4×1038
The smallest positive number is 2−126 ≈1.2×10−38 For example the number -52.234375 is represented as:
(52)10 = (64.)8 = (110 100.)2
(.234375)10 = (.17)8 = (.001 111)2
(52.234375)10 = (110100.001111)2 = (1.10100001111)2×25
c−127 = 5 so c= 132
(132)10 = (204)8 = (10 000 100)2
|1|100 0010 0|101 0000 1111 0000 0000 0000|
= (C250F000)16
The machine epsilon is
single = 2−23 ≈ 1.19×10−7
which is 6 decimal digits of precision.
double = 2−52 ≈2.22×10−16
which is 15 digits of precision.
Integers use 31 bits (plus sign) for a range of
(−(231−1),231−1) = (−2147483647,2147483647)
which is about 9 digits of precision.
CIS 541 IEEE Floating Point
Exponent Numerical Representation (00000000)2 = 010 ±0 if mantissa = 0
subnormal otherwise (00000001)2 = 110 2−126
(00000010)2 = 210 2−125 ...
(11111110)2 = (254)10 2127
(11111111)2 = (255)10 ±∞ if b10, b11, . . . , b32 = 1 NaN otherwise
Can a number be stored that is less than 1.0×2−126?
yes! A subnormal number.
If the exponent is zero and the mantissa is nonzero the number can be a subnormal number or zero.
For instance on a machine that allows subnormal numbers
(00000001)16 = (1.00000000000000000000001)2×2−127
≈1.4×10−45
CIS 541 Arithmetics
IEEE floating point Other details
Round(x) is the machine representation of the number x after it has been rounded. To understand round considerx+as the first machine number> x and x− as the first machine number < x so x+ ≥x ≥x−. Rounding in IEEE fp can be done with 4 methods.
• Round to nearest: round(x) is either x− or x+, whichever is nearer to x. If a tie, choose the one with the least significant bit equal to 0.
• Round towards zero: round(x) is either x− or x+, whichever is between 0 and x.
• Round towards −∞/round down: round(x) = x−.
• Round towards +∞/round up: round(x) = x+.
Floating-point arithmetic
Floating-point (sign)0.d1d2. . . dk ×βe In normalized form d1 6= 0.
In decimal normalized floating-point the mantissa r is in the interval [101,1).
x =±r×10n 1
10 ≤r < 1 In binary normalized floating-point
x =±q ×2m 1
2 ≤q < 1
What kind of problems could we run into with floating point arithmetic? A general solution so special cases can fail. Limited range, limited precision, no error bound.
Variable precision floating-point arithmetic
Use as many digits in the mantissa as needed.
Of course this could be infinite so we specify a bound N on the number of digits.
This increases precision, but we don’t know how accurate our answer is.
CIS 541 Arithmetics
Interval Arithmetic
Represent number as its computer representation and its maximum error.
m± = [a, b]
m= a+b 2 = b−a
2
a m b
So what does this do for us? It allows us to know what the error of a computation is so we know if our answer is good enough for our purposes.
CIS 541 Arithmetics
Interval Arithmetic: Operations
m1±1+m2±2 = (m1+m2)±(1+2) m1±1−m2±2 = (m1−m2)±(1+2)
m1±1×m2±2 = (m1m2)±(1|m2|+2|m1|+12) m1±1÷m2±2 =
m1 m2 ±
1+|mm1
2|2
|m2|−2
if |m2|> 2 division error if |m2| ≤2
Interval Arithmetic:Examples
0 2 10 20
I1 = [0,2] = 1±1 I2 = [10,20] = 15±5
I1+I2 = 1±1 + 15±5 =
= 1 + 15±1 + 5 = 16±6
I1−I2 = 1±1−15±5 =
= 1−15±1 + 5 = −14±6
I1×I2 = 1±1×15±5 =
= 1×15±1× |15|+ 5× |1|+ 1×5 = 15±25
I1÷I2 = 1±1÷15±5 =
= 1
15 ±1 +|151| ×5
|15| −5
= 1 15±
15 15+155
10
= 1 15 ±
20 15
10
= 1
15 ± 20 150
= 1 15 ± 2
15
= .06666667±.1333333
CIS 541 Arithmetics
Range Arithmetic
Adding variable precision to interval allows for knowing when not enough digits are used in a computation. (So that we can do it again with more).
How? We have a bound on the error. If we get above our tolerance, we simply increase the number of digits and start over.
A range is specified by adding a single digit r to the floating point representation.
(sign)0.d1d2. . . dn±r×10e
r is the range digit and r and dn have the same decimal significance.
.39215±3×105 specifies the range [39212,39218]
CIS 541 Arithmetics
Range Arithmetic
It is variable precision, but do we just allow infinite precision? No
Why? We can’t have an infinite representation.
So what do we do? We set the maximum precision in advance.
What happens as a calculation proceeds.
Will the error ever get smaller? No
Why? Because the error we are keeping track of is just a bound. So it only increases with uncertainity never decreases. The real error can get smaller, but our bound does not.
What happens as the error grows? The effective number of digits of precision shrinks. That is why we only need a single range digit. So the mantissa gets smaller.
What happens to the speed of the calculation as the error grows? Since the number of digitis in the mantissa is shrinking, the calculations will go faster.
Interval and Range notation
To convert between the forms m± and [a, b]:
m± = [m−, m+]
[a, b] = a+b
2 ±b−a 2 So 1±1 = [1−1,1 + 1] = [0,2]
range number +0.8888±9×101 mantissa error bound 0.0009 = 9×10−4 number error bound 9×10−3
range number −0.7244666±2×10−2 mantissa error bound 0.0000002 = 2×10−7 number error bound 2×10−9
range number +0.200345±5×103 mantissa error bound .000005 = 5×10−6 number error bound 5×10−3
Range arithmetic: examples
Remember r is only 1 digit, and is the same signifcance as the mantissa!
I1 = 0.1±1×101 I2 = 0.15±5×102 I1+I2 = 0.16±6×102 I1−I2 =−0.14±6×102
I1×I2 = 15±25 = .15±.25×102 =.1±4×102 I1÷I2 =.06666±.13333×100 = 0±2×100
.345±4×102+.234±5×102 =.579±9×102 .345±5×102+.234±5×102 =.579±10×102
=.57±2×102
.345±5×102+.234±6×102 =.579±11×102
=> .57±2×102 => .57±3×102
.345±9×102+.234±9×102 =.579±18×102
=> .57±2×102 => .57±3×102
CIS 541 Arithmetics
Rational Arithmetic
R1 = p1 q1
R2 = p2
q2 Addition
R1+R2 = p1
q1 + p2
q2
= p1q2+q1p2 q1q2
Subtraction
R1−R2 = p1 q1 − p2
q2
= p1q2−q1p2 q1q2
CIS 541 Arithmetics
Rational Arithmetic
R1 = p1 q1
R2 = p2
q2 Multiplication
R1×R2 = p1 q1 × p2
q2
= p1p2 q1q2 Division
R1÷R2 = p1
q1 ÷ p2
q2
=
p1q2 q1p2
division error if p2 = 0
Rational Arithmetic
Lets restrict denominator to always be positive, so we only have one conditional to check for sign.
So we modifiy the rule for division.
R1÷R2 = p1 q1 ÷ p2
q2
=
p1q2
q1p2 if p2 >0
−p1q2
q1p2 if p2 <0 division error if p2 = 0
We also always want the rational to be in a reduced form, so that checking for equality is easier.
In order to reduce we must find the greatest commond denominator/divisor (gcd), D, and replace p,q with p0,q0 where.
p0 = p D q0 = q
D How do we find D?
Euclid
CIS 541 Arithmetics
Eucliden Algorithm to find gcd
Let’s use the Euclidean Algorithm to find the gcd of two positive numbers n1 & n2 by successive division.
• Divide the larger integer by the smaller to obtain an integer quotient ,d1, and integer remainder, n3. Let n1 ≥n2. (switch if needed)
n1 =d1n2+n3
d1 = n1
n2
n3 =n1 (mod n2)
The gcd of n1 & n2 is also a divisor of n3 The gcd of n2 & n3 is also a divisor of n1 So we can shift the problem of findinggcd(n1, n2)into findinggcd(n2, n3).
We continue this until one of the numbers is 0, then the other is the gcd.
CIS 541 Arithmetics
Examples finding gcd
pair Relation
144, 78 144 = 1×78 + 66 78, 66 78 = 1×66 + 12 66, 12 66 = 5×12 + 6 12, 6 12 = 2×6 + 0 6,0 gcd is 6
pair Relation
205,55 205 = 3×55 + 40 55, 40 55 = 1×40 + 15 40, 15 40 = 2×15 + 10 15, 10 15 = 1×10 + 5 10, 5 10 = 2×5 + 0 5, 0 gcd is 5
pair Relation
Errors
Classes of errors Types of Errors
Errors
Combating errors
CIS 541 Nonsolvable problems
Nonsolvable problems
Can we solve
a=b with a computer? No
How about any other relation? No
a > b, a < b, a ≥b, a≤ b Why? Error, we don’t know what it is.
What about a= 0?
How about is a within 10−n of b?
In other words does
|a−b| ≤, = 10−n
Yes. We can always set the precision to greater than 10−n
CIS 541 Nonsolvable problems
Non Solvable problems
Can we obtain a correctkdecimal-place, fixed-point approximation to any real number c? No
Consider the following approximations of c with increasing precision:
.1111150±2 .111115000±5 .1111150000000±3
.1111150000000000000±6 ...
So is .11111 or .11112 the correct approximation for k = 5? We don’t know.
So do we have a problem here if we’d like to gaurantee something about our solution?
Non Solvable problems
Any ways to get around the problem?
We can determine a correct k or k + 1 decimal- place, fixed-point approximation to any real number c, so we avert the problem. This is very subtle difference, and is important if implementing such a system.
So in the above .111115is correct for k+ 1decimal places so the approximation is correct for k or k + 1 decimal places.
Ranged Relations
Let a and b be ranged approximations then:
if a overlaps b then a .
=b
if a is completely to the left of b then a<b˙ if a is completely to the right of b then a>b˙
Remember the precision of a and b matters, if the precision changes, so too may the dotted relation.
Dotted Relation Implied mathematical relation a<b˙ a < b
a>b˙ a > b a6.
=b a6= b
a=b˙ ?
Mathematical Relation Implied dotted relation
a < b a≤˙b
a > b a≥˙b
a=b a .
=b
a6=b ?
What does a≥˙b mean? How about a≤˙b mean?
CIS 541 Taylor’s Series
Taylors Series
sin(x) =x−x3 3! +x5
5! −x7 7! +. . .
if we graph the first few partial sums we see how the series converges to sin(x).
Notice how the series converges rapidly near the expansion and slowly or not at all away from it.
What does this tell us about the choice of c?
The number of terms needed for the same precision increases as we go away from c.
CIS 541 Taylor’s Series
Sine
Taylors Series
f(x) = f(c)+f0(c)(x−c)+f00(c)
2! (x−c)2+f000(c)
3! (x−c)3+. . .
=
∞
X
k=0
f(k)(c)
k! (x−c)k
This is the Taylors series of f at c. If c = 0 it is also known as a Maclaurin series.
So with the Taylors series, if we know a heck of a lot of information about a function at a single point, we can use that information to reconstruct the entire function, as opposed to methods that just know a little bit about the function at a lot of points.
When did Taylor do this work? 1715, he wrote this in a letter in 1712 and published a book in 1715. Brook Taylor, 18 Aug 1685 Edmonton, Middlesex, England, 29 Dec 1731. Not quite nobility but wealthy. Home schooled then went to cambridge.
Taylor created what is now called the ”calculus of finite differences”, invented integration by parts, and discovered the series known as Taylor’s expansion. (1715). Devised the basics of perspective (projective geometry). (he named it linear perspective and defined the vanishing point. More impressive than just one thereom, but his work was difficult to follow and he did not elaborate enough and died early (46). He fought a lot with Bernoulli (non-English)
Taylors Theorem
if f has continuous deriviatives of order 0,1,2, . . . , n in a closed interval I = (a, b), then for any x and c in I,
f(x) =
n−1
X
k=0
f(k)(c)
k! (x−c)k +Rn
where
Rn = f
n(ξ)(x−c)n
(n)! (Lagrange form)
or Rn = f
n(ξ)(x−ξ)n−1(x−c)
(n−1)! (Cauchy’s form) and ξ is a point that lies between x and c.
Rn is the remainder or error term.
CIS 541 Taylor’s Series
Taylors Theorem: Example
What is the Taylor’s Series for sin(x) Let’s choose c = 0?
f(x) = sin(x) f(c) =sin(0) = 0 f0(x) = cos(x) f0(c) =cos(0) = 1 f00(x) = −sin(x) f00(c) = −sin(0) = 0 f000(x) =−cos(x) f000(c) = −cos(0) =−1 f(4)(x) = sin(x) f(4)(c) =sin(0) = 0 f(5)(x) = cos(x) f(5)(c) =cos(0) =−1 f(6)(x) = −sin(x) f(6)(c) =−sin(0) = 0 f(7)(x) = −cos(x) f(7)(c) =−cos(0) = 1 f(8)(x) = sin(x) f(8)(c) =sin(0) = 0 f(9)(x) = cos(x) f(9)(c) =cos(0) =−1
f(x) =
∞
X
k=0
f(k)(c)
k! (x−c)k = f(0)(0)
0! (x)0+f(1)(0)
1! (x)1+ f(2)(0)
2! (x)2+f(3)(0)
3! (x)3+f(4)(0)
4! (x)4+f(5)(0)
5! (x)5+. . .
= 1
0!(x)0+0
1!(x)1+−1
2! (x)2+0
3!(x)3+−1
4! (x)4+0
5!(x)5+. . .
=x−x3 3! + x5
5! −x7 7! +. . .
CIS 541 Taylor’s Series
Taylors Theorem: Example
What is the Taylor’s Series for cos(x)?
f(x) = cos(x) f(c) = cos(0) = 1 f0(x) = −sin(x) f0(c) = −sin(0) = 0 f00(x) =−cos(x) f00(c) = −cos(0) =−1 f(3)(x) = sin(x) f(3)(c) = sin(0) = 0 f(4)(x) = cos(x) f(4)(c) = cos(0) = 1 f(5)(x) = −sin(x) f(5)(c) = −sin(0) = 0 f(6)(x) = −cos(x) f(6)(c) = −cos(0) = −1 f(7)(x) = sin(x) f(7)(c) = sin(0) = 0 f(8)(x) = cos(x) f(8)(c) = cos(0) =−1
cos(x) = 1
0!(x)0+0
1!(x)1+−1
2! (x)2+0
3!(x)3+1
4!(x)4+0
5!(x)5+. . .
= 1−x2 2! +x4
4! − x5 5! +. . .
Taylors Theorem: Example
What is the Taylor’s series for 1−x1 ?
f(x) = 1−x1 f(c) = f(0) = 1−01 = 1 f0(x) = 1−x1 2 f0(c) = 1
f00(x) = 1−x2 3 f00(c) = 2 f(3)(x) = 1−x6 4 f(3)(c) = 6 f(4)(x) = 1−x24 5 f(4)(c) = 24 f(5)(x) = 1−x5! 6 f(5)(c) = 5!
f(6)(x) = 1−x6! 7 f(6)(c) = 6!
f(7)(x) = 1−x7! 8 f(7)(c) = 7!
f(8)(x) = 1−x8! 9 f(8)(c) = 8!
cos(x) = 1
0!(x)0+1
1!(x)1+2
2!(x)2+3!
3!(x)3+4!
4!(x)4+5!
5!(x)5+. . .
= 1 +x+x2+x3+x4+x5+x6. . .
Taylors Theorem
What do the previous examples tell us about the choice of c? It can affect the complexity of the terms of the series. Which affects the speed and accuracy of the calculation. But remember that far away from c means more terms. So we need to consider both.
What about trig functions that repeat? Can we do something like shift the domain so it is alwasy say between −π and π
CIS 541 Taylor’s Series
Mean Value Theorem
if f is a continous function on the closed interval [a, b] and possesses a derivative at each point of the open interval (a, b) then
f(b) =f(a) + (b−a)f0(ξ) for some ξ in (a, b) so
f0(ξ) = f(b)−f(a) b−a
So we have an approximation forf0(x) at any x within the interval (a, b)
CIS 541 Taylor’s Series
Mean Value Theorem
What does the mean value theorem look like geometrically?
a a
aa a
a a
a a
a aa
a a
a a
a aa
a a
a a
a a
aa a
a a
a a
a aa
a a
a a
a aa
a s
f(a)
sf(b)
a a
aa a
a a
a a
aa a
a aa
ξ
a b
There is some ξ between a and b such that the secant line between f(a) and f(b) is the same as the derivate at ξ or f0(ξ) = f(b)−f(a)b−a .
Taylors Theorem f (x + h)
if f has continuous deriviatives of order 0,1,2, . . . ,(n + 1) in a closed interval I = (a, b), then for any x in I,
f(x+h) =
n
X
k=0
f(k)(x)
k! (h)k+En+1
where h is any value such that x+h is in I and where En = f(n+1)(ξ)
(n+ 1)! (h)n+1 There is some point ξ between a and b
ξ is a point that lies between x and x+h and En is the error term.
Where did this come from?
Let x ←x+h and c ←x
Alternating Series Theorem
if a1 ≥ a2 ≥ a3 ≥ . . . ≥ 0 for all n and limn→∞an = 0 then the alternating series a1−a2 + a3−a4+. . . converges. That is:
∞
X
k=1
(−1)k−1ak = lim
n→∞
n
X
k=1
(−1)k−1ak = lim
n→∞Sn =S
where S is its sum and Sn is the nth partial sum.
Also, for n,
|S −Sn| ≤ an+1
So if the magnitudes of the terms in an alternating series converges monotonically to zero, then the error in truncating the series is no larger than the magnitude of the first omitted term.
What does this tell us about calculating sin & cos?
We can bound the error based on the first term we drop.