Polynomial-Based Methods - BOOK OPTIMIZATION CONCEPTS AND APPLICATIONS IN ENGINEERING

x₁ x₂ x₄ x₃ x f q

Figure 2.12. Quadratic fit with three-point pattern.

derivative are known at each point, the complete information of two points can be used in fitting a cubic order polynomial. The basic idea is to start from an initial interval bracketing the minimum and reducing the interval of uncertainty using polynomial information. Robust algorithms are obtained when conditioning and sectioning ideas are integrated.

Quadratic Fit Algorithm

Assume that the function is unimodal, and start with an initial three-point pattern, obtainable using the algorithm given in Section 2.3. Let the points be numbered as [x₁,x₂,x₃] with [f₁,f₂,f₃] being the corresponding function values, wherex₁ <

x2 <x3 andf2 ≤ min(f1,f3).Figure 2.12depicts the situation, where we see that the minimum can be either in [x1,x2] or in [x2,x3]. Thus, [x1,x3] is the interval of uncertainty. A quadratic can be fitted through these points as shown inFig. 2.12. An expression for the quadratic function can be obtained by lettingq(x)=a+bx+cx² and determining coefficientsa,bandcusing [x₁,f₁], [x₂,f₂], [x₃,f₃]. Alternatively, the Lagrange polynomial may be fitted as

q(x)= f₁ (x−x2) (x−x3)

(x1−x2) (x1−x3)+ f₂ (x−x1) (x−x3)

(x2−x1) (x2−x3)+ f₃ (x−x1) (x−x2)

(x3−x1) (x3−x2) (2.22) The minimum point, call itx4, can be obtained by setting dq/dx = 0, which yields

x₄= f1

A(x₂+x₃)+ f2

B(x₁+x₃)+ f3

C(x₁+x₂) 2

f₁ A+ f₂

B + f₃ C

whereA=(x1−x2) (x1−x3), B=(x2−x1) (x2−x3),

C=(x3−x1) (x3−x2) (2.23)

2.6 Polynomial-Based Methods 69 The valuef4 =f(x4) is evaluated and, as perFig. 2.8, a new interval of uncertainty is identified after comparingf2withf4, as

[x₁,x₂,x₃]_new=

⎧⎪

⎪⎪

⎨

⎪⎪

⎪⎩

[x₁,x₂,x₄] if x₄>x₂ and f(x₄)≥ f₂ [x₂,x₄,x₃] if x₄>x₂ and f(x₄)≥ f₂ [x4,x2,x3] if x4<x2 and f(x4)≥ f2

[x1,x4,x2] if x4<x2 and f(x4)< f2

The process above is repeated until convergence. See stopping criteria in Eqs.(2.20) and(2.21). However, pure quadratic fit procedure just described is prone to fail often [Robinson 1979; Gill et al. 1981]. This is becausesafeguardsare necessary, as detailed in the following.

Safeguards in Quadratic (Polynomial) Fit

Three main safeguards are necessary to ensure robustness of polynomial fit algorithms:

(1) The pointx4 must lie within the interval [x1,x3] – this will be the case for the quadratic fit above with the stated assumptions, but needs to be monitored in general.

(2) The pointx₄ must not be too close to any of the three existing points else the subsequent fit will be ill-conditioned. This is especially needed when the polynomial fit algorithm is embedded in an-variable routine (Chapter 3). This is taken care of by defining a measureδand moving or bumping the point away from the existing point by this amount. The following scheme is implemented in Program Quadfit.

⎧⎪

⎪⎪

⎨

⎪⎪

⎪⎩

|x4−x1|< δ, then set x4=x1+δ

|x4−x3|< δ, then set x4=x3−δ

|x4−x2|< δ, and x2> .5^∗(x1+x3) then set x4=x2−δ

|x₄−x₂|< δ, and x₂≤.5^∗(x₁+x₃) then set x₄=x₂+δ whereδ >0 is small enough. For instance,δ=smin{|x3−x2|,|x2−x1|}where 0<s<¹/₂(seesfracparameter in program, default=1/8).

(3) The safeguard has to do with very slow convergence as measured by reductions in the length of the intervals.Figure 2.13illustrates the situation. The old three- point pattern is [x₁,x₂,x₃]. Upon fitting a quadratic and minimizing, a new point x4is obtained. Iff(x4)<f(x2), then the new three-point pattern is [x2,x4,x3].

However, sincex4is close tox2,Lnew/Loldis close to unity. For instance, if this ratio equals 0.95, then only a 5% reduction in interval length has been achieved.

x₁ x₂ x₄ L_old

L_new x₃

Figure 2.13. Slow convergence in a polynomial fit algorithm.

In fact, a golden section step would have reduced the interval byτ =0.618. This motivates the logic that if L_new/L_old > τ at any step, or for two consecutive steps, then a golden section step is implemented as

ifx₂≤ (x₁ + x₃)/2 then x₄ = x₂ +(1−τ) (x₃−x₂);

else

x4 = x3 −(1−τ) (x2−x1);

endif

The reader may verify that turning off this safeguard in program quadfit leads to failure in minimizing f = exp(x)−5 x, starting from an initial interval [−10, 0, 10].

Stopping Criteria (i) Based on total number of function evaluations

(ii) Based onx-value: stop if the interval length is sufficiently small as xtol=√εm(1.+abs(x₂))

stop if abs(x3–x1)≤2.0^∗xtol, for two consecutive iterations

(iii) Based onf-value: stop if the change in average function value within the interval is sufficiently small as

ftol=√εm(1.+abs(f2))

stop if abs( ¯f_new – ¯f_old)≤ftol, for two consecutive iterations

2.6 Polynomial-Based Methods 71

a w

u m

m =

= 2 e

b a + b

b – x a – x

if x < m if x ≥ m

v x

{

Figure 2.14. Brent’s quadratic fit method.

Brent’s Quadratic Fit – Sectioning Algorithm

The basic idea of the algorithm is to fit a quadratic polynomial when applicable and to accept the quadratic minimum if certain criteria are met. Golden sectioning is carried out otherwise. Brent’s method starts with bracketing of an interval that has the minimum [Brent 1973]. At any stage, five pointsa,b,x,v,w are considered.

These points may not all be distinct. Pointsaandbbracket the minimum.xis the point with the least function value.w is the point with the second least function value.vis the previous value ofw, anduis the point where the function has been most recently evaluated. SeeFig. 2.14. Quadratic fit is tried forx,v, andwwhenever they are distinct. The quadratic minimum pointqis at

q=x−0.5(x−w)²[f(x)− f(v)]−(x−v)²[f(x)− f(w)]

(x−w) [f(x)− f(v)]−(x−v) [f(x)− f(w)] (2.24) The quadratic fit is accepted when the minimum pointqis likely to fall inside the interval and is well conditioned otherwise the new point is introduced using golden sectioning. The new point introduced is calledu. From the old seta,b,x,v, wand the new pointuintroduced, the new set ofa,b,x,v,ware established and the algorithm proceeds.

Brent’s Algorithm for Minimum

1. Start with three-point patterna,b, andxwitha,bforming the interval and the least value of function atx.

2. wandvare initialized atx.

f₁ f₂

f₂ f₁

f₁ f₂

2 1

df dx

1< 0

df dx ₁< 0 df <

dx ₂ sgn df = –sgn

dx ₁

Figure 2.15. Two-Point pattern.

3. If the pointsx,w, andvare all distinct then goto 5.

4. Calculateuusing golden sectioning of the larger of the two intervalsx–aorx–b.

Go to 7.

5. Try quadratic fit forx,w, andv. If the quadratic minimum is likely to fall inside a–b, then determine the minimum pointu.

6. If the pointu is close toa,b, orx, then adjustu into the larger ofx–a orx–b such thatuis away fromxby a minimum distancetolchosen based on machine tolerance.

7. Evaluate the function value atu.

8. From amonga,b,x,w,v, andu, determine the newa,b,x,w,v.

9. If the larger of the intervalsx–aorx–bis smaller than 2^∗tolthen convergence has been achieved thenexitelse go to 3.

Brent’s original program uses convergence based on the interval size only. In the program BRENTGLD included here, a criterion based on the function values is also introduced.

Other variations on when to switch from a quadratic fit to a sectioning step have been published [Chandrupatla 1988].

Cubic Polynomial Fit for Finding the Minimum

If the derivative information is available, cubic polynomial fit can be used. The first step is to establish a two point pattern shown inFig. 2.15, such that

d f dx

<0 and sgn d f

= −sgn d f

or (2.25)

d f dx

<0 and f2> f1

2.6 Polynomial-Based Methods 73

f₂ f ' = df

dξ

f₁

1 0 1 2 ξ

ξp

Figure 2.16. Function on unit interval.

We make use of the unit interval shown inFig. 2.16, where the following trans- formation is used

ξ = x−x1

x2−x1

Denoting f= ^{d f}_d_ξ, we have

f= d f

dξ =(x2−x1)d f dx The cubic polynomial fit is then represented by

f =aξ³+bξ²+ f₁ξ + f₁ where

a = f₂+ f₁−2 (f2− f1) b=3 (f2− f1)− f₂−2f₁

By equating^{d f}_d_ξ =0, and checking the condition^d_d²_ξ2^f >0 for the minimum, we obtain the location of the minimum at

ξp= −b+

b²−3a f₁

3a (2.26)

Above expression is used whenb <0. Ifb >0, the alternate expression for accurate calculation isξp= f₁/(−b−

b²−3a f₁). The new point is introduced at xp=x1+(x2−x1)ξp, by making sure that it is at a minimum distance away from the end points of the interval. The new point is named as 1 if sgn (f)=sgn(f₁) and named as 2 otherwise. Convergence is established by checking the interval|x₂−x₁|, or two consecutive values of 0.5(f₁+f₂).

Example 2.10

Obtain the solution for the following problems using the computer programs Quadfit S and Cubic2P:

f1=x+1 x f₂=e^x−5x

f₃=x⁵−5x³−20x+5 f4=8x³−2x²−7x+3 f₅=5+(x−2)⁶

f6=100(1−x³)²+(1−x²)+2(1−x)² f7=e^x−2x+0.01

x −0.000001 x² Solution

Computer programs BRENTGLD and CUBIC2P are used in arriving at the results that are tabulated as follows.

Function x/f

1 1.0000

2.0000

2 1.6094

−3.0472

3 2.0000

−43.000

4 0.6298

−0.2034

5 2.0046

5.0000

6 1.0011

−1.1e-3

7 0.7032

0.6280

The reader is encouraged to experiment with other starting points, step sizes, convergence parameters, and functions.

2.7 Shubert–Piyavskii Method for Optimization of Non-unimodal Functions 75

2.7 Shubert–Piyavskii Method for Optimization of Non-unimodal Functions

Dalam dokumen BOOK OPTIMIZATION CONCEPTS AND APPLICATIONS IN ENGINEERING (Halaman 83-91)