Mathematics Homework Solutions
Problem
#88166

Optimization Algorithm : Prony Series

1 Definition of a Prony Series
Let GR(t) be the shear stress relaxation modulus. Define G1 and G0 by the following limits:
...
From the shear relaxation modulus we can define a dimensionless relaxation modulus from:
....
The normalized shear stress relaxation modulus is often represented by a series expansion in exponential
terms:
....
where gi and i are material parameters. This series expansion is called the Prony series.
2 Dynamic Frequency Experiments
The viscoelastic behavior of a material is often determined from dynamic vibrational experiments.
In these experiments the material is exposed to small strain vibrations and the resulting storage
modulus (G0) and loss modulus (G00) are determined as a function of the applied frequency: G0(!),
G00(!).
3 Conversion from Dynamic Frequency Data to Prony Series
Data
In most finite element packages the linear viscoelasticity model is specified by a Prony series (Eq.
4). If only dynamic frequency data is available for a material then the Prony series can be determined from the following implicit equations [see the ABAQUS theory manual for a derivation]:
.....
These equations determine the storage modulus and the loss modulus for a given Prony series. In
this case we are interested in the reverse relationship, that is, if we know G0(!) and G00(!) then
what is the Prony series (G0, N, gi, i)?
To solve this problem we can use the following approach.
1. Pick a value for N.
2. Guess values for G0, gi, i.
3. Calculate G0(!) and G00(!).
4. Calculate the residual between the calculated dynamic data and the experimental dynamic
data.
5. Optimize the parameters G0, gi, and i using any minimization algorithm (e.g. the Simplex
method). Goto 3.
.....

("A Stochastic BFGS.pdf" file has codes in the end but there is a calling function missing,i think THIS IS EXACTLY WHAT I NEED TO GET LIKE THE "Result_1_Xugang.pdf" SOLUTION FILE)

NOTE: IGNORE THE FILE "A Stochastic BFGS Based Algorithm for Estimating the Angles of Arriving Signals Given Signal Measure"

Attached file(s):
Attachments
Result_1_Xugang.pdf  View File
A Stochastic BFGS Based Algorithm for Estimating the Angles of Arriving Signals Given Signal Measure  View File
Prony_Series_Conversion.pdf  View File
A Stochastic BFGS.pdf  View File

Attachment Content Summary (Note: view attachment at the above link before purchasing. Actual attachment content may vary slightly from that shown below.)

Result_1_Xugang.pdf
Result1 Xugang Ye


Minimization Problem:

Given M, N, k , G ' ( k ) , G ' ' ( k ) , k = 1, 2, ..., M.

2
M
N
N
g 2 2
Min F(x) = G0 1 - g i + G0 i i 2 2 - G ' ( k )

i =1 1 + i
k =1 i =1
2
M
N g i i
+ G0 - G ' ' ( k )
i =1 1 + i
2 2
k =1


where x = [ g1 g 2 L g N 1 2 L N G0 ]T


Test case 1:
Data:
M=
10

N=
20 Suppose M = 10 applied frequencies are given

=
Columns 1 through 8
0 0.69813 1.3963 2.0944 2.7925 3.4907 4.1888 4.8869
Columns 9 through 10
5.5851 6.2832


x_c = Suppose the components of this vector are the
parameters of the prony series when N=20.
Columns 1 through 8
0.66916 6.3267 7.5374 7.6966 8.0804 5.4333 4.8508
4.4122


Columns 9 through 16
6.265 0.77783 5.4406 2.091 2.7 2.9168 8.2222 4.115


Columns 17 through 24
5.2696 9.2021 3.7914 6.389 1.7374 7.858 3.6563
7.7694


Columns 25 through 32
3.4215 7.7418 1.4783 1.4824 6.3905 0.93373 9.2879
4.8514




1
Result1 Xugang Ye


Columns 33 through 40
1.6325 2.7634 1.3097 0.2322 2.313 8.4526 7.2636
6.9469


Column 41
3.8311




G() = Simulated data of G() based on x_c
Columns 1 through 8
-387.65 -77.135 -34.263 -20.629 -14.119 -10.152 -7.4067 -5.3797
Columns 9 through 10
-3.8308 -2.6212


G () = Simulated data of G () based on x_c
Columns 1 through 8
0 115.2 77.107 57.966 47.085 40.045 35.021 31.189
Columns 9 through 10
28.131 25.618




Check objective value at x = x_c

>>fun(x_c)

ans =

0




Now consider the reverse question, i.e. Given M, N, k , G ' ( k ) , G ' ' ( k ) , k = 1, 2, ..., M, what


can we guess on the parameters x = [ g1 g 2 L g N 1 2 L N G0 ]T ?


Optimization Result:

Best solution obtained

x_record =

Columns 1 through 8
0.80971 3.2419 1.3103 5.0878 -1.9735 0.50753 4.6924
2.0571



2
Result1 Xugang Ye


Columns 9 through 16
1.838 2.0784 1.277 0.52145 2.5334 5.1269 2.4789 2.674


Columns 17 through 24
0.45354 1.828 2.7385 8.5508 1.0306 1.2471 4.7994
5.9635


Columns 25 through 32
-1.8594 2.3027 2.7073 3.5924 3.3681 2.8415 3.5363
-0.00050518


Columns 33 through 40
4.821 4.6636 1.1114 -7.492 -1.0718 0.22405 -4.5537 4.3787


Column 41
8.2775


Check objective value at x = x_record
>> fun(x_record)

ans =

The small objective value (the square of residual)
0.002021
Implies that the global optimization algorithm
works well.




Test case 2:
Data:
M=
15

N=
26 Suppose M = 15 applied frequencies are given

=

Columns 1 through 8
0 0.4488 0.8976 1.3464 1.7952 2.244 2.6928 3.1416


Columns 9 through 15
3.5904 4.0392 4.488 4.9368 5.3856 5.8344 6.2832



3
Result1 Xugang Ye



Suppose the components of this vector are the
parameters of the prony series when N=26.
x_c =
Columns 1 through 8
3.8906 0.73572 7.8324 1.1222 2.1429 2.3871 2.0381
0.18639


Columns 9 through 16
1.4424 7.8538 4.6308 5.9869 7.2054 4.3135 6.3832
2.2004


Columns 17 through 24
2.759 8.33 0.16537 4.8086 2.3307 5.6307 4.9674 6.6058


Columns 25 through 32
0.97998 8.7865 1.8451 5.9962 2.9588 1.8651 8.6196
1.2596


Columns 33 through 40
0.24503 9.2057 3.4914 8.6439 3.7585 4.3772 4.1459
0.50289


Columns 41 through 48
7.487 5.0111 7.425 5.1547 2.1412 9.9622 0.4211 7.5895


Columns 49 through 53
6.8866 6.0377 3.1218 9.6499 1.9937



Simulated data of G() based on x_c
G() =
Columns 1 through 8
-208.78 -50.628 -25.774 -17.372 -12.813 -9.7933 -7.615 -5.9744


Columns 9 through 15
-4.7053 -3.704 -2.9017 -2.2501 -1.7147 -1.2704 -0.89827


G () = Simulated data of G () based on x_c
Columns 1 through 8
0 69 46.533 36.243 30.373 26.43 23.501 21.19
Columns 9 through 15
19.299 17.715 16.363 15.196 14.177 13.279 12.484




4
Result1 Xugang Ye




Check objective value at x = x_c

>>fun(x_c)

ans =

0


Now consider the reverse question, i.e. Given M, N, k , G ' ( k ) , G ' ' ( k ) , k = 1, 2, ..., M, what


can we guess on the parameters x = [ g1 g 2 L g N 1 2 L N G0 ]T ?


Optimization Result:

Best solution obtained

x_record =
Columns 1 through 8
3.6697 4.7429 -0.064169 1.4438 5.8411 -0.22379 2.9063 0.20217


Columns 9 through 16
2.9689 3.0736 2.4521 1.2544 2.0037 0.41609 1.2999
-0.54987


Columns 17 through 24
0.33734 -0.94177 0.67606 -0.40146 -2.7993 -1.5278 4.8888
-0.31515


Columns 25 through 32
3.0103 0.61745 6.5988 7.1688 -7.0854 2.4913 7.3009
-1.0654


Columns 33 through 40
3.879 -0.76659 6.2466 0.91458 2.2931 4.6557 0.35059 -0.17563


Columns 41 through 48
0.88762 1.8208 4.6257 3.6108 0.012266 -0.20495 0.95176
3.3586


Columns 49 through 53
5.4285 1.8973 2.6616 -9.4438 6.1438




5
Result1 Xugang Ye


Check objective value at x = x_record
>> fun(x_record)
The small objective value (the square of residual)
ans = Implies that the global optimization algorithm
works well.
0.00042005



Test case 3:
Data:
M=
22

N=
32 Suppose M = 22 applied frequencies are given

=

Columns 1 through 8
0 0.2992 0.5984 0.8976 1.1968 1.496 1.7952 2.0944


Columns 9 through 16
2.3936 2.6928 2.992 3.2912 3.5904 3.8896 4.1888
4.488


Columns 17 through 22
4.7872 5.0864 5.3856 5.6848 5.984 6.2832


Suppose the components of this vector are the
parameters of the prony series when N=32.
x_c =

Columns 1 through 8
9.5013 2.3114 6.0684 4.8598 8.913 7.621 4.5647
0.18504


Columns 9 through 16
8.2141 4.447 6.1543 7.9194 9.2181 7.3821 1.7627
4.0571


Columns 17 through 24
9.3547 9.169 4.1027 8.9365 0.57891 3.5287 8.1317
0.098613




6
Result1 Xugang Ye


Columns 25 through 32
1.3889 2.0277 1.9872 6.0379 2.7219 1.9881 0.15274
7.4679


Columns 33 through 40
4.451 9.3181 4.6599 4.1865 8.4622 5.2515 2.0265 6.7214


Columns 41 through 48
8.3812 0.1964 6.8128 3.7948 8.318 5.0281 7.0947
4.2889


Columns 49 through 56
3.0462 1.8965 1.9343 6.8222 3.0276 5.4167 1.5087
6.979


Columns 57 through 64
3.7837 8.6001 8.5366 5.9356 4.9655 8.9977 8.2163
6.4491


Column 65
8.1797


G() = Simulated data of G() based on x_c
Columns 1 through 8
-1307.6 -484.43 -225.35 -134.1 -92.142 -69.488 -55.712 -46.506


Columns 9 through 16
-39.87 -34.788 -30.708 -27.311 -24.407 -21.875 -19.635 -17.632


Columns 17 through 22
-15.829 -14.196 -12.71 -11.355 -10.116 -8.9795


G () = Simulated data of G () based on x_c
Columns 1 through 8
0 542.49 402.93 309.32 249.27 208.84 180.26 159.18


Columns 9 through 16
143.06 130.35 120.07 111.57 104.39 98.232 92.87
88.143


Columns 17 through 22
83.931 80.143 76.709 73.576 70.701 68.048




7
Result1 Xugang Ye




Check objective value at x = x_c

>>fun(x_c)

ans =

0


Now consider the reverse question, i.e. Given M, N, k , G ' ( k ) , G ' ' ( k ) , k = 1, 2, ..., M, what


can we guess on the parameters x = [ g1 g 2 L g N 1 2 L N G0 ]T ?


Optimization Result:

Best solution obtained

x_record =

Columns 1 through 8
3.9633 2.8855 9.3555 -0.13221 6.9519 8.3211 2.8529 -3.1125


Columns 9 through 16
3.5588 7.9814 4.1634 4.8774 2.51 3.2234 -3.4464
-3.2743


Columns 17 through 24
3.2994 0.36456 5.4733 0.89861 6.1814 2.6212 0.53085
3.6368


Columns 25 through 32
5.5941 5.0407 -1.161 2.4554 2.8642 7.5711 4.4227
3.4429


Columns 33 through 40
5.5456 4.9772 5.3442 3.2425 8.4135 6.409 -7.1237
-3.7641


Columns 41 through 48
3.8005 4.417 3.5607 7.768 -2.748 3.0555 -4.9913 -2.4868


Columns 49 through 56
1.6644 0.0019356 1.6797 -2.4028 5.8827 3.1205 -2.2483 -3.9093



8
Result1 Xugang Ye




Columns 57 through 64
2.2476 6.3323 3.3689 5.3787 0.19748 7.8437 6.7071
1.6635


Column 65
12.705


Check objective value at x = x_record
>> fun(x_record)

ans = The small objective value (the square of residual)
Implies that the global optimization algorithm
0.00092688 works well.




9
Prony_Series_Conversion.pdf
Calculation of Prony Series Parameters From Dynamic
Frequency Data
Jš rgen Bergstrš m, Ph.D.
o o
jorgen@polymerfem.com
http://PolymerFEM.com


1 Definition of a Prony Series
Let GR (t) be the shear stress relaxation modulus. Define G and G0 by the following limits:

G = lim GR (t) (1)
t
G0 = GR (0). (2)

From the shear relaxation modulus we can define a dimensionless relaxation modulus from:
GR (t)
gR (t) = . (3)
G0
The normalized shear stress relaxation modulus is often represented by a series expansion in ex-
ponential terms:
N
gR (t) = 1 - gi 1 - e-t/i , (4)
i=1

where gi and i are material parameters. This series expansion is called the Prony series.


2 Dynamic Frequency Experiments
The viscoelastic behavior of a material is often determined from dynamic vibrational experiments.
In these experiments the material is exposed to small strain vibrations and the resulting storage
modulus (G ) and loss modulus (G ) are determined as a function of the applied frequency: G (),
G ().


3 Conversion from Dynamic Frequency Data to Prony Series
Data
In most finite element packages the linear viscoelasticity model is specified by a Prony series (Eq.
4). If only dynamic frequency data is available for a material then the Prony series can be deter-




c Jš rgen S. Bergstrš m, 2005.
o o 1
jorgen@polymerfem.com
mined from the following implicit equations [see the ABAQUS theory manual for a derivation]:
N N
gi 2 2

G () = G0 +G
1 - i
gi 0 (5)

i=1

i=1
1 + 2 2
i
N
gi i
G () = G0 . (6)
i=1
1 + 2 2
i

These equations determine the storage modulus and the loss modulus for a given Prony series. In
this case we are interested in the reverse relationship, that is, if we know G () and G () then
what is the Prony series (G0 , N, gi , i )?


To solve this problem we can use the following approach.

1. Pick a value for N.

2. Guess values for G0 , gi , i .

3. Calculate G () and G ().

4. Calculate the residual between the calculated dynamic data and the experimental dynamic
data.

5. Optimize the parameters G0 , gi , and i using any minimization algorithm (e.g. the Simplex
method). Goto 3.




c Jš rgen S. Bergstrš m, 2005.
o o 2
jorgen@polymerfem.com
A Stochastic BFGS.pdf
Technical Report



A Stochastic BFGS Based Algorithm for Estimating the Angles
of Arriving Signals Given Signal Measurements
Xugang Ye

Applied Mathematics, Florida State University


Abstract

In this paper we developed a stochastic optimization algorithm to solve a well-known problem in statistical
signal processing called "direction of arrival estimation". In this problem the angles of arriving signals are
to be determined to maximize the fit of the direction vectors model to the observations. Since the data
fitting turns out to be a problem of maximizing a likelihood function or, equally but simpler, minimizing an
energy function; and since the objective function is smooth but not convex, we choose derivative based
stochastic global optimization technique. In this study, a stochastic BFGS based simulated annealing
algorithm was implemented to find the near optimal solution for cases of one, two and more than two
variables. The results were compared with those of deterministic gradient methods and methods of
stochastic gradient.

Key words: Angles of Arriving Signals; Stochastic Optimization; BFGS; Metropolis-Hastings Simulated
Annealing




1.Introduction of The Problem

We look at a well-known problem in statistical signal processing called the "direction of arrival
estimation". As Figure 1.1 shows, consider a group of n sensors (electromagnetic, acoustic or
others) lying in a straight line on the ground at the equal spacing of d. Also, imagine a signal
transmitter (lying in the same plane as the array) at an unknown angular location , defined with
respect to the array. The signal transmitted by the transmitter is received at the sensors and
measured to generate observations.




Figure 1.1 Angel of arriving signal




1
Technical Report



As the wave traverses the array, it reaches different sensors at different lags. These lags are
dependent upon the angle at which the signal is arriving. Therefore, we can estimate its
direction of arrival by estimating these lags. Let n be the number of sensors in the array. Define a
n-vector d() Cn, called a direction vector, according to

d ( ) = [1 e - i e - i2 L e - i( n -1) ]T (1.1)

where [0; 2] is the angle of arrival, is the phase-lag between the signal reaching
successive sensors (given by = cos()). So, for a signal transmitted at angle , the array
measurement is given by the vector d(). If there are multiple transmitters then their cumulative
measurement is the superposition of their received signals. In addition, we assume a certain
observation noise to be present near the array; we model it as additive complex Gaussian noise w
Cn. Each element of w is independent complex normal, with i.i.d real and imaginary parts given
by N (0,2). Under these assumptions, the observation vector is modeled as:

m
y = d ( j ) + w Cn (1.2)
j =1


For this signal model, the likelihood function is given by,

2
m
1 1
L( y | ) = exp(- y - d ( j ) ) , (1.3)
(2 2 ) n 2 j =1



where = [1, 2, ..., m]. Our goal is to find the maximum likelihood estimate of :


= arg max L( y | ) = arg min f ( ) , (1.4)



where
2
m
f() = y - d ( j ) .
j =1
(1.5)



Let

y = (a1+ib1 a2+ib2 ... an+ibn)T, (1.6)

Since

d( j) = (cos(0j)-isin(0j) cos(1j)-isin(1j) ... cos((n-1)j)-isin((n-1)j))T (1.7)


So,




2
Technical Report


n m m
f() = {(ak - cos[(k - 1) cos j ]) 2 + (bk + sin[(k - 1) cos j ]) 2 }
k =1 j =1 j =1
(1.8)


The function (1.8) is the smooth objective function that we minimize.


2. Optimization Methods

For unconstrained optimization problem, say min f ( x) , although a wide spectrum of methods
m
xR
exists, they can be broadly categorized in terms of the derivative information (f(x)) that is, or is
not, used. Search methods that use only function evaluations (e.g., the simplex search of Nelder
and Mead) are most suitable for problems that are very nonlinear or have a number of
discontinuities. Gradient methods are generally more efficient when the function to be minimized
is continuous in its first derivative. Higher order methods, such as Newton's method, are only
really suitable when the second order information is readily and easily calculated since
calculation of second order information, using numerical differentiation, is computationally
expensive. Gradient methods use information about the slope of the function to determine a
direction of search where the minimum is thought to lie. The simplest of these is the method of
steepest descent in which a search is performed in a direction along negative gradient -f(x). This
method is very inefficient when the function to be minimized has long narrow valleys.

Of the methods that use gradient information, the most favored are the quasi-Newton methods.
Newton-type methods (as opposed to quasi-Newton methods) calculate Hessian matrix H(x)
directly and proceed in a direction of descent using a line search method to locate the local
minimum after a number of iterations. Calculating Hessian numerically involves a large amount
of computation. Quasi-Newton methods avoid this by using the observed behavior of objective
function and its gradient to build up curvature information to make an approximation to Hessian
using an appropriate updating technique. A large number of Hessian updating methods have been
developed. Generally, the formula of Broyden, Fletcher, Goldfarb, and Shanno (BFGS) is thought
to be the most effective for use in a general purpose method. The formula is given by:


T T T
qk qk H k sk sk H k
H k +1 = H k + T
- T (2.1)
qk sk sk H k sk

where

s k = x k +1 - xk (2.2)
q k = f ( xk +1 ) - f ( xk )
(2.3)


As a starting point, H0 can be set to any symmetric positive definite matrix, for example, the
identity matrix I. To avoid the inversion of the Hessian H, we can derive an updating method in
which the direct inversion of H is avoided by using a formula that makes an approximation of the
inverse Hessian at each update. Another well-known procedure is the DFP formula of Davidon,
Fletcher, and Powell. This uses the same formula as the above BFGS method (2.1-2.3) except that
qk is substituted for sk.



3
Technical Report



The gradient information is either supplied through analytically calculated gradients, or derived
by partial derivatives using a numerical differentiation method via finite differences. This
involves perturbing each of the design variables, xi (i=1,2,...,m), in turn and calculating the rate of
change in the objective function.

At each major iteration, k, a line search is performed in the direction

d k = - H k-1f ( x ( k ) ) (2.4)

When the step k is found through a line search method, the updated solution is given by

x ( k +1) = x ( k ) + k d k (2.5)

Although the Quasi-Newton method like BFGS or DFP finds its success in minimization of
smooth function, it is local search method. So, The quality of the solution found after satisfying
some termination criterion is determined by where the iteration starts. In other words,
Quasi-Newton method finds the local minimum. It does not tell us how to jump out of the region
around a local minimum.

A way of global search using Quasi-Newton method is to choose a population of start points then
apply the method one time for each of the initial start points. This is the well-known multi-starts
scheme. However, how to decide the initial population of start points remains a major problem to
be answered.

Another way, seemly much easier, is to add stochastic perturbation to the update scheme (2.5). A
favorable method is known as Langevin's method:

x ( k +1) = x ( k ) + k d k + 2 k k (2.6)


where k ~Nm(0,1), which denotes m dimensional normal random vector.

Simulated annealing is a commonly used metaheuristic designed to permit escaping from local
minimum. The key idea is to give an acceptance probability according to which a worse update is
accepted. The acceptance probability is generally written as:

exp( f ( x) - f ( y ))
( x, y ) = min{ ,1} (2.7)
T

Where, x is the current solution, and y is the solution calculated from an update scheme.

3. Algorithms

Look back at objective function (1.8), assume 1 m n, then, f()=f(1, 2,..., m). The algorithm
uses BFGS to generate a basic search direction; uses line search for a step along the BFGS
direction; uses Langevin's method (2.6) to generate a perturbation of the neighbor obtained by
BFGS; and use acceptance probability of simulated annealing (2.7) to confirm the solution update.
The algorithm is described as follows:



4
Technical Report



1. Initialize (0), H0=I, f( (0)) ,T0, k=0
2. Compute d k = - H k-1f ( ( k ) ) given (k), Hk,f( (k))
3. Compute step k such that f( (k)+kdk)= min f ( ( k ) + d k )


4. Compute one step update: y = (k )
+ k d k + 2 k k , k ~N(0,1)
exp( f ( ( k ) ) - f ( y ))
5. Set y with probability ( ( k ) , y ) = min{ ,1}
Tk
(k+1) =

(k) with probability 1- ( ( k ) , y )
6. Tk+1= Tk (where 0<<1), check if the maximum allowed iterations is exceeded, if not go to
step7. if yes, stop.
7. Compute s k = ( k +1) - ( k ) , f ( ( k +1) ) , q k = f ( ( k +1) ) - f ( ( k ) ) ,
T T T
qk qk H k sk sk H k
H k +1 = H k + T
- T . Set k=k+1, return to step 2.
qk sk sk H k sk


Remark: in the algorithm, steps 2-3 constitute the deterministic gradient search. Steps 2-4
constitute the stochastic gradient search (with perturbation term 2 k k added). Steps 2-7
constitute the Simulated Annealing stochastic BFGS search.


4. Numerical results

4.1. Minimize (1.8) in the case of m=1 with respect to data y1:

The given data:

n=10;

y1= [1.0469 + 0.1018i -0.5240 - 1.0591i -0.6202 + 0.7737i 0.9120 + 0.1550i
-0.1681 - 1.0776i -0.7272 + 0.4992i 0.8800 + 0.4639i -0.2039 - 1.0429i
-0.8885 + 0.4379i 0.9399 + 0.5882i];

Let:
y1 = (a1+ib1 a2+ib2 ... an+ibn)T;

Then,
10
f(x) = {(a
k =1
k - cos[(k - 1) x]) 2 + (bk + sin[(k - 1) x]) 2 }


with
x=cos, : 0 x: -

The objective function is graphically shown as Figure (4.1.1):



5
Technical Report




Figure 4.1.1 Objective function


Deterministic Gradient Search:



Starts at: Starts at:
x0=-3 x0=-1
Terminates at Terminates at
x=-1.4155 x=8.3095




Starts at: Starts at:
x0=1 x0=3
Terminates at Terminates at
x=2.0263 x=0.5366




Figure 4.1.2 Deterministic Gradient Search (1-D)




6
Technical Report



Stochastic Gradient Search:


Starts at: Starts at:
x0=-3 x0=-1
Terminates at Terminates at
x=2.0180 x=8.2936




Starts at: Starts at:
x0=1 x0=3
Terminates at Terminates at
x=2.0327 x=2.0294




Figure 4.1.3 Stochastic Gradient Search (1-D)


Simulated Annealing Stochastic Gradient Search:


Starts at: Starts at:
x0=-3 x0=-1
Terminates at Terminates at
x=-4.2484 x=8.3191




Starts at: Starts at:
x0=1 x0=3
Terminates at Terminates at
x=2.0244 x=-4.2530




Figure 4.1.4 Simulated Annealing Stochastic Gradient Search (1-D)




7
Technical Report




4.2 Minimize (1.8) in the case of m=2 with respect to data y2:

The given data:

n=10;

y2= [1.9535 + 0.1553i -0.3994 - 1.2171i 0.1925 + 0.1041i -1.0887 - 0.8207i
-0.6509 + 2.0954i 1.0196 - 0.1050i 0.2743 + 0.3010i 1.2685 - 1.1604i
-1.7885 - 0.8418i -0.1402 + 0.7339i];

Let:

y2 = (a1+ib1 a2+ib2 ... an+ibn)T;

Then,

10 2 2
f(x) = {(a -
k =1
k
j =1
cos[(k - 1) x j ]) 2 + (bk + sin[(k - 1) x j ]) 2 }
j =1


with

xj=cosj; j=1,2 j: 0 xj: -, j=1,2



The objective function is graphically shown as Figure (4.2.1):




Figure 4.2.1(a) Topology of Figure 4.2.1(b) Contour of Objective
Objective Function Function




8
Technical Report



Deterministic Gradient Search



Starts at: Starts at:
x0=(-3, -3) x0=(-1, -1)
Terminates at Terminates at
x=(2.7034, 2.7034) x=(-3.7171, -3.4235)




Starts at:
x0=(1, 1) Starts at:
Terminates at x0=(3, 3)
x=(0.9426, 1.2333) Terminates at
x=(2.7034, 2.7034)




Figure 4.2.2 Deterministic Gradient Search (2-D)


Stochastic Gradient Search




Starts at: Starts at:
x0=(-3, -3) x0=(-1, -1)
Terminates at Terminates at
x=(1.0725, 2.7304) x=(1.0867, -3.5651)




Starts at: Starts at:
x0=(1, 1) x0=(3, 3)
Terminates at Terminates at
x=(2.7264, 1.0733) x=(2.7013, 1.0873)




Figure 4.2.3 Stochastic Gradient Search (2-D)




9
Technical Report



Simulated Annealing Stochastic Gradient Search.




Starts at: Starts at:
x0=(-3, -3) x0=(-1, -1)
Terminates at Terminates at
x=(2.7094, 1.0900) x=(1.0792, -3.5812)




Starts at: Starts at:
x0=(1, 1) x0=(3, 3)
Terminates at Terminates at
x=(2.7225, 1.0616) x=(2.7236, 1.0906)




Figure 4.2.4 Simulated Annealing Stochastic Gradient Search (2-D)


4.3 Minimize (1.8) in the case of 1 m n with respect to data y3:

The given data:

n=10;

y3= [2.9459 - 0.1029i -1.5491 - 0.2630i -0.5875 - 0.5373i 0.0608 + 0.2039i
1.5035 + 1.5683i -1.3665 - 2.5519i 0.2960 + 0.3701i -0.2010 + 0.9896i
0.1270 + 0.5835i 0.7482 - 2.6276i];

Let:

y3 = (a1+ib1 a2+ib2 ... an+ibn)T;
Then
10 m m
f(x) = {(a -
k =1
k
j =1
cos[(k - 1) x j ]) 2 + (bk + sin[(k - 1) x j ]) 2 }
j =1
with

xj=cosj;j=1,2,...,m, j: 0 xj: -, j=1,2,...,m


In cases of more than two variables, we only test the Simulated Annealing Stochastic Gradient
Search. Also, we only check the decrease of the objective value vs.iterations.


10
Technical Report




x0 =(1 1 1 1 1) x0 =(1 1 1 1)




x0 =(0.96 0.23 1.33 1.58)
x0 =(-2.3, 0.72, -0.77, 0.71,
1.67, 1.75, 1.09)




x0 =(2.44 2.01 3.36) x0 =(3 3 3 3 3)




Figure 4.3.1 Several cases of Simulated Annealing
Stochastic Gradient search (>2-D)




5. Conclusion

Quasi_Newton's method (DFP, BFGS, and Broyden family) is a very good method for
unconstrained optimization problem when the objective function is near convex. However, we see
from the result that there are a lot of local minima. Each run of the search starts from a initial
point and ends at a local minimum. One of the disadvantages of deterministic search is that the
search is easily trapped into the neighbor of a local minimum. Multi-starts and parallel search are


11
Technical Report



the ways to approach to the region where the global optimum exists with better probability.
Compared with deterministic gradient search, which is easily trapped into the local minimum, the
stochastic gradient search overcomes this disadvantage by an additional perturbation term. When
search is going to be trapped in local minimum, the gradient approaches to zero, so the term of
gradient search is insignificant compared with the perturbation term. So perturbation term
"drags "the search out of the local minimum and makes it continue. For any neighbor of current
point, if its objective value is better, we accept it as new current point; if its objective value is
worse, we accept is as new current point according to a probability. By this way, when the search
gets into a region of local minimum, it can get out with a probability, it can also retain the current
solution with a probability. M-H simulated Annealing is in essence a metaheuristic for global
optimization. Although no guarantee for absolute global optimization point, the M-H simulated
annealing plus stochastic gradient search can reach a very satisfactory solution.


Acknowledge

Thanks very much for Dr. Anuj Srivastava's excellent teaching in the course Computational
Methods in Statistics II.

Reference

[1] Fletcher, R., "Practical Methods of Optimization," Vol. 1, Unconstrained Optimization, and
Vol. 2, Constrained Optimization, John Wiley and Sons.,1980.
[2] Gill, P.E., W. Murray, and M.H.Wright, Practical Optimization, Academic Press, London,
1981.
[3] Gintautas Dzemyda, Stochastic and Global Optimization, Kluwer Academic Pub, 248 pages,
2002
[4] Ingber, L. "Simulated Annealing: Practice Versus Theory." Math. Compute. Modelling 18,
29-57, 1993.
[5] James C. Spall, Introduction to Stochastic Search and Optimization : Estimation, Simulation,
and Control, John Wiley & Sons Inc, 595 pages, 2003
[6] Kirkpatrick, S.; Gelatt, C. D.; and Vecchi, M. P. "Optimization by Simulated Annealing."
Science 220, 671-680, 1983.
[7] W.X. Xing, J.X.Xie. Modern methods in optimization, Beijing, Tshinghua, 2000.
[8] W.X. Xing, J.X.Xie. Network optimization, Beijing, Tshinghua, 2000.
[9] Y.X.Yuan, W.Y.Sun Theory and method of Optimization, Chinese Academy of
Sciences,2001




12
Technical Report



Appendix: sample codes

Deterministic gradient search:
----------------------------------------------------------------------
%This subroutine is to minimize the objective
%function with Quasi-Newton's method(gradient search)
clear all;

x0=[3 3],%initial guess

k=1;
x_min=x0;
y_min(k)=f_y2_A(x_min);
for k=2:20
options=optimset('LargeScale','off','MaxIter',1);
x_min=fminunc('f_y2_A',x_min,options);
y_min(k)=f_y2_A(x_min);
end

kk=1:k;
figure(3);
plot(kk,y_min);

xlabel('iteration');ylabel('objective value');



Stochastic gradient search:
----------------------------------------------------------------------

%This subroutine is to minimize the objective
%function with stochastic gradient search
clear all;

x0=[3 3],%initial guess

k=1;
x_min=x0;
y_min(k)=f_y2_A(x_min);
record_y(k)=min(y_min);
for k=2:100
options=optimset('LargeScale','off','MaxIter',1);
x_min=fminunc('f_y2_A',x_min,options)+0.05*normrnd(0,1);
y_min(k)=f_y2_A(x_min);
record_y(k)=min(y_min);
end

kk=1:k;
figure(3);
%record_y=log(record_y);
plot(kk,record_y);


13
Technical Report




xlabel('iteration');ylabel('best objective value');
Metropolis-hasting Simulated Annealing Stochastic Gradient Search.
-----------------------------------------------------------------------------
%This subroutine is to minimize the objective
%function with stochastic gradient search
clear all;

x0=[3 3],%initial guess

T=1.0e3;%initial temprature
a=0.50;

k=1;
x_min=x0;
y_min(k)=f_y2_A(x_min);
record_y(k)=min(y_min);
for k=2:100
options=optimset('LargeScale','off','MaxIter',1);
x_propose=fminunc('f_y2_A',x_min,options)+0.05*normrnd(0,1);

U=unifrnd(0,1);
p=min(exp(f_y2_A(x_min)-f_y2_A(x_propose))/T,1);%aceptance probability
if (U x_min=x_propose;
end

T=a*T;%temprature descent

y_min(k)=f_y2_A(x_min);
record_y(k)=min(y_min);
end

kk=1:k;
figure(3);
%record_y=log(record_y);
plot(kk,record_y);

xlabel('iteration');ylabel('best objective value');




14

Solution Summary

Prony series are investigated. The solution is detailed and well presented. The response received a rating of "5/5" from the student who originally posted the question.

Solution
What is this?
By OTA - Overall OTA Rating
Purchase Cost Now
$2.19 CAD (was ~$99.75)
Included in Download
  • Plain text response
  • Attached file(s):
    • Matlab sol.zip
Why you can trust BrainMass.com
  • Your Information is Secure
  • Best Online Academic Help Service
  • Students find real academic Success
Related Solutions
  • Optimization algorithm - Could you solve the following approach "in the 2nd page of the PDF file" using matlab. Let me know if you are interesting to solve this problem.i can pay anything you want.there is an Optimization inv ...
Browse