NETWORK PERFORMANCE TUNING USING REINFORCEMENT LEARNING
G. Prem
Kumarand P. Venkataram
Department
ofElectrical Communication Engineering Indian Institute
ofScience, Bangalore
~560012 INDIA
ABSTRACT
Performance degradation in communication networks are caused by a set of faults. called sofr failures, owing t o which the network resources like bandwidth can not he utilised t o the expected level. An automated solu- tion t o the performance management problem involves identifying these soft failures follonred by the required change in few of the performance parameters t o over- come the performance degradation. T h e identification of soft-fa.ilures can b e done using any of the fault- diagnosis model [SI, a n d is not discussed here. In this paper, the emphasis is on providing a viable solution t o the network performance tuning. Since t h e explo- ration of the network performance tuning search space t o bring out the required control is not well known in advance, a reinforcement control model is developed t o tune the network performance. Ethernet performance management is talien up as a case study. The results obtained by the proposed approach demonstrate its effectiveness in solving tlie net,work performance man- agement problem [ill.
INTRODUCTION
Performance management. is a complex pa,rt of present day net,work management. [6]. Any ahnormal behavior of tlie nebwork in tlie absence of hard-failure is sup- posed t o be handled by the network performance nian- ager. Network performance degra.dat,ion is viewed t o be caused by a set, of faults, called soft-failures. Solu- tion bo the network performmce management problem involves two st,eps. Firstly. identify t,he soft-failures t h a t cause the performance degradation. Secondly, t u n e t.he network parameters t o bring its performance back t o normalcy.
In a.nother work [9], we proposed t o use realist.ic abductive reasoning model (Realistic-ARM) t,o iden- t,ify tlie network soft-fa,ilures. Each of the sofLfailures is expect,ed t o have a remedy in t,he forin of tuning up of network performa.nce parameters. Performance
tuning is done by using a neural network model that applies necessary correction t o some of t h e p a r a m e ters of the degraded communication network t o bring it back t o normalcy. A reinforcement learning mech- anism, called stochastic real valved algorithm [la], is used for on-line tuning of the network. Based on the improvement in the network performance, the learn- ing process takes place by adapting t o the behaviour of the communicatioi~ network.
In this paper, the soft-failure identification is not discussed. Any of t,be fault-diagnosis models which arrive with a set of explanations for a given set, of ahnormalities, each with some assigned probability is assumed t o be t.he input t o this work. A model for net- work fault diagnosis using realistic abductive reason- ing model that also assigns probability t o each of the explanations is discussed in [lo]. A model t o identify the soft-failures in Etbernet using Realistic Abductive Reasoning is presented below. Reinforcement learning niechanisni for performance tuning a,nd the results of the performance tuning for Ethernet follow it.
SOFT-FAILURE IDENTIFICA- TION IN ETHERNET
We consider a, rest,ricted Ethernet, model t o illustra,te the soft-failure identificat,ion. For details on the Eth- ernet operation, refer
[a][.i].
We consider an Ethernet performance manage- ment model with the following assumptions.
The informa,tion t h a t needs t.0 be monitored for the purpose of performance h i l i n g is collected froin the nodes and t h e channel. And, t h a t infor- mation which is beyond the normal values (both above a,nd below the norinal limit,s) is reported as symptom(s).
Some nioiiitoriiig information like load is n.orm.nl and colC.sio~s t7r.e within the range are included
0-7803-4984-9/98/$10.00 01998
IEEE.
3.
4.
5 .
B.
7 .
8.
No.
I
Fault1
Remedy1.
1
( F l )1
( R l ) : Faulty Ethernel card, re-, , . ,
ma,nager t o initiate fault, diag- nostic measure8
(F3) (R3) : Ensure many packets are (F4) (R4) : Report t o the network
manager
(F5) (R5) : Allocate more prima,ry memorg- to the required nodes (F6)
(RB)
: Selectively control thehroatlcast packets
(F7) (R.7) : Report to the network manager along with the speci- fied ta,p
(R8) : Ensure many packets are not, below the specified size
~~
(F8)
-
2.1
(F2)1
(R2) : Request the network1
Table 1 Suggested rcmedies for the soft-failures
to support, the diagnostic process for minimizing the number of explanations.
There may be some missing information and the entire information may not he available at, the time of performance t,uning.
PM Knowledge Model
Tlie Ethernrt perforinance management knowledge base [2][3][4][5][7] is constructed as a liyper-bipartite networ% (see Figure I ) . This maps t,he Ethernet, per- forina,nce management~ knowledge ont,o a model snit- a h k for t,he Healist,icAR.RI.
The Tahle 1 describes the suggeskcl remedies for (,lie soft-failures of the EXhernet model.
The Results
'To ideiit,if>: the soft-failures, t,lie algorithm Realis- tic-AR,Rl is rini bb- set,ting "the presprcified number of syrnpt,oms required to support. an observed s y m p tom before concluding a fault," t o 1. Tlir algorithm is run foi various sets o f symploms (from layer 1 of Figure 1) and some of the rrsnlts are given in Tahle
2 . The proha,hilities are assigned to each of the links
(not, shown here), which liave to he provided by the expert oE t l r p part,icular netwnrlr.
From Tahle 2 . it ma;- h e oi-~servrd t,liat, the cov-ers grnerated ti)- tlir proposed inoilel conta.in appropria.tr
Fig. 1. Ethernet performance management knowledge model.
Legend:
Layer #1 :
1. Packet loss helow normal; 2. Packet loss normal;
3. Packet loss above normal; 4. Load below normal;
6 . Loa,d normal; G . Load above normal; 7. C!olli-
sions below normal; 8. Collisions normal: 9. Colli- sions ahove normal; l0.Large packets below normal;
ll.l,arge packets normal: l2.Large packets a,hove nor- mal: 13.Small packets helow normal; 11.Small pack- ets noruial; 1.5.Small packets a,bove normal: l6.Broad- cast, packet,s normal: 17.Broadcast, packets a.hovr nor- mal; 1S.Pacl;et loss on spine h o v e Iiornial: I9.Load on spine normal; 2O.Load on spine above normal
Layer #'2 :
1. Light, t,raffic: 2. Heavy traffic; 3 . Buffers are insuf- ficient; 4. Users are t,oo many: 5 . Preambles a.r? 1,oo m a n y 6. Broadcast packet,s are too many; 7. Spine flooded wit,li t,oo many s n d packek: 8. Hea,v>- t,raffic on spiur
Layer #3 :
1. ( F l ) Bahhling node; 2. (F2) Hardware prohleni: 3.
(F3I Jabbering node: 4. Too many rctransmissions;
5. tinder nt,ilization of clia,nnel a niaiiy small packets arr in use: 6. ,Ittempt, for l.00 many hroadca,sts
1. (F4) Uridgc down: 2 . (F5) Diet,morl; paging;
Y.
(FG) Broadcast sl,ornr: 4. ( F i ) Bad t,ap: 5. (FX) Runt, d o r mNo. S y m p t o m s 1. 3,6,12,18,20 2. 1,4,10,15,17 3. 3,9,18,20
I-
4. 10,15,16,18-
Remedy
TJ
Table 2. Sample results for Ethernet performance model
explanation for any given symptoms without much of extra guess.
Consider a case given in %No. 4 of Table 2. T h e observed symptoms in layer #1 are: Symptom 10 : number of large packets below normal; Symptom 15 : small packets above normal; Symptom 16 : number of broadcast packets are normal; Symptom 18 : packet loss on spine above normal.
The soft-failure concluded is “Runt storm” (with some plausibility) and the remedy is to ensure that too many small paxkets a,re not inject,ed inbo the network.
NETWORK PERFORMANCE TUNING
Network performance management cannot be com- plete wit,hout, a control mechanism that tunes the net,- work t,o overcome t,he performance degradation.
When there is a degradation in network perfor- mance, we first identify the soft,-failures with t h e plausibility of their existence and then take up t h e performance tuning. A reinforcem,ent learnmg (RL) pamdigm is employed t o accomplish performance t,uii- ing. We use a, RL model, with two modules, similar t,o that discussed in [8]. T h e first module. called tlie
”actor”. learns the correct action for a given agent and the second module, called the “critic”, learns the maximum reinforcement expected from the environ- ment, for a, given input.
Reinforcement Learning
Reinforcement learning problems are usually modeled as an ngeiil intera.cting with an e?z.u%ronm.ent (see Fig- ure 2). The agent, receives an input from the environ- ment, a,nd outputs a, suit,ahle act,ion The environment accepts the a,ction from the agent as input and out,.
puts a sca,la,r evaluatioii (reinforcement) of t.he action as well as the next, input, t,o the agent. The agent,’s
Envlmment
Reinforcement
Fig. 2. The reinforcement learnins mechanism
Aclor ciihc
Fig. 3. A connectionist model for network PM.
task is to learn how t o react to a given input so as t o maximize the reinforcement it receives from the envi- ronment
In supervised learning, tlie agent is provided with t,he desired out,put from which it, can form a gradient to follow so as t.0 minimize the error in the out,put. In R.L, such an explicit gradient information is absent,.
Hence the agent has t o explore the output space to find the right direction.
A reinforcement learning is suitable t o use for net,- work performance tuning because, the desired out- put t o bring the performance back t o normalcy is not known well in advance.
The Neural Network Model
A connectionist model, as described in [la], is used t o realize the reinforcement, learning. This model uses t,mo neural networks, each of them ba,sed on the back- propagat,ioii learning algorithm (see Figure 3.)
The neural network has an input layer t h a t is de- signed t o accept tlie significant events of environment, as inputs.
The output of N N is the value on which the cor- rect.ion to the comniunic.ation network pa,rameter de-
pends.
The
NX
also has one or more hidden layers, de- pending on the complexit)- of the problem. The num- ber of neurons in the hidden layer(s) also depends on the estimate of the mimber of exemplars provided for t,raining [I].Each layer of the N N consist,s of one or more neu- rons. T h e output of one layer is fed as the input of the next. Associa.ted with each of these connections is an adaptive weight. The output of a neuron in a hidden layer or output, layer is calculated as am a,cti- vation function of the weighted sum of t,he input, to the neuron with an opt,ional bias term. which is also, in principle adaptive.
For every pattern (or exemplar) presented to the iirural iiet,work for training, t,wo passes are used. l h e forward pass is used to eva,luate t,he oulput of the neural network for tlie given input, with the existing weights. In the reverse pass, the difference in t,he neu- ral network output with the desired output, i s coni- pared and fed back to the neural network as an error t,o change the weights of the neural network
For the input layer nenrons, tlie user input i s all- plied aiid a,t the other neurons the net-input is calcu- lated by using the following :
c i ( u ’ i i
*
i n p u t j )where wij is the weight, of the link connecting the neuroii i in the current, layer and the neuron j in the preceding layer, and i n l ~ u t j is the output of the neu- roil j fed t,o neuron i . T h e input to bias neuron in i,he preceding layer is sct, t,o -1 and the weight is ad- just,ahle. Esseidially, tlre weight on t,lie Ilks link in the previous layer iridicat,es the t,hreshold of each nenroii in the current laycr.
T h e out,put, of rach iieuroii is determined hy ap- plying a transfer fuuci,ion f( .) l o the net-input to tlre iieuroii. T h e commonlg used fuiictioiis are sigmoidal fiinct,ions ( i . r / . . ( 1
+
F-?)-’: and h>Tperbolic tangent,.Lci,71 /U!)
IVe use t,he function
.f(,.) = (1
+
(1)In t,he reverse pass, i . c . , for t,he particular neiiron
<
in out,pu( layer L , suppose t,liat the out,plJt of thencuroii is .yc aiid t h ~ desired outpui is (1:. T h e hack- propagated error is
6: = f’(hF)[df. - yf.]. ( 2 ) where hf represents t,he net-input t,o t,he i”’ uiiit, For layers 1 = 1. . . ., (L-1). and Sor each neuron i iu t,lir Lih lilyrr a-nd , f ’ ( . j is the deriva,tiw 01 . f ( . j . i n layer I the error is comput,rd using :
The a,dapt,ation of the weights of the link connect- ing neuroiis i and j is done using :
tu:, x
+
I<*
6:* u‘-l
3 (4) where I< is the learning constant, t,hat depends on the rate a t which the neural network is expected to convergeThis neural network algorithm essentially mini- mizes t h r mean square error between the neural net- work out,put and the desired output using gradient, desceut, approa,ch
RL
for PerformanceTuning
The realistic ahductive reasoning model evaluates tlie performa.nce degradation with a set, of soft,-failures and respect.ive plausihilit,ies. These values are fed t o the two modules ol iieural network. One of the neural net- works is the (‘actor’’ and the other i s tlie ‘.crit,ir”. Let t,he out,put, of tlie “actor” be p , and that, of the “crit,ic”
he i~ which is the goodness of p . T h e correction z is applied to the cornmuiiica~tion net.work which is a.n in- stance o f a Ga,ussia,ii distribution wit,h mean p and varia,iicc where U is a non-increasing functioii of r . T h e variance t,erin is used t o cont,rol explorat,ion of the output space by having a small value when 1. is close to the maxiinrim valiie and ot,herwise a large value. Once again. t,he r e a l i h c ahdnctive reasoning model rvalu- ates t,lie coinmimicatioii network t o yield a, neu’ set of soft-failures axid respect,ive percentage degradat,ioii in performance. And. based on that,. the reiirforcenieni, is given t,o hot11 thi. “actor” and tlie “critic”. ‘The er- ror used t,o ada.pt, t,he xeights of hot,li net,morks is a function of tlie values of 11,. I’, a,nd ?.. This process of soft,-failnrr identificat,ion and perforniance tuning re- p a t s till the network performance is brought, back t o normalcj
The follorving are t,he equations for “actor’ a i d
”crit,ic” iieural net,warks that, are incorporated int,o the conmect,ionist model.
“Aci,or” neural iiet,work :
input I : th? degradation iii performance; plausi- Idit>- o f t.he performance parameter being con1,rolled h?- t,lie iieiiral iictwork: current set,t.ing ot the parame- t,er; ot,lier paramet,ers of significance.
output, : p .
r = 1 if t,he pcrformanc,e improves and -1 if it de- grades fnrt,lier.
liaclr],ropagat,Pd I e r r o P : ( 7 ~ ? )
*
( z ~ A < j / C ~ where z = f ( p . U ) . where f(.) is a Gaossian withi i i ~ a i i p and variance m.
input : x.
outuut : i..
0,s
*--Uwno NN.
.
--WM0uI ““I ”IU
=
max(0, (1 -e)).
r = 1 if the performance improves and -1 if i t de- backpropagated error = (r - i).
grades further.
Algorithm performance Tuning 0.8
Begin 1.
2 3.
4.
5.
6
7
8.
9 End
0.7s
Choose a neural network with 1, . . . , L layers.
1 2 3 4 1 1 7 e
lim”llOn4ims
Set the input of the bias neurons to -1.0 0.7
“lo‘
Initialize w i j = 6 V i , j , a random value in the range (0.1).
feed the input parameters
Propagate the signal forward through the neural network till the output layer, evaluating output of each of the neurons i ,
Fig. 4. Performance improvement on using neural network.
the source buffers. The output is the recommended value of the parameters.
T h e simulation model is developed in C++ pro-
f ( C j ( W j i
*
i W t j ) )where iirpvtj is the output from neuron j in the preceding layer, fed t o neuron i .
gramming language on Sun Sparc 20 worlcstation, run- ning SUN OS 4.2. Each of the “actor” and the “critic”
neural networks is trained by using backpropagation learning procedure with 5 neurons in the input, 4 neu- For each of the neuron i in the output layer
(layer L), compute the error 6: using the equa- tions given in the respective p and f networks.
For 1 = 1, . . ., (L-1), compute the error 61 = f ’ ( h i )
C j ( w i j
I i l*
#+I I )Upda.te the weights using :
2 1 SI 9 Y j
U , ! . m!.
+
K*
J!*
where
I<
is t,he learning rate.Repeat, hy going t o Step 4 till PM is active.
rons in the hidden layer and one neuron in the output layer. To demonstrate the proposed model for network performance tuning, the Ethernet model is simulated both in the absence of neural networks and after i n b grating with neural networks. The performance of the Ethernet a t several instances of simulation are pre- sented. The improvement can be attributed to the adaptability of the neura,l net,works t o learn the be- havior of the Ethernet and make appropriate changes t o result i n better performance.
From Figure 4, it ca,n be observed t h a t , there is a.
substantial improvement, in the performance (by over 5%, on an average) when the connectionist model for performance tuning is employed. Initial dip in per- formance is because of the neural network training t o
- ETHERNET PERFORMANCE
TUNING
reach the expected level of tuning. This may be over- come by providing the expert knowledge t o the neural network before it is placed on the Ethernet for perfor- The Ethernet performance management model consid-
ered for soft-failure identification is further ext,eiided to demonstrate the learning capabilit,y i . e . , improving the cominunication network behavior in case of per- formance degrada,tion.
T h e parameters considered for improving the net- work performa,nce are (i) arrival proha,hility; ( i i ) basic ret,ry slot; (iii) maximum pa.cket, length; and (iv) min- imum pa,cliet, length. Besides the parameter’s value, t,he t,uning a,lgorithm accepts performance degradation and average number of packets queued up in each of
mance tuning
CONCLUSION
In t.his paper, network performance problem is pre- sented in t,wo parts. First part identifies the ca.uses of net,work performance degradataion and the second gives necessary inning t,o bring the performance back to normalcy using reinforcement learning model. The model is tested with the hypothetical Ethernet and the results show a significant iinproveineiit, of over 5%.
References
[12] G. Vijaykumar, J. A. Frakliii and H . Bell- hrahim. Acquiring Robot, Skills via Reinforce- ment Learning. IEEE Control Syslenis, pp.13- 24. February 1994.[I] A. G . Barto. R. S. Sutton and C.
\V.
Ander- son, Neuronlike Elements t h a t can Solw Diffi- cult Learning Control Problems, IEEE Traii,%.on. Sgsfems, M a i i a n d Cyberrrttics, 13, pp.835- 846, 1983.
[2] D. R. Boggs. J . C . Mogul and C. A. Kent, Measured Capacity of an Ethernet, : Myths and Reality, Compiler Coininunicalion R E L ' C ~ W . pp.222-234, 1Y92.
[3] F. E. Feather, Fault D e t e h o n in an Ether- net Net,work via Anomaly Detectors. Doctoral Dessserlatios. Carnegie Mellon University-. July 1992.
[4] F. E. Feather: D. Slewlornk and R. Maxion.
Fault Detection i n an Ethernet Net,mork TIsing Anomaly Signature Matching, C'onzpulcr C o n -
7ii.unicatioii Reueiw. pp.279-288, 1993.
[5] J. P. Hansen, The Use of Multi-Dimensional Paranirt,ric Behavior of a CSMA/CD Network for Nri~rvorl; Diagnosis, Pb.D. thesis. Depart,- inrnt Elect,rical and Computer. Engineering, Chrniegie iMellon University: 1992.
[li] S. Ilayes, Aiialpzing Nnt.n-orli Performance Management, IEEE Cunzmzinicnlioa Magnzaiic.
[7] R. M . Metcafe arid D. R . Boggs, Et,hernet, : Dis- t,ril>ut,ed Packet Swit,chiug for Local Computer Nebworks. Co~i~.iiiiiniculions of AC'ilf, IS(7).
[Q
S. S. I<eerthi and B. Ravindran. A Tnt,orial Sur- w y of Reioforccnienl, Learning. i,o appear in : ,b'adhnnir.[9] (1:. P. Iiunrilr and P. Vrnliatarain, Network Per- rorinancr I\lana,gemmt Using Realist,ic Abduc- t,ivr Reasoning Model. i n : IEEE/IFIP lritcriio- t i o n a l ,Syii~po.~tuiii oit I i i l t g r n l i d .Vrtii:ork M a n - agement. Saiida Barbara. California. pp.18i-198.
n,iay 1995.
[IO] G . P. Iiuinar and P. \:enkatarani, Ne1,work Fault.
arid Perforniancc Management, Using Rea,list,ic i\hduct,ive Reasoning Alodel. Joarnal of Iiidi(iii I i i s t i l n f c of , S c i e u c e , Special issue oii Int,elligent Systems, Vol.28. No.6. pl).i09-7Oi, 199.7.
[Ill G . P. Iiuinar. Inkgra,ted Neimork Managcnieirt~
TTsiiig Cst,zndvd Blackhoard Arch;tect,nre. D o < - l o m i Ui,werlnlioir. Indian lnstitut,e OS Science.
Bangdore. Jnlk- IUlJ6.