Neural network - Artificial intelligence and machine learning

Acknowledgments

3.9 Artificial intelligence and machine learning

3.9.2 Neural network

51 # if necessary start again

52 if self.k<k: self.__init__(self.points,self.metric)

53 # step until we get k clusters

54 whileself.k>k: self.step()

55 # return list of cluster members

56 returnself.r, self.v

Given a set of points, we can determine the most likely number of clusters representing the data, and we can make a plot of the number of clusters versus distance and look for a plateau in the plot. In correspondence with the plateau, we can read from the y-coordinate the number of clusters.

This is done by the function ^cluster in the preceding algorithm, which returns the average distance between clusters and a list of clusters.

For example:

Listing3.21: in file:^nlib.py

1 >>> def metric(a,b):

2 ... returnmath.sqrt(sum((x-b[i])**2 for i,x in enumerate(a)))

3 >>> points = [[random.gauss(i % 5,0.3) for j in xrange(10)] for i in xrange(200) ]

4 >>> c = Cluster(points,metric)

5 >>> r, clusters = c.find(1) # cluster all points until one cluster only

6 >>> Canvas(title='clustering example',xlab='distance',ylab='number of clusters'

7 ... ).plot(c.dd[150:]).save('clustering1.png')

8 >>> Canvas(title='clustering example (2d projection)',xlab='p[0]',ylab='p[1]'

9 ... ).ellipses([p[:2] for p in points]).save('clustering2.png')

With our sample data, we obtain the following plot (“clustering1.png”):

and the location where the curve bends corresponds to five clusters. Al- though our points live in10dimensions, we can try to project them into two dimensions and see the five clusters (“clustering2.png”):

Figure3.7: Number of clusters found as a function of the distance cutoff.

ganized in the layers with oneinput layerof neurons connected only with the input and the next layer. Another one, theoutput layer, comprises neurons connected only with the output and previous layers, or manyhidden layersof neurons connected only with other neurons. Each neuron is char- acterized by input links and output links. Each output of a neuron is a function of its inputs. The exact shape of that function depends on the network and on parameters that can be adjusted. Usually this function is chosen to be a monotonic increasing function on the sum of the inputs, where both the inputs and the outputs take values in the [0,1] range. The inputs can be thought as electrical signals reaching the neuron. The output is the electrical signal emitted by the neuron. Each neuron is defined by a set of parameters^awhich determined the relative weight of the input signals. A common choice for this characteristic function is:

output_ij =tanh(

∑

a_ijkinput_ik) (3.97) wherei labels the neuron,jlabels the output,klabels the input, and a_ijk

Figure3.8: Visual representation of the clusters where the points coordinates are pro- jected in2D.

are characteristic parameters describing the neurons.

The network is trained by providing an input and adjusting the character- isticsa_ijk of each neuron kto produce the expected output. The network is trained iteratively until its parameters converge (if they converge), and then it is ready to make predictions. We say the network has learned from the training data set.

Listing3.22: in file:^nlib.py

1 class NeuralNetwork:

2 """

3 Back-Propagation Neural Networks

4 Placed in the public domain.

5 Original author: Neil Schemenauer <nas@arctrix.com>

6 Modified by: Massimo Di Pierro

7 Read more: http://www.ibm.com/developerworks/library/l-neural/

8 """

10 @staticmethod

11 def rand(a, b):

12 """ calculate a random number where: a <= rand < b """

Figure3.9: Example of a minimalist neural network.

13 return (b-a)*random.random() + a

15 @staticmethod

16 def sigmoid(x):

17 """ our sigmoid function, tanh is a little nicer than the standard 1/(1+

e^-x) """

18 return math.tanh(x)

20 @staticmethod

21 def dsigmoid(y):

22 """ # derivative of our sigmoid function, in terms of the output """

23 return 1.0 - y**2

25 def __init__(self, ni, nh, no):

26 # number of input, hidden, and output nodes

27 self.ni = ni + 1 # +1 for bias node

28 self.nh = nh

29 self.no = no

31 # activations for nodes

32 self.ai = [1.0]*self.ni

33 self.ah = [1.0]*self.nh

34 self.ao = [1.0]*self.no

36 # create weights

37 self.wi = Matrix(self.ni, self.nh, fill=lambda r,c: self.rand(-0.2, 0.2) )

38 self.wo = Matrix(self.nh, self.no, fill=lambda r,c: self.rand(-2.0, 2.0)

)

40 # last change in weights for momentum

41 self.ci = Matrix(self.ni, self.nh)

42 self.co = Matrix(self.nh, self.no)

44 def update(self, inputs):

45 if len(inputs) != self.ni-1:

46 raise ValueError('wrong number of inputs')

48 # input activations

49 for i in xrange(self.ni-1):

50 self.ai[i] = inputs[i]

52 # hidden activations

53 for j in xrange(self.nh):

54 s = sum(self.ai[i] * self.wi[i,j]for i in xrange(self.ni))

55 self.ah[j] = self.sigmoid(s)

57 # output activations

58 for k in xrange(self.no):

59 s = sum(self.ah[j] * self.wo[j,k]for j in xrange(self.nh))

60 self.ao[k] = self.sigmoid(s)

61 returnself.ao[:]

63 def back_propagate(self, targets, N, M):

64 if len(targets) != self.no:

65 raise ValueError('wrong number of target values')

67 # calculate error terms for output

68 output_deltas = [0.0] * self.no

69 for k in xrange(self.no):

70 error = targets[k]-self.ao[k]

71 output_deltas[k] = self.dsigmoid(self.ao[k]) * error

73 # calculate error terms for hidden

74 hidden_deltas = [0.0] * self.nh

75 for j in xrange(self.nh):

76 error = sum(output_deltas[k]*self.wo[j,k] for k in xrange(self.no))

77 hidden_deltas[j] = self.dsigmoid(self.ah[j]) * error

79 # update output weights

80 for j in xrange(self.nh):

81 for k in xrange(self.no):

82 change = output_deltas[k]*self.ah[j]

83 self.wo[j,k] = self.wo[j,k] + N*change + M*self.co[j,k]

84 self.co[j,k] = change

85 #print N*change, M*self.co[j,k]

87 # update input weights

88 for i in xrange(self.ni):

89 for j in xrange(self.nh):

90 change = hidden_deltas[j]*self.ai[i]

91 self.wi[i,j] = self.wi[i,j] + N*change + M*self.ci[i,j]

92 self.ci[i,j] = change

94 # calculate error

95 error = sum(0.5*(targets[k]-self.ao[k])**2 for k in xrange(len(targets)) )

96 return error

98 def test(self, patterns):

99 for p in patterns:

100 print p[0], '->', self.update(p[0])

101

102 def weights(self):

103 print 'Input weights:'

104 for i in xrange(self.ni):

105 print self.wi[i]

106 print

107 print 'Output weights:'

108 for j in xrange(self.nh):

109 print self.wo[j]

110

111 def train(self, patterns, iterations=1000, N=0.5, M=0.1, check=False):

112 # N: learning rate

113 # M: momentum factor

114 for i in xrange(iterations):

115 error = 0.0

116 for p in patterns:

117 inputs = p[0]

118 targets = p[1]

119 self.update(inputs)

120 error = error + self.back_propagate(targets, N, M)

121 if checkand i % 100 == 0:

122 print'error %-14f' % error

In the following example, we teach the network the XOR function, and we create a network with two inputs, two intermediate neurons, and one output. We train it and check what it learned:

Listing3.23: in file: ^nlib.py

1 >>> pat = [[[0,0], [0]], [[0,1], [1]], [[1,0], [1]], [[1,1], [0]]]

2 >>> n = NeuralNetwork(2, 2, 1)

3 >>> n.train(pat)

4 >>> n.test(pat)

5 [0, 0] -> [0.00...]

6 [0, 1] -> [0.98...]

7 [1, 0] -> [0.98...]

8 [1, 1] -> [-0.00...]

Now, we use our neural network to learn patterns in stock prices and predict the next day return. We then check what it has learned, comparing the sign of the prediction with the sign of the actual return for the same days used to train the network:

Listing3.24: in file:^test.py

1 >>> storage = PersistentDictionary('sp100.sqlite')

2 >>> v = [day['arithmetic_return']*300 for day in storage['AAPL/2011'][1:]]

3 >>> pat = [[v[i:i+5],[v[i+5]]] for i in xrange(len(v)-5)]

4 >>> n = NeuralNetwork(5, 5, 1)

5 >>> n.train(pat)

6 >>> predictions = [n.update(item[0]) for item in pat]

7 >>> success_rate = sum(1.0 for i,e in enumerate(predictions)

8 ... if e[0]*v[i+5]>0)/len(pat)

The learning process depends on the random number generator; there- fore, sometimes, for this small training data set, the network succeeds in predicting the sign of the next day arithmetic return of the stock with more than50% probability, and sometimes it does not. We leave it to the reader to study the significance of this result but using a different subset of the data for the training of the network and for testing its success rate.

Dalam dokumen Annotated Algorithms in Python (Halaman 132-138)