Example: ANN modeling for meat attribute prediction

4.3 ANN modeling

4.3.4 Example: ANN modeling for meat attribute prediction

Data from elastography analysis and mechanical and chemical tests for beef samples in LD muscle were available for modeling to predict the beef quality attributes. Huang et al. (1998) used the data to build the neural network prediction models with the Levenberg–Marquardt’s algorithm for effective training and network generalization. There were 29 sample vectors in the data. Each vector included the wavelet image textural feature parameters from each beef sample from the elastograms as the inputs and the mechanical and chemical measurements as outputs. Because each of the elastograms was cropped into a 128 × 128 image, the number of feature inputs was 29.

The mechanical and chemical measurements were used as the indicators of beef tenderness.

The modeling work was implemented using the regular BP algorithm for training with and without a momentum term, and then using the Levenberg–Marquardt BP algorithm for training with and without weight decay. The leave-one-out procedure was built in all of the training processes.

The performance difference of these training processes was then compared.

The programs were coded using MATLAB (The MathWorks, Inc.).

Table 4.5 Training and Validation Samples for Different Experimental Setups*

Machine Wear–Raw Material Conditions

Number of Training Samples

Number of Validation Samples

A 700 100

B 400 50

C 400 50

D 400 100

* Adapted from Sayeed et al. (1995). With permission.

As the first step, the model structure of the neural networks, that is, the number of hidden nodes for the one-hidden-layered network, was identi-fied. The method of identification was to determine the optimal number of hidden nodes by selecting the number with the lowest validation MSE value of a network model by the increment of the number of the hidden nodes from 1, 2,…, H where H was a number, 4 or 6, as long as an optimal number with the lowest validation MSE was found. Table 4.6 shows the MSE for 1 to 4 hidden nodes for the neural network model of WBSF, where 2 nodes was the optimal number. Table 4.7 shows the results of the determination of the number of hidden nodes for the models based on inputs of Daubechies-4 wavelet features using the leave-one-out regular BP training without a momentum term. In order to present the efficiency of the training processes completely, both the number of training epochs and the number of floating point operations (flops) for the training computation were recorded. The learning rate was η = 0.025, and the training stopping criterion was MSE = 0.01 in order to ensure the stability of the training processes. If it is desirable for the training process to be accelerated, the learning rate should be increased. However, if the learning rate parameter η is increased arbitrarily for the purpose of increasing the learning rate, the network may become unstable. To overcome this problem, the learning rate of the training

Table 4.6 Result of the Determination of the Number of Hidden Nodes for the

Network Model of WBSF * Number of Hidden Nodes

Validation MSE

1 0.1294

2 0.0410

3 0.0414

4 0.0416

* From Huang et al. (1998). With permission.

Table 4.7 Results of the Determination of the Number of Hidden Nodes for the Models with Daubechies-4 Wavelet Features in the Regular Leave-One-Out

BP Training without a Momentum Term*

Model Output

No. of Hidden

Nodes Training R²

Validation MSE

No. of Training Epoches

No. of Flops

WBSF 2 0.95 0.0207 30.48 × 10⁴ 25.48 × 10⁸

Calp 3 0.96 0.0203 44.58 × 10⁴ 54.85 × 10⁸

Sarc 2 0.91 0.0387 17.40 × 10⁶ 14.54 × 10⁹

T. Coll 4 0.96 0.0207 29.08 × 10⁴ 47.25 × 10⁸

%Sol 4 0.95 0.0218 47.27 × 10³ 76.81 × 10⁷

%Mois 2 0.95 0.0214 10.97 × 10⁵ 91.70 × 10⁸

%Fat 4 0.95 0.0205 98.89 × 10² 16.07 × 10⁷

* From Huang et al. (1998). With permission.

process can be increased and the network kept stable at the same time by introducing a momentum term in weight updating, that is, let 0 < η < 1.

Table 4.8 shows the results of leave-one-out BP training with a momentum term. Table 4.9 shows the ratios of training epochs and the number of flops with and without momentum terms. Training with momentum terms took more flops for the models of Sarc and %Fat where the number of training epochs remained the same. Training with momentum terms for the model of Calp also increased flops but the number of training epochs was reduced a bit. The training with momentum term for the rest of the models took a range of flops reduced by a factor of 0.01 up to 7.09× where the training epochs were reduced.

Incorporating the Levenberg–Marquardt algorithm into the leave-one-out BP training is another possibility to accelerate the training process. Table 4.10 shows the results of Levenberg–Marquardt leave-one-out BP training.

The factor µ was determined as 1 in all the cases in order to ensure that the algorithm converged effectively. All of the models converged to the given error criterion with much less training epochs. Table 4.11 shows the ratios of the training epochs and number of flops before and after using the

Table 4.8 Results of Convergence Acceleration Using a Momentum Term in the Regular Leave-One-Out BP Training with

Daubechies-4 Wavelet Features*

Model

Output Training R²

Validation

MSE ρ

No. of Training Epoches

No. of Flops WBSF 0.95 0.0207 0.90 29.61 × 10⁴ 25.13 × 10⁸ Calp 0.96 0.0200 0.60 44.45 × 10⁴ 55.55 × 10⁸ Sarc 0.91 0.0387 0.025 17.40 × 10⁶ 14.77 × 10⁹ T. Coll 0.96 0.0207 0.90 10.57 × 10⁴ 17.45 × 10⁸

%Sol 0.96 0.0204 0.05 29.46 × 10³ 48.63 × 10⁷

%Mois 0.95 0.0206 0.90 13.35 × 10⁴ 11.33 × 10⁸

%Fat 0.95 0.0205 0.025 98.89 × 10² 16.32 × 10⁷

* From Huang et al. (1998). With permission.

Table 4.9 Ratios of Training Epochs and Flops without and with a Momentum Term in the Regular Leave-One-Out BP

Training Using Daubechies-4 Wavelet Features*

Model Output Ratio of Epochs Ratio of Flops

WBSF 1.03 1.01

Calp 1.00 0.99

Sarc 1.00 0.98

T. Coll 2.75 2.71

%Sol 1.60 1.58

%Mois 8.22 8.09

%Fat 1.00 0.98

* From Huang et al. (1998). With permission.

Levenberg–Marquardt algorithm in the leave-one-out BP processes. From Table 4.10, it can be seen that all models obtained higher R² values and lower validation MSE values. This indicates that these network models have better output variation accounting and generalization. This may be related to the flexibility in the convergence space of the Levenberg–Marquardt algorithm.

Table 4.11 further indicates that after using the Levenberg–Marquardt algo-rithm for this application, the number of training epochs was reduced greatly.

However, when the reduction was only 100× or less, the number of flops was more, such as the models of %Sol and %Fat. This means that in each iteration step, the Levenberg–Marquardt algorithm needed more operations.

Thus, this algorithm had a higher computation requirement. In the cases where the epoch reduction was over several hundred times, the number of flops required for training was reduced approximately from 4 to 40×.

Further, on the basis of the use of the Levenberg–Marquardt algorithm, the weight-decay algorithm can be incorporated to improve the network generalization. Basically, the weight-decay algorithm, as described previ-ously, suppressed excessively large weights in the network to maintain the

Table 4.10 Results of the Leave-One-Out BP Training Using the Levenberg–Marquardt Algorithm with the

Daubechies-4 Wavelet Features*

Model

Output Training R²

Validation

MSE µ

No. of Training Epoches

No. of Flops

WBSF 0.97 0.0155 1.00 580 20.90 × 10⁷

Calp 0.98 0.0120 1.00 464 44.06 × 10⁷

Sarc 0.95 0.0296 1.00 1827 72.13 × 10⁷

T. Coll 0.96 0.0200 1.00 493 89.98 × 10⁷

%Sol 0.97 0.0158 1.00 464 83.07 × 10⁷

%Mois 0.96 0.0159 1.00 580 22.44 × 10⁷

%Fat 0.96 0.0193 1.00 261 47.40 × 10⁷

* From Huang et al. (1998). With permission.

Table 4.11 Ratios of Training Epochs and Flops before and after Using the Levenberg–Marquardt Algorithm in the Leave-One-Out BP

Training with Daubechies-4 Wavelet Features*

Model Output Ratio of Epochs Ratio of Flops

WBSF 525.60 12.19

Calp 960.75 12.45

Sarc 9523.81 20.16

T. Coll 589.82 5.25

%Sol 101.88 0.92

%Mois 1891.75 40.87

%Fat 37.89 0.34

* From Huang et al. (1998). With permission.

network stability and to reduce noise fitting in modeling. Table 4.12 shows that incorporating weight-decay into the leave-one-out Levenberg–Marquardt BP training achieved better models which had higher R² and lower validation MSE values. The decay constant λ was determined to see if it gave a better model generalization. The results indicated that these models had better out-put variation accounting and generalization. These models were evaluated as the best models with good output variation accounting and less noise fitting.

It was also interesting that incorporating weight decay made the training slightly more efficient vs. implementing the Levenberg–Marquardt algorithm alone.

Dalam dokumen ENGINEERINGFOOD for AUTOMATION (Halaman 146-150)