P ERFORMANCE C HARTS
8.2 Statistical Analysis of Performance Data
Statistical analyses of aircraft performance data are important to both understand the current product trends and performance and also the future of the platform and industry trends—where the product is headed, what the objective of the product is, and if the direction it is headed to is aligned with its mission both technically and economically, on local and global levels.
Therefore, in addition to a technical individual who will take advantage of this section and the presented methodology, a technical individual with business acumen will also learn the concepts, and so will the Black Belt Six Sigma certified champion whose mission is to decrease product defects and irregularities, and eventually enhance the company’s bottom line.
Aircraft performance data are usually given in section 5 of the POH.
Aircraft performance chart data are used to calculate climb performance (e.g., required fuel, time, and distance to climb), cruise performance (e.g., brake horsepower, TAS, and rate of fuel usage), and takeoff and landing performance. These charts also provide data for converting CAS to the IAS and identifying the stall speed under various conditions. These data are paramount when you are making flight plans, especially when identifying if the runway is long enough for landing or takeoff. The latter is of particular importance under extreme atmospheric conditions, when aircraft perfor- mance decreases due to high temperature, humidity, or pressure altitude.
Performance charts given in Section 5 of the Cessna Skyhawk 1976 Model 172M Pilot’s Operating Handbook are used herein as examples. This
ch08.indd 167
ch08.indd 167 1/27/2019 6:36:09 PM1/27/2019 6:36:09 PM
aircraft, introduced in 1955, has remained in production since then, and it holds the distinction of having the highest number of manufactured units (over 44,000) of any aircraft in history. It is a high wing, four-seat, single- piston-engine aircraft with fixed tricycle landing gear and a fixed-pitch, two- blade propeller. Its performance characteristics are summarized in Table 27.
Since statistical analysis methods are extensively employed in this section, the reader is encouraged to take advantage of available software tools such as Microsoft EXCEL, Minitab® Statistical Software, JMP Statistical DiscoveryTM Software, MathWorks® Statistics, and Machine Learning ToolboxTM. Regression analysis has been heavily employed for the majority of the investigations presented herein. The reader may come across expressions such as fitted curves, significant variables, analyses summaries, and normal probability plots. The purpose of these regression analyses is to predict the relationship between a performance parameter of interest (response or dependent variable) and a few selected parameters (predictors or independent variables). To some extent, the models developed here may be extended to similar weight or category aircraft to gain a general feel for the relationship between these variables. However, the reader is reminded that only the aircraft’s POH may be used for any flight-related calculations.
TABLE 27 Performance data for Cessna 172M [157,158,159,160].
flight.indb 168
flight.indb 168 1/25/2019 5:27:17 PM1/25/2019 5:27:17 PM
As mentioned earlier, in the majority of cases when carrying out a regression analysis, the scientist is to select a measurable parameter that identifies the performance of the system as a whole and is a function of a number of other parameters to be identified. For example, when investigating landing performance data, the parameter that describes the outcome (response) is the landing distance. One can say that this parameter is “perhaps” a function of (dependent upon) a number of other parameters (predictors). Note the use of the term perhaps—often the scientist does not really know which parameters are important factors in determining the landing distance. Fortunately, prejudices have little to do with performing statistical analyses and the results obtained. The previous considerations normally apply to experimental research work, where one is often not sure what effect different factors have on the outcome. Thus, one does the analysis to assess whether this factor is important or not. In a POH, the reader does not deal with mysterious unknown factors. The reason the performance tables have been created is to inform the reader of the relationship between the well-known important factors (critical few) which affect the landing distance, as in this example. If the factors were not important (i.e., trivial many), they would surely not include them.
Assume that you select a number of parameters, believing that they are important. The parameters that are proved to be important are among critical variables and the ones that are not so important are among the trivial variables. Ideally, based on your experiments and judgment, you would select the critical few variables. If you are taking this flight to deliver a bride to her groom and analysis is part of this auspicious occasion, you may categorize the tailwind and headwind as influential variables that affect your landing while the importance of the wedding dress may only be due to its weight, not its designer. By the same token, while transporting a bride to her wedding venue, the number of the bride’s eyeliner shade is not important at all, unless you experience cases where wearing a certain shade has resulted in a significant difference in average landing distance. Even so, you are to collect enough evidence (over 100 samples) for better accuracy of the statistical analysis to make clear conclusions. You may then conclude that since brides with a certain eyeliner shade may cause poor landing performances, you either refuse to deliver them to their grooms or they must apply their makeup after being delivered to their grooms. This playful example is to illustrate a methodology for treating dependent and independent variables and identifying the few that are critical to making sound decisions.
The variables that are important, also known as the critical few variables, are further categorized based on their effect on the dependent
flight.indb 169
flight.indb 169 1/25/2019 5:27:17 PM1/25/2019 5:27:17 PM
or response variable (e.g., landing distance). If their significance is within the chosen confidence level, they are statistically significant. It is possible to have few critical variables whose significance is above 99.9 percent. A linear regression assumes a linear relationship between the response parameter and the vital few predictors, while the nonlinear regression assumes a nonlinear relationship among them. Note that an error (or residual) in regression analysis is defined for a particular combination of independent variables as the difference between the actual observed value and the value predicted by the regression. Ideally, though normally unlikely for experimental data, the residual would be zero and that would make the regression model (linear or nonlinear) function a perfect fit. A summary of regression analysis is usually given in the form of a table that lists the variance (mean of the square of deviation), square of variance, sum-squared (summation of the errors squared), and the root-mean-squared of the data.
Obviously, the smaller the sum-squared, the less deviation from the mean value, and as a result tighter data distribution and less error. In addition, a probability plot shows the probability distribution of the residual, and it should follow a straight line in case of a perfect fit.
There are sometimes unexpected variables that come into play that you had not thought about before. You may assign a percentage to them (e.g., 5 percent) and announce that the criteria for validity of your statistical analysis—or at least what you hope to achieve—is presented within a confidence level of 95 percent. It is paramount to critically evaluate the results of a regression analysis. There are goodness-of-fit tests that may be performed in order to evaluate the applicability of the results using different techniques and confidence levels. The data may be correlated (similar trends) or cointegrated (similar trend and stationary spread). They may or may not be interrelated in any of these scenarios. For example, although you may find a similar trend in the increasing population of gibbons and plush aircraft, there is not really any real relationship between the two. The increase of the bottom line and reduction of product waste—
Transportation, Inventory, Motion, Waiting, Over-production, Over- processing, Defects, and Skills (TIMWOODS)—however, are interrelated, correlated, and often cointegrated.
In the following section, the relationship between an independent variable—for example, a propeller’s angular speed (RPM), pressure altitude (PA), and temperature (T)—for a fixed-pitch propeller scenario and dependent variables—for example, horsepower (BHP), true airspeed (KTAS), and fuel usage (GPH)—is described by fitted equations.
Regression analysis is employed to model these characteristics for the
flight.indb 170
flight.indb 170 1/25/2019 5:27:17 PM1/25/2019 5:27:17 PM
Cessna 172M, and the model is presented with a confidence level above 95 percent. Tables and figures—for example, Figure 70 and Table 28—are given that present the summary of the analyses and also the probability plot for the model fit. When the effect of a few selected critical-to-quality performance variables is not statistically significant with a confidence level above 95 percent, new variables are introduced, or higher-level interactions are included in the model. Except for those scenarios, selected critical variables are statistically significant. As mentioned previously, the data may be correlated, cointegrated, interrelated, or any combination of these.
The examples presented as follows are to represent this relationship and what it would mean if you were to make an educated and non-educated combination of the variables or a grouping of the responses as one of the predictors. The most common-sense scenario (e.g., rate of fuel usage versus the engine angular speed, temperature, and pressure altitude) is the first scenario whose regression analysis is presented in the following sections.
The first step to make these analyses is to collect the related data from the POH; for example, one can use the data for cruise performance.
The next step is to input the data into the statistical tool in its preferred method of communication. These data are usually imported in separate columns with identifying headings for each. The statistical tools are called in as the next step, where the response and predictors are identified as well as the analysis method, dominant mathematical correlation techniques, and confidence level in which the results are to be entered. You may incorporate a linear or nonlinear relationship between the dependent and independent variables. You may also define a correlation method between two, three, or all the independent variables. For example, if Ff x y z( , , ), F may take any of the forms axby cz , ax2by1/2cz,axbxy cxyz , and so on—which present linear, nonlinear, and interrelated scenarios with cross-predictor terms. You can find thirty-four combinations of the three terms (x, y, and z) that include the interactions through the third order, cross predictors, and terms in the model. Assuming that F = f(x, y), you are able to obtain nine relations and interrelations based on the aforementioned configuration. Assuming that F = f(x, y, z, t), sixty-nine combinations may be derived. Figure 69 provides a relationship between the number of critical few variables and possible combinations. As you see, a cubic polynomial is a very good model for this relation, with a coefficient of determination (R2 value) over 99.9 percent. The R2 value describes the proportion of the variance in the dependent variable predictable by the model from the independent variables. A curve is fitted to the data so that in the future, one can predict the fitted parameter for any values between those used to estimate the model (known as interpolation). Predicting
flight.indb 171
flight.indb 171 1/25/2019 5:27:17 PM1/25/2019 5:27:17 PM
outside the range of input data (extrapolation) is also possible, but there is greater likelihood that the model will deviate from the actual data.
A diagram may then be presented to compare the results of the regression analysis to the raw data for specific sets of input variables.
For a higher number of independent variables (predictors), it is recommended that you pick the critical few variables judiciously as well as the order of the cross predictors. Note you cannot form a trend with only two data points—at least three are required. To investigate the influence of the second or higher order variables and their interactions, the second or higher order analysis may be performed. Although this may increase the confidence level, it makes the process, the interpretation of its results, and its implementation more complex, which discourages the use of higher order equations. Recall the truncation and roundoff errors, decreasing one far away from its optimum value increases the other one significantly. The author recommends not to engage in complex relationships unless they are worth it, which is not the case in the majority of scenarios. The analyst may end up paying much more than they gain with no substantial added benefits. In this scenario, for instance, you are to tackle additional second order or higher order terms, and this requires additional resources (e.g., computer capability, human knowledge, and associated costs in the form of money and time) for dubious gain. Perfection is to know when to stop being perfect. Keep It Simple, Stupid (KISS), “less is more” [161].
FIGURE 69 Possible numbers of the combinations versus the number of variables for the regression analysis.
flight.indb 172
flight.indb 172 1/25/2019 5:27:22 PM1/25/2019 5:27:22 PM
8.2.1 Pressure Altitude and Temperature
Let us investigate a relatively simple example for a regression analysis of the temperature variation with altitude. Recall that the gradient of temperature change with altitude presents the atmospheric lapse rate. This investigation is independent of the aircraft type—it only depends on the atmospheric lapse rate. The raw data uses a standard-adiabatic lapse rate of 1.98 °C (2 °C) per 1,000 ft (6.5 °C/km) altitude with the standard sea level temperature of 15 °C (288 K, 59°F). Depending on the lapse rate (i.e., reduction of temperature with altitude), which is directly related to the atmospheric conditions (i.e., dry, standard, saturated, or a combination of three in different layers), different rates are estimated. As an example, the relationship between pressure altitude in feet above the MSL (PA) and temperature in degrees Celsius (T) is presented by equation (135). Model fit is presented with a confidence level of 100 percent. In this scenario, the temperature is the response variable, and the pressure altitude is the predictor.
The analysis summary (Table 28) and the probability plot (Figure 70) are presented as well. The confidence level is 100 percent.
( ) 15 0.002 ( )
T C PA ft (135)
TABLE 28 Summary of the regression analysis model for the temperature as a function of the pressure altitude presented by equation (135).
FIGURE 70 Normal probability plot for the regression analysis showing the relation between the temperature and pressure altitude presented by equation (135).
flight.indb 173
flight.indb 173 1/25/2019 5:27:22 PM1/25/2019 5:27:22 PM
Temperature (T) versus the pressure altitude (PA) is shown in Figure 71.
It is seen that the fitted diagram closely follows the regression analysis results presented by equation (135). This method validates the regression analysis and also provides a visualization aid for understanding this relationship.