Statistical Analysis for Extreme Value with Applications to Hydrological Events

LIST OF ABBREVIATIONS AND SYMBOL

INTRODUCTION

Objective of the research
Scope of the study

Scope of study to performance of maximum penalized likelihood estimator with maximum likelihood estimator and L-moments estimator for four-
Scope of study to propose the r-largest order statistics for four parameter kappa distribution (r-K4D)

Expected outcome

One extreme value distribution with a related GEV distribution is the four-parameter kappa distribution (K4D) introduced by Hosking (1994). That is, the four parameters include a location parameter (µ), a scale parameter (σ) and two shape parameters (k ; h). It can be considered a generalization of the three-parameter kappa distribution (K3D), the generalized Pareto distribution (GPD), the generalized logistic distribution (GLO), the Gumbel distribution, the exponential distribution, and the logistic distribution. This problem motivates the development of the MLE method for K4D by adapting the penalty function of Coles and Dixon (1999) considering an exponential penalty function.

The four-parameter kappa (K4D) distribution, a generalization of the three-parameter normal distributions and in particular the GEV distribution, is well known. This is the development of a new r-largest order statistics model for K4D for hydrological events.

Figure 1.1: Relationship of a four-parameter kappa distribution (K4D) to other distributions, Jeong et al

LITERATURE REVIEW

Order statistics

Extreme value theory

7 measurement on a regular time scale – perhaps an hourly measurement of a sea level, or daily average temperatures – so that Mn represents the maximum of the process over n units of observation time. This is similar to the common practice of approximating the distribution of sample means by the normal distribution, as justified by the central limit theorem. This difficulty is avoided by allowing a linear renormalization of the variable Mn :Mn∗ = Mn−bn.

Appropriate choices of an and bn stabilize the location and measure of Mn∗ as n increases. The sense that this is an analogy of the Central Limit Theorem (CLT) comes from the fact that these three extreme value distributions are the only possible limits for Mn∗ distributions, regardless of the parent F distribution of Coles (2001).

Probability density function

Generalized extreme value distribution

Figures 2.2 and 2.3 present some possible shapes (pdf) of K4D for different combinations of shape parameters (k, h), where location (µ) and scale (σ) parameters are fixed to 0. The boundaries of the sample space are easily derived. in cases where h = 0, they can be calculated by noting that. The density of the K4D distribution, when plotted, can basically be one of four shapes determined by the parameter value of k and h.

In summary, the density can show a maximum, a minimum, or one of two different shapes if there is no extremum. For figure 2.3 fix k positive value and figure 2.4 fix k negative value and varying h obviously affects the left tail and right tail.

Figure 2.1: Probability density function of GEV distribution 2.3.2 Four parameter kappa distribution

The r largest-order statistics model

The likelihood for this model is obtained by absorbing the unknown scaling coefficients in location and scaling parameters in the usual way and by taking products across the block. 15 The likelihood equations 2.8 and 2.9 or, more commonly, the corresponding log-likelihood, can be maximized numerically to obtain maximum likelihood estimates. In the spatial case of r = 1, the likelihood function reduces to the likelihood of the GEV model for block maxima.

In general, the model of the largest order r statistic yields a likelihood whose parameters correspond to those of the GEV distribution of block maxima, but which includes more of the extreme observed data. So, relative to a standard maximum block analysis, the interpretation of the parameters is unchanged, but the accuracy should be improved due to the inclusion of additional information.

Estimation method

Maximum likelihood estimation

Assuming that X1, X2, .., Xn are independent variables with GEV distribution, the probability for the GEV parameters when k 6= 0. and the log-likelihood of GEV is. 2.13). For a combination of parameters for which Equation 2.13 is violated, corresponding to a configuration for which at least one of the observed data does not exceed the endpoint of the distribution, the probability is zero and the log-likelihood is equal. The maximization equation 2.12 with respect to the vector of parameters (µ, σ, k) leads to a maximum likelihood estimate with respect to the entire family of GEVs.

Maximum likelihood estimation for K4D distribution

Profile likelihood estimation

Basic concept of profile likelihood function
Profile likelihood estimation of parameters and quantile of extreme value distribution

Review the generalized extreme value distribution explain to four parameter kappa distribution
Criterion

Modified prediction absolute error
Goodness of fit Test

Related research of four kappa distribution
Related research of the r largest order statistics model

The L-moments of K4D exist if and only if the mean of its distribution is finite. It can be shown that the L-moment ratios τ3 and τ4 are a function only of the shape parameters h and k. For each k0 it is possible to obtain the maximum value of the probability function, i.e. the value of the profile likelihood function k. Then, the profile likelihood estimate value ˆk of k is obtained by maximizing the profile likelihood function.

The distributions are based on the asymptotic joint distribution of the largest r values in a single sample, and the method of estimation is numerical maximum likelihood. Thus, observations that do not agree with the proposed distribution can be identified and the validity of the model can be assessed.

Figure 2.5: L-moments ratios τ 3 and τ 4 for the K4D. the graph shows τ 4 as a function of τ 3 as k varies, for fixed h

METHODOLOGY

Research methodology of propose and compare the efficiencies maximum penalized likelihood estimator

Maximum penalized likelihood estimation using penalty of Cole and Dixon Penalized likelihood is a straightforward method of incorporating into an
Maximum penalized likelihood estimation using penalty of Martins and Stedinger
Step of Simulation study
Step of applications with hydrology data

To propose the r-largest order statistics for four parameter kappa distribution (r-K4D)

Large values of α in the penalty function correspond to a more severe relative penalty for the value of k that is large but less than 1, while λ determines the overall weighting attached to the penalty. The penalty function (on a logarithmic scale) is shown in Figure 3.1 for different values of α and λ. In the year 2000, Martins and Stedinger (2000) propose penalty function as a reasonable penalty function for the shape parameter as shown in Figure 3.3, the penalty function used here is the beta distribution.

The MSP penalty function appears in equation 3.6 and develop the penalty function of the MSP display in equation 3.7 to equation 3.8. Develop r-greatest order Investigation of the r-greatest order statistical model for four parameter kappa distribution.

Figure 3.2: Penalty function for various of plotted against h

RESULTS AND DISCUSSION

Result of compare the performance of maximum penalized likelihood estimator (MPLE) with maximum likelihood estimator and L-moment

Simulation Study

The MPLE.MS3 was found to be the most efficient method among MPLE.MS1 to MPLE.MS3 and MPLE.MSP1 to MPLE.MSP3 when the k value was negative and close to zero. The MPLE.MS3 method had an increase in RBIAS when the k-value was negative, but when the k-value was close to 0, the RBIAS would decrease. However, considering the RRMSE, there was a small difference in the RRMSE in the MPLE.MS3 and MPLE.MSP3 methods.

For the k-value that was negative and close to 0, the MPLE.MS3 method had the least RRMSE, compared to that of the LM and MLE methods. For the k-value that was positive, the MPLE.MSP3 method had the least RRMSE, compared to that of all methods in the quantiles studied. Regarding the RRMSE, there is no significant difference in the MPLE.MS3 and MPLE.MSP3 methods.

Regarding the k value, when the k value was negative and close to 0, it was found that MPLE.MS3 had the smallest RRMSE and MPLE.MSP3 had the smaller RRMSE than the LM and MLE methods. Regarding the case that the k-value was positive, MPLE.MSP3 had the smallest RRMSE compared to other methods in the quantiles examined. It was found that MPLE.MSP3 had the smallest RBIAS when the h value was negative.

In terms of RRMSE, there was no significance difference in MPLE.MS3 and MPLE.MSP3. Regarding the k value, when it was negative and close to 0, MPLE.MS3 had the smallest RRMSE, and MPLE.MSP3 had the smaller RRMSE than that of the LM and MLE methods. Regarding the k-value when positive, MPLE.MSP3 had the smallest RRMSE compared to others in the quantiles studied.

When the k value was positive, MPLE.MSP3 was the most efficient, compared to that of the MLE, MPLE.MS3 and LM methods when the k value was negative. When the k value was positive, MPLE.MSP3 was the most efficient, compared to that of the MLE, MPLE.MS3 and LM methods (except when h = -1.2) when the k value was negative.

Table 4.2: The RBIAS and RRMSE value of the estimates of K4D obtained by four estimation methods for k = −0.4, −1.2 ≤ h ≤ 1.2 and n = 30

Rainfall data

Studying of the r-largest order statistics model

The annual maximum temperature data was measured in Surin, Thailand, recorded in the period 1990 to 2018 for 29 years. The result from MPAE and AD Based on the goodness-of-fit criteria are approximately identical for different methods of all estimation methods in , therefore the MPLE.MS3 method is equally effective for fitting temperature data. But in this case LM estimate cannot calculate, MPLE.MS2 and MPLE.MSP1 estimate can be used alternatively.

Additionally, Figures 4.38a and 4.38b present the fitted density histogram and the qq-plot appears consistent for the data for the four-estimate method. Smith (1986) presents a family of statistical distributions for extreme values based on a fixed number r >= 1 of the largest events. Assume that the normalizing constants an and bn can be found such that (Mn−bn)/an converges in distribution.

Let an and bn be as before and consider, for fixed r, the joint distribution of. Then, under exactly the same condition as Equation 4.1, this joint distribution converges to that of a vector (X1, .., Xn) for which an explicit formula is available. The four parameter kappa distribution introduced by Hosking (1994) is a generalized form of the GEV when h = 0.

It is a candidate to be fitted to data when these three parameter distributions provide an inadequate fit, or when the experimenter does not want to be committed to using a particular three parameter distribution. The case h → 0, the joint probability density function in equation 4.4 reduces to the statistical model of r-largest order of GEV in equation 4.3. The marginal pdf of X2 is derived as follows by the integral of f(x1, .., xr) with respect to x1.

Figure 4.37: Histogram, Q–Q plot and Profile likelihood for 20-year return level of the K4D for annual maximum rainfall data

CONCLUSIONS

Performance of MLE, all MPLE and LM method
Application to Hydrological Data
The r-largest order statistics of K4D in case of r=2
Recommendation

MSP2, MPLE.MSP3, and LM methods for four-parameter kappa distribution (K4D) RBIAS and RRMSE criteria. First, for positive value of k Compare the performance of MLE, the best of MPLE.CD (MPLE.CD5), the best of MPLE.MS and MPLE.MSP (MPLE.MSP3), and the LM method for all quantities of estimated, for all sample sizes (n = 30,50 and 100) MPLE.MSP3 outperforms the MLE, MPLE, CD5, MPLE.MS3 and LM estimators in terms of RRMSE for all quantiles except for h = -1.2. Second, for negative and close to 0 value of k Compare the performance of MLE, the best of MPLE.CD (MPLE.CD5), the best of MPLE.MS and MPLE.MSP (MPLE.MS3), and the LM method for all quantiles evaluated, for all sample size (n = 30,50 and 100) MPLE.MSP3 and MPLE.MS3 perform better than MLE, MPLE, CD5 and LM estimation in terms of RRMSE for all quantiles.

In the application with two sets of hydrological data, the maximum rainfall data was measured in Pattaya, Thailand, and the annual maximum temperature data was measured in Surin Province, Thailand. These precipitation data show that MPLE.CD5, MPLE.MS3, and MPLE.MSP3 match the K4D precipitation data reasonably well. Annual maximum temperature data using the MPLE.MS3 method are equally effective for fitting temperature data with 95%.

Furthermore, MPLE.MSP3 estimates suggest that this could be a generalized distribution for the heavy tail containing four parameters. For the r-largest order statistics for kappa distribution with four parameters in the case of r=2, the MLE, MPLE and LM parameter estimation method for hydrological data can be applied.

APPENDIXES

APPENDIXES A