1.
Exploratory Data Analysis
1.4. EDA Case Studies 1.4.2. Case Studies 1.4.2.9. Airplane Polished Window Strength
|
|||
Goal | The goal of this analysis is to determine a good distributional model for these data. A secondary goal is to provide estimates for various percent points of the data. Percent points provide an answer to questions of the type "What is the polished window strength for the weakest 5% of the data?". | ||
Initial Plots of the Data |
The first step is to generate a
histogram to get an
overall feel for the data.
The histogram shows the following:
The normal probability plot has a correlation coefficient of 0.980. We can use this number as a reference baseline when comparing the performance of other distributional fits. |
||
Other Potential Distributions |
There is a large number of distributions that would be
distributional model candidates for the data. However, we will
restrict ourselves to consideration of the following
distributional models because these have proven to be useful in
reliability studies.
|
||
Approach |
There are two basic questions that need to be addressed.
If the distribution does not have a shape parameter, we simply generate a probability plot.
If the distribution does have a shape parameter, then we are actually addressing a family of distributions rather than a single distribution. We first need to find the optimal value of the shape parameter. The PPCC plot can be used to determine the optimal parameter. We will use the PPCC plots in two stages. The first stage will be over a broad range of parameter values while the second stage will be in the neighborhood of the largest values. Although we could go further than two stages, for practical purposes two stages is sufficient. After determining an optimal value for the shape parameter, we use the probability plot as above to obtain estimates of the location and scale parameters and to determine the PPCC value. This PPCC value can be compared to the PPCC values obtained from other distributional models. |
||
Analyses for Specific Distributions |
We analyzed the data using the approach described above for the
following distributional models:
|
||
Summary of Results |
The results are summarized below.
Estimate of location = 30.81 Estimate of scale = 7.38
Estimate of shape = 2.13 Estimate of location = 15.9 Estimate of scale = 16.92
Estimate of shape = 0.18 Estimate of location = -9.96 Estimate of scale = 40.17
Estimate of shape = 11.8 Estimate of location = 5.19 Estimate of scale = 2.17
Estimate of shape = 0.05 Estimate of location = 19.0 Estimate of scale = 2.4
Estimate of shape = 0.18 Estimate of location = -11.0 Estimate of scale = 41.3 |
||
Percent Point Estimates |
The final step in this analysis is to compute
percent point estimates
for the 1%, 2.5%, 5%, 95%, 97.5%, and 99% percent points.
A percent point estimate is an estimate of the strength at
which a given percentage of units will be weaker.
For example, the 5% point is the strength at which we estimate
that 5% of the units will be weaker.
To calculate these values, we use the Weibull percent point function with the appropriate estimates of the shape, location, and scale parameters. The Weibull percent point function can be computed in many general purpose statistical software programs, including Dataplot. Dataplot generated the following estimates for the percent points: Estimated percent points using Weibull Distribution PERCENT POINT POLISHED WINDOW STRENGTH 0.01 17.86 0.02 18.92 0.05 20.10 0.95 44.21 0.97 47.11 0.99 50.53 |
||
Quantitative Measures of Goodness of Fit |
Although it is generally unnecessary, we can include quantitative
measures of distributional goodness-of-fit. Three of the commonly
used measures are:
|
||
Normal Anderson-Darling Output |
ANDERSON-DARLING 1-SAMPLE TEST THAT THE DATA CAME FROM A NORMAL DISTRIBUTION 1. STATISTICS: NUMBER OF OBSERVATIONS = 31 MEAN = 30.81142 STANDARD DEVIATION = 7.253381 ANDERSON-DARLING TEST STATISTIC VALUE = 0.5321903 ADJUSTED TEST STATISTIC VALUE = 0.5870153 2. CRITICAL VALUES: 90 % POINT = 0.6160000 95 % POINT = 0.7350000 97.5 % POINT = 0.8610000 99 % POINT = 1.021000 3. CONCLUSION (AT THE 5% LEVEL): THE DATA DO COME FROM A NORMAL DISTRIBUTION. |
||
Lognormal Anderson-Darling Output |
ANDERSON-DARLING 1-SAMPLE TEST THAT THE DATA CAME FROM A LOGNORMAL DISTRIBUTION 1. STATISTICS: NUMBER OF OBSERVATIONS = 31 MEAN OF LOG OF DATA = 3.401242 STANDARD DEVIATION OF LOG OF DATA = 0.2349026 ANDERSON-DARLING TEST STATISTIC VALUE = 0.3888340 ADJUSTED TEST STATISTIC VALUE = 0.4288908 2. CRITICAL VALUES: 90 % POINT = 0.6160000 95 % POINT = 0.7350000 97.5 % POINT = 0.8610000 99 % POINT = 1.021000 3. CONCLUSION (AT THE 5% LEVEL): THE DATA DO COME FROM A LOGNORMAL DISTRIBUTION. |
||
Weibull Anderson-Darling Output |
ANDERSON-DARLING 1-SAMPLE TEST THAT THE DATA CAME FROM A WEIBULL DISTRIBUTION 1. STATISTICS: NUMBER OF OBSERVATIONS = 31 MEAN = 14.91142 STANDARD DEVIATION = 7.253381 SHAPE PARAMETER = 2.237495 SCALE PARAMETER = 16.87868 ANDERSON-DARLING TEST STATISTIC VALUE = 0.3623638 ADJUSTED TEST STATISTIC VALUE = 0.3753803 2. CRITICAL VALUES: 90 % POINT = 0.6370000 95 % POINT = 0.7570000 97.5 % POINT = 0.8770000 99 % POINT = 1.038000 3. CONCLUSION (AT THE 5% LEVEL): THE DATA DO COME FROM A WEIBULL DISTRIBUTION.Note that for the Weibull distribution, the Anderson-Darling test is actually testing the 2-parameter Weibull distribution (based on maximum likelihood estimates), not the 3-parameter Weibull distribution. To give a more accurate comparison, we subtract the location parameter (15.9) as estimated by the PPCC plot/probability plot technique before applying the Anderson-Darling test. |
||
Conclusions |
The Anderson-Darling test passes all three of these distributions.
Note that the value of the Anderson-Darling test statistic is
the smallest for the Weibull distribution with the value for
the lognormal distribution just slightly larger. The test
statistic for the normal distribution is noticeably higher
than for the Weibull or lognormal.
This provides additional confirmation that either the Weibull or lognormal distribution fits this data better than the normal distribution with the Weibull providing a slightly better fit than the lognormal. |