CHAPTER 2 Graphics Commands


DATAPLOT supports a wide range of graphics commands.

The PLOT command is the primary command for generating 2D graphs. It supports numerous formats and options and is both flexible and powerful. In addition, DATAPLOT provides a large number of specialized graphics formats. These can be broken down into the following 9 categories.

PLOT Generate a plot of variables and/or functions.

ERROR BAR PLOT Generate a plot with error bars.

VECTOR PLOT Generate a vector plot (pairs of points connected with arrows).

3-D Plots

3D-PLOT Generate a 3-dimensional plot of variables and/or functions.

CONTOUR PLOT Generate a contour plot.

Distributional Plots

... HISTOGRAM Generate a histogram (absolute or relative frequencies, cumulative frequencies).

... BIHISTOGRAM Generate a bihistogram (absolute or relative frequencies).

... FREQUENCY PLOT Generate a frequency plot (absolute or relative frequencies, cumulative frequencies).

... ROOTOGRAM Generate a rootogram.

STEM AND LEAF PLOT Generate a stem and leaf plot.

PIE CHART Generate a pie chart.

... PROBABILITY PLOT Generate a probability plot (24 distributions).

... PPCC PLOT Generate a probability plot correlation coefficient plot (9 distributions).

NORMAL PLOT Generate a Normal plot (a variation of the normal probability plot).

BOX-COX NORMALITY PLOT Generate a Box-Cox normality plot.

BOX-COX LINEARITY PLOT Generate a Box-Cox linearity plot.

BOX-COX HOMO PLOT Generate a Box-Cox homoscedasticity plot.

PERCENT POINT PLOT Generate a percent point plot (also known as a quantile plot).

QUANTILE-QUANTILE PLOT Generate a quantile-quantile plot.

SYMMETRY PLOT Generate a symmetry plot.

4 PLOT Generate a 4 plot (run sequence plot, lag plot, histogram, normal probability plot on one page).

Time Series

RUN SEQUENCE PLOT Generate a run sequence plot.

LAG ... PLOT Generate a lag plot.

... CORRELATION PLOT Generate an auto-, cross-, or partial auto-correlation plot.

... SPECTRAL PLOT Generate various types of spectral plots.

... PERIODOGRAM Generate a periodogram plot.

AV PLOT Generate an Allan variance plot.

ASD PLOT Generate an Allan standard deviation plot.

COMPLEX DEMODULATION PLOT Generate a complex demodulation plot.

Multivariate Plots

ANDREWS PLOT Generate Andrews curves.

PROFILE PLOT Generate a profile plot.

STAR PLOT Generate a star plot.

SYMBOL PLOT Generate a symbol plot (plot character attributes such as size and color set by other variables).

Analysis of Variance, Design of Experiments

... BLOCK PLOT Generate a block plot.

BOX PLOT Generate a box plot.

DEX ... PLOT Generate design of experiment plots (15 different plot formats).

I PLOT Generate an I plot.

YOUDEN PLOT Generate a Youden plot.

Statistics Plots

ANOP PLOT Generate an Analysis of proportions plot.

BOOTSTRAP ... STATISTIC PLOT Generate a bootstrap plot for a given statistic (for over 30 statistics).

JACKNIFE ... STATISTIC PLOT Generate a jacknife plot for a given statistic (for over 30 statistics).

HOMOSCEDASTICITY PLOT Generate a homoscedasticity plot.

PHASE PLANE DIAGRAM Generate a phase plane diagram.

FRACTAL PLOT Generate a fractal plot.

PARETO PLOT Generate a Pareto plot.

... STATISTIC PLOT Generate a statistic (versus data subset) plot (for over 30 statistics). These plots are documented individually as the AUTOCORRELATION STATISTIC PLOT, AUTOCOVARIANCE PLOT, COUNTS PLOT, CP PLOT, CPK PLOT, DECILE PLOT, EXPECTED LOSS PLOT, EXTREME PLOT, HINGE PLOT, KURTOSIS PLOT, LINEAR CORRELATION PLOT, LINEAR INTERCEPT PLOT, LINEAR RESSD PLOT, LINEAR SLOPE PLOT, MAXIMUM PLOT, MEAN PLOT, MEDIAN PLOT, MIDMEAN PLOT, MIDRANGE PLOT, MINIMUM PLOT, NORMAL PPCC PLOT, PERCENT DEFECTIVE PLOT, PRODUCT PLOT, QUARTILE PLOT, RANGE PLOT, RELATIVE STANDARD DEVIATION PLOT, RELATIVE VARIANCE PLOT, SINE AMPLITUDE PLOT, SINE FREQUENCY PLOT, SKEWNESS PLOT, STANDARD DEVIATION PLOT, STANDARD DEVIATION OF MEAN PLOT, SUM PLOT, TAGUCHI SN0, TAGUCHI SN+, TAGUCHI SN-, TAGUCHI SN00, TRIMMED MEAN PLOT, VARIANCE PLOT, VARIANCE OF THE MEAN PLOT, and WINDSORIZED MEAN PLOT.

6-PLOT Generate 6 plots useful for analyzing fits.

Reliability, Extreme Value Analysis

TAIL AREA PLOT Generate a tail area plot.

WEIBULL PLOT Generate a Weibull plot.

CME PLOT Generate a conditional mean exceedance plot.

Quality Control

XBAR CHART Generate a mean control chart.

R CHART Generate a range control chart.

S CHART Generate a standard deviation control chart.

C CONTROL CHART Generate a C control chart.

U CONTROL CHART Generate a U control chart.

P CONTROL CHART Generate an P control chart.

NP CONTROL CHART Generate an NP control chart.

Q ... CONTROL CHART Generate Quesenberry type control charts.

The ... in some of the commands indicates user-defined options for the command, as in

NORMAL PROBABILITY PLOT, UNIFORM PROBABILITY PLOT, etc.

AUTOCORRELATION PLOT, CROSS-CORRELATION PLOT

MEAN CONTROL CHART, RANGE CONTROL CHART, etc.

DATAPLOT provides a great deal of flexibility in controlling the elements of a plot. For example, each plot trace can be drawn as a line, a character, a bar, or a spike. The settings for each of these is independent of the others. The chapter on Plot Control documents the commands for controlling the plot features.

The flexibility in controlling the plot attributes and features means that it is possible to create specialized chart formats not listed above. For example, various types of bar charts can be created from the standard PLOT command (and setting the LINE and BAR attributes appropriately). Check the various LINE, CHARACTER, BAR, and SPIKE attribute setting commands in the Plot Control chapter. It is also straightforward to change the default appearance of the supported charts with these attribute setting commands.

There are separate chapters that discuss topics such as available line styles and plot characters.

Multiple curves per plot

DATAPLOT can generate multiple curves per plot. For example,

LINES SOLID BLANK

CHARACTER BLANK O

PLOT Y1 Y2 VS X

PLOT Y1 VS X1 AND

PLOT Y2 VS X2

The first plot command draws two curves (Y1 and Y2) against a common x coordinate while the second PLOT command plots two curves with different x coordinates.

When drawing multiple curves, DATAPLOT uses the concept of ``traces.'' A trace is a connected set of points. Points in the same trace are plotted with the same attributes. In the above example, Y1 is trace 1 and Y2 is trace 2. This is used when setting the attributes for a curve. For example, Y1 is drawn as a solid line with no character while Y2 is drawn as an O with no connected line.

The attribute setting commands (LINE, CHARACTER, LINE COLOR, LINE THICKNESS, etc.) specify the attributes for up to 100 traces. When a plot is generated, the first trace uses the first entry from each of the attribute setting commands, the second trace uses the second entries, and so on.

The ability to define traces is also useful in creating specialized chart formats. For example, it is easy to create a curve that is solid, then dashed for a certain number of points, and then solid again. This is done by creating a ``tag'' variable and then entering a command like PLOT Y X TAG. The variable TAG

identifies those points in Y and X which are plotted with common attributes. The documentation for the PLOT command discusses the use of tag variables in more detail.

Overlaying plots

By default, DATAPLOT erases the screen at the beginning of a plot. The PRE-ERASE OFF command suppresses this initial screen erase and can be used to overlay plots on the same frame. If you do this, be sure to use the XLIMITS and YLIMITS command to set constant scales. The command LIMITS FREEZE (after the first plot) can be used if you do not know what your data limits are (this assumes the scales set for the first plot will contain the data for subsequent plots which may or may not be true). You may also want to suppress certain plot elements for subsequent plots (e.g., FRAME OFF, TIC MARKS OFF, TIC LABELS OFF) to avoid redrawing significant portions of a plot. However, this step is not required (it just saves plotting time).

Multiple plots per page

DATAPLOT provides two methods for positioning multiple plots on a page. The MULTIPLOT command allows you to specify an arbitrary number of rows and columns. It splits the page up into these rows and columns and each subsequent command that generates a plot simply moves to the next row and column position on the page. There is no restriction on the type of plot command that you can use. See the MULTIPLOT command in the Plot Control chapter for details.

The WINDOW CORNER COORDINATES and the FRAME CORNER COORDINATES commands let you specify the portion of the page to use for the next plot. These can be used in conjunction with the PRE-ERASE OFF command to generate multiple plots per page. These commands are documented in the Plot Control chapter.

Using the WINDOW CORNER COORDINATES (or the FRAME CORNER COORDINATES) command is not as easy as using the MULTIPLOT command, but it is more flexible in that you can position the plot anywhere. This allows you to do some things that the MULTIPLOT command does not. For example, you can draw a small plot inside the frame of regular size plot.

Plotting data subsets

DATAPLOT allows data subsets to be plotted by using the keywords SUBSET, EXCEPT, or FOR. The The SUBSET and EXCEPT keywords selects subsets based on the values of one or more variables. The variable defining the subset need not be one of the variables being plotted. The FOR keyword selects subsets based on the row number (e.g., plot the first 10 rows or plot every fifth row). The use of these keywords is documented in detail in the Keywords chapter of this volume.

ALLAN STANDARD DEVIATION PLOT

PURPOSE

Generates an Allan standard deviation plot in order to examine the low-frequency component of a spectrum of an equi-spaced time series, and to estimate the exponent in a low-frequency power-law spectral model.

DESCRIPTION

The Allan standard deviation plot is a graphical data analysis technique for examining the low-frequency component of a time series. The horizontal axis is the subsample size (up to N/2). The vertical axis is the Allan standard deviation (ASD(K)), which is the standard deviation of the squared deltas as defined below. For subsample size 1:

delta1 = x(1)-x(2)

delta2 = x(3)-x(4)

delta3 = x(5)-x(6)

...

deltan = x(n-1)-x(n)

For subsample size 2:

delta1 = (x(1)+x(2))-(x(3)+x(4))

delta2 = (x(5)+x(6))-(x(7)+x(8))

...

For subsample size 3:

delta1 = (x(1)+x(2)+x(3))-(x(4)+x(5)+x(6))

delta2 = (x(7)+x(8)+x(9))-(x(10)+x(11)+x(12))

...

The Allan standard deviation plot is usually viewed on a loglog scale. A common frequency domain model for the spectrum S(w) of a low-frequency time series is the power-law:

There is a one-to-one correspondence between the slope of the loglog spectrum (the a) and the slope of the loglog Allan standard deviation plot:

Time Series Model Slope of Loglog Spectrum Slope of Loglog ASD Plot

a (-a-1)

----------------------------------------------------------------------------------------

Random Walk -2 1

Flicker -1 0

White Noise 0 -1

Super Flicker 1 -2

Super White 2 -3

If one has a time series with a dominant low-frequency component, then the Allan standard deviation plot is a useful tool for assessing the nature of the low-frequency component and for estimating the power (a) of the power-law spectral power-law model. The slope of the Allan standard deviation plot indicates the nature of the underlying time series model.

The response variable must have at least 3 elements.

SYNTAX

ALLAN STANDARD DEVIATION PLOT <y1> <SUBSET/EXCEPT/FOR qualification>
where <y1> is a response variable;

and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

ALLAN STANDARD DEVIATION PLOT Y
ASD PLOT Y

NOTE 1

The Allan variance plot and the Allan standard deviation plot have equivalent information content (and differ only by a factor of 2). The Allan variance plot is more heavily used than the Allan standard deviation plot.

DEFAULT

None

SYNONYMS

ASD PLOT

RELATED COMMANDS

SPECTRAL PLOT = Generates a spectral plot.
ALLAN VARIANCE PLOT = Generate an Allan variance plot.

REFERENCE

Dave Allan, NIST in Boulder

APPLICATIONS

Frequency Time Series Analysis

IMPLEMENTATION DATE

87/1

PROGRAM 1

. THIS IS AN EXAMPLE OF AN ALLAN SD PLOT
. FOR WHITE NOISE DATA S(W) = W**0
. (THUS THE LOGLOG SPECTRUM HAS SLOPE 0 AND
. AND THE ALLAN SD PLOT HAS SLOPE (-(0)-1) = -1
LET Y = NORMAL RANDOM NUMBERS FOR I = 1 1 500
TITLE WHITE NOISE
MULTIPLOT 2 2; MULTIPLOT CORNER COORDINATES 0 0 100 100
GRID ON; X3LABEL AUTOMATIC
PLOT Y; SPECTRUM Y
LOGLOG; SPECTRUM Y; ALLAN STANDARD DEVIATION PLOT Y
END OF MULTIPLOT

PROGRAM 2

. THIS IS AN EXAMPLE OF AN ALLAN SD PLOT
. FOR RANDOM WALK DATA S(W) = W**(-2)
. (THUS THE LOGLOG SPECTRUM HAS SLOPE -2 AND
. AND THE ALLAN VARIANCE PLOT HAS SLOPE (-(-2)-1) = 1
SKIP 25
READ RANDWALK.DAT Y
TITLE RANDOM WALK
MULTIPLOT 2 2; MULTIPLOT CORNER COORDINATES 0 0 100 100
GRID ON
X3LABEL AUTOMATIC
PLOT Y
SPECTRUM Y
LOGLOG
SPECTRUM Y
ALLAN STANDARD DEVIATION PLOT Y
END OF MULTIPLOT

PROGRAM 3

. THIS IS AN EXAMPLE OF AN ALLAN SD PLOT
. FOR FLICKER NOISE DATA S(W) = W**(-1)
. (THUS THE LOGLOG SPECTRUM HAS SLOPE -1 AND
. AND THE ALLAN VARIANCE PLOT HAS SLOPE (-(-1)-1) = 0
SKIP 25
READ FLICKER.DAT Y
TITLE FLICKER DATA
MULTIPLOT 2 2; MULTIPLOT CORNER COORDINATES 0 0 100 100
GRID ON
X3LABEL AUTOMATIC
PLOT Y
SPECTRUM Y
LOGLOG
SPECTRUM Y
ALLAN STANDARD DEVIATION PLOT Y
END OF MULTIPLOT

ALLAN VARIANCE PLOT

PURPOSE

Generates an Allan variance plot in order to examine the low-frequency component of a spectrum of an equi-spaced time series, and to estimate the exponent in a low-frequency power-law spectral model.

DESCRIPTION

The Allan variance plot is a graphical data analysis technique for examining the low-frequency component of a time series. The horizontal axis is the subsample size (up to N/2). The vertical axis is the Allan variance (AV(K)), which is the variance of the squared deltas as defined below. For subsample size 1:

delta1 = x(1)-x(2)

delta2 = x(3)-x(4)

delta3 = x(5)-x(6)

...

deltan = x(n-1)-x(n)

For subsample size 2:

delta1 = (x(1)+x(2))-(x(3)+x(4))

delta2 = (x(5)+x(6))-(x(7)+x(8))

...

For subsample size 3:

delta1 = (x(1)+x(2)+x(3))-(x(4)+x(5)+x(6))

delta2 = (x(7)+x(8)+x(9))-(x(10)+x(11)+x(12))

...

...

The Allan variance plot is usually viewed on a loglog scale. A common frequency domain model for the spectrum S(w) of a low-frequency time series is the power-law:

There is a one-to-one correspondence between the slope of the loglog spectrum (the a) and the slope of the loglog Allan variance plot:

Time Series Model Slope of Loglog Spectrum Slope of Loglog AV Plot

a (-a-1)/2

----------------------------------------------------------------------------------------

Random Walk -2 0.5

Flicker -1 0

White Noise 0 -0.5

Super Flicker 1 -1

Super White 2 -1.5

If one has a time series with a dominant low-frequency component, then the Allan variance plot is a useful tool for assessing the nature of the low-frequency component and for estimating the power (a) of the power-law spectral power-law model. The slope of the Allan variance plot indicates the nature of the underlying time series model.

The response variable must have at least 3 elements.

SYNTAX

ALLAN VARIANCE PLOT <y1> <SUBSET/EXCEPT/FOR qualification>
where <y1> is a response variable;

and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

ALLAN VARIANCE PLOT Y

AV PLOT Y

NOTE 1

The Allan variance plot and the Allan standard deviation plot have equivalent information content (and differ only by a factor of 2). The Allan variance plot is more heavily used than the Allan standard deviation plot.

DEFAULT

None

SYNONYMS

AV PLOT

RELATED COMMANDS

SPECTRAL PLOT = Generates a spectral plot.
ALLAN STAND DEVIATION PLOT = Generates an Allan standard deviation plot.

REFERENCE

Dave Allan, NIST in Boulder

APPLICATIONS

Frequency Time Series Analysis

IMPLEMENTATION DATE

87/1

PROGRAM 1

. THIS IS AN EXAMPLE OF AN ALLAN VARAIANCE PLOT
. FOR WHITE NOISE DATA S(W) = W**0
. (THUS THE LOGLOG SPECTRUM HAS SLOPE 0 AND
. AND THE ALLAN SD PLOT HAS SLOPE (-(0)-1)/2 = -1/2
LET Y = NORMAL RANDOM NUMBERS FOR I = 1 1 500
TITLE WHITE NOISE
MULTIPLOT 2 2; MULTIPLOT CORNER COORDINATES 0 0 100 100
GRID ON; X3LABEL AUTOMATIC
PLOT Y; SPECTRUM Y
LOGLOG; SPECTRUM Y; ALLAN VARIANCE PLOT Y
END OF MULTIPLOT

PROGRAM 2

. THIS IS AN EXAMPLE OF AN ALLAN VARIANCE PLOT
. FOR RANDOM WALK DATA S(W) = W**(-2)
. (THUS THE LOGLOG SPECTRUM HAS SLOPE -2 AND
. AND THE ALLAN VARIANCE PLOT HAS SLOPE (-(-2)-1)/2 = 1/2
SKIP 25; READ RANDWALK.DAT Y
TITLE RANDOM WALK
MULTIPLOT 2 2; MULTIPLOT CORNER COORDINATES 0 0 100 100
GRID ON; X3LABEL AUTOMATIC
PLOT Y; SPECTRUM Y
LOGLOG; SPECTRUM Y; ALLAN VARIANCE PLOT Y
END OF MULTIPLOT

PROGRAM 3

. THIS IS AN EXAMPLE OF AN ALLAN VARIANCE PLOT
. FOR FLICKER NOISE DATA S(W) = W**(-1)
. (THUS THE LOGLOG SPECTRUM HAS SLOPE -1 AND
. AND THE ALLAN VARIANCE PLOT HAS SLOPE (-(-1)-1)/2 = 0
SKIP 25; READ FLICKER.DAT Y
TITLE FLICKER DATA
MULTIPLOT 2 2; MULTIPLOT CORNER COORDINATES 0 0 100 100
GRID ON; X3LABEL AUTOMATIC
PLOT Y; SPECTRUM Y
LOGLOG; SPECTRUM Y; ALLAN VARIANCE PLOT Y
END OF MULTIPLOT

ANDREWS PLOT

PURPOSE

Generates an Andrews plot.

DESCRIPTION

An Andrews plot is a graphical data analysis technique for plotting multivariate data. An Andrews curve applies the following transformation to a set of data:

where t goes from -p to p and X1, X2, etc. are the columns (i.e., variables) of data. One Andrews curve is generated for each row of data. As usual, the LINE, LINE COLOR, and LINE THICKNESS commands can be used to control the attributes of the curves. Andrews curves are used to distinguish which observations (i.e., rows) are most alike.

SYNTAX

ANDREWS PLOT <y1> <y2> ... <yk> <SUBSET/EXCEPT/FOR qualification>
where <y1> through <yk> are the response variables;

and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

ANDREWS PLOT Y1 Y2 Y3 Y4 Y5
ANDREWS PLOT Y1 Y2 Y3 Y4 Y5 SUBSET TAG > 2

NOTE 1

The increment for t in the transformation can be set with the ANDREWS INCREMENT command. It defaults to 0.1.

NOTE 2

Andrews curves are order dependent. The first few variables tend to dominate, so it is a good idea to put the most important variables first. Some analysts recommend running a principle components analysis first and generating Andrews curves for the principle components.

A related plot which is not order dependent is the parallel coordinates plot. The plot is divided into a series of parallel axes (one for each variable). An observation is then generated by plotting its value on each axis and connecting them between axes with a line. Although DATAPLOT does not generate this plot directly, a macro to generate one is demonstrated in program example 2.

With both of these plots, individual cases can be difficult to follow. It can sometimes help to draw the plot for all cases and then draw the plot over various subsets of interest (with the subsets having a limited number of cases).

NOTE 3

Up to 20 variables can be used.

NOTE 4

The TO syntax is allowed on this command. For example:

ANDREWS PLOT Y1 TO Y10

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

LINES = Sets type for plot lines.
LINE COLOR = Sets color for plot lines.
LINE THICKNESS = Sets thickness for plot lines.
PLOT = Generates a data or function plot.
STAR PLOT = Generates a star plot.
PROFILE PLOT = Generates a profile plot.
ANDREWS INCREMENT = Specify the x axis increment when generating Andrews curves.

REFERENCE

``Graphical Exploratory Data Analysis,'' du Toit, Steyn, and Stumpf, Springer-Verlang, 1986.

``Hyperdimensional Data Analysis Using Parallel Coordinates,'' E. J. Wegman, Journal of the American Statistical Association, 85, 664-675.

APPLICATION

Multivariate Analysis

IMPLEMENTATION DATE

92/12

PROGRAM 1

ROW LIMITS 26 50
COLUMN LIMITS 20 132
READ AUTO79.DAT Y1 TO Y9
MULTIPLOT 2 3; MULTIPLOT CORNER COORDINATES 0 0 100 100
TITLE AUTOMATIC; TITLE SIZE 3.0
TIC LABEL SIZE 3
XLIMITS -3 3; XTIC OFFSET 0.2 0.2; MAJOR XTIC MARK NUMBER 7
YLIMITS 0 15000; YTIC OFFSET 1000 1000; MAJOR YTIC MARK NUMBER 18
ANDREWS PLOT Y1 TO Y9
LINES SOLID DASH DOT SOLID DASH DOT
LINE THICKNESS 0.1 0.1 0.1 0.3 0.3 0.3
ANDREWS PLOT Y1 TO Y9 FOR I = 1 1 5
ANDREWS PLOT Y1 TO Y9 FOR I = 6 1 10
ANDREWS PLOT Y1 TO Y9 FOR I = 11 1 15
ANDREWS PLOT Y1 TO Y9 FOR I = 16 1 20
ANDREWS PLOT Y1 TO Y9 FOR I = 21 1 25
END OF MULTIPLOT

PROGRAM 2

DIMENSION 20 COLUMNS
LET P = 9
ROW LIMITS 26 50; COLUMN LIMITS 20 132
READ AUTO79.DAT X1 TO X^P
.
LET N = SIZE X1; LET 2N = 2*N; LET TEMP = N + 1
LET Y = 0 FOR I = 1 1 N; LET Y = 1 FOR I = TEMP 1 2N
LET TAG = SEQUENCE 1 1 N FOR I = 1 1 2N
.
LOOP FOR K = 1 1 P
LET M = MEAN X^K; LET SD = STANDARD DEVIATION X^K
LET X^K = (X^K - M)/SD
END OF LOOP
.
LET TEMP = P - 1
MULTIPLOT TEMP 1; MULTIPLOT CORNER COORDINATES 5 5 95 95
FRAME CORNER COORDINATES 5 0 95 100; YFRAME OFF
TIC LABELS OFF; TIC MARKS OFF
YLIMITS 0 1; XLIMITS -3.5 3.5
LEGEND 1 COORDINATES 4.5 98; LEGEND JUSTIFICATION RIGHT; LEGEND SIZE 12
.
LOOP FOR K = P -1 2
LET A = K - 1; LET B = K; LET X = X^A
EXTEND X X^B; LEGEND 1 X^B
PLOT Y X TAG
END OF LOOP
JUSTIFICATION RIGHT; MOVE 4.5 1.5; HEIGHT 12; TEXT X1
END OF MULTIPLOT
HEIGHT; JUSTIFICATION CENTER; MOVE 50 97; TEXT PARALLEL COORDINATES PLOT

ANOP PLOT

PURPOSE

Generates an analysis of proportions plot.

DESCRIPTION

An analysis of proportions plot performs a graphical analysis of proportions. In an analysis of proportions, the values that a response variable can have are divided into two mutually exclusive groups (commonly called ``successes'' and ``failures''). The response variable is generated for various levels of a factor variable (i.e., the factor variable identifies groups). The analysis of proportions plot is formed as follows:

Horizontal axis = distinct values (i.e., levels) of the factor variable;
Vertical axis = for each distinct value of horizontal axis, calculate the proportion of the first response variable falling within some user defined limits (i.e., the proportion of successes).
In addition, a horizontal line is drawn representing the proportion of successes for the entire response variable.

SYNTAX

ANOP PLOT <y1> <tag> <SUBSET/EXCEPT/FOR qualification>
where <y1> is a response variable;

<tag> is a factor variable;

and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

ANOP PLOT Y1 TAG
ANOP PLOT Y1 TAG SUBSET TAG > 3

NOTE 1

The ANOP LIMITS command is used to define the lower and upper limits for calculating the proportion.

NOTE 2

The proportion is plotted as a percentage (i.e., 0 to 100 scale rather than 0 to 1 scale).

NOTE 3

By default, the proportions are drawn as a connected line segment. Some users may prefer to draw them as distinct points. This is demonstrated in the PROGRAM section below.

NOTE 4

The ANOP PLOT only supports one factor variable at this time.

DEFAULT

None

SYNONYMS

PROPORTION PLOT

RELATED COMMANDS

ANOP LIMITS = Sets the limits for calculating the proportion.
LINE = Sets the line types.
CHARACTER = Sets the plot characters.
PLOT = Generates a data or function plot.

APPLICATION

Analysis of Proportions

IMPLEMENTATION DATE

87/6

PROGRAM

SKIP 25
READ GEAR.DAT Y1 TAG
ANOP LIMITS 0.99 1.01
CHARACTER CIRCLE BLANK
LINE BLANK SOLID
X1LABEL LEVELS OF FACTOR VARIABLE
Y1LABEL PERCENTAGE OF SUCCESSES
XTIC OFFSET 0.5 0.5
YTIC OFFSET 0.5 0.5
ANOP PLOT Y1 TAG

AUTOCORRELATION STATISTIC PLOT

PURPOSE

Generates a subsample autocorrelation versus subsample index plot.

DESCRIPTION

The subsample autocorrelation is the autocorrelation of the data in the subsample. This plot is used to answer the question: ``Does the subsample autocorrelation change over different subsamples?'' The plot consists of:

Vertical axis = subsample autocorrelation;
Horizontal axis = subsample index.
In addition, a horizontal line is drawn representing the full sample autocorrelation. The appearance of the 2 traces is controlled by the first 2 settings of the LINES, CHARACTERS, SPIKES, BARS, and similar attributes.

SYNTAX

AUTOCORRELATION STATISTICS PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is a response variable;

<x> is a subsample identifier variable (this variable appears on the horizontal axis);

and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

AUTOCORRELATION STATISTICS PLOT Y X
AUTO STAT PLOT Y X1

NOTE

AUTOCORRELATION PLOT is a distinct command (see the documentation for the CORRELATION PLOT command for details), so the word STATISTICS is required.

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

CHARACTERS = Sets the character symbols for plot characters.
LINES = Sets the line types for plot lines.
MEAN PLOT = Generates a mean versus subset plot.
STANDARD DEVIATION PLOT = Generates a standard deviation versus subset plot.
CORRELATION PLOT = Generates an auto-, cross-, or partial auto-correlation plot.
AUTOCORRELATION = Computes the autocorrelation of a variable.
PLOT = Generates a data or function plot.

APPLICATIONS

Exploratory Data Analysis

IMPLEMENTATION DATE

88/2

PROGRAM

SKIP 25

READ SUNSPOT.DAT Y MONTH

CHARACTER CIRCLE BLANK

LINE BLANK SOLID

XLIMITS 1 12

XTIC OFFSET 0.5 0.5

X1TIC MARK LABEL FORMAT ALPHA

X1TIC MARK LABEL CONTENTS JAN FEB MARCH APRIL MAY JUNE JULY AUG SEP ...

OCT NOV DEC

MINOR XTIC MARK NUMBER 0

Y1LABEL AUTOCORRELATION

TITLE AUTOMATIC

AUTOCORRELATION STAT PLOT Y MONTH

AUTOCOVARIANCE PLOT

PURPOSE

Generates a subsample autocovariance versus subsample index plot.

DESCRIPTION

The subsample autocovariance is the autocovariance of the data in the subsample. This plot is used to answer the question: ``Does the subsample autocovariance change over different subsamples?'' The plot consists of:

Vertical axis = subsample autocovariance;
Horizontal axis = subsample index.
In addition, a horizontal line is drawn representing the full sample autocovariance. The appearance of the 2 traces is controlled by the first 2 settings of the LINES, CHARACTERS, SPIKES, BARS, and similar attributes.

SYNTAX

AUTOCOVARIANCE PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is a response (i.e., dependent) variable;
<x> is a subsample identifier variable (this variable appears on the horizontal axis);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

AUTOCOVARIANCE PLOT Y X
AUTOCOVARIANCE PLOT Y X1 SUBSET X1 > 5

DEFAULT

None

SYNONYMS

AUTOCOVARIANCE STATISTIC PLOT

RELATED COMMANDS

CHARACTER = Sets the types for plot characters.
LINE = Sets the types for plot lines.
AUTOCOVARIANCE = Computes the autocovariance of a variable.
AUTOCORRELATION STAT PLOT = Generates an autocorrelation versus subset plot.
MEAN PLOT = Generates a mean versus subset plot.
STANDARD DEVIATION PLOT = Generates a standard deviation versus subset plot.
PLOT = Generates a data or function plot.

APPLICATIONS

Exploratory Data Analysis

IMPLEMENTATION DATE

88/2

PROGRAM

SKIP 25

READ SUNSPOT.DAT Y MONTH

CHARACTER CIRCLE BLANK

LINE BLANK SOLID

XLIMITS 1 12

XTIC OFFSET 0.5 0.5

X1TIC MARK LABEL FORMAT ALPHA

X1TIC MARK LABEL CONTENTS JAN FEB MARCH APRIL MAY JUNE JULY AUG ...

SEP OCT NOV DEC

MINOR XTIC MARK NUMBER 0

Y1LABEL AUTOCOVARIANCE

AUTOCOVARIANCE STAT PLOT Y MONTH

BAR PLOT

NOTE

This command is obsolete. Although it still works, the preferred method is to use the BAR command in conjunction with the PLOT command. This is documented under the BAR command in the Plot Control chapter.

PURPOSE

Generate a bar chart.

DESCRIPTION

Bar charts are commonly used in business and presentation graphics. The following types of bars are commonly produced:

2. Grouped bar charts (bars are drawn for 2 or more groups of data);
3. Stacked (or divided) bar charts (the bar is divided into several intervals).
The BAR PLOT command is only useful for generating standard bar charts (see the NOTE section below for an explanation).

DATAPLOT provides commands to control the various aspects of the bar (see the RELATED COMMANDS section). Many of these are also demonstrated with the sample programs. The BAR PLOT command treats observations with the same X axis value as a common trace when setting attributes.

SYNTAX

BAR PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is a response variable;

<x> is a group identifier variable;

and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

BAR PLOT Y X
BAR PLOT Y X SUBSET X > 2

NOTE

The BAR PLOT command and the PLOT command using the BAR switch define traces differently. For example, BAR PLOT Y X treats each distinct value of X as defining a separate trace. PLOT Y X treats Y as one trace. PLOT Y1 Y2 VS X treats Y1 as one trace and Y2 as another trace. PLOT Y X TAG treats each distinct value of TAG as defining a separate trace. This distinction between PLOT and BAR PLOT is relevant when assigning attributes to the bars (e.g., when using the BAR PATTERN command) since these commands use traces. That is, the first setting is applied to the first trace, the second setting is applied to the second trace, and so on.

Creating grouped and stacked bars is easier with the PLOT command used in conjunction with the BAR switch. The problem with grouped bars using the BAR PLOT command is that the BAR PLOT essentially functions as a histogram with a fixed width class interval. Setting up the data for groups typically violates this assumption. It can be done by using a separate BAR PLOT command for each group (and using the LIMITS and PRE-ERASE OFF commands). However, it is simpler with the standard PLOT command. The problem with divided bar charts is that it is difficult to give corresponding bars the same fill pattern (the BAR PLOT command gives bars with the same X value the same pattern).

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

PLOT = Generates a data or function plot.
CHARACTERS = Sets the character type for plot points.
LINES = Sets the line type for plot points.
SPIKES = Sets the on/off switches for plot spikes.
BAR = Sets the on/off switches for plot bars.
BAR BASE = Sets the base location for bars on plots
BAR FILL = Sets the on/off switches for plot bar fills.
BAR FILL COLOR = Sets the colors of the bar fills.
BAR DIMENSION = Sets the bar dimension to 2d or 3d.
BAR DIRECTION = Sets the bar direction to horizontal or vertical.
BAR PATTERN = Sets the types for bar fill patterns.
BAR PATTERN COLOR = Sets the colors for bar fill patterns.
BAR PATTERN LINE = Sets the line types for bar fill patterns.
BAR PATTERN SPACING = Sets the line spacings for bar fill patterns.
BAR PATTERN THICK = Sets the line thicknesses for bar fill patterns.
BAR BORDER COLOR = Sets the colors for bar border lines.
BAR BORDER LINE = Sets the types for bar border lines.
BAR BORDER THICKNESS = Sets the line thicknesses for bar border lines.
BAR WIDTH = Sets the width of plot bars.

REFERENCE

``Statistical Graphics,'' Calvin Schmid, John F. Wiley and Sons, 1979.

APPLICATIONS

Presentation Graphics

IMPLEMENTATION DATE

Pre-1987

PROGRAM

ORIENTATION PORTRAIT
LET X = DATA 81 82 83 84 85
LET Y = DATA 2 5 9 15 28
.
YTIC MARK SIZE 1.2
X1TIC MARK LABEL FORMAT ALPHA
XLIMITS 81 85
XTIC OFFSET 1 1
X1TIC LABEL CONTENT 1981 1982 1983 1984 1985
X1LABEL YEAR
MINOR X1TIC MARK NUMBER 0
Y1LABEL SALES (IN MILLIONS OF DOLLARS)
YLIMITS 0 30
MAJOR YTIC MARK NUMBER 4
MINOR YTIC MARK NUMBER 1
.
MULTIPLOT 3 2; MULTIPLOT CORNER COORDINATES 0 0 100 100
.
TITLE BAR CHART WITH NO OPTIONS
BAR PLOT Y X
.
BAR WIDTH .5 ALL
TITLE BAR CHART WITH USER DEFINED BAR WIDTH
BAR PLOT Y X
.
BAR DIMENSION 3 ALL
TITLE BAR CHART WITH 3-DIMENSIONAL EFFECT
BAR PLOT Y X
.
BAR FILL ONTS ALL
TITLE BAR CHART WITH 3-DIMENSIONAL EFFECT, FILLED
BAR PLOT Y X
.
TITLE DEMONSTRATE A FILL PATTERN
BAR DIMENSION 2 ALL
BAR FILL ON ALL
BAR PATTERN HORI VERT D1 D2 D1D2
BAR PATTERN SPACING 1 1 4 4 4
BAR PATTERN THICKNESS 0.1 ALL
BAR PLOT Y X
.
HORIZONTAL SWITCH ON
Y1TIC MARK LABEL FORMAT ALPHA
YLIMITS 81 85
YTIC OFFSET 1 1
Y1TIC LABEL CONTENT 1981 1982 1983 1984 1985
Y1LABEL YEAR
MAJOR Y1TIC MARK NUMBER 5
MINOR Y1TIC MARK NUMBER 0
X1LABEL SALES (IN MILLIONS OF DOLLARS)
XTIC OFFSET 0 0
X1TIC MARK LABEL FORMAT DEFAULT
MINOR X1TIC MARK NUMBER DEFAULT
XLIMITS 0 30
MAJOR XTIC MARK NUMBER 4
MINOR XTIC MARK NUMBER 1
TITLE VERTICAL BAR CHART
BAR PATTERN SOLID ALL
BAR FILL COLOR G15 G30 G45 G60 G75
BAR PLOT Y X
.
MULTIPLOT OFF

... BIHISTOGRAM

PURPOSE

Generates a bihistogram.

DESCRIPTION

The bihistogram is a graphical data analysis technique for summarizing and comparing the distributions of 2 data sets. It is a graphical alternative for the various classical 2-sample tests (e.g., t for location, F for dispersion). Frequencies (or relative frequencies) are plotted on the vertical axis while the response variable is plotted on the horizontal axis.

There are 2 types of bihistograms:

2. relative bihistogram (relative frequencies are plotted).
The (relative) bihistogram is a plot consisting of 2 (relative) histograms. The (relative) histogram for data set 1 is positioned above the zero-line while the (relative) histogram for data set 2 is positioned below the zero-line. The advantage of the bihistogram is 2-fold:

2. many distributional aspects may be simultaneously tested--shifts in location, shifts in dispersion, changes in symmetry/skewness, outliers, etc.

SYNTAX 1

BIHISTOGRAM <y1> <y2> <SUBSET/EXCEPT/FOR qualification>
where <y1> is the first response variable;

<y2> is the second response variable;

and where the <SUBSET/EXCEPT/FOR qualification> is optional.

SYNTAX 2

RELATIVE BIHISTOGRAM <y1> <y2> <SUBSET/EXCEPT/FOR qualification>
where <y1> is the first response variable;

<y2> is the second response variable;

and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

BIHISTOGRAM Y1 Y2
RELATIVE BIHISTOGRAM Y1 Y2
BIHISTOGRAM Y1 Y2 SUBSET AUTO 4
BIHISTOGRAM Y1 Y2 SUBSET STATE < 25

NOTE 1

The bihistogram is automatically plotted with the bar switch ON. The CHARACTERS and LINES command settings are ignored. The appearance of the bars (e.g., solid filled or filled with a cross-hatch pattern) can be set with the various BAR attribute setting commands. See the example program for the HISTOGRAM command for some examples.

NOTE 2

As with a standard histogram, the class width and the upper and lower class limits can be controlled with the CLASS WIDTH, CLASS LOWER, and CLASS UPPER commands.

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

HISTOGRAM = Generates a histogram.
QUANTILE-QUANTILE PLOT = Generates a quantile-quantile plot.
BOX PLOT = Generates a box plot.
YOUDEN PLOT = Generates a Youden plot.
T-TEST = Carries out a 2 sample t test.
ANOVA = Carries out an ANOVA.
PLOT = Generates a data or function plot.
MULTIPLOT = Allows multiple plots per page.

APPLICATION

Exploratory Data Analysis

IMPLEMENTATION DATE

88/9

PROGRAM

SKIP 25
READ AUTO83B.DAT Y1 Y2
.
LEGEND 1 COMPARING 2 DISTRIBUTIONS
LEGEND 2 BIHISTOGRAM
.
DELETE Y2 SUBSET Y2 < 0
BIHISTOGRAM Y1 Y2

... BLOCK PLOT

PURPOSE

Generates a block plot.

DESCRIPTION

A block plot is a graphical method for representing an analysis of variance problem. The first variable is a response variable while the remaining variables (there must be at least two) represent levels of factors. These levels are typically coded as indices (e.g., 1 for process A, 2 for process B). The <x1> ... <xn> sequence of variables define where and how many blocks are drawn. If n1, n2, .. ,nk represent the number of levels of these variables, there will be n1*n2*...*n3 blocks. If n1=2, n2=2, and n3=3, the blocks will be set up on the X axis as follows (the first example program also demonstrates this):

The groups of block plots are centered around the numeric values for the levels of the x1 variable. Within each block, the levels of the <char> variable are plotted as distinct traces at the values of the corresponding response variable. The levels of <char> are identified by using the CHARACTER command (e.g., CHAR 1 2 3; LINE BL BL BL). A box is drawn around the <char> levels for each unique combination of factor levels (this is where the term block plot comes from). The command BAR EXPANSION controls the height and width of the boxes.

SYNTAX 1

BLOCK PLOT <y> <x1> ... <xn> <char> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response variable in an analysis of variance problem;
<x1> ... <xn> is a sequence of factor variables that define the X axis (there must be at least one, and typically will be between one and three);
<char> represents the levels of an additional factor variable;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax case is used for the no replication case or the case when the replicates are averaged into a single value. Although it can also be used for the replication case, the second syntax is more often used with replication.

SYNTAX 2

<stat> BLOCK PLOT <y> <x1> .. <xn> <char> <SUBSET/EXCEPT/FOR qualification>
where <stat> is one of the following statistics:

MEAN, MIDMEAN, MEDIAN,TRIMMED MEAN, WINDSORIZED MEAN,
NUMBER, SUM, PRODUCT, MINIMUM, MAXIMUM,
SD, VARIANCE, RANGE, RELATIVE STANDARD DEVIATION, MIDRANGE,
AVERAGE ABSOLUTE DEVIATION (AAD), MEDIAN ABSOLUTE DEVIATION (MAD),
VARIANCE OF MEAN, STANDARD DEVIATION OF MEAN,
LOWER QUARTILE, UPPER QUARTILE, LOWER HINGE, UPPER HINGE,
<FIRST/SECOND/THIRD/FOURTH/FIFTH/SIXTH/SEVENTH/EIGTH/NINTH> DECILE,
SKEWNESS, KURTOSIS,
AUTOCORRELATION, AUTOCOVARIANCE,
SINE FREQUENCY, SINE AMPLITUDE,
TAGUCHI SN0, TAGUCHI SN+, TAGUCHI SN-, TAGUCHI SN00;
<y> is the response variable in an analysis of variance problem;
<x1> ... <xn> is a sequence of factor variables that define the X axis (there must be at least one, and typically will be between one and three);
<char> represents the levels of an additional factor variable;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax can be used when there is replication at each of the combinations of factor levels. The requested statistic is calculated for all the response values with the same levels of the factor variables. The <char> variable is plotted at the computed statistic on the vertical axis. MEAN BLOCK PLOT is the most commonly used.

EXAMPLES

BLOCK PLOT Y X1 X2
BLOCK PLOT Y X1 X2 X3
BLOCK PLOT Y X1 X2 X3 X4
MEAN BLOCK PLOT Y X1 X2 X3

NOTE 1

When there are multiple factor variables, it can sometimes be beneficial to repeat the block plot using a different variable as the <char> variable.

NOTE 2

The BLOCK PLOT command saves the internal parameters HEADS, FACES, TRIALS, TAILPROB, AVEDEL, and SDAVEDEL.
These parameters are primarily useful if the <char> variable (i.e., the last variable on the BLOCK PLOT command) has exactly 2 levels. For convenience, call these levels 1 and 2 respectively. DATAPLOT looks at the pattern of the first block (either 12 or 21 where 12 means level 1 is greater than level 2 and 21 means level 2 is greater than level 1). This pattern is designated as heads and is treated as a binomial probability. The parameter TRIALS is the number of boxes and the parameter TAILPROB is the binomial probability of obtaining the number of 12 and 21 patterns that was found in the BLOCK PLOT. The parameter AVEDEL is the average difference between level 1 and level 2 in each of the boxes. The parameter SDAVEDEL is the corresponding standard deviation. AVEDEL is in fact a least squares estimate of the difference between the levels of the factors. The parameter FACES is the number of levels of the factor corresponding to the <char> variable. If FACES is greater than 2, then a multinomial rather than a binomial probability is calculated.

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

LINES = Sets the type for plot lines.
CHARACTER = Sets the type for plot characters
BAR EXPANSION = Sets the bar expansion factors.
BOX PLOT = Generates a box plot
YOUDEN PLOT = Generates a Youden plot.
ANOVA = Carries out an ANOVA.
PLOT = Generates a data or function plot.
DEX PLOT = Generates a design of experiments plot.

APPLICATIONS

Analysis of Variance

IMPLEMENTATION DATE

92/5

PROGRAM 1

SKIP 25
READ SHEESLE2.DAT Y PROC PLANT SPEED SHIFT PROC
.
CHARACTERS A B
LINE BLANK BLANK
.
XLIMITS 1 2
XTIC OFFSET 0.5 0.5
MAJOR XTIC MARK NUMBER 2
MINOR XTIC MARK NUMBER 0
XTIC LABEL FORMAT ALPHA
XTIC LABEL CONTENT PLANTSP()=SP()1 PLANTSP()=SP()2
.
LEGEND JUSTIFICATION CENTER
LEGEND SIZE 1.8
LEGEND 1 SPEED = 1; LEGEND 1 COORDINATES 25 32
LEGEND 2 SPEED = 2; LEGEND 2 COORDINATES 39 32
LEGEND 3 SHIFT = 1, 2, 3; LEGEND 3 COORDINATES 25 88
LEGEND 4 A = WELD PROCESS 1; LEGEND 4 COORDINATES 16 25
LEGEND 5 B = WELD PROCESS 2; LEGEND 5 COORDINATES 16 22
LEGEND 4 JUSTIFICATION LEFT
LEGEND 5 JUSTIFICATION LEFT
SEGMENT 1 COORDINATES 20 87 30 87
Y1LABEL DEFECTIVE LEAD WIRES
BLOCK PLOT Y PLANT SPEED SHIFT PROC

PROGRAM 2

. STEP 1--READ IN THE DATA
.
SKIP 25
READ BOXCAKE.DAT Y X1 X2 X3 X4 X5
DELETE Y X1 X2 X3 X4 X5 FOR I = 1 1 5
.
MULTIPLOT CORNER COORDINATES 0 0 100 100; MULTIPLOT 2 2
CHAR BLANK ALL
CHAR 1 2
LINES SOLID ALL; LINES BL BL
CHAR SIZE 4 ALL
BAR EXPANSION FACTOR 2 1.5
MEAN BLOCK PLOT Y X4 X5 X1
MEAN BLOCK PLOT Y X4 X5 X2
MEAN BLOCK PLOT Y X4 X5 X3
MULTIPLOT OFF

BOOTSTRAP ... PLOT

PURPOSE

Generates a bootstrap plot for one of 30+ statistics.

DESCRIPTION

The bootstrap is a non-parametric method for calculating a sampling distribution for a statistic. Given a sample data set and a desired statistic (e.g., the mean), the bootstrap works by computing the desired statistic for a subsample of the data set. The subsampling is done with replacement and the sample size is equal to the size of the original data set (since the sampling is done with replacement, there can be duplicates). The BOOTSTRAP SAMPLE command (documented under the SUPPORT COMMANDS chapter) specifies the number of bootstrap samples taken. To calculate a bootstrap for a statistic not listed below, check the documentation for the BOOTSTRAP SAMPLE command (in Volume II under the LET subcommands).

For the bootstrap plot, the vertical axis contains the computed value of the statistic and the horizontal axis contains the sample number (for k = 1, 2, ..., N). The number of response variables depends on the number of variables required to compute the statistic (e.g., the MEAN uses one while the LINEAR SLOPE uses two). The bootstrap plot is typically followed by some type of distributional plot such as a histogram.

SYNTAX 1

BOOTSTRAP <stat> PLOT <y> <SUBSET/EXCEPT/FOR qualification>
where <y> is a response variable;

<stat> is one of the following statistics:

MEAN, MIDMEAN, MEDIAN, TRIMMED MEAN, WINDSORIZED MEAN,
SUM, PRODUCT, SIZE (or NUMBER or COUNT), MINIMUM, MAXIMUM,
AVERAGE ABSOLUTE DEVIATION (AAD), MEDIAN ABSOLUTE DEVIATION (MAD),
STANDARD DEVIATION, VARIANCE, STANDARD DEVIATION OF MEAN, VARIANCE OF MEAN,
RELATIVE STANDARD DEVIATION, RELATIVE VARIANCE,
RANGE, MIDRANGE, LOWER HINGE, UPPER HINGE, LOWER QUARTILE, UPPER QUARTILE,
<FIRST/SECOND/THIRD/FOURTH/FIFTH/SIXTH/SEVENTH/EIGTH/NINTH> DECILE,
SKEWNESS, KURTOSIS,
AUTOCORRELATION, AUTOCOVARIANCE, SINE FREQUENCY, SINE AMPLITUDE,
TAGUCHI SN0 (or SN), TAGUCHI SN+ (or SNL), TAGUCHI SN- (or SNS), TAGUCHI SN00 (or SN2);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax is used for statistics requiring one response variable.

SYNTAX 2

BOOTSTRAP <stat> <y1> <y2> <SUBSET/EXCEPT/FOR qualification>
where <y1> is the first response variable;

<y2> is the second response variable;

<stat> is one of the following statistics:

LINEAR INTERCEPT, LINEAR SLOPE, LINEAR RESSD, LINEAR CORRELATION;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax is used for statistics requiring two response variables to compute.

EXAMPLES

BOOTSTRAP MEAN PLOT Y
BOOTSTRAP LINEAR SLOPE PLOT Y1 X1

NOTE 1

The SEED command can be used to specify the seed for generating the random bootstrap samples.

NOTE 2

The jacknife is a similar technique. However, it uses a different resampling scheme. See the JACKNIFE PLOT command for details. The bootstrap is generally considered to provide better results than the jacknife, but it involves more computation.

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

LINE = Sets the type for plot lines.
HISTOGRAM = Generates a histogram.
BOOTSTRAP SAMPLE = Set the sample size for the bootstrap.
JACKNIFE PLOT = Generate a jacknife plot.
PLOT = Generates a data or function plot.

REFERENCE

``A Leisurely Look at the Bootstrap, the Jacknife, and Cross-Validation,'' Efron and Gong, The American Statistician, February, 1983.

APPLICATIONS

Sample Distribution of a Statistic

IMPLEMENTATION DATE

89/2

PROGRAM 1

. PURPOSE--CARRY OUT BOOTSTRAP ANALYSIS OF HARVEY MARSHAK
. NUCLEAR THERMOMETRY DATA. (LOCATION ESTIMATION ANALYSIS)
SKIP 25; READ MARSHAK.DAT Y
TITLE NUCLEAR THERMOMETRY; X3LABEL AUTOMATIC
MULTIPLOT 2 2; MULTIPLOT CORNER COORDINATES 0 0 100 100
BOOTSTRAP MEAN PLOT Y
LET YPLOT2 = YPLOT
HISTOGRAM YPLOT2
BOOTSTRAP MEDIAN PLOT Y
LET YPLOT3 = YPLOT; HISTOGRAM YPLOT3
END OF MULTIPLOT

PROGRAM 2

. PURPOSE--CARRY OUT BOOTSTRAP ANALYSIS OF BERGER ALASKA PIPELINE DATA
. LINEAR FIT SLOPE ANALYSIS
.
SKIP 25
READ BERGER1.DAT Y X
MULTIPLOT 2 2
MULTIPLOT CORNER COORDINATES 0 0 100 100
TITLE ALASKA PIPELINE
CHAR X
LINES
X3LABEL AUTOMATIC
PLOT Y X
LET Y2 = LOG(Y)
LET X2 = LOG(X)
PLOT Y2 X2
BOOTSTRAP LINEAR SLOPE PLOT Y2 X2
LET YPLOT2 = YPLOT
LET S = STANDARD DEVIATION YPLOT2
X2LABEL SD(SLOPE) = 0.022983
HIST YPLOT2
END OF MULIPLOT

BOX-COX HOMOSCEDASTICITY PLOT

PURPOSE

Generates a Box-Cox homoscedasticity plot.

DESCRIPTION

Many statistical procedures (e.g., regression) make assumptions of constant variance relative to the value of an independent variable. For example, in regression it is assumed that the variance of the residuals does not depend on the value of the independent variable. This assumption is generally referred to as homogeneous variances or as homoscedasticity.

A Box-Cox homoscedasticity plot is a graphical technique for determining the Box-Cox transformation that yields the most constant variance of one variable relative to the values of a second variable.

The Box-Cox family is essentially the power-transformation family (adjusted to include log transformations). The form of the family is:

There are various methods for measuring constant variance. The particular method DATAPLOT uses is to divide the first variable into groups with the same value for the second value. For a given value of lambda, the standard deviation is computed for each group. The statistic used is the ratio of the minimum standard deviation to the maximum standard deviation (this ratio will always be between 0 and 1). The plot then consists of this statistic on the vertical axis versus the lambda parameter on the horizontal axis. The lambda corresponding to the highest ratio is the appropriate transformation to use to provide the most constant variance.

This command only applies if there is replication in the second variable.

SYNTAX

BOX-COX HOMOSCEDASTICITY PLOT <y1> <y2> <SUBSET/EXCEPT/FOR qualification>
where <y1> is the dependent variable;
<y2> is the independent variable;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

BOX-COX HOMOSCEDASTICITY PLOT Y1 Y2

NOTE

The number of observations in the 2 response variables must be equal.

DEFAULT

None

SYNONYMS

BOX-COX HOMOGENITY PLOT
BOX COX HOMOGENITY PLOT
BOX COX HOMOSCEDASTICITY PLOT

RELATED COMMANDS

LINES = Sets the types for plot lines.
CHARACTERS = Sets the types for plot characters.
BOX COX LINEARITY PLOT = Generates a Box-Cox linearity plot.
BOX COX NORMALITY PLOT = Generates a Box-Cox normality plot.
PLOT = Generates a data or function plot.

APPLICATIONS

Exploratory Data Analysis

IMPLEMENTATION DATE

94/2 (although earlier versions supported this command, the method used did not give informative results)

PROGRAM

SKIP 25
READ NELSON.DAT Y X1 X2
.
MULTIPLOT 2 2; MULTIPLOT CORNER COORDINATES 0 0 100 100
FIT Y X2
LINE SOLID BLANK; CHARACTER BLANK X
XTIC OFFSET 5
TITLE LINEAR FIT OF RAW DATA
PLOT PRED Y VS X2
.
TITLE BOX-COX HOMOSCEDASTICITY PLOT
X1LABEL LAMBDA; Y1LABEL HOMOSCEDASTICITY MEASURE
XTIC OFFSET 0
BOX-COX HOMOSCEDASTICTY PLOT Y X2
.
LET YTEMP = MAXIMUM YPLOT
RETAIN XPLOT SUBSET YPLOT = YTEMP
LET LAMBDA = XPLOT(1)
LET Y2 = (Y**LAMBDA - 1)/LAMBDA
FIT Y2 X2
TITLE LINEAR FIT OF TRANSFORMED DATA
X1LABEL; Y1LABEL; XTIC OFFSET 5
PLOT PRED Y2 VS X2
END OF MULTIPLOT

BOX-COX LINEARITY PLOT

PURPOSE

Generates a Box-Cox linearity plot.

DESCRIPTION

A Box-Cox linearity plot is a graphical technique for determining the Box-Cox transformation that yields the maximum correlation between two variables.The Box-Cox transformation family is essentially the power-transformation family (adjusted to include log transformations). The form for the family is:

The horizontal axis is the lambda parameter. The vertical axis is the computed correlation coefficient between <y1> and the transformed <y2>. The lambda corresponding to the highest correlation is the appropriate transformation to use in linearizing the relationship between <y1> and <y2>.

SYNTAX

BOX-COX LINEARITY PLOT <y1> <y2> <SUBSET/EXCEPT/FOR qualification>
where <y1> is the first response variable;

<y2> is the second response variable;

and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

BOX-COX LINEARITY PLOT Y1 Y2

NOTE

The number of observations in the 2 response variables must be equal.

DEFAULT

None

SYNONYMS

BOX COX LINEARITY PLOT

RELATED COMMANDS

LINES = Sets the types for plot lines.
CHARACTERS = Sets the types for plot characters.
BOX-COX NORMALITY PLOT = Generates a Box-Cox normality plot.
BOX-COX HOMOSCED PLOT = Generates a Box-Cox homoscedasticity plot.
PLOT = Generates a data or function plot.

APPLICATION

Exploratory Data Analysis

IMPLEMENTATION DATE

87/5

PROGRAM

SKIP 25
READ BERGER1.DAT Y X
.
MULTIPLOT 2 2; MULTIPLOT CORNER COORDINATES 0 0 100 100
FIT Y X
LINE SOLID BLANK
CHARACTER BLANK X
TITLE LINEAR FIT OF RAW DATA
PLOT PRED Y VS X
.
TITLE BOX-COX LINEARITY PLOT
X1LABEL LAMBDA
Y1LABEL CORRELATION COEFFICIENT
BOX-COX LINEARITY PLOT Y X
.
LET LAMBDA = 0.5
LET Y2 = (Y**LAMBDA - 1)/LAMBDA
FIT Y2 X
TITLE LINEAR FIT OF TRANSFORMED DATA
X1LABEL
Y1LABEL
PLOT PRED Y2 VS X
END OF MULTIPLOT

BOX-COX NORMALITY PLOT

PURPOSE

Generates a Box-Cox normality plot.

DESCRIPTION

A Box-Cox normality plot is a graphical data analysis technique for determining the transformation (from the Box-Cox transformation family) that yields a transformed variable that is ``closest'' to being normally distributed. The Box-Cox transformation family is essentially the power-transformation family (adjusted to include log transformations). The form for the family is:

For each of selected members of the Box-Cox family, the transformation is carried out, a normal probability plot is computed, and the linearity of the normal probability plot is summarized via the correlation coefficient. The resulting normality plot thus consists of:

Vertical axis = normal probability plot correlation coefficient;
Horizontal axis = Box-Cox lambda parameter.
The value of the lambda parameter (on the horizontal axis) which corresponds to the maximum of the normal probability plot correlation coefficient curve (on the vertical axis) is, of course, of interest--it indicates the best-transformation member of the family. The normality technique is applicable for general transformation families. The current DATAPLOT implementation has the normality plot implemented only for the Box-Cox family (the most important and common of the various transformation families).

SYNTAX

BOX-COX NORMALITY PLOT <y> <SUBSET/EXCEPT/FOR qualification>
where <y> is a response variable;

and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

BOX-COX NORMALITY PLOT Y

DEFAULT

None

SYNONYMS

BOX COX NORMALITY PLOT

RELATED COMMANDS

LINES = Sets the types for plot lines.
CHARACTERS = Sets the types for plot characters.
BOX-COX LINEARITY PLOT = Generates a Box-Cox linearity plot.
BOX-COX HOMOSCED PLOT = Generates a Box-Cox homoscedasticity plot.
LET = Transforms variables (and many other options).
PROBABILITY PLOT = Generates a probability plot.
PPCC PLOT = Generates a probability plot correlation coefficient plot.
PLOT = Generates a data or function plot.

APPLICATION

Exploratory Data Analysis

IMPLEMENTATION DATE

Pre-1987

PROGRAM

SKIP 25
READ AUTO83B.DAT Y1
.
TITLE AUTOMATIC
MULTIPLOT 2 2; MULTIPLOT CORNER COORDINATES 0 0 100 100
HISTOGRAM Y1
Y1LABEL CORRELATION COEFFICIENT
X1LABEL LAMBDA
BOX-COX NORMALITY PLOT Y1
X1LABEL ; Y1LABEL
LET TEMP = XPLOT
LET ATEMP = MAXIMUM YPLOT
RETAIN TEMP SUBSET YPLOT = ATEMP
LET LAMBDA = TEMP(1)
LET YNEW = (Y1**LAMBDA - 1)/LAMBDA
HISTOGRAM YNEW
END OF MULTIPLOT

... BOX PLOT

PURPOSE

Generates a box plot.

DESCRIPTION

A box plot is a graphical data analysis technique for determining if differences exist between the various levels of a 1-factor model. The box plot is a graphical alternative to 1-factor ANOVA. It is also a useful technique for summarizing and comparing data from 2 or more samples. The box plot consists of:

Vertical axis = response variable;
Horizontal axis = level identification.
The bottom x is the data minimum; the bottom of the box is the estimated 25% point; the middle x in the box is the data median; the top of the box is the estimated 75% point; the top x is the data maximum. The box plot has 24 components (characters and lines) which can be individually controlled. For the box plot to appear as it should, the BOX PLOT command is usually preceded by 2 commands:

CHARACTERS BOX PLOT

LINES BOX PLOT
These commands automatically define the proper values for the 24 components of the box plot. After the box plot is formed, the analyst should redefine plot characters and lines via the usual CHARACTERS and LINES commands.

An alternate form of the box plot, called a mean box plot, is based on means and standard deviations rather than medians and percentiles. The bottom x is the data minimum; the bottom of the box is the mean minus 2 times the standard deviation, the middle x in the box is the mean of the data; the top of the box is the mean plus 2 times the standard deviation; the top x is the data maximum.

For box plots with more than one group, the width of the box plot is proportional the number of elements in the group.

SYNTAX

BOX PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is a response variable;

<x> is a group identifier variable;

and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

BOX PLOT Y X
BOX PLOT Y TAG SUBSET TAG > 3

NOTE 1

Outliers can be identified by entering the FENCES ON command. If the inter-quartile range (i.e., the difference between the 25% point and the 75% point) is IQ, then values that are between 1.5 and 3.0 times the IQ above (or below) the 75% point (or the 25%) point are drawn as circles and points that are more than 3.0 times the IQ above (or below) the 75% point (or the 25%) are drawn as large circles.

For mean box plots 4 times the standard deviation (the distance from the top of the box to the bottom of the box in the mean box plot) is used in the above formulas instead of the interquartile range.

NOTE 2

The width of the box is proportional to the number of data points in that box.

NOTE 3

An alternate form of the box plot can be generated by entering the commands CHARACTERS TUFTE BOX PLOT and LINES TUFTE BOX PLOT. You can also define your own plot symbols with the standard CHARACTER and LINE commands (e.g., you may prefer to use a dash (-) rather than the default X).

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

CHARACTERS = Sets the types for plot characters.
LINES = Sets the types for plot lines.
I PLOT = Generates an I plot.
ANOVA = Carries out an ANOVA.
MEDIAN POLISH = Carries out a median polish.
CONTROL CHART = Generates a control chart.
PLOT = Generates a data or function plot.

APPLICATION

Exploratory Data Analysis, Comparing Distributions

REFERENCE

``Graphical Methods for Data Analysis,'' Chambers, Cleveland, Kleiner, and Tukey, Wadsworth, 1983.

``Exploratory Data Analysis,'' Tukey, Addison-Wesley, 1977.

IMPLEMENTATION DATE

Pre-1987

PROGRAM

SKIP 50

SET READ FORMAT 3F4.0,F5.0,F6.0,F3.0,2F9.0

READ PBF11.DAT YEAR DAY BOT SD F11 FLAG WV CO2

RETAIN YEAR DAY BOT SD F11 WV CO2 FLAG SUBSET FLAG 0

LET MONTH=INT(DAY/30.25)+1

LEGEND 1 1-FACTOR MODELING; LEGEND 2 BOX PLOT

CHARACTERS BOX PLOT; LINES BOX PLOT

XLIMITS 0 15; YMINIMUM .995

FENCES ON

BOX PLOT WV MONTH

C CONTROL CHART

PURPOSE

Generates a (Poisson) counts control chart.

DESCRIPTION

A C chart is a data analysis technique for determining if a measurement process has gone out of statistical control. The C chart is sensitive to changes in the number of defective items in the measurement process. The ``C'' in C CONTROL CHART stands for ``counts'' as in defectives per lot. The C control chart consists of:

Vertical axis = the number defective for each sub-group;
Horizontal axis = sub-group designation.
The C chart assumes that each sub-group has an equal sample size (this sample size does not need to be specified). A sub-group is typically a time sequence (e.g., the number of defectives in a daily production run where each day is considered a sub-group). If the times are equally spaced, the horizontal axis variable can be generated as a sequence (e.g., LET X = SEQUENCE 1 1 N where N is the number of sub-groups).

In addition, horizontal lines are drawn at the mean number of defectives and at the upper and lower control limits. The control limits are calculated as:

where c is the mean number of defectives. Also, zero serves as a lower bound on the LCL.

SYNTAX

C CONTROL CHART <y1> <x> <SUBSET/EXCEPT/FOR qualification>
where <y1> is a variable containing the number of defective items in each sub-group;

<x> is a variable containing the sub-group identifier (usually 1, 2, 3, ...);

and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

C CONTROL CHART Y X
C CONTROL CHART D X SUBSET X > 2

NOTE 1

The distribution of the number of defective items is assumed to be Poisson. This assumption is the basis for the calculating the upper and lower control limits.

NOTE 2

The U CONTROL CHART is similar to the C CONTROL chart. The distinction is that the C CONTROL CHART is used when the material being measured is constant in area and the sub-groups have equal size. The U CONTROL CHART is used when either of these assumptions is not valid.

NOTE 3

The attributes of the 4 traces that make up the C control chart are controlled by the standard LINES, CHARACTERS, SPIKES, and BAR commands. Trace 1 is the response variable, trace 2 is the mean line, and traces 3 and 4 are the upper and lower control limits. Some analysts prefer to draw the response variable as a character or a spike rather than a connected line. The example program demonstrates setting the line attributes (the control lines are drawn as dotted lines).

DEFAULT

None

SYNONYMS

C CHART for C CONTROL CHART

RELATED COMMANDS

U CHART = Generates a U control chart.
P CHART = Generates a P control chart.
NP CHART = Generates an Np control chart.
CONTROL CHART = Generates a mean, standard deviation, or range control chart.
Q CONTROL CHART = Generates Quesenberry style control charts.
CHARACTERS = Sets the types for plot characters.
LINES = Sets the types for plot lines.
SPIKES = Sets the on/off switches for plot spikes.
PLOT = Generates a data or function plot.

REFERENCE

``Guide to Quality Control,'' Kaoru Ishikawa, Asian Productivity Organization, 1982 (Chapter 8).

APPLICATIONS

Quality Control

IMPLEMENTATION DATE

88/2

PROGRAM

SKIP 25
READ CCC.DAT X NUMDEF SIZE
TITLE AUTOMATIC
LINES SOLID SOLID DOT DOT
Y1LABEL NUMBER OF DEFECTIVES
XLABEL SAMPLE ID
XLIMITS 0 20
XTIC OFFSET 0 1
YLIMITS 0 15
YTIC OFFSET 2 0
C CONTROL CHART NUMDEF SIZE X

CME PLOT

PURPOSE

Generates a conditional mean exceedance plot (also known as the mean residual life plot, the mean life expectancy plot, or the Yang plot).

DESCRIPTION

The conditional mean exceedance function (also referred to as the mean residual life function or the expectation of life at t) is defined for a non-negative random variable X as:

It can be interpreted as the expected remaining life after time t given that a unit or individual is of age t.

An exact functional form for the CME function can often be obtained if the distribution function of the original variable is known. However, what DATAPLOT calculates is the empirical CME function. Given a variable, the following steps are performed:

2. at each data point, subtract that value from the remaining data points;
3. calculate the mean of these remaining adjusted data points;
4. plot this series of mean points.
The paper by Guess and Proschan (see the REFERENCE section below) gives the mathematical formulas. It also includes the formulas for handling ties in the data.

DATAPLOT supports several variations of this plot. In step 3 above, the median or midmean can be substituted for the mean. In addition, steps 3 and 4 can be replaced by simply plotting the adjusted remaining data points at the given point.

SYNTAX 1

CME PLOT <x> <SUBSET/EXCEPT/FOR qualification>
where <x> is a response variable;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

SYNTAX 2

CONDITIONAL MEDIAN EXCEEDANCE PLOT <x> <SUBSET/EXCEPT/FOR qualification>
where <x> is a response variable;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax computes the median rather than the mean.

SYNTAX 3

CONDITIONAL MIDMEAN EXCEEDANCE PLOT <x> <SUBSET/EXCEPT/FOR qualification>
where <x> is a response variable;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax computes the midmean rather than the mean.

SYNTAX 4

CONDITIONAL EXCEEDANCE PLOT <x> <SUBSET/EXCEPT/FOR qualification>
where <x> is a response variable;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax plots the remaining adjusted data points.

EXAMPLES

CME PLOT Y1
CME PLOT Y1 SUBSET TAG > 1
CONDITIONAL MEDIAN EXCEEDANCE PLOT Y1

NOTE

DATAPLOT internally fits a line to the resulting plot and saves the following internal parameters:

CMECC - the correlation coefficient of the fitted line
CMEA0 - the intercept term
CMEA1 - the slope term
SDCMEA0 - the standard deviation of the intercept term
SDCMEA1 - the standard deviation of the slope term
CMERESSD - the residual standard deviation
CMERESDF - the residual degrees of freedom
Entering STATUS PARAMETERS is a quick way to see the values of all of these parameters. This is a linear fit, so these parameters may not be meaningful if a straight line is not a good fit to the data.

DEFAULT

None

SYNONYMS

The following are synonyms for CME PLOT:

CONDITIONAL MEAN EXCEEDANCE PLOT
LIFE EXPECTANCY PLOT
MEAN LIFE EXPECTANCY PLOT
MEAN RESIDUAL LIFE PLOT
YANG PLOT
CONDITIONAL SCATTER EXCEEDANCE PLOT is a synonym for CONDITIONAL EXCEEDANCE PLOT.

RELATED COMMANDS

LINES = Sets the type for plot lines.
CHARACTERS = Sets the type for plot characters.
TAIL AREA PLOT = Generates a tail area plot.
PLOT = Generates a data or function plot.

REFERENCE

``Handbook of Statistics, Vol. 7,'' Krishnaiah and Rao, eds. Elsevier Science Publishers B. V., 1988, pp. 215-224. Specifically, the article is ``Mean Residual Life: Theory and Applications,'' Frank Guess and Frank Proschan.

APPLICATIONS

Reliability, Extreme Value Analysis

IMPLEMENTATION DATE

94/2

PROGRAM

SKIP 25
READ WASHDC.DAT Y
.
TITLE AUTOMATIC
LINES BLANK ALL
CHARACTER X ALL
MULTIPLOT 2 2; MULTIPLOT CORNER COORD 0 0 100 100
CONDITIONAL EXCEEDANCE PLOT Y
CME PLOT Y
CONDITIONAL MEDIAN EXCEEDANCE PLOT Y
CONDITIONAL MIDMEAN EXCEEDANCE PLOT Y
END OF MULTIPLOT

COMPLEX DEMODULATION ... PLOT

PURPOSE

Generates a complex demodulation plot.

DESCRIPTION

A complex demodulation plot is a graphical data analysis technique for determining if the amplitude or generating frequency changes over the course of a single-frequency time series. Complex demodulation attempts to model a time series with the following equation:

In this equation, A is the amplitude, f is the phase shift, and W0 is the complex demodulation frequency. Note that A and f vary with time while the complex demodulation frequency is constant. Since A and f are allowed to vary, complex demodulation is sometimes referred to as local harmonic analysis. The goal of complex demodulation is to estimate approximations for At and fit. The mathematical derivations for finding these estimates can be found in the books listed in the REFERENCE section below (DATAPLOT uses the Granger and Hatanaka derivation).

A complex demodulation plot consists of:

Vertical axis = estimated local amplitude or estimated local phase;
Horizontal axis = dummy index 1 to n where n is the number of observations.
The following 2 types of complex demodulation plots are available:

2. complex demodulation phase plot

SYNTAX 1

COMPLEX DEMODULATION AMPLITUDE PLOT <y> <SUBSET/EXCEPT/FOR qualification>
where <y> is a variable that contains the time series observations to be analyzed;

and where the <SUBSET/EXCEPT/FOR qualification> is optional.

SYNTAX 2

COMPLEX DEMODULATION PHASE PLOT <y> <SUBSET/EXCEPT/FOR qualification>
where <y> is a variable that contains the time series observations to be analyzed;

and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

COMPLEX DEMODULATION AMPLITUDE PLOT Y
COMPLEX DEMODULATION PHASE PLOT Y

NOTE 1

Complex demodulation plots are typically drawn with no connected lines and with some type of character. For example,

LINES BLANK
CHARACTER X

NOTE 2

The DEMODULATION FREQUENCY command is required before entering the COMPLEX DEMODULATION PLOT command. The demodulation frequency is the W0 parameter. A spectral plot is typically generated to get an initial estimate for the demodulation frequency.

NOTE 3

The complex demodulation plot can be used to generate a non-linear fit of the single cycle model. For example,

DEMODULATION FREQUENCY FREQ
LET N = SIZE Y
LET T = SEQUENCE 1 1 N
LET CONST = MEAN Y
COMPLEX DEMODULATION PHASE PLOT Y
LET FREQ = <best estimate>
COMPLEX DEMODULATION AMPLITUDE PLOT Y
LET AMP = best estimate
FIT Y = CONST + AMP*SIN(2*3.14159*FREQ *T + PHASE)

NOTE 4

Complex demodulation is typically done iteratively as demonstrated with the following flowchart:

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

DEMOD FREQUENCY = Sets the demodulation frequency for a complex demodulation plot.
DEMODF = A parameter where the updated demodulation frequency is stored.
SPECTRUM = Generates a spectral plot.
PERIODOGRAM = Generates a periodogram.
CORRELATION PLOT = Generates a correlation plot.
LAG PLOT = Generates a lag plot.
PLOT = Generates a data or function plot.
CHARACTERS = Sets the types for plot characters.
LINES = Sets the types for plot lines.
LET = Generates sine or cosine transformations (and much more).
FIT = Carries out a least squares fit.

APPLICATIONS

Frequency based time series analysis

IMPLEMENTATION DATE

Pre-1987

REFERENCE

``Spectral Analysis of Economic Time Series,`` Granger and Hatanaka, Princeton University Press, 1964.

``Fourier Analysis of Time Series: An Introduction,'' Peter Bloomfield, John Wiley and Sons, 1976 (Chapter 6).

PROGRAM

SKIP 25
READ LEW.DAT Y
TITLE AUTOMATIC
MULTIPLOT 2 2; MULTIPLOT CORNER COORDINATES 0 0 100 100
LINE BLANK; SPIKE ON; LET A = MEAN Y; SPIKE BASE A
PLOT Y
SPIKE OFF; LINE SOLID
SPECTRUM Y
DEMODULATION FREQUENCY .3
COMPLEX DEMODULATION AMPLITUDE PLOT Y
DEMODULATION FREQUENCY .3
COMPLEX DEMODULATION PHASE PLOT Y
END OF MULTIPLOT

CONTOUR PLOT

PURPOSE

Generates a contour plot.

DESCRIPTION

A contour plot is a graphical technique for representing a 3-dimensional z = f(x,y) surface by plotting constant-z ``slices'' (contours) on a 2-dimensional format. The contour plot is used heavily in geographic contouring, topography, and mapping.

SYNTAX

CONTOUR PLOT <z> <x> <y> <z0> <SUBSET/EXCEPT/FOR qualification>
where <z> is the response (= dependent) variable;

<x> is one horizontal axis variable;

<y> is the other horizontal axis variable;

<z0> is the variable of desired contour values;

and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

CONTOUR PLOT Z X Y Z0
CONTOUR PLOT Z X Y Z0 SUBSET DAY 6 TO 10
CONTOUR PLOT Z X Y Z0 SUBSET Z < 10

NOTE

The contour command has the following limitations:

2. There is no capability for color or pattern filling between contour levels.
3. The user must specify the desired contour levels (DATAPLOT will not automatically set any).
4. The CONTOUR PLOT command does not handle irregularly gridded data. Sometimes the data can be interpolated to form a grid. However, since there are many ways for the data to be irregular, there are various approaches to doing this type of interpolation. The best method to use depends on the particular data set. DATAPLOT currently supports one method for this type of interpolation. See the documentation for the 2D INTERPOLATION command for details.

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

LINES = Sets the types for plot lines.
3D-PLOT = Generates a 3-d data or function plot.
PLOT = Generates a data or function plot.

REFERENCE

The contouring code was adapted from code provided by David Behringer of NOAA/AOML. NOAA is the National Oceanic and Atmospheric Administration.

APPLICATIONS

3-Dimensional Analysis

IMPLEMENTATION DATE

88/9

PROGRAM

LET X = SEQUENCE -4 1 4 FOR I = 1 1 81
LET Y = SEQUENCE -4 9 1 4
LET Z = X**2+Y**2-X*Y
LET Z0 = SEQEUNCE 5 5 40
TITLE AUTOMATIC
CONTOUR PLOT Z X Y Z0

... CONTROL CHART

PURPOSE

Generates a mean, standard deviation, range, C, U, P, or NP control chart.

DESCRIPTION

A control chart is a data analysis technique for determining if a measurement process has gone out of statistical control. It consists of:

Vertical axis = the mean, range, or standard deviation for each sub-group;
Horizontal axis = sub-group designation.
In addition, horizontal lines are drawn at the mean (i.e., the mean of the means, ranges, or standard deviations) and at the upper and lower control limits.

There are 7 types of control charts available:

2. standard deviation control chart;
3. range control chart;
4. C control chart;
5. U control chart;
6. P control chart;
7. NP control chart.
The first 3 types are used for a measurement process while the remaining 4 are for cases when counting the number of defectives.

SYNTAX 1

XBAR CONTROL CHART <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable (containing the raw data values);

<x> is an independent variable (containing the sub-group identifications);

and where the <SUBSET/EXCEPT/FOR qualification> is optional.

This syntax generates a mean control chart.

SYNTAX 2

R CONTROL CHART <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable (containing the raw data values);

<x> is an independent variable (containing the sub-group identifications);

and where the <SUBSET/EXCEPT/FOR qualification> is optional.

This syntax generates a range control chart.

SYNTAX 3

S CONTROL CHART <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable (containing the raw data values);

<x> is an independent variable (containing the sub-group identifications);

and where the <SUBSET/EXCEPT/FOR qualification> is optional.

This syntax generates a standard deviation control chart.

SYNTAX 4

C CONTROL CHART <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable (containing the raw data values);

<x> is an independent variable (containing the sub-group identifications);

and where the <SUBSET/EXCEPT/FOR qualification> is optional.

This syntax generates a C control chart.

SYNTAX 5

U CONTROL CHART <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable (containing the raw data values);

<x> is an independent variable (containing the sub-group identifications);

and where the <SUBSET/EXCEPT/FOR qualification> is optional.

This syntax generates a U control chart.

SYNTAX 6

P CONTROL CHART <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable (containing the raw data values);

<x> is an independent variable (containing the sub-group identifications);

and where the <SUBSET/EXCEPT/FOR qualification> is optional.

This syntax generates a P control chart.

SYNTAX 7

NP CONTROL CHART <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable (containing the raw data values);

<x> is an independent variable (containing the sub-group identifications);

and where the <SUBSET/EXCEPT/FOR qualification> is optional.

This syntax generates an NP control chart.

EXAMPLES

XBAR CONTROL CHART Y X
R CONTROL CHART Y X
S CONTROL CHART Y X

NOTE 1

For the mean, range, and standard deviation control charts, the distribution of the response variable is assumed to be Normal. For the C and U control charts, the distribution of the response variable is assumed to be Poisson. For the N and NP control charts, the distribution of the response variable is assumed to be binomial. These assumptions are the basis for calculating the upper and lower control limits. Most books on statistical quality control will provide the details on calculating the control limits.

NOTE 2

The attributes of the 4 traces that make up the control chart are controlled by the standard LINES, CHARACTERS, SPIKES, and BAR commands. Trace 1 is the response variable, trace 2 is the mean line, and traces 3 and 4 are the upper and lower control limits. Some analysts prefer to draw the response variable as a character or a spike rather than a connected line.

DEFAULT

None

SYNONYMS

X CONTROL CHART, MEAN CONTROL CHART, AVERAGE CONTROL CHART for XBAR CONTROL CHART

S CHART, STANDARD DEVIATION CONTROL CHART for S CONTROL CHART

R CHART, RANGE CONTROL CHART for RANGE CONTROL CHART

C CHART for C CONTROL HART

U CHART for U CONTROL CHART

P CHART for P CONTROL CHART

NP CHART for NP CONTROL CHART

RELATED COMMANDS

Q ... CONTROL CHART = Generate Quesenberry style control charts.
CHARACTERS = Sets the types for plot characters.
LINES = Sets the types for plot lines.
SPIKES = Sets the on/off switches for plot spikes.
BARS = Sets the on/off switches for plot bars.
PLOT = Generates a data or function plot.
LAG PLOT = Generates a lag plot.
4-PLOT = Generates 4-plot for univariate analysis.
ANOP PLOT = Generates an ANOP plot.

REFERENCE

``Guide to Quality Control,'' Kaoru Ishikawa, Asian Productivity Organization, 1982 (Chapter 7).
Control charts are described in just about any text on statistical quality control.

APPLICATIONS

Quality Control

IMPLEMENTATION DATE

Pre-1987

PROGRAM

SKIP 25
READ GEAR.DAT DIAMETER BATCH
.
TITLE AUTOMATIC
MULTIPLOT 2 2
MULTIPLOT CORNER COORDINATES 0 0 100 100
PLOT DIAMETER
LINE SOLID SOLID DOT DOT
TITLE MEAN CONTROL CHART
MEAN CONTROL CHART DIAMETER BATCH
TITLE RANGE CONTROL CHART
RANGE CONTROL CHART DIAMETER BATCH
TITLE S CONTROL CHART
STANDARD DEVIATION CONTROL CHART DIAMETER BATCH
END OF MULTIPLOT

... CORRELATION PLOT

PURPOSE

Generates an auto-, cross-, or partial auto-correlation plot.

DESCRIPTION

A correlation plot is a graphical data analysis technique for determining if correlation exists between:

2. lags for a single time series after removing the linear dependence of intermediate lags (a partial autocorrelation plot);
3. lags for 2 time series (a cross-correlation plot).
The correlation plot consists of:

Vertical axis = correlation coefficient (a value between -1 and 1);
Horizontal axis = lag (an integer between 1 and n/4 where n is the number of observations).
In addition, vertical lines are drawn at zero and at levels indicating statistically significant correlation.

The autocorrelation plot is used in time series modeling. It can be used in conjunction with the LET and FIT commands to generate Box-Jenkins Auto Regressive (AR) models. It is also used to test for randomness (e.g., autocorrelation plots are often generated for the residuals from a least squares fit). The partial autocorrelation plot is used (typically with the autocorrelation plot as well) in the model identification stage when developing Box-Jenkins ARMA models. Cross-correlation is used for the more complex case of analyzing two distinct time series to see if they are related.

SYNTAX 1

AUTOCORRELATION PLOT <y> <SUBSET/EXCEPT/FOR qualification>
where <y> is a variable containing the time series observations to be analyzed;

and where the <SUBSET/EXCEPT/FOR qualification> is optional.

SYNTAX 2

CROSS-CORRELATION PLOT <y1> <y2> <SUBSET/EXCEPT/FOR qualification>
where <y1> is a variable containing the observations for the first time series;

<y2> is a variable containing the observations for the second time series;

and where the <SUBSET/EXCEPT/FOR qualification> is optional.

SYNTAX 3

PARTIAL AUTOCORRELATION PLOT <y> <SUBSET/EXCEPT/FOR qualification>
where <y> is a variable containing the time series observations to be analyzed;

and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

AUTOCORRELATION PLOT Y
CROSS-CORRELATION PLOT Y1 Y2

NOTE 1

The partial autocorrelation of lag k is the autocorrelation between zt and zt+k with the linear dependence of zt+1 thru zt+k-1 removed.

NOTE 2

DATAPLOT writes some conclusions derived from the correlation plot to the file DPCONF.TEX (the name may vary depending on the operating system). This file is opened when DATAPLOT is initially started. However, only a few commands actually write any information to this file.

NOTE 3

After generating an autocorrelation or cross-correlation plot, the variable YPLOT contains the numerical values for the correlations. If you need to do additional analysis on the correlations, or you simply wish to print the numeric values, copy this variable to a user variable (e.g., LET CORR = YPLOT). The YPLOT variable is overwritten when the next plot is generated.

NOTE 4

The LINE, CHARACTER, SPIKE, and BAR commands can be used to control the appearance of the correlation plot. For example, some analysts prefer to draw correlation plots as spikes from a zero baseline. Trace 1 is the correlation line, trace 2 is the line at zero, and traces 3 and 4 are the lines indicating statistical significance. This is demonstrated in the sample program below.

NOTE 5

The number of lags is determined automatically. If you want to over-ride the default value, enter one of the following commands:

LET LAGS = <value>
LET LAG = <value>
LET NUMLAG = <value>

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

SPECTRUM = Generates a spectral plot.
PERIODOGRAM = Generates a periodogram.
COMPLEX DEMODULATION PLOT = Generates a complex demodulation plot.
LAG PLOT = Generates a lag plot.
PLOT = Generates a data or function plot.
4-PLOT = Generates a 4-plot for univariate analysis.
CHARACTERS = Sets the types for plot characters.
LINES = Sets the types for plot lines.
SPIKES = Sets the on/off switches for plot spikes.
SUMMARY = Generates a table of summary statistics.
LET = Generates sine/cosine transformations (and much more).
FIT = Carries out a least squares fit.

REFERENCE

``Time Series Analysis: Forecasting and Control,'' Box and Jenkins, Holden-Day, 1976.

APPLICATIONS

Time Series Analysis

IMPLEMENTATION DATE

Pre-1987 (the PARTIAL AUTOCORRELATION PLOT command was implemented 93/7)

PROGRAM 1

. THIS SAMPLE PROGRAM READS THE FILE LEW.DAT IN THE
. DATAPLOT REFERENCE DIRECTORY. THESE DATA ARE
. BEAM DEFLECTION DATA.
.
SKIP 25
READ LEW.DAT DEFLECT
.
LINE BLANK SOLID DOT DOT
SPIKE ON
SPIKE BASE 0
XLABEL LAG
Y1LABEL AUTOCORRELATION
TITLE AUTOMATIC
.
AUTOCORRELATION PLOT DEFLECT

PROGRAM 2

. THIS SAMPLE PROGRAM READS THE FILE HAYES1.DAT IN THE
. DATAPLOT REFERENCE DIRECTORY. THESE DATA ARE
. FIRE RESEARCH SMOKE OBSCURATION.
.
SKIP 25
READ HAYES1.DAT JUNK Y1 Y2
.
LINE BLANK SOLID DOT DOT
SPIKE ON
SPIKE BASE 0
XLABEL LAG
Y1LABEL CROSS-CORRELATION
TITLE AUTOMATIC
.
CROSS-CORRELATION PLOT Y1 Y2

PROGRAM 3

. THIS SAMPLE PROGRAM READS THE FILE LEW.DAT IN THE
. DATAPLOT REFERENCE DIRECTORY. THESE DATA ARE
. BEAM DEFLECTION DATA.
.
SKIP 25
READ LEW.DAT DEFLECT
.
LINE BLANK SOLID DOT DOT DOT DOT
SPIKE ON
SPIKE BASE 0
XLABEL LAG
Y1LABEL PARTIAL AUTOCORRELATION
.
PARTIAL AUTOCORRELATION PLOT DEFLECT

COUNTS PLOT

PURPOSE

Generates a subsample count versus subsample index plot.

DESCRIPTION

The subsample count is identically the number of observations in the subsample. The counts plot is used to answer the question: ``Does the subsample size change over different subsamples?'' The counts plot consists of:

Vertical axis = subsample size;
Horizontal axis = subsample index.
In addition, a horizontal line is drawn representing the full sample size. The appearance of the 2 traces is controlled by the first 2 settings of the LINES, CHARACTERS, SPIKES, BARS, and similar attributes.

SYNTAX

COUNTS PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable;

<x> is the subsample identifier variable (this variable appears on the horizontal axis);

and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

COUNTS PLOT Y X
COUNTS PLOT Y X SUBSET X > 1

DEFAULT

None

SYNONYMS

SIZE PLOT

RELATED COMMANDS

CHARACTERS = Sets the types for plot characters.
LINES = Sets the types for plot lines.
MEAN PLOT = Generates a mean plot.
MEDIAN PLOT = Generates a median plot.
SD PLOT = Generates a standard deviation plot.
PLOT = Generates a data or function plot.

APPLICATIONS

Exploratory Data Analysis

IMPLEMENTATION DATE

88/2

PROGRAM

SKIP 25
READ RIPKEN.DAT BA HORI VERT TYPE HAND
.
MULTIPLOT 2 2; MULTIPLOT CORNER COORDINATES 0 0 100 100
XLIMITS 1 3
MAJOR XTIC MARK NUMBER 3
MINOR XTIC MARK NUMBER 0
XTIC OFFSET 0.5 0.5
X1TIC MARK LABEL FORMAT ALPHA
LINE BLANK SOLID
CHARACTER CIRCLE BLANK
YLIMITS 0 35
YTIC OFFSET 0 3
.
X1TIC MARK LABEL CONTENT INSIDE MIDDLE OUTSIDE
TITLE COUNTS FOR HORIZONTAL LOCATION
COUNTS PLOT BA HORI
X1TIC MARK LABEL CONTENT LOW MIDDLE HIGH
TITLE COUNTS FOR VERTICAL LOCATION
COUNTS PLOT BA VERT
X1TIC MARK LABEL CONTENT FASTBALL CURVE SP()
TITLE COUNTS FOR PITCH TYPE
COUNTS PLOT BA TYPE
X1TIC MARK LABEL CONTENT LEFT RIGHT SP()
TITLE COUNTS FOR PITCH HAND
COUNTS PLOT BA HAND
.
END OF MULTIPLOT

CP PLOT

PURPOSE

Generates a subsample Cp versus subsample index plot.

DESCRIPTION

The subsample Cp index is the Cp index of the data in the subsample. The Cp plot is used to answer the question: ``Does the subsample Cp index change over different subsamples?'' The plot consists of:

Vertical axis = subsample Cp index;
Horizontal axis = subsample index.
In addition, a horizontal line is drawn representing the full sample Cp value. As usual, the appearance of the 2 traces is controlled by the first 2 settings of the LINES, CHARACTERS, SPIKES, BARS, and similar attributes.

SYNTAX

CP PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable;
<x> is the subsample identifier variable (this variable appears on the horizontal axis);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

CP PLOT Y X
CP PLOT Y X1 SUBSET X1 > 2

NOTE 1

The process capability index measures the performance (i.e., the capability) of an industrial process and is defined as follows:

where S is the sample standard deviation and where USL and LSL are user specified upper and lower specification limits. The specification limits define the range within which a product is considered acceptable (values outside this range indicate that a product is defective). Values less than 1 indicate that there are still some defectives. A value of 6S yields a range of plus or minus 3 standard deviations. For example, if the specification limits are symmetric about the mean and the calculated CP is exactly 1, this means that the specification limits fall at plus and minus 3 standard deviations from the mean (and almost all the data will fall within these limits). Values greater than 1 indicate the specification limits are even greater than 3 standard deviations from the mean while values less than 1 indicate specification limits are less than 3 standard deviations from the mean.

NOTE 2

Recall that Chebychev's thereom states that at least 75% of the variables data must fall within plus or minus 2 standard deviations of the mean and that at least 88% must fall within plus or minus 3 standard deviations. This is for any distribution. For a normal distribution, these numbers are 95.4% and 99.7% respectively.

NOTE 3

The upper and lower specification limits must be specified by the user as follows:

LET LSL = <value>
LET USL = <value>

NOTE 4

If your specification limits are not symmetric about the mean, the Cpk statistic may be a better choice than the CP statistic. It is an alternate calculation of CP that adjusts for possibly non-symmetric specification limits.

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

CHARACTERS = Sets the type for plot characters.
LINES = Sets the type for plot lines.
CAPABILITY ANALYSIS = Generate a capability analysis.
CP = Compute the CP statistic.
CPK PLOT = Generates a Cpk plot.
EXPECTED LOSS PLOT = Generates an expected loss plot.
PERCENT DEFECTIVE PLOT = Generates a percent defective plot.
BOX PLOT = Generates a box plot.
CONTROL CHART = Generate various types of control charts.
PLOT = Generates a data or function plot.

APPLICATIONS

Quality Control

IMPLEMENTATION DATE

93/10

PROGRAM

SKIP 25
READ GEAR.DAT DIAMETER BATCH
TITLE CASE ASIS
LABEL CASE ASIS
TITLE Gear Diameter Analysis
Y1LABEL CP; X1LABEL Batch
LEGEND 1 Process Capability
LEGEND 2 CP Plot
XTIC OFFSET 0.5 0.5
CHARACTER X BLANK
LINE BLANK SOLID
LET LSL = 0.98
LET USL = 1.02
CP PLOT Diameter Batch

CPK PLOT

PURPOSE

Generates a subsample Cpk versus subsample index plot.

DESCRIPTION

The subsample Cpk index is the Cpk index of the data in the subsample. The Cpk plot is used to answer the question: ``Does the subsample Cpk index change over different subsamples?'' The plot consists of:

Vertical axis = subsample Cpk index;
Horizontal axis = subsample index.
In addition, a horizontal line is drawn representing the full sample Cpk value. As usual, the appearance of the 2 traces is controlled by the first 2 settings of the LINES, CHARACTERS, SPIKES, BARS, and similar attributes.

The Cpk statistic is used as an alternative to the CP statistic when the specification limits are not symmetric about the mean.

SYNTAX

CPK PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable;
<x> is the subsample identifier variable (this variable appears on the horizontal axis);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

CPK PLOT Y X
CPK PLOT Y X1 SUBSET X1 > 3

NOTE 1

The process capability index measure the performance (i.e., the capability) of an industrial process and is defined as follows:

where M is the sample mean, S is the sample standard deviation and where USL and LSL are user specified upper and lower specification limits. The specification limits define the range within which a product is considered acceptable (values outside this range indicate that a product is defective). Values less than 1 indicate that there are still some defectives.

NOTE 2

Recall that Chebychev's thereom states that at least 75% of the variables data must fall within plus or minus 2 standard deviations of the mean and that at least 88% must fall within plus or minus 3 standard deviations. This is for any distribution. For a normal distribution, these numbers are 95.4% and 99.7% respectively.

NOTE 3

The upper and lower specification limits must be specified by the user as follows:

LET LSL = <value>
LET USL = <value>

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

CHARACTERS = = Sets the type for plot characters.
LINES = = Sets the type for plot lines.
CAPABILITY ANALYSIS = Performs a capability analysis.
CPK = Computes the Cpk statistic,
CP PLOT = = Generates a Cp plot.
EXPECTED LOSS PLOT = = Generates an expected loss plot.
PERCENT DEFECTIVE PLOT = = Generates a percent defective plot.
BOX PLOT = = Generates a box plot.
XBAR CHART = = Generates an xbar control chart.
PLOT = = Generates a data or function plot.

APPLICATIONS

Quality Control

IMPLEMENTATION DATE

93/10

PROGRAM

SKIP 25
READ GEAR.DAT DIAMETER BATCH
TITLE CASE ASIS
LABEL CASE ASIS
TITLE Gear Diameter Analysis
Y1LABEL CPK
X1LABEL Batch
LEGEND 1 Process Capability
LEGEND 2 CPK Plot
XTIC OFFSET 0.5 0.5
CHARACTER X BLANK
LINE BLANK SOLID
LET LSL = 0.98
LET USL = 1.02
.
CPK PLOT Diameter Batch

... DECILE PLOT

PURPOSE

Generates a subsample decile versus subsample index plot.

DESCRIPTION

A subsample decile is the estimated 10%, 20%, 30%, ..., 80%, or 90% point of the subsample. For example, the 30% is the point where 30% of the data are below that point and the remaining 70% are above that point. The decile plot is used to answer the question: ``Does the subsample variation change over different subsamples?'' The decile plot consists of:

Vertical axis = subsample decile;
Horizontal axis = subsample index.
In addition, a horizontal line is drawn representing the full sample decile value. The appearance of the 2 traces is controlled by the first 2 settings of the LINES, CHARACTERS, SPIKES, BARS, and similar attributes.

SYNTAX

<keyword> DECILE PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <keyword> is FIRST, SECOND, THIRD, FOURTH, FIFTH, SIXTH, SEVENTH, EIGHTH, or NINTH;
<y> is the response (= dependent) variable;
<x> is the subsample identifier variable (this variable appears on the horizontal axis);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

FIRST DECILE PLOT Y X
NINTH DECILE PLOT Y X1 SUBSET X1 > 2

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

CHARACTERS = Sets the type for plot characters.
LINES = Sets the type for plot lines.
LOWER QUARTILE PLOT = Generates a lower quartile plot.
UPPER QUARTILE PLOT = Generates a upper quartile plot.
MINIMUM PLOT = Generates a minimum plot.
MAXIMUM PLOT = Generates a maximum plot.
RANGE PLOT = Generates a range plot.
STANDARD DEVIATION PLOT = Generates a standard deviation plot.
VARIANCE PLOT = Generates a variance plot.
MEAN PLOT = Generates a mean plot.
MEDIAN PLOT = Generates a median plot.
BOX PLOT = Generates a box plot.
S CHART = Generates a standard deviation control chart.
PLOT = Generates a data or function plot.

APPLICATIONS

Quality Control

IMPLEMENTATION DATE

88/2

PROGRAM

. PURPOSE--GENERATE A DECILE PLOT OF POINT BARROW FREON-11 DATA
SKIP 50
SET READ FORMAT 3F4.0,F5.0,F6.0,F3.0,2F9.0
READ PBF11.DAT YEAR DAY BOT SD F11 FLAG WV CO2
.
RETAIN YEAR DAY BOT SD F11 WV CO2 FLAG SUBSET FLAG 0
LET MONTH=INT(DAY/30.25)+1
.
CHARACTER X BLANK
LINE BLANK SOLID
XLIMITS 0 15
MULTIPLOT 3 3; MULTIPLOT CORNER COORDINATES 0 0 100 100
TITLE AUTOMATIC
FIRST DECILE PLOT WV MONTH
SECOND DECILE PLOT WV MONTH
THIRD DECILE PLOT WV MONTH
FOURTH DECILE PLOT WV MONTH
FIFTH DECILE PLOT WV MONTH
SIXTH DECILE PLOT WV MONTH
SEVENTH DECILE PLOT WV MONTH
EIGHTH DECILE PLOT WV MONTH
NINTH DECILE PLOT WV MONTH
END OF MULTIPLOT

DEX SCATTER PLOT

PURPOSE

Generates a dex scatter plot.

DESCRIPTION

A dex scatter plot is a graphical method for representing a design of experiment problem. The first variable is a response variable while the remaining variables (must be at least one) represent levels of factors. Qualitative levels are coded as indices (e.g., 1 for process A, 2 for process B). A separate subplot is drawn for each factor with the subplot for factor k centered horizontally at x=k. Each subplot has a given horizontal width (defined by the DEX WIDTH command, defaults to 0.5). For example, the subplot for factor 2 ranges from 1.75 to 2.25 on the horizontal axis. The levels of the factor are assigned an x coordinate within this range (from lowest to highest). Then within each subplot:

Vertical axis = value of the response variable;
Horizontal axis = value of the level of a given factor.
This plot graphically shows the following:

2. How the response variable varies between factors.

SYNTAX

DEX SCATTER PLOT <y> <x1> ... <xn> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response variable;
<x1> ... <xn> is a sequence of variables representing factors in a designed experiment;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

DEX SCATTER PLOT Y X1 X2
DEX SCATTER PLOT Y X1 X2 X3
DEX SCATTER PLOT Y X1 X2 X3 X4
DEX SCATTER PLOT Y X1 TO X4

NOTE 1

The TO syntax is allowed for the list of factor variables (see the EXAMPLES above).

NOTE 2

The CHARACTER and LINE settings can be used to control the appearance of the plot. The first trace is typically drawn with a blank line and some type of character set (the choice of character is a matter of user preference). The second trace draws a horizontal line at the overall mean. It is typically drawn with a blank character and a solid line (some analysts may prefer a dashed or dotted line). In any event, the user must explicitly set character and line settings (they default to all lines solid and all characters blank).

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

LINES = Sets the type for plot lines.
CHARACTER = Sets the type for plot characters
PLOT = Generates a data or function plot.
DEX SIGN PLOT = Generates a dex sign plot.
DEX ... PLOT = Generates a dex plot for a statistic.
DEX ... PARETO PLOT = Generates a Pareto dex plot for a statistic.
DEX ... YOUDEN PLOT = Generates a Youden dex plot for a statistic.
DEX ... EFFECTS PLOT = Generates a dex effects plot for a statistic.
DEX ... PARETO EFFECTS PLOT = Generates a Pareto effects dex plot for a statistic.
DEX ... ABSOLUTE EFFECTS PLOT = Generates an absolute effects dex plot for a statistic.
DEX ... PARE ABSO EFFECTS PLOT = Generates a Pareto absolute effects dex plot for a statistic.
DEX WIDTH = Specifies the width of levels in a dex plot..

REFERENCE

``Statistics for Experimenters,'' Box, Hunter, and Hunter, Wiley and Sons, 1978.

APPLICATIONS

Design of Experiments

IMPLEMENTATION DATE

89/12

PROGRAM

SKIP 25
READ BOXYIEL2.DAT Y X1 X2
.
TITLE AUTOMATIC
CHARACTERS X BLANK
LINE BLANK SOLID
YLIMITS 75 90
YTIC OFFSET 0 2
Y1LABEL CHEMICAL YIELD
XLIMITS 1 2
XTIC OFFSET 0.5 0.5
MAJOR XTIC MARK NUMBER 2
MINOR XTIC MARK NUMBER 0
XTIC MARK LABEL FORMAT ALPHA
XTIC MARK LABEL CONTENT TIME TEMPERATURE
X1LABEL FACTORS
DEX SCATTER PLOT Y X1 TO X2

DEX SIGN PLOT

PURPOSE

Generates a dex sign plot.

DESCRIPTION

A dex sign plot is a graphical method for representing a design of experiment problem. The first variable is a response variable while the remaining variables (must be at least one) represent levels of factors. Qualitative levels are coded as indices (e.g., 1 for process A, 2 for process B). A separate subplot is drawn for each factor with the subplot for factor k plotted horizontally at x=k. Then within each subplot, the plot consists of:

Vertical axis = value of the response variable with each level within a factor plotted as a separate trace.
Horizontal axis = factor id (i.e., factor 1 at x=1, factor 2 at x=2, etc.).
This plot graphically shows the following:

2. How the response variable varies between factors.
The most common use of this plot is when all factor variables have exactly two levels. The lower level is plotted with a minus sign and the upper level is plotted with a plus sign (this is where the term DEX SIGN PLOT comes from).

SYNTAX

DEX SIGN PLOT <y> <x1> ... <xn> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response variable;
<x1> ... <xn> are a sequence of variables representing factors in a designed experiment;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

DEX SIGN PLOT Y X1 X2
DEX SIGN PLOT Y X1 X2 X3
DEX SIGN PLOT Y X1 X2 X3 X4
DEX SIGN PLOT Y X1 TO X4

NOTE 1

This command is similar to the DEX SCATTER PLOT command. There are two distinctions. First, the levels of the factor variables are plotted with different traces. This allows the levels to be clearly identified. Second, the levels of a factor are plotted at the same horizontal axis value.

NOTE 2

The CHARACTER and LINE settings can be used to control the appearance of the plot. If NLEVELS is the maximum number of levels in a factor, then the first NLEVELS traces are typically drawn with a blank line and with a unique character identifier. Typically, if there are exactly 2 levels for each factor, the characters are set to a minus and plus sign respectively. If there are more than 2 levels, using 1, 2, 3, etc. works well. The NLEVELS+1 trace draws a horizontal line at the overall mean. This is typically drawn with a blank character and a solid line (some analysts may prefer a dashed or dotted line). In any event, the user must explicitly set character and line settings (they default to all lines solid and all characters blank).

NOTE 3

The TO syntax is allowed for the list of factor variables (see the EXAMPLES above).

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

LINES = Sets the type for plot lines.
CHARACTER = Sets the type for plot characters
DEX SCATTER PLOT = Generates a dex scatter plot.
DEX ... PLOT = Generates a dex plot for a statistic.
DEX ... PARETO PLOT = Generates a Pareto dex plot for a statistic.
DEX ... YOUDEN PLOT = Generates a Youden dex plot for a statistic.
DEX ... EFFECTS PLOT = Generates a dex effects plot for a statistic.
DEX ... PARETO EFFECTS PLOT = Generates a Pareto effects dex plot for a statistic.
DEX ... ABSOLUTE EFFECTS PLOT = Generates an absolute effects dex plot for a statistic.
DEX ... PARE ABSO EFFECTS PLOT = Generates a Pareto absolute effects dex plot for a statistic.

REFERENCE

``Statistics for Experimenters,'' Box, Hunter, and Hunter, Wiley and Sons, 1978.

APPLICATIONS

Design of Experiments

IMPLEMENTATION DATE

89/12

PROGRAM

SKIP 25
READ BOXCHEM.DAT Y X1 X2 X3 X4
CHARACTERS - + BLANK; LINE BLANK BLANK SOLID
YLIMITS 50 90
Y1LABEL PERCENT CONVERSION
YTIC OFFSET 5 5
XLIMITS 1 4
XTIC OFFSET 0.5 0.5
X1LABEL FACTORS
MAJOR XTIC MARK NUMBER 4
MINOR XTIC MARK NUMBER 0
XTIC MARK LABEL FORMAT ALPHA
XTIC MARK LABEL CONTENT CATALYST TEMPERATURE PRESSURE CONCENTRATION
DEX SIGN PLOT Y X1 TO X4

DEX ... ABSOLUTE EFFECTS PLOT

PURPOSE

Generates a dex absolute effects plot for a given statistic.

DESCRIPTION

A dex absolute effects plot is a graphical method for representing a design of experiment problem. The first variable is a response variable while the remaining variables (must be at least one) represent levels of factors. The value of some user specified statistic is calculated for the response variable. This plot consists of:

Vertical axis = if there are exactly 2 levels, the effect is the value of the statistic for the lower level subtracted from the value of the statistic for the higher level (and then the absolute value is taken). If there are more than 2 levels, the lowest value of the statistic is subtracted from the highest value of the statistic (this value is always positive);
Horizontal axis = the factor id (i.e., 1 for factor 1, 2 for factor 2 and so on).
This plot graphically shows the following:

2. How these maximum differences vary between factors.

SYNTAX 1

DEX <stat> ABSOLUTE EFFECTS PLOT <y> <x1> ... < xn> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response variable;
<x1> ... <xn> are a sequence of variables representing factors in a designed experiment;
<stat> is one of the following statistics:
MEAN (or AVERAGE), MIDMEAN, MEDIAN, TRIMMED MEAN, WINDSORIZED MEAN,
SUM, PRODUCT, SIZE (or NUMBER or COUNT), MINIMUM, MAXIMUM,
STANDARD DEVIATION (or SD), VARIANCE,
STANDARD DEVIATION OF MEAN (or SDM), VARIANCE OF MEAN (or VM),
RELATIVE STANDARD DEVIATION (or RELSD),
RELATIVE VARIANCE (or RELV or COEFFICIENT OF VARIATION),
RANGE, MIDRANGE, LOWER HINGE, UPPER HINGE, LOWER QUARTILE, UPPER QUARTILE,
<FIRST/SECOND/THIRD/FOURTH/FIFTH/SIXTH/SEVENTH/EIGTH/NINTH> DECILE (or 1DEC, 2DEC,
3DEC,4DEC,5DEC,6DEC,7DEC, 8DEC, 9DEC),
SKEWNESS, KURTOSIS, PROPORTION,
AUTOCORRELATION, AUTOCOVARIANCE,
SINE FREQUENCY, SINE AMPLITUDE,
CP, CPK, EXPECTED LOSS, PERCENT DEFECTIVE,
SINE FREQUENCY, SINE AMPLITUDE,
TAGUCHI SN, TAGUCHI SN+, TAGUCHI SN-, TAGUCHI SN00;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax is used for statistics that only require a single variable to compute.

SYNTAX 2

DEX <stat> ABSOLUTE EFFECTS PLOT <y> <x> <x1> ... < xn> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response variable;
<x> is a second variable used in calculating the statistic (e.g., a linear fit is computed between <y1> and <x));
<x1> ... <xn> are a sequence of variables representing factors in a designed experiment;
<stat> is one of the following statistics:
LINEAR INTERCEPT, LINEAR SLOPE, LINEAR RESSD, LINEAR CORRELATION,
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax is used for statistics that require two variables to compute.

EXAMPLES

DEX MEAN ABSOLUTE EFFECTS PLOT Y X1 X2
DEX SD ABSOLUTE EFFECTS PLOT Y X1 X2 X3
DEX RANGE ABSOLUTE EFFECTS PLOT Y X1 X2 X3 X4
DEX RANGE ABSOLUTE EFFECTS PLOT Y X1 TO X4
DEX LINEAR SLOPE ABSOLUTE EFFECTS PLOT Y X X1 TO X4

NOTE 1

This plot is similar to the DEX ... EFFECTS PLOT. The distinction is that the absolute effects plot takes the absolute value of the low level minus the high level while the standard effects plot does not. For the case with more than 2 levels, the two plots are identical.

NOTE 2

This plot is normally done for a location parameter (typically the mean or median) or a spread parameter (typically the standard deviation or range). The other statistics are less often used.

NOTE 3

The TO syntax is allowed for the list of factor variables (see the EXAMPLES above).

NOTE 4

The CHARACTER, BAR, SPIKE, and LINE settings can be used to control the appearance of the plot. The trace is typically drawn with a blank line and some type of character set (the choice of character is a matter of user preference). However, you can draw the trace as a bar, a connected line, or a spike if you prefer. In any event, the user must explicitly set character and line settings (they default to all lines solid and all characters blank).

DEFAULT

None

SYNONYMS

DEX ... EFFECTS ABSOLUTE PLOT

RELATED COMMANDS

LINES = Sets the type for plot lines.
CHARACTER = Sets the type for plot characters
PLOT = Generates a data or function plot.
DEX SCATTER PLOT = Generates a dex scatter plot.
DEX SIGN PLOT = Generates a dex sign plot.
DEX ... PLOT = Generates a dex plot for a statistic.
DEX ... EFFECTS PLOT = Generates a dex effects plot for a statistic.
DEX ... PARETO PLOT = Generates a Pareto dex plot for a statistic.
DEX ... YOUDEN PLOT = Generates a Youden dex plot for a statistic.
DEX ... PARETO EFFECTS PLOT = Generates a Pareto effects dex plot for a statistic.
DEX ... PARE ABSO EFFECTS PLOT = Generates a Pareto absolute effects dex plot for a statistic.

REFERENCE

``Statistics for Experimenters,'' Box, Hunter, and Hunter, Wiley and Sons, 1978.

APPLICATIONS

Design of Experiments

IMPLEMENTATION DATE

89/12

PROGRAM

SKIP 25
READ SHEESLE2.DAT Y PROC PLANT SPEED SHIFT PROC
.
CHARACTERS X ALL
LINES BLANK ALL
LET NFACT = 4
XLIMITS 1 NFACT
MAJOR XTIC MARK NUMBER NFACT
MINOR XTIC MARK NUMBER 0
XTIC OFFSET 1 1
XTIC LABEL FORMAT ALPHA
XTIC LABEL CONTENT PROCESS PLANT SPEED SHIFT
X1LABEL FACTORS
TITLE DEX ABSOLUTE EFFECTS PLOT
.
MULTIPLOT 2 2; MULTIPLOT CORNER COORDINATES 0 0 100 100
Y1LABEL MEAN
DEX MEAN ABSOLUTE EFFECTS PLOT Y PROC PLANT SPEED SHIFT
Y1LABEL MEDIAN
DEX MEDIAN ABSOLUTE EFFECTS PLOT Y PROC PLANT SPEED SHIFT
Y1LABEL STANDARD DEVIATION
DEX SD ABSOLUTE EFFECTS PLOT Y PROC PLANT SPEED SHIFT
Y1LABEL RANGE
DEX RANGE ABSOLUTE EFFECTS PLOT Y PROC PLANT SPEED SHIFT
END OF MULTIPLOT

DEX ... EFFECTS PLOT

PURPOSE

Generates a dex effects plot for a given statistic.

DESCRIPTION

A dex effects plot is a graphical method for representing a design of experiment problem. The first variable is a response variable while the remaining variables (must be at least one) represent levels of factors. The value of some user specified statistic is calculated for the response variable. The effects plot consists of:

Vertical axis = if there are exactly 2 levels, the effect is the value of the statistic for the lower level subtracted from the value of the statistic for the higher level (this can be positive or negative). If there are more than 2 levels, the lowest value of the statistic is subtracted from the highest value of the statistic (this value is always positive);
Horizontal axis = the factor id (i.e., 1 for factor 1, 2 for factor 2 and so on).
This plot graphically shows the following:

2. How these maximum differences vary between factors.

SYNTAX 1

DEX <stat> EFFECT PLOT <y> <x1> ... <xn> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response variable;
<x1> ... <xn> are a sequence of variables representing factors in a designed experiment;
<stat> is one of the following statistics:
MEAN (or AVERAGE), MIDMEAN, MEDIAN, TRIMMED MEAN, WINDSORIZED MEAN,
SUM, PRODUCT, SIZE (or NUMBER or COUNT), MINIMUM, MAXIMUM,
STANDARD DEVIATION (or SD), VARIANCE,
STANDARD DEVIATION OF MEAN (or SDM), VARIANCE OF MEAN (or VM),
RELATIVE STANDARD DEVIATION (or RELSD),
RELATIVE VARIANCE (or RELV or COEFFICIENT OF VARIATION),
RANGE, MIDRANGE, LOWER HINGE, UPPER HINGE, LOWER QUARTILE, UPPER QUARTILE,
<FIRST/SECOND/THIRD/FOURTH/FIFTH/SIXTH/SEVENTH/EIGTH/NINTH> DECILE (or 1DEC, 2DEC,
3DEC,4DEC,5DEC,6DEC,7DEC, 8DEC, 9DEC),
SKEWNESS, KURTOSIS, PROPORTION,
AUTOCORRELATION, AUTOCOVARIANCE,
SINE FREQUENCY, SINE AMPLITUDE,
CP, CPK, EXPECTED LOSS, PERCENT DEFECTIVE,
SINE FREQUENCY, SINE AMPLITUDE,
TAGUCHI SN, TAGUCHI SN+, TAGUCHI SN-, TAGUCHI SN00;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax is used for statistics that only require a single variable to compute.

SYNTAX 2

DEX <stat> EFFECTS PLOT <y> <x> <x1> ... < xn> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response variable;
<x> is a second variable used in calculating the statistic (e.g., a linear fit is computed between <y1> and <x));
<x1> ... <xn> are a sequence of variables representing factors in a designed experiment;
<stat> is one of the following statistics:
LINEAR INTERCEPT, LINEAR SLOPE, LINEAR RESSD, LINEAR CORRELATION,
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax is used for statistics that require two variables to compute.

EXAMPLES

DEX MEAN EFFECTS PLOT Y X1 X2
DEX SD EFFECTS PLOT Y X1 X2 X3
DEX RANGE EFFECTS PLOT Y X1 X2 X3 X4
DEX RANGE EFFECTS PLOT Y X1 TO X4

NOTE 1

This plot is normally done for a location parameter (typically the mean or median) or a spread parameter (typically the standard deviation or range). The other statistics are less often used.

NOTE 2

The TO syntax is allowed for the list of factor variables (see the EXAMPLES above).

NOTE 3

The CHARACTER,BAR, SPIKE, and LINE settings can be used to control the appearance of the plot. The trace is typically drawn with a blank line and some type of character set (the choice of character is a matter of user preference). However, you can draw the trace as a bar, a connected line, or a spike if you prefer. In any event, the user must explicitly set character and line settings (they default to all lines solid and all characters blank).

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

LINES = Sets the type for plot lines.
CHARACTER = Sets the type for plot characters
PLOT = Generates a data or function plot.
DEX SCATTER PLOT = Generates a dex scatter plot.
DEX SIGN PLOT = Generates a dex sign plot.
DEX ... PLOT = Generates a dex plot for a statistic.
DEX ... PARETO PLOT = Generates a Pareto dex plot for a statistic.
DEX ... YOUDEN PLOT = Generates a Youden dex plot for a statistic.
DEX ... PARETO EFFECTS PLOT = Generates a Pareto effects dex plot for a statistic.
DEX ... ABSOLUTE EFFECTS PLOT = Generates an absolute effects dex plot for a statistic.
DEX ... PARE ABSO EFFECTS PLOT = Generates a Pareto absolute effects dex plot for a statistic.

REFERENCE

``Statistics for Experimenters,'' Box, Hunter, and Hunter, Wiley and Sons, 1978.

APPLICATIONS

Design of Experiments

IMPLEMENTATION DATE

89/12

PROGRAM

SKIP 25
READ SHEESLE2.DAT Y PROC PLANT SPEED SHIFT PROC
.
CHARACTERS X ALL
LINES BLANK ALL
LET NFACT = 4
XLIMITS 1 NFACT
MAJOR XTIC MARK NUMBER NFACT
MINOR XTIC MARK NUMBER 0
XTIC OFFSET 1 1
XTIC LABEL FORMAT ALPHA
XTIC LABEL CONTENT PROCESS PLANT SPEED SHIFT
X1LABEL FACTORS
TITLE DEX EFFECTS PLOT
.
MULTIPLOT 2 2
MULTIPLOT CORNER COORDINATES 0 0 100 100
Y1LABEL MEAN
DEX MEAN EFFECTS PLOT Y PROC PLANT SPEED SHIFT
Y1LABEL MEDIAN
DEX MEDIAN EFFECTS PLOT Y PROC PLANT SPEED SHIFT
Y1LABEL STANDARD DEVIATION
DEX SD EFFECTS PLOT Y PROC PLANT SPEED SHIFT
Y1LABEL RANGE
DEX RANGE EFFECTS PLOT Y PROC PLANT SPEED SHIFT
END OF MULTIPLOT

DEX ... PARETO PLOT

PURPOSE

Generates a dex Pareto plot for a given statistic.

DESCRIPTION

A dex statistic Pareto plot is a graphical method for representing a design of experiment problem. The first variable is a response variable while the remaining variables (must be at least one) represent levels of factors.The user specified statistic is computed for each level of each factor. These statistics are then ordered from high to low (this is where the Pareto term comes from). The plot consists of:

Vertical axis = values of the computed statistic for each level of each factor ordered from high to low;
Horizontal axis = an index value (1 for the highest value of the statistic, N for the lowest value of the statistic where N is the total number of levels in all factors).
This plot graphically shows the magnitude and spread of the given statistic for the various levels of factors.

SYNTAX 1

DEX <stat> PARETO PLOT <y> <x1> ... <xn> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response variable;
<x1> ... <xn> are a sequence of variables representing factors in a designed experiment;
<stat> is one of the following statistics:
MEAN (or AVERAGE), MIDMEAN, MEDIAN, TRIMMED MEAN, WINDSORIZED MEAN,
SUM, PRODUCT, SIZE (or NUMBER or COUNT), MINIMUM, MAXIMUM,
STANDARD DEVIATION (or SD), VARIANCE,
STANDARD DEVIATION OF MEAN (or SDM), VARIANCE OF MEAN (or VM),
RELATIVE STANDARD DEVIATION (or RELSD),
RELATIVE VARIANCE (or RELV or COEFFICIENT OF VARIATION),
RANGE, MIDRANGE, LOWER HINGE, UPPER HINGE, LOWER QUARTILE, UPPER QUARTILE,
<FIRST/SECOND/THIRD/FOURTH/FIFTH/SIXTH/SEVENTH/EIGTH/NINTH> DECILE (or 1DEC, 2DEC,
3DEC,4DEC,5DEC,6DEC,7DEC, 8DEC, 9DEC),
SKEWNESS, KURTOSIS, PROPORTION,
AUTOCORRELATION, AUTOCOVARIANCE,
SINE FREQUENCY, SINE AMPLITUDE,
CP, CPK, EXPECTED LOSS, PERCENT DEFECTIVE,
SINE FREQUENCY, SINE AMPLITUDE,
TAGUCHI SN, TAGUCHI SN+, TAGUCHI SN-, TAGUCHI SN00;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax is used for statistics that only require a single variable to compute.

SYNTAX 2

DEX <stat> PARETO PLOT <y> <x> <x1> ... < xn> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response variable;
<x> is a second variable used in calculating the statistic (e.g., a linear fit is computed between <y1> and <x));
<x1> ... <xn> are a sequence of variables representing factors in a designed experiment;
<stat> is one of the following statistics:
LINEAR INTERCEPT, LINEAR SLOPE, LINEAR RESSD, LINEAR CORRELATION,
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax is used for statistics that require two variables to compute.

EXAMPLES

DEX MEAN PARETO PLOT Y X1 X2
DEX SD PARETO PLOT Y X1 X2 X3
DEX RANGE PARETO PLOT Y X1 X2 X3 X4
DEX RANGE PARETO PLOT Y X1 TO X4

NOTE 1

This plot is similar to the DEX ... PLOT command. The distinction is that the Pareto version sorts by the value of the statistic while the DEX .. PLOT command does not. The Pareto version loses the connection between a given value and the factor to which it belongs, so it is typical to do a DEX ... PLOT command before a DEX ... PARETO PLOT command. The Pareto version can sometimes show more distinctly the difference in magnitude and the amount of variation in the calculated statistic.

NOTE 2

This plot is normally done for a location parameter (typically the mean or median) or a spread parameter (typically the standard deviation or range). The other statistics are less often used.

NOTE 3

The TO syntax is allowed for the list of factor variables (see the EXAMPLES above).

NOTE 4

The CHARACTER, BAR, SPIKE, and LINE settings can be used to control the appearance of the plot. Trace 1 contains the calculated value of the statistic and trace 2 contains the value of the statistic for all values of the response variable. Pareto charts are typically drawn as bars. However, you can draw them as spikes, individual points, or a connected line if you prefer. The program example below demonstrates drawing them as bars. The various plot control commands for characters, lines, spikes, and bars can be used as usual.

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

LINES = Sets the type for plot lines.
CHARACTER = Sets the type for plot characters
PLOT = Generates a data or function plot.
DEX SCATTER PLOT = Generates a dex scatter plot.
DEX SIGN PLOT = Generates a dex sign plot.
DEX ... PLOT = Generates a dex plot for a statistic.
DEX ... YOUDEN PLOT = Generates a Youden dex plot for a statistic.
DEX ... EFFECTS PLOT = Generates a dex effects plot for a statistic.
DEX ... PARETO EFFECTS PLOT = Generates a Pareto effects dex plot for a statistic.
DEX ... ABSOLUTE EFFECTS PLOT = Generates an absolute effects dex plot for a statistic.
DEX ... PARE ABSO EFFECTS PLOT = Generates a Pareto absolute effects dex plot for a statistic.
DEX WIDTH = Specifies the width of levels in a dex plot.

REFERENCE

``Statistics for Experimenters,'' Box, Hunter, and Hunter, John Wiley and Sons, 1978.

APPLICATIONS

Design of Experiments

IMPLEMENTATION DATE

89/12

PROGRAM

SKIP 25
READ SHEESLE2.DAT Y PROC PLANT SPEED SHIFT PROC
.
BAR ON
BAR WIDTH 0.5
LINES BLANK SOLID
TITLE DEX PARETO PLOT
.
MULTIPLOT 2 2
MULTIPLOT CORNER COORDINATES 0 0 100 100
Y1LABEL MEAN
LET A = MEAN Y
BAR BASE A
DEX MEAN PARETO PLOT Y PROC PLANT SPEED SHIFT
Y1LABEL MEDIAN
LET A = MEDIAN Y
BAR BASE A
DEX MEDIAN PARETO PLOT Y PROC PLANT SPEED SHIFT
YLIMITS
Y1LABEL STANDARD DEVIATION
LET A = STANDARD DEVIATION Y
BAR BASE A
DEX SD PARETO PLOT Y PROC PLANT SPEED SHIFT
Y1LABEL RANGE
LET A = RANGE Y
BAR BASE A
DEX RANGE PARETO PLOT Y PROC PLANT SPEED SHIFT
END OF MULTIPLOT

DEX ... PARETO ABSOLUTE EFFECTS PLOT

PURPOSE

Generates a dex Pareto absolute effects plot for a given statistic.

DESCRIPTION

A dex Pareto absolute effects plot is a graphical method for representing a design of experiment problem. The first variable is a response variable while the remaining variables (must be at least one) represent levels of factors. The user specified statistic is computed for each level of each factor. This plot consists of:

Vertical axis = if there are exactly 2 levels, the effect is the value of the statistic for the lower level subtracted from the value of the statistic for the higher level (this can be positive or negative). The absolute value of this number is taken. If there are more than 2 levels, the lowest value of the statistic is subtracted from the highest value of the statistic (this value is always positive). The values are then sorted from highest to lowest.
Horizontal axis = the factor id (i.e., 1 for the factor with the highest value, 2 for the factor with the second highest value, and so on).
This plot graphically shows the following:

2. How these maximum differences vary between factors.

SYNTAX 1

DEX <stat> PARETO ABSOLUTE EFFECTS PLOT <y> <x1> ... <xn> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response variable;
<x1> ... <xn> are a sequence of variables representing factors in a designed experiment;
<stat> is one of the following statistics:
MEAN (or AVERAGE), MIDMEAN, MEDIAN, TRIMMED MEAN, WINDSORIZED MEAN,
SUM, PRODUCT, SIZE (or NUMBER or COUNT), MINIMUM, MAXIMUM,
STANDARD DEVIATION (or SD), VARIANCE,
STANDARD DEVIATION OF MEAN (or SDM), VARIANCE OF MEAN (or VM),
RELATIVE STANDARD DEVIATION (or RELSD),
RELATIVE VARIANCE (or RELV or COEFFICIENT OF VARIATION),
RANGE, MIDRANGE, LOWER HINGE, UPPER HINGE, LOWER QUARTILE, UPPER QUARTILE,
<FIRST/SECOND/THIRD/FOURTH/FIFTH/SIXTH/SEVENTH/EIGTH/NINTH> DECILE (or 1DEC, 2DEC,
3DEC,4DEC,5DEC,6DEC,7DEC, 8DEC, 9DEC),
SKEWNESS, KURTOSIS, PROPORTION,
AUTOCORRELATION, AUTOCOVARIANCE,
SINE FREQUENCY, SINE AMPLITUDE,
CP, CPK, EXPECTED LOSS, PERCENT DEFECTIVE,
SINE FREQUENCY, SINE AMPLITUDE,
TAGUCHI SN, TAGUCHI SN+, TAGUCHI SN-, TAGUCHI SN00;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax is used for statistics that only require a single variable to compute.

SYNTAX 2

DEX <stat> PARETO ABSOLUTE EFFECTS PLOT <y> <x> <x1> ... < xn> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response variable;
<x> is a second variable used in calculating the statistic (e.g., a linear fit is computed between <y1> and <x));
<x1> ... <xn> are a sequence of variables representing factors in a designed experiment;
<stat> is one of the following statistics:
LINEAR INTERCEPT, LINEAR SLOPE, LINEAR RESSD, LINEAR CORRELATION,
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax is used for statistics that require two variables to compute.

EXAMPLES

DEX MEAN PARETO ABSOLUTE EFFECTS PLOT Y X1 X2
DEX SD PARETO ABSOLUTE EFFECTS PLOT Y X1 X2 X3
DEX RANGE PARETO ABSOLUTE EFFECTS PLOT Y X1 X2 X3 X4
DEX RANGE PARETO ABSOLUTE EFFECTS PLOT Y X1 TO X4

NOTE 1

This plot is similar to the DEX ... PARETO EFFECTS PLOT. The distinction is that the Pareto absolute effects plot takes the absolute value of the effect while the Pareto effects plot does not. This only applies if there are exactly two levels (for more than two levels, the two plots are identical). The DEX ... ABSOLUTE EFFECTS PLOT is also similar (the effects are not ordered). The unordered version is often generated first so that the ordered plot can be labeled with the correct factors on the horizontal axis (as was done in the example program below).

NOTE 2

This plot is normally done for a location parameter (typically the mean or median) or a spread parameter (typically the standard deviation or range). The other statistics are less often used.

NOTE 3

The TO syntax is allowed for the list of factor variables (see the EXAMPLES above).

NOTE 4

The CHARACTER, BAR, SPIKE, and LINE settings can be used to control the appearance of the plot. The trace is typically drawn with a blank line and some type of character set (the choice of character is a matter of user preference) or with no character and a bar. In any event, the user must explicitly set character and line settings (they default to all lines solid and all characters blank).

DEFAULT

None

SYNONYMS

DEX ... EFFECTS PARETO ABSOLUTE PLOT
DEX ... EFFECTS ABSOLUTE PARETO PLOT
DEX ... PARETO EFFECTS ABSOLUTE PLOT
DEX ... ABSOLUTE PARETO EFFECTS PLOT
DEX ... ABSOLUTE EFFECTS PARETO PLOT

RELATED COMMANDS

LINES = Sets the type for plot lines.
CHARACTER = Sets the type for plot characters
PLOT = Generates a data or function plot.
DEX SCATTER PLOT = Generates a dex scatter plot.
DEX SIGN PLOT = Generates a dex sign plot.
DEX ... PLOT = Generates a dex plot for a statistic.
DEX ... PARETO PLOT = Generates a Pareto dex plot for a statistic.
DEX ... YOUDEN PLOT = Generates a Youden dex plot for a statistic.
DEX ... EFFECTS PLOT = Generates an effects dex plot for a statistic.
DEX ... PARETO EFFECTS PLOT = Generates a Pareto effects dex plot for a statistic.
DEX WIDTH = Specifies the width of levels in a dex plot.

REFERENCE

``Statistics for Experimenters,'' Box, Hunter, and Hunter, Wiley and Sons, 1978.

APPLICATIONS

Design of Experiments

IMPLEMENTATION DATE

89/12

PROGRAM

SKIP 25
READ SHEESLE2.DAT Y PROC PLANT SPEED SHIFT PROC
.
BAR ON
BAR WIDTH 0.5
LINE BLANK
TITLE DEX PARETO ABSOLUTE EFFECTS PLOT
LET NUMFAC = 4
XLIMITS 1 NUMFAC
MAJOR XTIC MARK NUMBER NUMFAC
MINOR XTIC MARK NUMBER 0
XTIC OFFSET 1 1
XTIC LABEL FORMAT ALPHA
.
MULTIPLOT 2 2; MULTIPLOT CORNER COORDINATES 0 0 100 100
Y1LABEL MEAN
XTIC LABEL CONTENT SHIFT SPEED PLANT PROCESS
DEX MEAN PARETO ABSOLUTE EFFECTS PLOT Y PROC PLANT SPEED SHIFT
Y1LABEL MEDIAN
XTIC LABEL CONTENT SHIFT SPEED PLANT PROCESS
DEX MEDIAN PARETO ABSOLUTE EFFECTS PLOT Y PROC PLANT SPEED SHIFT
Y1LABEL STANDARD DEVIATION
XTIC LABEL CONTENT SHIFT PLANT SPEED PROCESS
DEX SD PARETO ABSOLUTE EFFECTS PLOT Y PROC PLANT SPEED SHIFT
Y1LABEL RANGE
XTIC LABEL CONTENT SHIFT PLANT SPEED PROCESS
DEX RANGE PARETO ABSOLUTE EFFECTS PLOT Y PROC PLANT SPEED SHIFT
END OF MULTIPLOT

DEX ... PARETO EFFECTS PLOT

PURPOSE

Generates a dex Pareto effects plot for a given statistic.

DESCRIPTION

A dex Pareto effects plot is a graphical method for representing a design of experiment problem. The first variable is a response variable while the remaining variables (must be at least one) represent levels of factors. The user specified statistic is computed for each level of each factor. This plot consists of:

Vertical axis = if there are exactly 2 levels, the effect is the value of the statistic for the lower level subtracted from the value of the statistic for the higher level (this can be positive or negative). If there are more than 2 levels, the lowest value of the statistic is subtracted from the highest value of the statistic (this value is always positive). The values are then sorted from highest to lowest.
Horizontal axis = the factor id (i.e., 1 for the factor with the highest value, 2 for the factor with the second highest value, and so on).
This plot graphically shows the following:

2. How these maximum differences vary between factors.

SYNTAX 1

DEX <stat> PARETO EFFECTS PLOT <y> <x1> ... <xn> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response variable;
<x1> ... <xn> are a sequence of variables representing factors in a designed experiment;
<stat> is one of the following statistics:
MEAN (or AVERAGE), MIDMEAN, MEDIAN, TRIMMED MEAN, WINDSORIZED MEAN,
SUM, PRODUCT, SIZE (or NUMBER or COUNT), MINIMUM, MAXIMUM,
STANDARD DEVIATION (or SD), VARIANCE,
STANDARD DEVIATION OF MEAN (or SDM), VARIANCE OF MEAN (or VM),
RELATIVE STANDARD DEVIATION (or RELSD),
RELATIVE VARIANCE (or RELV or COEFFICIENT OF VARIATION),
RANGE, MIDRANGE, LOWER HINGE, UPPER HINGE, LOWER QUARTILE, UPPER QUARTILE,
<FIRST/SECOND/THIRD/FOURTH/FIFTH/SIXTH/SEVENTH/EIGTH/NINTH> DECILE (or 1DEC, 2DEC,
3DEC,4DEC,5DEC,6DEC,7DEC, 8DEC, 9DEC),
SKEWNESS, KURTOSIS, PROPORTION,
AUTOCORRELATION, AUTOCOVARIANCE,
SINE FREQUENCY, SINE AMPLITUDE,
CP, CPK, EXPECTED LOSS, PERCENT DEFECTIVE,
SINE FREQUENCY, SINE AMPLITUDE,
TAGUCHI SN, TAGUCHI SN+, TAGUCHI SN-, TAGUCHI SN00;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax is used for statistics that only require a single variable to compute.

SYNTAX 2

DEX <stat> PARETO EFFECTS PLOT <y> <x> <x1> ... < xn> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response variable;
<x> is a second variable used in calculating the statistic (e.g., a linear fit is computed between <y1> and <x));
<x1> ... <xn> are a sequence of variables representing factors in a designed experiment;
<stat> is one of the following statistics:
LINEAR INTERCEPT, LINEAR SLOPE, LINEAR RESSD, LINEAR CORRELATION,
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax is used for statistics that require two variables to compute.

EXAMPLES

DEX MEAN PARETO EFFECTS PLOT Y X1 X2
DEX SD PARETO EFFECTS PLOT Y X1 X2 X3
DEX RANGE PARETO EFFECTS PLOT Y X1 X2 X3 X4
DEX RANGE PARETO EFFECTS PLOT Y X1 TO X4

NOTE 1

This plot is similar to the DEX ... EFFECTS PLOT. The distinction is that the Pareto effects plot orders the effects from high to low while the standard effects plot orders them in the order they are given in the plot. The advantage to the standard plot is that you can trace a value to a specific factor while the Pareto plot loses this. However, the ordering of the Pareto plot can make it easier to read by first generating a standard effects plot and then generating the Pareto version. The TIC LABEL CONTENT can be used to label the factors on the Pareto version (the proper order is manually determined from the standard effects plot). This was done for the example program below.

NOTE 2

This plot is normally done for a location parameter (typically the mean or median) or a spread parameter (typically the standard deviation or range). The other statistics are less often used.

NOTE 3

The TO syntax is allowed for the list of factor variables (see the EXAMPLES above).

NOTE 4

The CHARACTER, BAR, SPIKE, and LINE settings can be used to control the appearance of the plot. The trace is typically drawn with a blank line and some type of character set (the choice of character is a matter of user preference) or with no character and a bar. In any event, the user must explicitly set character and line settings (they default to all lines solid and all characters blank).

DEFAULT

None

SYNONYMS

DEX ... EFFECTS PARETO PLOT

RELATED COMMANDS

LINES = Sets the type for plot lines.
CHARACTER = Sets the type for plot characters
PLOT = Generates a data or function plot.
DEX SCATTER PLOT = Generates a dex scatter plot.
DEX SIGN PLOT = Generates a dex sign plot.
DEX ... PLOT = Generates a dex plot for a statistic.
DEX ... PARETO PLOT = Generates a Pareto dex plot for a statistic.
DEX ... YOUDEN PLOT = Generates a Youden dex plot for a statistic.
DEX ... EFFECTS PLOT = Generates an effects dex plot for a statistic.
DEX ... PARE ABSO EFFECTS PLOT = Generates a Pareto absolute effects dex plot for a statistic.
DEX WIDTH = Specifies the width of levels in a dex plot.

REFERENCE

``Statistics for Experimenters,'' Box, Hunter, and Hunter, Wiley and Sons, 1978.

APPLICATIONS

Design of Experiments

IMPLEMENTATION DATE

89/12

PROGRAM

SKIP 25
READ SHEESLE2.DAT Y PROC PLANT SPEED SHIFT PROC
.
BAR ON
BAR WIDTH 0.5
LINE BLANK
TITLE DEX PARETO EFFECTS PLOT
LET NUMFAC = 4
XLIMITS 1 NUMFAC
MAJOR XTIC MARK NUMBER NUMFAC
MINOR XTIC MARK NUMBER 0
XTIC OFFSET 1 1
XTIC LABEL FORMAT ALPHA
.
MULTIPLOT 2 2
MULTIPLOT CORNER COORDINATES 0 0 100 100
Y1LABEL MEAN
XTIC LABEL CONTENT SHIFT SPEED PLANT PROCESS
DEX MEAN PARETO EFFECTS PLOT Y PROC PLANT SPEED SHIFT
Y1LABEL MEDIAN
XTIC LABEL CONTENT SHIFT SPEED PLANT PROCESS
DEX MEDIAN PARETO EFFECTS PLOT Y PROC PLANT SPEED SHIFT
Y1LABEL STANDARD DEVIATION
XTIC LABEL CONTENT SHIFT PLANT SPEED PROCESS
DEX SD PARETO EFFECTS PLOT Y PROC PLANT SPEED SHIFT
Y1LABEL RANGE
XTIC LABEL CONTENT SHIFT PLANT SPEED PROCESS
DEX RANGE PARETO EFFECTS PLOT Y PROC PLANT SPEED SHIFT
END OF MULTIPLOT

DEX ... PLOT

PURPOSE

Generates a dex plot for a given statistic.

DESCRIPTION

A dex statistic plot is a graphical method for representing a design of experiment problem. The first variable is a response variable while the remaining variables (must be at least one) represent levels of factors. The user specified statistic is computed for each level of each factor. A separate subplot is drawn for each factor with the subplot for factor k centered horizontally at x=k. Each subplot has a given horizontal width (defined by the DEX WIDTH command, defaults to 0.5). For example, the subplot for factor 2 ranges from 1.75 to 2.25 on the horizontal axis. The levels of the factor are assigned an x coordinate within this range (from lowest to highest). Then within each subplot:

Vertical axis = value of the computed statistic from the response variable for a given level of the factor;
Horizontal axis = value of the level of a given factor.
This plot graphically shows the following:

2. How the statistic for the response variable varies between factors.

SYNTAX 1

DEX <stat> PLOT <y> <x1> ... <xn> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response variable;
<x1> ... <xn> is a sequence of variables representing factors in a designed experiment;
<stat> is one of the following statistics:
MEAN (or AVERAGE), MIDMEAN, MEDIAN, TRIMMED MEAN, WINDSORIZED MEAN,
SUM, PRODUCT, SIZE (or NUMBER or COUNT), MINIMUM, MAXIMUM,
STANDARD DEVIATION (or SD), VARIANCE,
STANDARD DEVIATION OF MEAN (or SDM), VARIANCE OF MEAN (or VM),
RELATIVE STANDARD DEVIATION (or RELSD),
RELATIVE VARIANCE (or RELV or COEFFICIENT OF VARIATION),
RANGE, MIDRANGE, LOWER HINGE, UPPER HINGE, LOWER QUARTILE, UPPER QUARTILE,
<FIRST/SECOND/THIRD/FOURTH/FIFTH/SIXTH/SEVENTH/EIGTH/NINTH> DECILE (or 1DEC, 2DEC,
3DEC,4DEC,5DEC,6DEC,7DEC, 8DEC, 9DEC),
SKEWNESS, KURTOSIS, PROPORTION,
AUTOCORRELATION, AUTOCOVARIANCE,
SINE FREQUENCY, SINE AMPLITUDE,
CP, CPK, EXPECTED LOSS, PERCENT DEFECTIVE,
SINE FREQUENCY, SINE AMPLITUDE,
TAGUCHI SN, TAGUCHI SN+, TAGUCHI SN-, TAGUCHI SN00;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax is used for statistics that only require a single variable to compute.

SYNTAX 2

DEX <stat> PLOT <y> <x> <x1> ... < xn> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response variable;
<x> is a second variable used in calculating the statistic (e.g., a linear fit is computed between <y1> and <x));
<x1> ... <xn> are a sequence of variables representing factors in a designed experiment;
<stat> is one of the following statistics:
LINEAR INTERCEPT, LINEAR SLOPE, LINEAR RESSD, LINEAR CORRELATION,
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax is used for statistics that require two variables to compute.

EXAMPLES

DEX MEAN PLOT Y X1 X2
DEX MEDIAN PLOT Y X1 X2
DEX SD PLOT Y X1 X2 X3
DEX RANGE PLOT Y X1 X2 X3 X4
DEX SD PLOT Y X1 TO X3
DEX RANGE PLOT Y X1 TO X4

NOTE 1

This plot is normally done for a location parameter (typically the mean or median) or a spread parameter (typically the standard deviation or range). The other statistics are less often used.

NOTE 2

The TO syntax is allowed for the list of factor variables (see the EXAMPLES above).

NOTE 3

The CHARACTER and LINE settings can be used to control the appearance of the plot. The first trace is typically drawn with a blank line and some type of character set (the choice of character is a matter of user preference). The second trace draws a horizontal line at the value for the specified statistic for the entire response variable. This is typically drawn with a blank character and a solid line (some analysts may prefer a dashed or dotted line). In any event, the user must explicitly set character and line settings (they default to all lines solid and all characters blank).

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

LINES = Sets the type for plot lines.
CHARACTER = Sets the type for plot characters
PLOT = Generates a data or function plot.
DEX SCATTER PLOT = Generates a dex scatter plot.
DEX SIGN PLOT = Generates a dex sign plot.
DEX ... PARETO PLOT = Generates a Pareto dex plot for a statistic.
DEX ... YOUDEN PLOT = Generates a Youden dex plot for a statistic.
DEX ... EFFECTS PLOT = Generates a dex effects plot for a statistic.
DEX ... PARETO EFFECTS PLOT = Generates a Pareto effects dex plot for a statistic.
DEX ... ABSOLUTE EFFECTS PLOT = Generates an absolute effects dex plot for a statistic.
DEX ... PARE ABSO EFFECTS PLOT = Generates a Pareto absolute effects dex plot for a statistic.
DEX WIDTH = Specifies the width of levels in a dex plot.

REFERENCE

``Statistics for Experimenters,'' Box, Hunter, and Hunter, Wiley and Sons, 1978.

APPLICATIONS

Design of Experiments

IMPLEMENTATION DATE

89/12

PROGRAM

SKIP 25
READ SHEESLE2.DAT Y PROC PLANT SPEED SHIFT PROC
.
CHARACTERS X BLANK
LINES SOLID SOLID
LET NFACT = 4
XLIMITS 1 NFACT
MAJOR XTIC MARK NUMBER NFACT
MINOR XTIC MARK NUMBER 0
XTIC OFFSET 1 1
XTIC LABEL FORMAT ALPHA
XTIC LABEL CONTENT PROCESS PLANT SPEED SHIFT
X1LABEL FACTORS
YLIMITS 20 28
.
MULTIPLOT 2 2
MULTIPLOT CORNER COORDINATES 0 0 100 100
Y1LABEL MEAN
DEX MEAN PLOT Y PROC PLANT SPEED SHIFT
Y1LABEL MEDIAN
DEX MEDIAN PLOT Y PROC PLANT SPEED SHIFT
YLIMITS
Y1LABEL STANDARD DEVIATION
DEX SD PLOT Y PROC PLANT SPEED SHIFT
Y1LABEL RANGE
DEX RANGE PLOT Y PROC PLANT SPEED SHIFT
END OF MULTIPLOT

DEX ... YOUDEN PLOT

PURPOSE

Generates a dex Youden plot for a given statistic.

DESCRIPTION

A dex Youden plot is a graphical method for representing a design of experiment problem. The first variable is a response variable while the remaining variables (must be at least one) represent levels of factors. For the Youden plot, all factors must have exactly 2 levels. The Youden plot computes the given statistic for each level of each factor. The plot then consists of:

Vertical axis = value of the computed statistic for the lower level in the factor;
Horizontal axis = value of the computed statistic for the higher level in the factor.
This plot graphically shows if the value of the computed statistic is dependent on the level of the factor and additionally if it is dependent on the factor.

SYNTAX 1

DEX <stat> YOUDEN PLOT <y> <x1> ... <xn> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response variable;
<x1> ... <xn> are a sequence of variables representing factors in a designed experiment;
<stat> is one of the following statistics:
MEAN (or AVERAGE), MIDMEAN, MEDIAN, TRIMMED MEAN, WINDSORIZED MEAN,
SUM, PRODUCT, SIZE (or NUMBER or COUNT), MINIMUM, MAXIMUM,
STANDARD DEVIATION (or SD), VARIANCE,
STANDARD DEVIATION OF MEAN (or SDM), VARIANCE OF MEAN (or VM),
RELATIVE STANDARD DEVIATION (or RELSD),
RELATIVE VARIANCE (or RELV or COEFFICIENT OF VARIATION),
RANGE, MIDRANGE, LOWER HINGE, UPPER HINGE, LOWER QUARTILE, UPPER QUARTILE,
<FIRST/SECOND/THIRD/FOURTH/FIFTH/SIXTH/SEVENTH/EIGTH/NINTH> DECILE (or 1DEC, 2DEC,
3DEC,4DEC,5DEC,6DEC,7DEC, 8DEC, 9DEC),
SKEWNESS, KURTOSIS, PROPORTION,
AUTOCORRELATION, AUTOCOVARIANCE,
SINE FREQUENCY, SINE AMPLITUDE,
CP, CPK, EXPECTED LOSS, PERCENT DEFECTIVE,
SINE FREQUENCY, SINE AMPLITUDE,
TAGUCHI SN, TAGUCHI SN+, TAGUCHI SN-, TAGUCHI SN00;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax is used for statistics that only require a single variable to compute.

SYNTAX 2

DEX <stat> YOUDEN PLOT <y> <x> <x1> ... < xn> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response variable;
<x> is a second variable used in calculating the statistic (e.g., a linear fit is computed between <y1> and <x));
<x1> ... <xn> are a sequence of variables representing factors in a designed experiment;
<stat> is one of the following statistics:
LINEAR INTERCEPT, LINEAR SLOPE, LINEAR RESSD, LINEAR CORRELATION,
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax is used for statistics that require two variables to compute.

EXAMPLES

DEX MEAN YOUDEN PLOT Y X1 X2
DEX SD YOUDEN PLOT Y X1 X2 X3
DEX RANGE YOUDEN PLOT Y X1 X2 X3 X4
DEX RANGE YOUDEN PLOT Y X1 TO X4

NOTE

This plot is normally done for a location parameter (typically the mean or median) or a spread parameter (typically the standard deviation or range). The other statistics are less often used.

NOTE 1

The TO syntax is allowed for the list of factor variables (see the EXAMPLES above).

NOTE 2

The following program example shows how to put the factor labels on the X axis.

NOTE 3

The CHARACTER,BAR, SPIKE, and LINE settings can be used to control the appearance of the plot. The first trace is typically drawn with a blank line and some type of character set (the choice of character is a matter of user preference). The second trace draws a diagonal line on the plot. This is typically drawn with a blank character and a solid line (some analysts may prefer a dashed or dotted line). In any event, the user must explicitly set character and line settings (they default to all lines solid and all characters blank).

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

LINES = Sets the type for plot lines.
CHARACTER = Sets the type for plot characters
PLOT = Generates a data or function plot.
DEX SCATTER PLOT = Generates a dex scatter plot.
DEX SIGN PLOT = Generates a dex sign plot.
DEX ... PLOT = Generates a dex plot for a statistic.
DEX ... PARETO PLOT = Generates a Pareto dex plot for a statistic.
DEX ... EFFECTS PLOT = Generates a dex effects plot for a statistic.
DEX ... PARETO EFFECTS PLOT = Generates a Pareto effects dex plot for a statistic.
DEX ... ABSOLUTE EFFECTS PLOT = Generates an absolute effects dex plot for a statistic.
DEX ... PARE ABSO EFFECTS PLOT = Generates a Pareto absolute effects dex plot for a statistic.
DEX WIDTH = Specifies the width of levels in a dex plot.

REFERENCE

``Statistics for Experimenters,'' Box, Hunter, and Hunter, Wiley and Sons, 1978.

APPLICATIONS

Design of Experiments

IMPLEMENTATION DATE

89/12

PROGRAM

SKIP 25
READ BOXCHEM.DAT Y X1 X2 X3 X4
.
CHARACTERS 1 2 3 4
LINE BLANK BLANK BLANK BLANK DOTTED
X1LABEL HIGH LEVEL
Y1LABEL LOW LEVEL
.
MULTIPLOT 2 2
MULTIPLOT CORNER COORDINATES 0 0 100 100
TITLE MEAN
YTIC OFFSET 5 0
DEX MEAN YOUDEN PLOT Y X1 TO X4
TITLE MEDIAN
YTIC OFFSET 5 0
DEX MEDIAN YOUDEN PLOT Y X1 TO X4
TITLE STANDARD DEVIATION
YTIC OFFSET 0 0
DEX SD YOUDEN PLOT Y X1 TO X4
TITLE RANGE
YTIC OFFSET 0 5
DEX RANGE YOUDEN PLOT Y X1 TO X4
END OF MULTIPLOT

ERROR BAR PLOT

PURPOSE

Generates an error bar plot.

DESCRIPTION

An error bar plot is a graphical data analysis technique for showing the error in the dependent variable and, optionally, the independent variable in a standard x-y plot. As in a standard x-y plot, the vertical axis contains a dependent variable while the horizontal axis contains an independent variable. In addition, it contains error bars in the vertical direction and, optionally, the horizontal direction. The error bars can be either symmetric or asymmetric about the point. The number of arguments on the command determines which type of error bars are produced (see the SYNTAX section below).

By using the CHARACTERS and LINES command, the analyst has a great deal of flexibility in formatting the appearance of the error bars. The appearance of the error bars is controlled by the following character and line traces.

Trace 1 = the point (x,y)
Trace 2 = the point (x,y+error)

Trace 3 = the point (x,y-error)

Trace 4 = the point (x-error,y)

Trace 5 = the point (x+error,y)

Trace 6 = the line between (x,y+error) and (x,y-error)

Trace 7 = the line between (x-error,y) and (x+error,y)

The line setting for trace 1 is used if you want the plot points to be connected. The line settings for traces 2 through 5 are usually set to BLANK (some analysts prefer connected lines for traces 2 and 3). Most analysts prefer to use solid lines for traces 6 (connects the vertical errors) and trace 7 (connects the horizontal errors). However, you can use a different line style or leave them blank if you prefer. The character setting for trace 1 is the original data point. Use character traces 2 through 5 if you want the 4 end points to have unique settings. Use character traces 6 and 7 to use the same symbol for matching end points. The program below demonstrates how to set the line and character traces.

SYNTAX 1

ERROR BAR PLOT <y1> <ypos> <SUBSET/EXCEPT/FOR qualification>
where <y1> is the dependent variable;
<ypos> is the error for <y1> in both the positive (up) and the negative (down) direction;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

SYNTAX 2

ERROR BAR PLOT <y1> <ypos> <x1> <SUBSET/EXCEPT/FOR qualification>
where <y1> is the dependent variable;
<ypos> is the error for <y1> in both the positive (up) and the negative (down) direction;
<x1> is the independent variable;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

SYNTAX 3

ERROR BAR PLOT <y1> <ypos> <yneg> <x1> <SUBSET/EXCEPT/FOR qualification>
where <y1> is the dependent variable;
<ypos> is the error for <y1> in the positive (up) direction;
<yneg> is the error for <y1> in the negative (down) direction;
<x1> is the independent variable;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

SYNTAX 4

ERROR BAR PLOT <y1> <ypos> <yneg> <x1> <xpos> <SUBSET/EXCEPT/FOR qualification>
where <y1> is the dependent variable;
<ypos> is the error for <y1> in the positive (up) direction;
<yneg> is the error for <y1> in the negative (down) direction;
<x1> is the independent variable;
<xpos> is the error for <x1> in both the positive (left) and the negative (right) direction;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

SYNTAX 5

ERROR BAR PLOT <y1> <ypos> <yneg> <x1> <xpos> <xneg> <SUBSET/EXCEPT/FOR qualification>
where <y1> is the dependent variable;
<ypos> is the error for <y1> in the positive (up) direction;
<yneg> is the error for <y1> in the negative (down) direction;
<x1> is the independent variable;
<xpos> is the error for <x1> in the positive (left) direction;
<xneg> is the error for <x1> in the negative (right) direction;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

ERROR BAR PLOT Y1 YDELTA
ERROR BAR PLOT Y1 YDELTA X1
ERROR BAR PLOT Y1 YDELPOS YDELNEG X1
ERROR BAR PLOT Y1 YDELPOS YDELNEG X1 XDELTA
ERROR BAR PLOT Y1 YDELPOS YDELNEG X1 XDELPOS XDELNEG
ERROR BAR PLOT Y1 YDELTA X1 SUBSET X1 < 10

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

LINES = Sets the type for plot lines.
CHARACTER = Sets the symbol for plot characters.
PLOT = Generates a data or function plot.

APPLICATIONS

Exploratory Data Analysis, Presentation Graphics

IMPLEMENTATION DATE

88/11

PROGRAM

LET FUNCTION F = X1**2
LET X1 = SEQUENCE -5 1 5
LET Y1 = F; LET N = SIZE Y1
LET DELTA = NORMAL RANDOM NUMBERS FOR I = 1 1 N
LET DELTA = 2*DELTA
LET YPOS = ABS(DELTA)
LET YNEG = UNIFORM RANDOM NUMBERS FOR I = 1 1 N
LET YNEG = 3*YNEG
LET XPOS = PATTERN .2 .4 .6 FOR I = 1 1 N
LET XNEG = PATTERN .4 .8 1.2 FOR I = 1 1 N
.
MULTIPLOT 2 2; MULTIPLOT CORNER COORDINATES 0 0 100 100
XLIMITS -5 5; XTIC OFFSET 1 1
YLIMITS 0 30; YTIC OFFSET 2 1
CHARACTER FONT SIMPLEX ALL; CHARACTER CIRCLE - -
CHARACTER SIZE 2.0 3.0 3.0 3.0 3.0 3.0 3.0
CHARACTER FILL ON
.
TITLE AUTOMATIC
PLOT Y1 VS X1
TITLE SYMMETRIC ERROR BARS
ERROR BAR PLOT Y1 DELTA X1
TITLE ASYMMETRIC ERROR BARS
ERROR BAR PLOT Y1 YPOS YNEG X1
CHARACTER CIRCLE BLANK BLANK BLANK BLANK - |
LINE BLANK BLANK BLANK BLANK BLANK SOLID SOLID
TITLE X AND Y ERROR BARS
ERROR BAR PLOT Y1 YPOS YNEG X1 XNEG XPOS
END OF MULTIPLOT

EXPECTED LOSS PLOT

PURPOSE

Generates a subsample expected loss versus subsample index plot.

DESCRIPTION

The subsample expected loss index is the expected loss of the data in the subsample. The expected loss computes the number of defectives for a variable (i.e., the number of values that fall outside of some user specified tolerance limits) and multiplies that by some user specified cost. The expected loss plot is used to answer the question: ``Does the subsample expected loss index change over different subsamples?'' The plot consists of:

Vertical axis = subsample expected loss index;
Horizontal axis = subsample index.
In addition, a horizontal line is drawn representing the full sample expected loss value. As usual, the appearance of the 2 traces is controlled by the first 2 settings of the LINES, CHARACTERS, SPIKES, BARS, and similar attributes.

SYNTAX

EXPECTED LOSS PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable;
<x> is the subsample identifier variable (this variable appears on the horizontal axis);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

EXPECTED LOSS PLOT Y X
EXPECTED LOSS PLOT Y X SUBSET X = 4 TO 10

NOTE

The upper and lower specification limits must be specified by the user as follows:

LET USL = <value>
LET LSL = <value>
The cost value must be specified as follows:

LET USLCOST = <value>

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

CHARACTERS = Sets the type for plot characters.
LINES = Sets the type for plot lines.
CP PLOT = Generates a Cp plot.
CPK PLOT = Generates a Cpk plot.
PERCENT DEFECTIVE PLOT = Generates a percent defective plot.
CAPABILITY ANALYSIS = Performs a process capability analysis.
BOX PLOT = Generates a box plot.
XBAR CHART = Generates a mean control chart.
PLOT = Generates a data or function plot.

APPLICATIONS

Quality Control

IMPLEMENTATION DATE

93/10

PROGRAM

SKIP 25
READ GEAR.DAT DIAMETER BATCH
.
TITLE CASE ASIS
LABEL CASE ASIS
TITLE Gear Diameter Analysis
Y1LABEL EXPECTED LOSS
X1LABEL Batch
LEGEND 1 Process Capability
LEGEND 2 EXPECTED LOSS Plot
XTIC OFFSET 0.5 0.5
CHARACTER X BLANK
LINE BLANK SOLID
.
LET LSL = 0.98
LET USL = 1.02
LET USLCOST = 15
.
EXPECTED LOSS PLOT Diameter Batch

EXTREME PLOT

PURPOSE

Generates a subsample extreme versus subsample index plot.

DESCRIPTION

The subsample extreme is the data value with the largest absolute value in the subsample. The extreme plot is used to answer the question: ``Does the subsample variation change over different subsamples?'' The plot consists of:

Vertical axis = subsample extreme;
Horizontal axis = subsample index.
In addition, a horizontal line is drawn representing the full sample extreme value. As usual, the appearance of the 2 traces is controlled by the first 2 settings of the LINES, CHARACTERS, SPIKES, BARS, and similar attributes.

SYNTAX

EXTREME PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable;
<x> is the subsample identifier variable (this variable appears on the horizontal axis);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

EXTREME PLOT Y X
EXTREME PLOT Y X1 SUBSET X1 > 3

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

CHARACTERS = Sets the type for plot characters.
LINES = Sets the type for plot lines.
MINIMUM PLOT = Generates a minimum plot.
MAXIMUM PLOT = Generates a minimum plot.
RANGE PLOT = Generates a range plot.
DECILE PLOT = Generates a decile plot.
STANDARD DEVIATION PLOT = Generates a stand deviation plot.
MEAN PLOT = Generates a mean plot.
BOX PLOT = Generates a box plot.
RANGE CHART = Generates a range control chart.
S CHART = Generates a standard deviation control chart.
PLOT = Generates a data or function plot.

APPLICATIONS

Quality Control

IMPLEMENTATION DATE

88/2

PROGRAM

SKIP 50

SET READ FORMAT 3F4.0,F5.0,F6.0,F3.0,2F9.0

READ PBF11.DAT YEAR DAY BOT SD F11 FLAG WV CO2

.

RETAIN YEAR DAY BOT SD F11 WV CO2 FLAG SUBSET FLAG 0

LET MONTH=INT(DAY/30.25)+1

LET A = MEAN WV

LET WV = WV - A

.

LINE BLANK DASH

CHARACTER X BLANK

XLIMITS 0 15

Y1LABEL ABSOLUTE VALUE FROM MEAN

X1LABEL GROUP ID

TITLE AUTOMATIC

EXTREME PLOT WV MONTH

FRACTAL PLOT

PURPOSE

Generates a fractal plot.

DESCRIPTION

DATAPLOT generates Iterated Function Systems fractals as defined by Michael Barnsley. Barnsley defines an affine transformation as follows:

Fractal plots are generated by applying one or more affine transformations in an iterative fashion to an initial starting point (DATAPLOT uses (0,0) as the starting point). The points a, b, c, and d define rotation and scaling operations to be applied to the point. The e and f points define a translation to be applied to the point. An additional value is the probability weighting. These weights are applied to a uniform random number generator to determine which of the affine transformations (if there is more than one) to apply at a given step. The a, b, c, and d points are commonly expressed as follows:

This form makes the nature of the scaling and rotation more explicit. DATAPLOT can generate fractals expressed in either of these formats. In addition, DATAPLOT supports an alternate form for specifying the rotation and scaling (this algorithm is due to William Withers of the US Naval Academy). It performs the following rotation and scaling to obtain the a, b, c, and d points:

This form specifies an initial rotation, a scaling, then a final rotation.

DATAPLOT currently supports each of these methods for specifying fractals. Note that both forms given in angles and scaling factors transform easily to the Barnsley form (i.e., a, b, c, d). If you have the a, b, c, and d points, you can get r1, r2, alpha1, and alpha2 (i.e., the alternate form for Barnsley's definition) as follows:

For each of the 3 formats, one row of the input variables defines a single affine transformation. A fractal can be generated from one or more affine transformations. The columns specify one of the elements (e.g., a, b, etc.).

If you do not wish to specify a probability factor, simply specify all probability weights to be 1 or leave it off. The translation variables are specified the same way for all 3 forms.

SYNTAX 1 (Wither's format)

FRACTAL PLOT <y1> <y2> <y3> <y4> <y5> <y6> <y7> <SUBSET/EXCEPT/FOR qualification>
where <y1> is the variable containing the initial rotations (i.e., a);
<y2> is the variable containing the X scalings (i.e., p);
<y3> is the variable containing the Y scalings (i.e., q);
<y4> is the variable containing the final rotations (i.e., b);
<y5> is the variable containing the X translations (i.e., e);
<y6> is the variable containing the Y translations (i.e., f);
<y7> is a variable containing the probability weightings;
and where the <SUBSET/EXCEPT/FOR qualification> is optional and rarely used in this context.

SYNTAX 2 (Barnsley's format)

FRACTAL PLOT <a> <b> <c> <d> <e> <f> <y7> <SUBSET/EXCEPT/FOR qualification>
where <a> is the variable containing the a values;
<b> is the variable containing the b values;
<c> is the variable containing the c values;
<c> is the variable containing the d values;
<e> is the variable containing the e (i.e., translation);
<f> is the variable containing the f (i.e., translation) values;
<y7> is a variable containing the probability weightings;
and where the <SUBSET/EXCEPT/FOR qualification> is optional and rarely used in this context.

SYNTAX 3 (Barnsley's rotation matrix format)

FRACTAL PLOT <y1> <y2> <y3> <y4> <y5> <y6> <y7> <SUBSET/EXCEPT/FOR qualification>
where <y1> is the variable containing the a1 values;
<y2> is the variable containing the r1 values;
<y3> is the variable containing the r2 values;
<y4> is the variable containing the a2 values;
<y5> is the variable containing the X translations (i.e, e);
<y6> is the variable containing the Y translations (i.e., f);
<y7> is a variable containing the probability weightings;
and where the <SUBSET/EXCEPT/FOR qualification> is optional and rarely used in this context.

EXAMPLES

FRACTAL PLOT Y1 Y2 Y3 Y4 Y5 Y6
FRACTAL PLOT Y1 Y2 Y3 Y4 Y5 Y6 Y7

NOTE 1

The FRACTAL TYPE command is used to specify which of the three formats is used for the fractal data. The default is Barnsley's format (i.e., SYNTAX 2 above).

NOTE 2

The following sample data files in the DATAPLOT reference directory contain examples of fractal data sets. Just replace the READ section in the program example below. Two of these are shown in the sample programs. These data files are in the Wither's format.

FRACBRAN.DAT - generate a branch
FRACCHRI.DAT - generate a Christmas tree

FRACCLOU.DAT - generate a cloud

FRACFERN.DAT - generate a fern

FRACFRON.DAT - generate a frond

FRACGALA.DAT - generate a galaxy

FRACPENT.DAT - generate a pentagon

FRACSPIR.DAT - generate a spiral

FRACSQUA.DAT - generate a square

FRACTRIA.DAT - generate a triangle

NOTE 3

The appearance of the plot is controlled by the LINE and CHARACTER settings. Typically, you want to set the LINE to blank and the CHARACTER to a ``.'' or some other character. It is also recommended that you set the character small. This is demonstrated in the example programs below.

NOTE 4

DATAPLOT continues to generate the fractal plot until the maximum number of points for a plot has been reached. This is 20,000 or 40,000 on most current implementations. The FRACTAL ITERATIONS command can be used to set this to a smaller number (it is currently not possible to set it to a larger number).

NOTE 5

The 2 angle variables can be given in either radians or degrees. If they are given in degrees, be sure to enter an ANGLE UNITS DEGREES command before the FRACTAL PLOT command.

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

LINES = Sets the type for plot lines.
CHARACTER = Sets the type for plot characters.
CHARACTER FONT = Sets the font for plot characters.
ANGLE UNITS = Specifies whether angles are given in degrees or radians.
PLOT = Generates a data or function plot.
MULTIPLOT = Allows multiple plots per page
FRACTAL (LET) = Generate fractal data (of the kind used to create Koch snowflakes).

REFERENCE

DATAPLOT uses an algorithm provided by Douglass Withers of the U.S. Naval Academy.

``Fractals Everywhere,'' Michael Barnsley, Academic Press, 1988.

``Chaos, Fractals, and Dynamics: Computer Experiments in Mathematics,'', Robert Devaney, Addison-Wesley, 1990.

``Chaos and Fractals: New Frontiers of Science,'', Petigen, Jurgens, and Saupe, Springer-Verlang, 1993.

APPLICATIONS

Fractals

IMPLEMENTATION DATE

88/12

PROGRAM 1

. Generate a fractal fern
READ Y1 TO Y7
180.000 0.160 0.001 180.000 0.000 0.000 1
0.000 0.850 0.850 -2.500 1.600 0.000 15
180.000 0.340 0.300 229.000 1.600 0.000 2
109.709 -0.288 0.379 235.233 0.440 0.000 2
END OF DATA
FRAME OFF
FRAME COORDINATES 5 5 95 95
ANGLE UNITS DEGREES
CHARACTER FONT SIMPLEX
CHARACTER .
LINE BLANK
FRACTAL TYPE WHITHERS
FRACTAL PLOT Y1 Y2 Y3 Y4 Y5 Y6 Y7

PROGRAM 2

READ Y1 TO Y7
0.000 0.400 -0.400 90.000 0.000 0.000 1
-35.578 0.111 0.759 -28.559 0.070 0.070 1
35.578 0.759 -0.111 241.441 -0.070 -0.070 1
-17.912 0.641 1.956 -17.951 0.000 0.000 1
END OF DATA
FRAME OFF
FRAME COORDINATES 5 5 95 95
ANGLE UNITS DEGREES
CHARACTER JUSTIFICATION LEBO
CHARACTER .
LINE BLANK
FRACTAL TYPE WHITHERS
FRACTAL PLOT Y1 Y2 Y3 Y4 Y5 Y6 Y7

... FREQUENCY PLOT

PURPOSE

Generates a frequency plot.

DESCRIPTION

A frequency plot is a graphical data analysis technique for summarizing the distributional information of a variable. The response variable is divided into equal sized intervals (or bins). The number of occurrences of the response variable is calculated for each bin. The frequency plot then consists of:

Vertical axis = frequencies or relative frequencies;
Horizontal axis = response variable (i.e., the mid-point of each interval).
There are 4 types of frequency plots:

2. relative frequency plot (convert counts to proportions);
3. cumulative frequency plot;
4. cumulative relative frequency plot.
The frequency plot and the histogram have the same information except the frequency plot has lines connecting the frequency values whereas the histogram has bars at the frequency values.

SYNTAX 1

FREQUENCY PLOT <y> <SUBSET/EXCEPT/FOR qualification>
RELATIVE FREQUENCY PLOT <y> <SUBSET/EXCEPT/FOR qualification>
CUMULATIVE FREQUENCY PLOT <y> <SUBSET/EXCEPT/FOR qualification>
CUMULATIVE RELATIVE FREQUENCY PLOT <y> <SUBSET/EXCEPT/FOR qualification>>
where <y> is the variable of raw data values which will appear on the horizontal axis;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax is used when you have raw data only.

SYNTAX 2

FREQUENCY PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
RELATIVE FREQUENCY PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
CUMULATIVE FREQUENCY PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
CUMULATIVE RELATIVE FREQUENCY PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the variable of pre-computed frequencies to appear on the vertical axis;
<x> is the variable of distinct values to appear on the horizontal axis;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

This syntax is used when you have pre-computed frequencies at each horizontal axis value.

EXAMPLES

FREQUENCY PLOT TEMP
RELATIVE FREQUENCY PLOT TEMP
CUMULATIVE FREQUENCY PLOT TEMP
CUMULATIVE RELATIVE FREQUENCY PLOT TEMP
FREQUENCY PLOT COUNTS STATE
RELATIVE FREQUENCY PLOT COUNTS STATE
CUMULATIVE FREQUENCY PLOT COUNTS STATE
CUMULATIVE RELATIVE FREQUENCY PLOT COUNTS STATE

NOTE 1

Although DATAPLOT does not have a FREQUENCY TABLE command, one can be generated with the following commands:

FREQUENCY PLOT Y
LET YFREQ = YPLOT
LET XVAL = XPLOT
Then the variables YFREQ and XVAL essentially contain a frequency table. There is a LET subcommand called FREQUENCY. However, it does not generate a frequency table in the sense that a frequency plot does.

NOTE 2

By default, DATAPLOT uses a class width of 0.3 times the standard deviation of the variable. Use the CLASS WIDTH command to override this default. DATAPLOT also tends to generate a large number of zero frequency classes at the lower and upper tails. This tends to compress the frequency plot on the horizontal axis. Use the XLIMITS command or the CLASS LOWER and CLASS UPPER commands to avoid plotting these zero frequency classes.

NOTE 3

If you want to overlay several frequency plots, specify the axis limits via the XLIMITS and YLIMITS commands before the first FREQUENCY PLOT command. Enter a PRE-ERASE OFF command after the first FREQUENCY PLOT command.

DEFAULT

None

SYNONYMS

A synonym for CUMULATIVE RELATIVE FREQUENCY PLOT is RELATIVE CUMULATIVE FREQUENCY PLOT.

RELATED COMMANDS

HISTOGRAM = Generates a histogram.
PIE CHART = Generates a pie chart.
PERCENT POINT PLOT = Generates a percent point plot.
PROBABILITY PLOT = Generates a probability plot.
PPCC PLOT = Generates probability plot correlation coefficient plot.
CLASS LOWER = Sets the lower class minimum for histograms, frequency plots, and pie charts.
CLASS UPPER = Sets the upper class maximum for histograms, frequency plots, and pie charts.
CLASS WIDTH = Sets the class width for histograms, frequency plots, and pie charts.
MINIMUM = Sets the frame minima for all plots.
MAXIMUM = Sets the frame maxima for all plots.
LIMITS = Sets the frame limits for all plots.
PLOT = Generates a data or function plot.

REFERENCE

Most introductory statistics book discuss frequency polygons and histograms.

``Multivariate Density Estimation,'' David Scott, John Wiley, 1992 (chapter 4). This book discusses frequency polygons as ``density estimators'' and gives optimal criterion for selecting the class width.

APPLICATIONS

Exploratory Data Analysis, Distributional Analysis

IMPLEMENTATION DATE

Pre-1987

PROGRAM

SET READ FORMAT F10.1
SKIP 25
READ SUNSPOT.DAT Y
MULTIPLOT 2 2
MULTIPLOT CORNER COORDINATES 0 0 100 100
TITLE AUTOMATIC
XLIMITS 0 200
XTIC OFFSET 10 40
MAJOR XTIC MARK NUMBER 6
MINOR XTIC MARK NUMBER 3
FREQUENCY PLOT Y
RELATIVE FREQUENCY PLOT Y
CUMULATIVE FREQUENCY PLOT Y
CUMULATIVE RELATIVE FREQUENCY PLOT Y
END OF MULTIPLOT

... HINGE PLOT

PURPOSE

Generates a subsample lower or upper hinge versus subsample index plot.

DESCRIPTION

The lower hinge is a pseudo lower quartile and the upper hinge is a pseudo upper quartile. Specifically, a hinge is the median of the points between one of the extreme points (i.e., the minimum point or the maximum point) and the median. The subsample hinge is the hinge of the data in the subsample. The hinge plot is used to answer the question: ``Does the subsample spread change over different subsamples?'' It consist of:

Vertical axis = subsample hinge;
Horizontal axis = subsample index.
In addition, a horizontal line is drawn representing the full sample hinge value. As usual, the appearance of the 2 traces is controlled by the first 2 settings of the LINES, CHARACTERS, SPIKES, BARS, and similar attributes.

SYNTAX

<LOWER/UPPER> HINGE PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable;
<x> is the subsample identifier variable (this variable appears on the horizontal axis);
LOWER specifies a lower hinge plot and UPPER specifies an upper hinge plot;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

LOWER HINGE PLOT Y X
UPPER HINGE PLOT Y X1

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

CHARACTERS = Sets the type for plot characters.
LINES = Sets the type for plot lines.
MEAN PLOT = Generates a mean plot.
SD PLOT = Generates a standard deviation plot.
BOX PLOT = Generates a box plot.
XBAR CHART = Generates a mean control chart.
PLOT = Generates a data or function plot.

APPLICATIONS

Exploratory Data Analysis

IMPLEMENTATION DATE

88/2

PROGRAM

. PURPOSE--GENERATE A HINGE PLOT OF POINT BARROW FREON-11 DATA
SKIP 50
SET READ FORMAT 3F4.0,F5.0,F6.0,F3.0,2F9.0
READ PBF11.DAT YEAR DAY BOT SD F11 FLAG WV CO2
.
RETAIN YEAR DAY BOT SD F11 WV CO2 FLAG SUBSET FLAG 0
LET MONTH=INT(DAY/30.25)+1
.
TITLE HINGE PLOT
XLIMITS 0 15
YLIMITS 1 1.01
CHARACTER U U
LINE BLANK SOLID
UPPER HINGE PLOT WV MONTH
PRE-ERASE OFF
CHARACTER L L
LOWER HINGE PLOT WV MONTH

... HISTOGRAM

PURPOSE

Generates a histogram.

DESCRIPTION

A histogram is a graphical data analysis technique for summarizing the distributional information of a variable. The response variable is divided into equal sized intervals (or bins). The number of occurrences of the response variable is calculated for each bin. The histogram consists of:
Vertical axis = frequencies or relative frequencies;
Horizontal axis = response variable (i.e., the mid-point of each interval).
There are 4 types of histograms:

2. relative histogram (converts counts to proportions);
3. cumulative histogram;
4. cumulative relative histogram.
The histogram and the frequency plot have the same information except the histogram has bars at the frequency values, whereas the frequency plot has lines connecting the frequency values.

SYNTAX 1

HISTOGRAM <y> <SUBSET/EXCEPT/FOR qualification>
RELATIVE HISTOGRAM <y> <SUBSET/EXCEPT/FOR qualification>
CUMULATIVE HISTOGRAM <y> <SUBSET/EXCEPT/FOR qualification>
CUMULATIVE RELATIVE HISTOGRAM <y> <SUBSET/EXCEPT/FOR qualification>
where <y> is the variable of raw data values which will appear on the horizontal axis;
and where the <SUBSET/EXCEPT/FOR qualification is optional.
This syntax is used when you have raw data only.

SYNTAX 2

HISTOGRAM <y> <x> <SUBSET/EXCEPT/FOR qualification>
RELATIVE HISTOGRAM <y> <x> <SUBSET/EXCEPT/FOR qualification>
CUMULATIVE HISTOGRAM <y> <x> <SUBSET/EXCEPT/FOR qualification>
CUMULATIVE RELATIVE HISTOGRAM <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the variable of pre-computed frequencies to appear on the vertical axis;
<x> is the variable of raw data values which will appear on the horizontal axis;
and where the <SUBSET/EXCEPT/FOR qualification is optional.
This syntax is used when you have pre-computed frequencies at each horizontal axis value.

EXAMPLES

HISTOGRAM TEMP
RELATIVE HISTOGRAM TEMP
CUMULATIVE HISTOGRAM TEMP
CUMULATIVE RELATIVE HISTOGRAM TEMP
HISTOGRAM COUNTS STATE
RELATIVE HISTOGRAM COUNTS STATE
CUMULATIVE HISTOGRAM COUNTS STATE
CUMULATIVE RELATIVE HISTOGRAM COUNTS STATE

NOTE 1

The appearance of the bars on the histogram (i.e., whether they are filled or not, the line width of the bar border, etc.) are controlled by the various bar attribute commands. A few are listed in the RELATED COMMANDS section below. See the documentation for the BAR command for a complete list of the bar attribute commands. This is demonstrated with the sample program below.

NOTE 2

Although DATAPLOT does not have a FREQUENCY TABLE command, one can be generated with the following commands:
HISTOGRAM Y
LET YFREQ = YPLOT
LET XVAL = XPLOT
Then the variables YFREQ and XVAL essentially contain a frequency table. There is a LET subcommand called FREQUENCY. However, it does not generate a frequency table in the sense that a histogram or a frequency plot does.

NOTE 3

By default, DATAPLOT uses a class width of 0.3 times the standard deviation of the variable. Use the CLASS WIDTH command to override this default. DATAPLOT also tends to generate a large number of zero frequency classes at the lower and upper tails. This tends to compress the histogram on the horizontal axis. Use the XLIMITS command or the CLASS LOWER and CLASS UPPER commands to avoid plotting these zero frequency classes.

DEFAULT

None

SYNONYMS

A synonym for CUMULATIVE RELATIVE HISTOGRAM is RELATIVE CUMULATIVE HISTOGRAM

RELATED COMMANDS

FREQUENCY PLOT = Generates a frequency plot.
HISTOGRAM = Generates a histogram.
PIE CHART = Generates a pie chart.
PERCENT POINT PLOT = Generates a percent point plot.
PROBABILITY PLOT = Generates a probability plot.
PPCC PLOT = Generates probability plot correlation coefficient plot.
CLASS LOWER = Sets the lower class minimum for histograms, frequency plots, and pie charts.
CLASS UPPER = Sets the upper class maximum for histograms, frequency plots, and pie charts.
CLASS WIDTH = Sets the class width for histograms, frequency plots, and pie charts.
MINIMUM = Sets the frame minima for all plots.
MAXIMUM = Sets the frame maxima for all plots.
LIMITS = Sets the frame limits for all plots.
PLOT = Generates a data or function plot.
BARS = Sets the on/off switches for plot bars.
BAR WIDTH = Sets the widths for plot bars.
BAR FILL = Sets the on/off switches for plot bar fills.
BAR PATTERN = Sets the types for bar fill patterns.
BAR BORDER LINE = Sets the types for bar border lines.

REFERENCE

Most introductory statistics book discuss frequency polygons and histograms.
``Multivariate Density Estimation,'' David Scott, John Wiley, 1992 (chapter 3). This book discusses histograms as ``density estimators'' and gives optimal criterion for selecting the class width.

APPLICATIONS

Exploratory Data Analysis

IMPLEMENTATION DATE

Pre-1987

PROGRAM

SET READ FORMAT F10.1
SKIP 25
READ SUNSPOT.DAT Y
MULTIPLOT 2 2
MULTIPLOT CORNER COORDINATES 0 0 100 100
TITLE AUTOMATIC
XLIMITS 0 200
XTIC OFFSET 10 40
MAJOR XTIC MARK NUMBER 6
MINOR XTIC MARK NUMBER 3
HISTOGRAM Y
BAR FILL ON
RELATIVE HISTOGRAM Y
BAR FILL OFF
BAR BORDER THICKNESS 0.3
CUMULATIVE HISTOGRAM Y
BAR FILL ON
BAR PATTERN D1
BAR PATTERN SPACING 3
CUMULATIVE RELATIVE HISTOGRAM Y
END OF MULTIPLOT

HOMOSCEDASTICITY PLOT

PURPOSE

Generates a homoscedasticity plot.

DESCRIPTION

A homoscedasticity plot is a graphical data analysis technique for assessing the assumption of constant variance across subsets of the data. The first variable is a response variable and the second variable identifies subsets of the data. The mean and standard deviation are calculated for each of these subsets. The following plot is generated:

Vertical axis = subset standard deviations;
Horizontal axis = subset means.
The interpertation of this plot is that the greater the spread on the vertical axis, the less valid is the assumption of constant variance. A common pattern is for the spread (i.e., the standard deviation) to increase as the location (i.e., the mean) increases. This indicates the need for some type of transformation such as a log or square root.

SYNTAX

HOMOSCEDASTICITY PLOT <y> <tag> <SUBSET/EXCEPT/FOR qualification>
where <y> is a response variable;
<tag> identifies the subsets;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

HOMOSCEDASTICITY PLOT Y1 TAG
HOMOSCEDASTICITY PLOT Y1 TAG SUBSET TAG > 2

NOTE 1

One limitation of the homoscedasticity plot is that it does not give a convenient way to label the groups on the plot. This can be done by using the SUBSET command as in this example (assume Y is the response variable, X the group-id variable):

X1LABEL MEANS
Y1LABEL STANDARD DEVIATIONS
CHARACTER X; LINE BLANK
XLIMITS 0 5; YLIMITS 0 4
CHARACTER NORM; TITLE HOMOSCEDASTICITY PLOT
HOMOSCEDASTICITY PLOT Y X SUBSET X = 1
PRE-ERASE OFF
CHARACTER T
HOMOSCEDASTICITY PLOT Y X SUBSET X = 2
CHARACTER CHIS
HOMOSCEDASTICITY PLOT Y X SUBSET X = 3
CHARACTER UNIF
HOMOSCEDASTICITY PLOT Y X SUBSET X = 4
CHARACTER F
HOMOSCEDASTICITY PLOT Y X SUBSET X = 5

NOTE 2

Bartlett's test is an analytic test for the assumption of constant variance. See the documentation for the BARTLET TEST command in the Analysis Commands chapter for more details.

NOTE 3

The spread-location plot (or s-l plot) recommended by Bill Cleveland (see REFERENCE section) is an alternative to the HOMOSCEDASTICITY PLOT. In this plot, group medians are fit to each group and residuals are formed by taking the absolute value of the response variable minus the corresponding median. The square root of these absolute values are plotted against the medians (this is a similar concept to plotting the standard deviations against the means). A line connects the medians of the residuals for each group. Variations of this plot can be obtained by using different fits (e.g., trimmed means instead of medians) and residuals. Program example 2 demonstrates a macro for generating s-l plots in DATAPLOT.

DEFAULT

None

SYNONYMS

HOMOGENITY PLOT

RELATED COMMANDS

LINES = Sets the type for plot lines.
HISTOGRAM = Generates a histogram.
BOX PLOT = Generates a box plot
PLOT = Generates a data or function plot.
BARTLETT TEST = Performs a Bartlett test.

REFERENCE

``Visualizing Data,'' William Cleveland, Hobart Press, 1993.

APPLICATIONS

Exploratory Data Analysis

IMPLEMENTATION DATE

Pre-1987

PROGRAM 1

SKIP 50
SET READ FORMAT 3F4.0,F5.0,F6.0,F3.0,2F9.0
READ PBF11.DAT YEAR DAY BOT SD F11 FLAG WV CO2
RETAIN YEAR DAY BOT SD F11 WV CO2 FLAG SUBSET FLAG 0
LET MONTH=INT(DAY/30.25)+1
X1lABEL MEANS; Y1LABEL STANDARD DEVIATIONS
CHARACTER X
LINE BLANK
HOMOSCEDASTICITY PLOT WV MONTH

PROGRAM 2

. PURPOSE--GENERATE A S-L PLOT OF POINT BARROW FREON-11 DATA
DIMENSION 20 VARIABLES
SKIP 50
SET READ FORMAT 3F4.0,F5.0,F6.0,F3.0,2F9.0
READ PBF11.DAT YEAR DAY BOT SD F11 FLAG WV CO2
RETAIN YEAR DAY BOT SD F11 WV CO2 FLAG SUBSET FLAG 0
LET TAG=INT(DAY/30.25)+1
LET Y = WV
.
LET N = SIZE Y
LET MED = 0 FOR I = 1 1 N
LET RES = 0 FOR I = 1 1 N
LET TEMP = DISTINCT TAG
LET NGROUP = SIZE TEMP
LOOP FOR K = 1 1 NGROUP
LET TAGID = TEMP(K)
LET ATEMP = MEDIAN Y SUBSET TAG = TAGID
LET MED = ATEMP SUBSET TAG = TAGID
LET RES = ABS(Y-MED) SUBSET TAG = TAGID
LET GROUPMD(K) = ATEMP
LET ATEMP = MEDIAN RES SUBSET TAG = TAGID
LET MAD(K) = SQRT(ATEMP)
END OF LOOP
LET RES = SQRT(RES)
TITLE SPREAD-LOCATION PLOT
Y1LABEL SQUARE ROOT ABSOLUTE RESIDUAL VW; X1LABEL MEDIAN VW
CHARACTER CIRCLE BLANK; CHARACTER SIZE 1.2; LINE BLANK SOLID
PLOT RES MED AND
PLOT MAD GROUPMD

I PLOT

PURPOSE

Generates an I plot.

DESCRIPTION

An I plot is used in 2 ways:

Vertical axis = response variable;
Horizontal axis = level identification.
The bottom of the bar is the data minimum; the middle x in the bar is the data median; the top of the bar is the data maximum.
2. As an uncertainty chart in which the analyst wishes to plot estimated values of a certain quantity and to also illustrate the uncertainty bars associated with each estimate. In this case, a given value of the horizontal axis variable is typically accompanied by exactly 3 values for the vertical axis variable:
1) the estimate - the uncertainty;
2) the estimate;

3) the estimate + the uncertainty.

For both applications, the resulting plot is similar in appearance. Namely, a target value which appears as an X; a vertical bar with small horizontal bars at the extremes (hence the name ``I plot'').

The I plot has 3 components (characters and lines) which can be individually controlled. For the I plot to appear as it should, the I PLOT command is usually preceded by 2 commands:

CHARACTERS I PLOT

LINES I PLOT

These commands automatically define proper values for the 3 components of the I plot. After the I plot is formed, the analyst should redefine plot characters and lines via the usual CHARACTERS and LINES commands.

SYNTAX

I PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable;
<x> is the horizontal axis (= independent) variable;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

I PLOT Y X
I PLOT Y X SUBSET X > 2

NOTE

The second use of the I plot (to draw error bars) has been supplanted by an explicit ERROR BAR PLOT command.

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

CHARACTERS = Sets the type for plot characters.
LINES = Sets the type for plot lines.
BOX PLOT = Generates a box plot.
ANOVA = Carries out an ANOVA.
MEDIAN POLISH = Carries out a median polish.
CONTROL CHART = Generates a control chart.
ERROR BAR PLOT = Generates a plot with error bars.
PLOT = Generates a data or function plot.

APPLICATIONS

Data Analysis

IMPLEMENTATION DATE

Pre-1987

PROGRAM

LET X = SEQUENCE 1 100 1 10
TITLE AUTOMATIC
LET Z = NORMAL RANDOM NUMBERS FOR I = 1 1 1000
CHARACTERS I PLOT
LINES I PLOT
CHARACTER SIZE 3 ALL
CHARACTER FONT SIMPLEX ALL
XTIC OFFSET 0.5 0.5
I PLOT Z X

JACKNIFE ... PLOT

PURPOSE

Generates a jacknife plot for a given statistic.

DESCRIPTION

The jacknife is a non-parametric method for estimating a sampling distribution for a statistic. Given a sample data set and a desired statistic (e.g., the mean), the jacknife works by computing the desired statistic with an element deleted. This is done for each element of the data set. The collection of these statistics is used as an estimate of the sampling distribution. For the jacknife plot, the vertical axis contains the computed value of the statistic and the horizontal axis contains the sample number (for k = 1, 2, ..., N). The number of response variables depends on the number of variables required to compute the statistic (e.g., the MEAN uses one while the LINEAR INTERCEPT uses two). The jacknife plot is typically followed by some type of distributional plot such as a histogram.

SYNTAX 1

JACKNIFE <stat> PLOT <y> <SUBSET/EXCEPT/FOR qualification>
where <y> is the first response variable;
<stat> is one of the following statistics:
MEAN, MIDMEAN, MEDIAN, TRIMMED MEAN, WINDSORIZED MEAN,
SUM, PRODUCT, SIZE (or NUMBER or COUNT), MINIMUM, MAXIMUM,
STANDARD DEVIATION, VARIANCE, STANDARD DEVIATION OF MEAN, VARIANCE OF MEAN,
AVERAGE ABSOLUTE DEVIATION (AAD), MEDIAN ABSOLUTE DEVIATION (MAD),
RELATIVE STANDARD DEVIATION, RELATIVE VARIANCE (or COEFFICIENT OF VARIATION),
RANGE, MIDRANGE, LOWER HINGE, UPPER HINGE, LOWER QUARTILE, UPPER QUARTILE,
<FIRST/SECOND/THIRD/FOURTH/FIFTH/SIXTH/SEVENTH/EIGTH/NINTH> DECILE,
SKEWNESS, KURTOSIS,
AUTOCORRELATION, AUTOCOVARIANCE,
SINE FREQUENCY, SINE AMPLITUDE,
TAGUCHI SN0 (or SN), TAGUCHI SN+ (or SNL), TAGUCHI SN- (or SNS), TAGUCHI SN00 (SN2);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax is used for statistics requiring one response variable to compute.

SYNTAX 2

JACKNIFE <stat> PLOT <y1> <y2> <SUBSET/EXCEPT/FOR qualification>
where <y1> is the first response variable;
<y2> is the optional second response variable;
<stat> is one of the following statistics:
LINEAR INTERCEPT, LINEAR SLOPE, LINEAR RESSD, LINEAR CORRELATION;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax is used for statistics requiring two response variables to compute.

EXAMPLES

JACKNIFE MEAN PLOT Y
JACKNIFE LINEAR SLOPE PLOT Y1 X1

NOTE

The bootstrap is similar to the jacknife. However, in the bootstrap the sampling is done with replacement.

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

LINES = Sets the type for plot lines.
HISTOGRAM = Generates a histogram.
BOOTSTRAP SAMPLE = Set the sample size for the jacknife.
PLOT = Generates a data or function plot.
BOOTSTRAP PLOT = Generates a bootstrap plot.

REFERENCE

``A Leisurely Look at the Bootstrap, the Jacknife, and Cross-Validation,'' Efron and Gong, The American Statistician, February, 1983.

APPLICATIONS

Sample Distribution of a Statistic

IMPLEMENTATION DATE

89/2

PROGRAM

TITLE AUTOMATIC
MULTIPLOT 2 1
MULTIPLOT CORNER COORDINATES 0 0 100 100
LET Y1 = NORMAL RANDOM NUMBERS FOR I = 1 1 1000
JACKNIFE MEAN PLOT Y1
LET YPLOT2 = YPLOT
HISTOGRAM YPLOT2
END OF MULTIPLOT

KURTOSIS PLOT

PURPOSE

Generates a subsample kurtosis versus subsample index plot.

DESCRIPTION

The subsample kurtosis is the kurtosis of the data in the subsample. The kurtosis plot is used to answer the question: ``Does the subsample distribution change over different subsamples?'' It consists of:

Vertical axis = subsample kurtosis;
Horizontal axis = subsample index.
In addition, a horizontal line is drawn representing the full sample kurtosis. The appearance of the 2 traces is controlled by the first 2 settings of the LINES, CHARACTERS, SPIKES, BARS, and similar attributes.

SYNTAX

KURTOSIS PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable;
<x> is the subsample identifier variable (this variable appears on horizontal axis);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

KURTOSIS PLOT Y X
KURTOSIS PLOT Y X SUBSET X + 2 TO 10

NOTE

The kurtosis is the standardized fourth central moment. It is a measure of the ``peakedness'' of a distribution.

DEFAULT

None

SYNONYMS

K PLOT

RELATED COMMANDS

CHARACTERS = Sets the type for plot characters.
LINES = Sets the type for plot lines.
SKEWNESS PLOT = Generates a skewness plot
VARIANCE PLOT = Generates a variance plot
STANDARD DEVIATION PLOT = Generates a standard deviation plot.
RANGE PLOT = Generates a range plot
MEAN PLOT = Generates a mean plot.
MEDIAN PLOT = Generates a median plot.
BOX PLOT = Generates a box plot.
S CHART = Generates a standard deviation control chart.
PLOT = Generates a data or function plot.

APPLICATIONS

Quality Control

IMPLEMENTATION DATE

88/2

PROGRAM

SKIP 50
SET READ FORMAT 3F4.0,F5.0,F6.0,F3.0,2F9.0
READ PBF11.DAT YEAR DAY BOT SD F11 FLAG WV CO2
.
RETAIN YEAR DAY BOT SD F11 WV CO2 FLAG SUBSET FLAG 0
LET MONTH=INT(DAY/30.25)+1
.
XLIMITS 0 15
CHARACTER X BLANK
LINE BLANK SOLID
Y1LABEL KURTOSIS
X1LABEL GROUPS
TITLE AUTOMATIC
KURTOSIS PLOT WV MONTH

LAG PLOT

PURPOSE

Generates a lag plot.

DESCRIPTION

For a single time series, a lag plot is a graphical data analysis technique for determining if an autocorrelation structure exists within the time series. For two time series, a lag plot is a graphical technique for determining if cross-correlation structure exists between the two time series. Ideally (for a white noise time series or for 2 uncorrelated time series), the lag plot should have the appearance of a random shotgun pattern. Any kind of a structured pattern in a lag plot indicates an underlying auto/cross-correlation model, the nature of which may be inferred from the type of lag plot structure.

In time series analysis, a lag is a fixed time displacement. For example, y(2) and y(7) would be said to have a lag of 5 (= 7-2). For a lag plot, the lag is fixed at some value specified by the analyst. The default value is a lag of 1.

For a lag plot on a single time series, the lag plot consists of:

Vertical axis = x(i)
Horizontal axis = x(i+lag)
For a lag plot for 2 time series, the lag plot consists of:

Vertical axis = y(i)
Horizontal axis = x(i+lag)

SYNTAX 1

LAG <n> PLOT <x> <SUBSET/EXCEPT/FOR qualification>
where <n> is an integer number or parameter between 1 and n-1 (n is the number of observations) that specifies the lag;
<x> is the variable of raw data values which is being analyzed for autocorrelation structure;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax is used for a single time series.

SYNTAX 2

LAG <n> PLOT <y1> <y2> <SUBSET/EXCEPT/FOR qualification>
where <n> is an integer number or parameter between 1 and n-1 (n is the number of observations) that specifies the lag;
<y1> is the first variable of raw data values which is being analyzed for cross-correlation structure;
<y2> is the second variable of raw data values which is being analyzed for cross-correlation structure;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax is used for two time series.

EXAMPLES

LAG 3 PLOT X
LAG PLOT X
LAG -12 PLOT Y X
LAG PLOT Y X

DEFAULT

If <n> is omitted, the default lag is 1.

SYNONYMS

None

RELATED COMMANDS

CHARACTERS = Sets the types for plot characters.
LINES = Sets the types for plot lines.
PLOT = Generates a data or function plot.
4-PLOT = Generates a 4-plot for univariate analysis.
CORRELATION PLOT = Generates an auto or cross-correlation plot.
SPECTRUM = Generates a spectral plot.
SUMMARY = Generates a table of summary statistics.
LET = Computes various statistics (and many other capabilities).
FIT = Carries out a least squares fit.

APPLICATIONS

Time Series Analysis, Regression

IMPLEMENTATION DATE

Pre-1987

PROGRAM 1

. THIS SAMPLE PROGRAM READS THE FILE LEW.DAT IN THE DATAPLOT
. REFERENCE DIRECTORY. THE DATA IS BEAM DELECTION DATA.
SKIP 25
READ LEW.DAT Y
.
LEGEND 1 AUTOCORRELATION ANALYSIS
LEGEND 2 LAG PLOT
TITLE AUTOMATIC
Y1LABEL X(LC()i)
CHAR X
LINES
YMAX 500
XMAX 500
MULTIPLOT 2 2; MULTIPLOT CORNER COORDINATES 0 0 100 100
LOOP FOR K = 1 1 4
X1LABEL X(LC()I+^K)
LAG ^K PLOT Y
END OF LOOP
END OF MULTIPLOT

PROGRAM 2

. THIS SAMPLE PROGRAM READS THE FILE HAYES1.DAT IN THE DATAPLOT
. REFERENCE DIRECTORY. THIS IS FIRE RESEARCH SMOKE OBSCURATION DATA.
.
SKIP 25
READ HAYES1.DAT JUNK Y1 Y2
.
TITLE AUTOMATIC
LEGEND 1 CROSS-CORRELATION ANALYSIS
LEGEND 2 LAG PLOT
Y1LABEL X(LC()i)
TIC OFFSET 0.2 0.2
.
CHAR X
LINES
YLIMITS 0 3
XLIMITS 0 3
MULTIPLOT 2 2
MULTIPLOT CORNER COORDINATES 0 0 100 100
LOOP FOR K = 1 1 4
X1LABEL Y(LC()I+^K)
LAG ^K PLOT Y1 Y2
END OF LOOP
END OF MULTIPLOT

LINEAR CORRELATION PLOT

PURPOSE

Generates a subsample (linear) correlation versus subsample index plot.

DESCRIPTION

The subsample correlation is the usual Pearson product-moment correlation coefficient between 2 user-specified variables for the data in the subsample. The linear correlation plot is used to answer the question: ''Does the correlation between 2 variables hold equally well from one subsample to the next? In other words, does the linear relatedness of the 2 variables change from one subsample to the next?'' It consists of:

Vertical axis = subsample correlation from a linear fit;
Horizontal axis = subsample index.
In addition, a horizontal line is drawn representing the full sample linear correlation. The appearance of the 2 traces is controlled by the first 2 settings of the LINES, CHARACTERS, SPIKES, BARS, and similar attributes.

SYNTAX

LINEAR CORRELATION PLOT <y1> <y2> <x> <SUBSET/EXCEPT/FOR qualification>
where <y1> is a response (= dependent in the linear fit) variable;
<y2> is another response (= independent variable in the linear fit) variable;
<x> is the subsample identifier variable (this variable appears on the horizontal axis);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

LINEAR CORRELATION PLOT Y1 Y2 X
LINEAR CORRELATION PLOT CONC YEAR MONTH SUBSET MONTH = 1 TO 10

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

LINEAR SLOPE PLOT = Generates a linear slope plot.
LINEAR INTERCEPT PLOT = Generates a linear intercept plot.
LINEAR RESSD PLOT = Generates a linear residual standard deviation plot.
CHARACTERS = Sets the type for plot characters.
LINES = Sets the type for plot lines.
FIT = Carries out a least squares fit.

APPLICATIONS

Exploratory Data Analysis

IMPLEMENTATION DATE

88/3

PROGRAM

SKIP 25
READ BERGER1.DAT Y X BATCH
LINE BLANK DASH
CHARACTER X BLANK
XTIC OFFSET 0.2 0.2
Y1LABEL CORRELATION
X1LABEL SAMPLE ID
TITLE AUTOMATIC
LINEAR CORRELATION PLOT Y X BATCH

LINEAR INTERCEPT PLOT

PURPOSE

Generates a subsample linear intercept versus subsample index plot.

DESCRIPTION

The subsample intercept is the intercept resulting from a least squares linear fit (between 2 user-specified variables) of the data in the subsample. The linear intercept plot is used to answer the question: ''Does the y-intercept of a fitted line between 2 variables change from one subsample to the next?'' The plot consists of:

Vertical axis = subsample intercept from linear fit;
Horizontal axis = subsample index.
In addition, a horizontal line is drawn representing the full sample linear intercept. The appearance of the 2 traces is controlled by the first 2 settings of the LINES, CHARACTERS, SPIKES, BARS, and similar attributes.

SYNTAX

LINEAR INTERCEPT PLOT <y1> <y2> <x> <SUBSET/EXCEPT/FOR qualification>
where <y1> is a response (= dependent variable in the linear fit) variable;
<y2> is another response (= independent variable in the linear fit) variable;
<x> is the subsample identifier variable (this variable appears on the horizontal axis);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

LINEAR INTERCEPT PLOT PRES TEMP DAY
LINEAR INTERCEPT PLOT CONC YEAR MONTH SUBSET MONTH = 2 TO 11

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

LINEAR SLOPE PLOT = Generates a linear slope plot.
LINEAR CORRELATION PLOT = Generates a linear correlation plot.
LINEAR RESSD PLOT = Generates a linear residual standard deviation plot.
CHARACTERS = Sets the type for plot characters.
LINES = Sets the type for plot lines.
FIT = Carries out a least squares fit.

APPLICATIONS

Exploratory Data Analysis

IMPLEMENTATION DATE

88/3

PROGRAM

SKIP 25
READ BERGER1.DAT Y X BATCH
LINE BLANK DASH
CHARACTER X BLANK
Y1LABEL INTERCEPT
X1LABEL SAMPLE ID
TITLE AUTOMATIC
XTIC OFFSET 0.2 0.2
LINEAR INTERCEPT PLOT Y X BATCH

LINEAR RESSD PLOT

PURPOSE

Generates a subsample linear residual standard deviation versus subsample index plot.

DESCRIPTION

The subsample residual standard deviation is the residual standard deviation resulting from a least squares linear fit (between 2 user-specified variables) of the data in the subsample. The linear residual standard deviation plot is used to answer the question: ``Does the residual standard deviation of a fitted line between 2 variables change from one subsample to the next? In other words, does the quality and goodness of the linear fit change from one subsample to the next?'' The plot consists of:

Vertical axis = subsample residual standard deviation from a linear fit;
Horizontal axis = subsample index.
In addition, a horizontal line is drawn representing the full sample linear residual standard deviation. The appearance of the 2 traces is controlled by the first 2 settings of the LINES, CHARACTERS, SPIKES, BARS, and similar attributes.

SYNTAX

LINEAR RESSD PLOT <y1> <y2> <x> <SUBSET/EXCEPT/FOR qualification>
where <y1> is a response (= dependent variable in the fit) variable;
<y2> is another response (= independent variable in the fit) variable;
<x> is the subsample identifier variable (this variable appears on horizontal axis);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

LINEAR RESSD PLOT PRES TEMP DAY
LINEAR RESSD PLOT CONC YEAR MONTH SUBSET MONTH > 1

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

LINEAR SLOPE PLOT = Generates a linear slope plot.
LINEAR CORRELATION PLOT = Generates a linear correlation plot.
LINEAR INTERCEPT PLOT = Generates a linear intercept plot.
CHARACTERS = Sets the type for plot characters.
LINES = Sets the type for plot lines.
FIT = Carries out a least squares fit.

APPLICATIONS

Exploratory Data Analysis

IMPLEMENTATION DATE

88/3

PROGRAM

SKIP 25
READ BERGER1.DAT Y X BATCH
LINE BLANK DASH
CHARACTER X BLANK
Y1LABEL RESSD
X1LABEL SAMPLE ID
TITLE AUTOMATIC
XTIC OFFSET 0.2 0.2
LINEAR RESSD PLOT Y X BATCH

LINEAR SLOPE PLOT

PURPOSE

Generates a subsample linear slope versus subsample index plot.

DESCRIPTION

The subsample slope is the slope resulting from a least squares linear fit (between 2 user-specified variables) of the data in the subsample. The linear slope plot is used to answer the question: ``Does the slope of a fitted line between 2 variables change from one subsample to the next?'' The plot consists of:

Vertical axis = subsample slope from linear fit;
Horizontal axis = subsample index.
In addition, a horizontal line is drawn representing the full sample linear slope. The appearance of the 2 traces is controlled by the first 2 settings of the LINES, CHARACTERS, SPIKES, BARS, and similar attributes.

SYNTAX

LINEAR SLOPE PLOT <y1> <y2> <x> <SUBSET/EXCEPT/FOR qualification>
where <y1> is a response (= dependent variable in the fit) variable;
<y2> is another response (= independent variable in the fit) variable;
<x> is the subsample identifier variable (this variable appears on horizontal axis);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

LINEAR SLOPE PLOT PRES TEMP DAY
LINEAR SLOPE PLOT CONC YEAR MONTH SUBSET MONTH > 1

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

LINEAR CORRELATION PLOT = Generates a linear correlation plot.
LINEAR INTERCEPT PLOT = Generates a linear intercept plot.
LINEAR RESSD PLOT = Generates a linear residual standard deviation plot.
CHARACTERS = Sets the type for plot characters.
LINES = Sets the type for plot lines.
FIT = Carries out a least squares fit.

APPLICATIONS

Exploratory Data Analysis

IMPLEMENTATION DATE

88/3

PROGRAM

SKIP 25
READ BERGER1.DAT Y X BATCH
LINE BLANK DASH
CHARACTER X BLANK
XTIC OFFSET 0.2 0.2
Y1LABEL SLOPE
X1LABEL SAMPLE ID
TITLE AUTOMATIC
LINEAR SLOPE PLOT Y X BATCH

MAXIMUM PLOT

PURPOSE

Generates a subsample maximum versus subsample index plot.

DESCRIPTION

The subsample maximum is the largest data value in the subsample. The maximum plot is used to answer the question: ``Does the subsample variation change over different subsamples?'' The plot consists of:

Vertical axis = subsample maximum;
Horizontal axis = subsample index.
In addition, a horizontal line is drawn representing the full sample maximum. The appearance of the 2 traces is controlled by the first 2 settings of the LINES, CHARACTERS, SPIKES, BARS, and similar attributes.

SYNTAX

MAXIMUM PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable;
<x> is the subsample identifier variable (this variable appears on the horizontal axis);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

MAXIMUM PLOT Y X
MAXIMUM PLOT Y X SUBSET X = 2 TO 12

DEFAULT

None

SYNONYMS

MAX PLOT

RELATED COMMANDS

CHARACTERS = Sets the type for plot characters.
LINES = Sets the type for plot lines.
MINIMUM PLOT = Generates a minimum plot.
RANGE PLOT = Generates a range plot.
EXTREME PLOT = Generates an extreme plot.
DECILE PLOT = Generates a decile plot.
STANDARD DEVIATION PLOT = Generates a stand deviation plot.
VARIANCE PLOT = Generates a variance plot.
MEAN PLOT = Generates a mean plot.
MEDIAN PLOT = Generates a median plot.
BOX PLOT = Generates a box plot.
RANGE CHART = Generates a maximum control chart.
S CHART = Generates a standard deviation control chart.
PLOT = Generates a data or function plot.

APPLICATIONS

Quality Control

IMPLEMENTATION DATE

88/2

PROGRAM

SKIP 50
SET READ FORMAT 3F4.0,F5.0,F6.0,F3.0,2F9.0
READ PBF11.DAT YEAR DAY BOT SD F11 FLAG WV CO2
.
RETAIN YEAR DAY BOT SD F11 WV CO2 FLAG SUBSET FLAG 0
LET MONTH=INT(DAY/30.25)+1
.
LINE BLANK DASH
CHARACTER X BLANK
XLIMITS 0 15
Y1LABEL MAXIMUM
X1LABEL GROUP ID
TITLE AUTOMATIC
MAXIMUM PLOT WV MONTH

MEAN PLOT

PURPOSE

Generates a subsample mean versus subsample index plot.

DESCRIPTION

The subsample mean is the mean of the data in the subsample. The mean plot is used to answer the question: ``Does the subsample location change over different subsamples?'' The plot consists of:

Vertical axis = subsample mean;
Horizontal axis = subsample index.
In addition, a horizontal line is drawn representing the full sample mean. The appearance of the 2 traces is controlled by the first 2 settings of the LINES, CHARACTERS, SPIKES, BARS, and similar attributes.

SYNTAX

MEAN PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable;
<x> is the subsample identifier variable (this variable appears on the horizontal axis);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

MEAN PLOT Y X
MEAN PLOT Y X SUBSET X = 2 TO 20

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

CHARACTERS = Sets the type for plot characters.
LINES = Sets the type for plot lines.
MEDIAN PLOT = Generates a median plot.
MIDMEAN PLOT = Generates a midmean plot.
MIDRANGE PLOT = Generates a midrange plot.
TRIMMED MEAN PLOT = Generates a trimmed mean plot.
WINDSORIZED MEAN PLOT = Generates a Windsorized mean plot
SD PLOT = Generates a standard deviation plot.
BOX PLOT = Generates a box plot.
XBAR CHART = Generates an xbar control chart.
PLOT = Generates a data or function plot.

APPLICATIONS

Exploratory Data Analysis

IMPLEMENTATION DATE

88/2

PROGRAM

SKIP 25
READ GEAR.DAT Y X
LINE BLANK DASH
CHARACTER X BLANK
XTIC OFFSET 0.2 0.2
Y1LABEL MEAN
X1LABEL SAMPLE ID
TITLE AUTOMATIC
MEAN PLOT Y X

MEDIAN PLOT

PURPOSE

Generates a subsample median versus subsample index plot.

DESCRIPTION

The subsample median is the middle of the ordered data from the subsample. The median plot is used to answer the question: ``Does the subsample location change over different subsamples?'' The plot consists of:

Vertical axis = subsample median;
Horizontal axis = subsample index.
In addition, a horizontal line is drawn representing the full sample median. The appearance of the 2 traces is controlled by the first 2 settings of the LINES, CHARACTERS, SPIKES, BARS, and similar attributes.

SYNTAX

MEDIAN PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable;
<x> is the subsample identifier variable (this variable appears on the horizontal axis);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

MEDIAN PLOT Y X
MEDIAN PLOT Y X SUBSET X > 1

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

CHARACTERS = Sets the type for plot characters.
LINES = Sets the type for plot lines.
MEAN PLOT = Generates a mean plot.
MIDMEAN PLOT = Generates a midmean plot.
MIDRANGE PLOT = Generates a midrange plot.
TRIMMED MEAN PLOT = Generates a trimmed mean plot.
WINDSORIZED MEAN PLOT = Generates a Windsorized mean plot
SD PLOT = Generates a standard deviation plot.
BOX PLOT = Generates a box plot.
XBAR CHART = Generates an xbar control chart.
PLOT = Generates a data or function plot.

APPLICATIONS

Quality Control

IMPLEMENTATION DATE

88/2

PROGRAM

SKIP 25
READ GEAR.DAT Y X
LINE BLANK DASH
CHARACTER X BLANK
XTIC OFFSET 0.2 0.2
Y1LABEL MEDIAN
X1LABEL SAMPLE ID
TITLE AUTOMATIC
MEDIAN PLOT Y X

MIDMEAN PLOT

PURPOSE

Generates a subsample midmean versus subsample index plot.

DESCRIPTION

The subsample midmean is the mean of the middle 50% of the ordered data in the subsample. The midmean plot is used to answer the question: ``Does the subsample location change over different subsamples?'' The plot consists of:

Vertical axis = subsample midmean;
Horizontal axis = subsample index.
In addition, a horizontal line is drawn representing the full sample midmean. The appearance of the 2 traces is controlled by the first 2 settings of the LINES, CHARACTERS, SPIKES, BARS, and similar attributes.

SYNTAX

MIDMEAN PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable;
<x> is the subsample identifier variable (this variable appears on the horizontal axis);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

MIDMEAN PLOT Y X
MIDMEAN PLOT Y X SUBSET X > 1

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

CHARACTERS = Sets the type for plot characters.
LINES = Sets the type for plot lines.
MEAN PLOT = Generates a mean plot.
MEDIAN PLOT = Generates a median plot.
MIDRANGE PLOT = Generates a midrange plot.
TRIMMED MEAN PLOT = Generates a trimmed mean plot.
WINDSORIZED MEAN PLOT = Generates a Windsorized mean plot
SD PLOT = Generates a standard deviation plot.
BOX PLOT = Generates a box plot.
XBAR CHART = Generates an xbar control chart.
PLOT = Generates a data or function plot.

APPLICATIONS

Quality Control

IMPLEMENTATION DATE

88/2

PROGRAM

SKIP 25
READ GEAR.DAT Y X
LINE BLANK DASH
CHARACTER X BLANK
XTIC OFFSET 0.2 0.2
Y1LABEL MIDMEAN
X1LABEL SAMPLE ID
TITLE AUTOMATIC
MIDMEAN PLOT Y X

MIDRANGE PLOT

PURPOSE

Generates a subsample midrange versus subsample index plot.

DESCRIPTION

The midrange is the arithmetic average of the minimum and the maximum points. The midrange plot is used to answer the question: ``Does the subsample location change over different subsamples?'' The plot consists of:

Vertical axis = subsample midrange;
Horizontal axis = subsample index.
In addition, a horizontal line is drawn representing the full sample midrange. The appearance of the 2 traces is controlled by the first 2 settings of the LINES, CHARACTERS, SPIKES, BARS, and similar attributes.

SYNTAX

MIDRANGE PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable;
<x> is the subsample identifier variable (this variable appears on the horizontal axis);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

MIDRANGE PLOT Y X
MIDRANGE PLOT Y X SUBSET X < 19

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

CHARACTERS = Sets the type for plot characters.
LINES = Sets the type for plot lines.
MEAN PLOT = Generates a mean plot.
MEDIAN PLOT = Generates a median plot.
MIDMEAN PLOT = Generates a midmean plot.
TRIMMED MEAN PLOT = Generates a trimmed mean plot.
WINDSORIZED MEAN PLOT = Generates a Windsorized mean plot
SD PLOT = Generates a standard deviation plot.
BOX PLOT = Generates a box plot.
XBAR CHART = Generates an xbar control chart.
PLOT = Generates a data or function plot.

APPLICATIONS

Quality Control

IMPLEMENTATION DATE

88/2

PROGRAM

SKIP 25
READ GEAR.DAT Y X
LINE BLANK DASH
CHARACTER X BLANK
XTIC OFFSET 0.2 0.2
Y1LABEL MIDRANGE
X1LABEL SAMPLE ID
TITLE AUTOMATIC
MIDRANGE PLOT Y X

MINIMUM PLOT

PURPOSE

Generates a subsample minimum versus subsample index plot.

DESCRIPTION

The subsample minimum is the smallest data value in the subsample. The minimum plot is used to answer the question: ``Does the subsample variation change over different subsamples?'' The plot consists of:

Vertical axis = subsample minimum;
Horizontal axis = subsample index.
In addition, a horizontal line is drawn representing the full sample minimum. The appearance of the 2 traces is controlled by the first 2 settings of the LINES, CHARACTERS, SPIKES, BARS, and similar attributes.

SYNTAX

MINIMUM PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable;
<x> is the subsample identifier variable (this variable appears on the horizontal axis);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

MINIMUM PLOT Y X
MINIMUM PLOT Y X SUBSET X = 1 TO 10

DEFAULT

None

SYNONYMS

MIN PLOT

RELATED COMMANDS

CHARACTERS = Sets the type for plot characters.
LINES = Sets the type for plot lines.
MAXIMUM PLOT = Generates a maximum plot.
RANGE PLOT = Generates a range plot.
LOWER QUARTILE PLOT = Generates a lower quartile plot.
UPPER QUARTILE PLOT = Generates a upper quartile plot.
LOWER HINGE PLOT = Generates a lower hinge plot.
UPPER HINGE PLOT = Generates a upper hinge plot.
DECILE PLOT = Generates a decile plot.
STANDARD DEVIATION PLOT = Generates a stand deviation plot.
VARIANCE PLOT = Generates a variance plot.
MEAN PLOT = Generates a mean plot.
MEDIAN PLOT = Generates a median plot.
BOX PLOT = Generates a box plot.
RANGE CHART = Generates a range control chart.
S CHART = Generates a standard deviation control chart.
PLOT = Generates a data or function plot.

APPLICATIONS

Quality Control

IMPLEMENTATION DATE

88/2

PROGRAM

SKIP 50
SET READ FORMAT 3F4.0,F5.0,F6.0,F3.0,2F9.0
READ PBF11.DAT YEAR DAY BOT SD F11 FLAG WV CO2
.
RETAIN YEAR DAY BOT SD F11 WV CO2 FLAG SUBSET FLAG 0
LET MONTH=INT(DAY/30.25)+1
LINE BLANK DASH
CHARACTER X BLANK
XLIMITS 0 15
Y1LABEL MINIMUM
X1LABEL GROUP ID
TITLE AUTOMATIC
MINIMUM PLOT WV MONTH

NORMAL PLOT

PURPOSE

Generates a normal plot.

DESCRIPTION

A normal plot is a normal probability plot, but with the data on the horizontal axis and neat probability values on the vertical axis. The plot consists of the following 4 components:

2. A fitted line to the raw data;
3. A horizontal 50% line;
4. A vertical 50% line.
The characteristics of these components are controlled through the LINE and CHARACTER commands.

SYNTAX 1

NORMAL PLOT <y> <SUBSET/EXCEPT/FOR qualification>
where <y> is a response variable;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

SYNTAX 2

NORMAL PLOT <y> <tag> <SUBSET/EXCEPT/FOR qualification>
where <y> is a response variable;
<tag> is a censoring variable (values equal to 0 are omitted from the plot);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

NORMAL PLOT Y1

NOTE

The following internal parameters are saved after a NORMAL PLOT. These parameters can be used like any user created parameter by the analyst.

SIGMA - the slope of the fitted line
MU - the intercept of the fitted line

SDSIGMA - the standard deviation of SIGMA

SDETA - the standard deviation of MU

BPT1 - the 0.1% point of the best fit distribution

BPT5 - the 0.5% point of the best fit distribution

BP1 - the 1% point of the best fit distribution

BP5 - the 5% point of the best fit distribution

BP10 - the 10% point of the best fit distribution

BP20 - the 20% point of the best fit distribution

BP50 - the 50% point of the best fit distribution

BP80 - the 80% point of the best fit distribution

BP90 - the 90% point of the best fit distribution

BP95 - the 95% point of the best fit distribution

BP99 - the 99% point of the best fit distribution

BP995 - the 99.5% point of the best fit distribution

BP999 - the 99.9% point of the best fit distribution

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

LINES = Sets the type for plot lines.
NORMAL PROBABILITY PLOT = Generates a normal probability plot.
HISTOGRAM = Generates a histogram.
QUANTILE-QUANTILE PLOT = Generates a quantile-quantile plot
BOX PLOT = Generates a box plot
PLOT = Generates a data or function plot.

APPLICATIONS

Exploratory Data Analysis

IMPLEMENTATION DATE

90/5

PROGRAM

LET Y1 = NORMAL RANDOM NUMBERS FOR I = 1 1 100
LINE SOLID DASH DOT DOT
TITLE AUTOMATIC
NORMAL PLOT Y1

NORMAL PPCC PLOT

PURPOSE

Generates a subsample normal ppcc versus subsample index plot.

DESCRIPTION

The subsample normal ppcc is the correlation coefficient of the straight line fitted to a normal probability plot (see the documentation for PPCC PLOT for details) in the subsample. The normal ppcc plot is used to answer the question: ``Does the subsample normality change over different subsamples?'' The plot consists of:

Vertical axis = subsample normal ppcc;
Horizontal axis = subsample index.
In addition, a horizontal line is drawn representing the full sample normal ppcc value. The appearance of the 2 traces is controlled by the first 2 settings of the LINES, CHARACTERS, SPIKES, BARS, and similar attributes.

SYNTAX

NORMAL PPCC PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable;
<x> is the subsample identifier variable (this variable appears on the horizontal axis);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

NORMAL PPCC PLOT Y X
NORMAL PPCC PLOT Y X SUBSET X > 2

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

CHARACTERS = Sets the type for plot characters.
LINES = Sets the type for plot lines.
STANDARD DEVIATION PLOT = Generates a stand deviation plot.
MEAN PLOT = Generates a mean plot.
BOX PLOT = Generates a box plot.
PLOT = Generates a data or function plot.

APPLICATIONS

Exploratory Data Analysis

IMPLEMENTATION DATE

94/2

PROGRAM

SKIP 50
SET READ FORMAT 3F4.0,F5.0,F6.0,F3.0,2F9.0
READ PBF11.DAT YEAR DAY BOT SD F11 FLAG WV CO2
.
RETAIN YEAR DAY BOT SD F11 WV CO2 FLAG SUBSET FLAG 0
LET MONTH=INT(DAY/30.25)+1
.
LINE BLANK DASH
CHARACTER X BLANK
XLIMITS 0 15
Y1LABEL NORMAL PPCC
X1LABEL GROUP ID
TITLE AUTOMATIC
NORMAL PPCC PLOT WV MONTH

NP CONTROL CHART

PURPOSE

Generates a (binomial) counts control chart.

DESCRIPTION

An NP chart is a data analysis technique for determining if a measurement process has gone out of statistical control. It is sensitive to changes in the number of defective items in the measurement process. The ``NP'' in NP charts stands for the np (the mean number of successes) of a binomial distribution. The NP control chart consists of:

Vertical axis = the number of defectives for each sub-group;
Horizontal axis = the sub-group designation.
A sub-group is frequently a time sequence (e.g., the number of defectives in a daily production run where each day is considered a sub-group). If the times are equally spaced, the horizontal axis variable can be generated as a sequence (e.g., LET X = SEQUENCE 1 1 N where N is the number of sub-groups).

In addition, horizontal lines are drawn at the mean number of defectives and at the upper and lower control limits. The distribution of the number of defective items is assumed to be binomial. This assumption is the basis for the calculating the upper and lower control limits. The control limits are calculated as:

where n is the number of items and p is the proportion of defective items. Also, zero serves as a lower bound on the LCL value.

SYNTAX

NP CHART <y> <size> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is a variable containing the number of defective items in each sub-group;

<size> is a variable containing the sample size for each sub-group;

<x> is a variable containing the sub-group identifier (usually 1, 2, 3, ...);

and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

NP CHART Y SIZE X
NP CHART NUMDEF NUMTOT X

NOTE 1

The P CONTROL CHART is similar to the NP CONTROL CHART. The distinction is that the P CONTROL CHART plots the percentage of defectives while the NP CONTROL CHART plots the number of defectives. The NP CONTROL CHART is typically used for equal sample sizes and P CONTROL CHART is typically used for unequal sample sizes.

NOTE 2

The attributes of the 4 traces that make up the P control chart are controlled by the standard LINES, CHARACTERS, SPIKES, and BAR commands. Trace 1 is the response variable, trace 2 is the mean line, and traces 3 and 4 are the upper and lower control limits. Some analysts prefer to draw the response variable as a character or a spike rather than a connected line. The example program demonstrates setting the line attributes (the control lines are drawn as dotted lines).

DEFAULT

None

SYNONYMS

NP CHART for NP CONTROL CHART

RELATED COMMANDS

U CHART = Generates a U control chart.
C CHART = Generates a C control chart.
P CHART = Generates a P control chart.
XBAR CHART = Generates a xbar control chart.
R CHART = Generates a range control chart.
S CHART = Generates a standard deviation control chart.
Q CHART = Generates a Quesenberry style control chart.
CHARACTERS = Sets the types for plot characters.
LINES = Sets the types for plot lines.
SPIKES = Sets the on/off switches for plot spikes.
PLOT = Generates a data or function plot.

REFERENCE

``Guide to Quality Control,'' Kaoru Ishikawa, Asian Productivity Organization, 1982 (Chapter 8).

APPLICATIONS

Quality Control

IMPLEMENTATION DATE

88/2

PROGRAM

SKIP 25
READ CCPN.DAT X NUMDEF SIZE
LINES SOLID SOLID DOT DOT
XLIMITS 0 25
XTIC OFFSET 0 1
YLIMITS 0 15
YTIC OFFSET 2 0
TITLE AUTOMATIC
X1LABEL GROUP-ID
Y1LABEL NUMBER OF DEFECTIVES
NP CONTROL CHART NUMDEF SIZE X

P CONTROL CHART

PURPOSE

Generates a (binomial) proportion control chart.

DESCRIPTION

A P chart is a data analysis technique for determining if a measurement process has gone out of statistical control. The P chart is sensitive to changes in the proportion of defective items in the measurement process. The ``P'' in P chart stands for the p (the proportion of successes) of a binomial distribution. The P control chart consists of:

Vertical axis = the percentage of defectives for each sub-group;
Horizontal axis = the sub-group designation.
A sub-group is frequently a time sequence (e.g., the number of defectives in a daily production run where each day is considered a sub-group). If the times are equally spaced, the horizontal axis variable can be generated as a sequence (e.g., LET X = SEQUENCE 1 1 N where N is the number of sub-groups).

In addition, horizontal lines are drawn at the mean number of defectives and at the upper and lower control limits. The distribution of the number of defective items is assumed to be binomial. This assumption is the basis for the calculating the upper and lower control limits. The control limits are calculated as:

where p is the total number of defects divided by the total number of items and N is the number of items in a given sub-group. Note that this means that the control limits can vary with the sub-group. Also, zero serves as a lower bound on the LCL value.

SYNTAX

P CHART <y> <size> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is a variable containing the number of defective items in each sub-group;

<size> is a variable containing the sample size for each sub-group;

<x> is a variable containing the sub-group identifier (usually 1, 2, 3, ...);

and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

P CHART Y SIZE X

NOTE 1

The P CONTROL CHART is similar to the NP CONTROL CHART. The distinction is that the P CONTROL CHART plots the percentage of defectives while the NP CONTROL CHART plots the number of defectives. The NP CONTROL CHART is typically used for equal sample sizes and P CONTROL CHART is typically used for unequal sample sizes.

NOTE 2

The attributes of the 4 traces that make up the P control chart are controlled by the standard LINES, CHARACTERS, SPIKES, and BAR commands. Trace 1 is the response variable, trace 2 is the mean line, and traces 3 and 4 are the upper and lower control limits. Some analysts prefer to draw the response variable as a character or a spike rather than a connected line. The example program demonstrates setting the line attributes (the control lines are drawn as dotted lines).

DEFAULT

None

SYNONYMS

P CHART for P CONTROL CHART

RELATED COMMANDS

U CHART = Generates a U control chart.
C CHART = Generates a C control chart.
NP CHART = Generates a Np control chart.
XBAR CHART = Generates a xbar control chart.
R CHART = Generates a range control chart.
S CHART = Generates a standard deviation control chart.
CHARACTERS = Sets the types for plot characters.
LINES = Sets the types for plot lines.
SPIKES = Sets the on/off switches for plot spikes.
PLOT = Generates a data or function plot.

REFERENCE

``Guide to Quality Control,'' Kaoru Ishikawa, Asian Productivity Organization, 1982 (Chapter 8).

APPLICATIONS

Quality Control

IMPLEMENTATION DATE

88/2

PROGRAM

SKIP 25
READ CCP.DAT X NUMDEF SIZE
LINES SOLID SOLID DOT DOT
XLIMITS 0 20
XTIC OFFSET 0 1
TITLE AUTOMATIC
Y1LABEL PERCENTAGE OF DEFECTIVES
X1LABEL GROUP-ID
P CONTROL CHART NUMDEF SIZE X

PARETO PLOT

PURPOSE

Generates a Pareto plot.

DESCRIPTION

A Pareto plot is an ordered (largest to smallest) histogram with carry-along tags. The Pareto plot is used to answer the question: ``Which data values are most important, and which are least important?'' The Pareto plot consists of:

Vertical axis = ordered response value;
Horizontal axis = dummy index (1 to n) where n is the number of response values.
The appearance of the trace is controlled by the first setting of the LINES, CHARACTERS, SPIKES, BARS, and their related attribute setting commands.

SYNTAX

PARETO PLOT <y> <SUBSET/EXCEPT/FOR/qualification>
where <y> is the response variable;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

PARETO PLOT Y
PARETO PLOT Y SUBSET LAB 2

NOTE 1

The carry-along tags are normally generated by specifying them in the CHARACTERS command. This is demonstrated in the program example below. Be aware that they are limited to 4 characters with this method. If more than 4 characters are required, then use the MOVEDATA and TEXT commands to plot them. The following provides an example (assume the carry-along tags are stored in strings S1 thru SN):
PARETO PLOT Y
LET N = SIZE Y
LOOP FOR K = 1 1 N
LET XCOOR = XPLOT(K)
LET YCOOR = YPLOT(K)
MOVEDATA XCOOR YCOOR
TEXT ^S^K
END OF LOOP

NOTE 2

Many analysts prefer to draw Pareto plots as bar charts. This can be accomplished by entering the following:

LINE BLANK
BAR ON
The attributes of the bars can be set with the various bar attribute setting commands. The Pareto plot can just as easily be drawn with spikes (SPIKE ON) as well.

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

CHARACTERS = Sets the type for plot characters.
CHARACTER OFFSET = Sets the horizontal and vertical offset for characters.
CHARACTER ANGLE = Sets the angle for characters
LINES = Sets the type for plot lines.
SPIKES = Sets the on/off switches for plot spikes.
BARS = Sets the on/off switches for plot bars.
BAR FILL = Sets the on/off switches for plot bar fills.
BAR DIMENSION = Sets the dimension (2 or 3) for bars.
BAR WIDTH = Sets the widths for plot bars.
FONT = Sets the font.
PLOT = Generates a data or function plot.
CONTROL CHART = Generates various types of control charts.

APPLICATIONS

Quality Control

IMPLEMENTATION DATE

88/9

PROGRAM

LET Y = DATA 9 4 13 11 19 8 11 10
CHARACTERS CA FL IL MA NE NY OR TX
CHARACTERS OFFSET 2 2 ALL
SPIKE ON ALL
TITLE 1986 AUTO GAS TAX
PARETO PLOT Y

PERCENT DEFECTIVE PLOT

PURPOSE

Generates a subsample percent defective versus subsample index plot.

DESCRIPTION

The subsample percent defective index is the percent defective of the data in the subsample. The percent defective plot is used to answer the question: ``Does the subsample percent defective index change over different subsamples?'' The plot consists of:

Vertical axis = subsample percent defective index;
Horizontal axis = subsample index.
In addition, a horizontal line is drawn representing the full sample percent defective value. The appearance of the 2 traces is controlled by the first 2 settings of the LINES, CHARACTERS, SPIKES, BARS, and similar attributes.

SYNTAX

PERCENT DEFECTIVE PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable;
<x> is the subsample identifier variable (this variable appears on the horizontal axis);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

PERCENT DEFECTIVE PLOT Y X
PERCENT DEFECTIVE PLOT Y X SUBSET X > 6

NOTE 1

The percent defective computes the number of defectives for a variable (i.e., the number of values that fall outside of some user specified tolerance limits) and expresses it as a percentage.

NOTE 2

The upper and lower specification limits must be specified by the user as follows:

LET USL = <value>
LET LSL = <value>

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

CHARACTERS = Sets the type for plot characters.
LINES = Sets the type for plot lines.
CP PLOT = Generates a Cp plot.
CPK PLOT = Generates a Cpk plot.
EXPECTED LOSS PLOT = Generates an expected loss plot.
CAPABILITY ANALYSIS = Performs a capability analysis.
BOX PLOT = Generates a box plot.
XBAR CHART = Generates an xbar control chart.
PLOT = Generates a data or function plot.

APPLICATIONS

Quality Control

IMPLEMENTATION DATE

93/10

PROGRAM

SKIP 25
READ GEAR.DAT DIAMETER BATCH
.
TITLE CASE ASIS
LABEL CASE ASIS
TITLE Gear Diameter Analysis
Y1LABEL PERCENT DEFECTIVE
X1LABEL Batch
LEGEND 1 Process Capability
LEGEND 2 PERCENT DEFECTIVE Plot
XTIC OFFSET 0.5 0.5
YTIC OFFSET 2.0 2.0
CHARACTER CIRCLE BLANK
CHARACTER FILL ON
CHARACTER SIZE 1.2
SPIKE ON
SPIKE DOTTED
LINE BLANK SOLID
.
LET LSL = 0.99
LET USL = 1.01
.
PERCENT DEFECTIVE PLOT DIAMETER BATCH

PERCENT POINT PLOT

PURPOSE

Generates a percent point plot (commonly referred to as quantile plot in the statistical graphics literature).

DESCRIPTION

A percent point plot is a graphical data analysis technique for summarizing the distributional information of a variable. The X percent point of a data set is the point where X% of the data is below that point and (100-X)% of the data is above that point. It consists of:

Vertical axis = percent point;
Horizontal axis = percent (0 to 100).
If the value of 50 is chosen on the horizontal axis, then the corresponding value on the vertical axis is the estimated 50% point (that is, the median) from the data. The attributes of the plot can be set by the first setting of the LINE, CHARACTER, SPIKE, and BAR commands (and their corresponding attribute setting commands). This is demonstrated in the sample program below.

SYNTAX 1

PERCENT POINT PLOT <y> <SUBSET/EXCEPT/FOR qualification>
where <y> is the variable of raw data;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax is used when you have raw data only.

SYNTAX 2

PERCENT POINT PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is a variable of pre-computed frequencies;
<x> is a variable of distinct values;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax is used when you have pre-computed frequencies at each data value.

EXAMPLE

PERCENT POINT PLOT Y
PERCENT POINT Y X
PERCENT POINT Y X SUBSET X > 2

NOTE 1

DATAPLOT divides the original data variable into classes in the same manner as it does for a histogram or frequency polygon. The percent points are calculated at the mid-points of these histogram classes. The defaults are the same as for histograms (the class width is 0.3*standard deviation, 6 classes above and 6 classes below the mean). This tends to leave a wide gap in the middle of the distribution and a large number of 0 and 100 percent point values. This can be improved by using the CLASS LOWER, CLASS UPPER, and CLASS WIDTH commands. A simple method for doing this is demonstrated in the sample program below.

NOTE 2

The residual-fitted spread plot (or r-f spread plot) recommended by Bill Cleveland (see REFERENCE section) is based on percent point plots. It plots the percent point plot of the fitted values minus their mean and the residuals from a fit side by side. A common scale is used on the vertical scale. This plot gives a visual representation of how influential the fit is. If the spread of the fitted-value distribution is large compared to the spread of the residual distribution, then the fitted variables are influential. The second program example demonstrates this plot. In addition to linear fits, it can be applied to various other types of fits (e.g., nonlinear fits, lowess fits, spline fits).

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

QUANTILE-QUANTILE PLOT = Generates a quantile-quantile plot.
HISTOGRAM = Generates a histogram.
FREQUENCY PLOT = Generates a frequency plot.
PROBABILITY PLOT = Generates a probability plot.
PIE CHART = Generates a pie chart.
PLOT = Generates a data or function plot.
CLASS LOWER = Sets the lower class minimum for histograms, frequency plots, and pie charts.
CLASS UPPER = Sets the upper class maximum for histograms, frequency plots, and pie charts.
CLASS WIDTH = Sets the class width for histograms, frequency plots, and pie charts.

APPLICATIONS

Distributional Analysis

REFERENCE

``Graphical Methods for Data Analysis,'' Chambers, Cleveland, Kleiner, and Tukey, Wadsworth, 1983.

``Visualizing Data,'' William Cleveland, Hobart Press, 1993.

IMPLEMENTATION DATE

Pre-1987

PROGRAM 1

SKIP 25
READ SUNSPOT2.DAT Y
LET ALOW = MINIMUM Y
LET AHIGH = MAXIMUM Y
CLASS LOWER ALOW
CLASS UPPER AHIGH
CLASS WIDTH 1.0
CHARACTER CIRCLE
CHARACTER FILL ON
CHARACTER SIZE 1.2
X1LABEL PERCENT POINT; Y1LABEL DATA VALUE
TITLE AUTOMATIC
PERCENT POINT PLOT Y

PROGRAM 2 (R-F SPREAD PLOT)

SKIP 25
READ BERGER1.DAT Y X BATCH
FIT Y X
LET M = MEAN PRED; LET Y2 = PRED - M
LET ALOW1 = MINIMUM Y2; LET ALOW2 = MINIMUM RES
LET ALOW = MIN(ALOW1,ALOW2)
LET AHIGH1 = MAXIMUM Y2; LET AHIGH2 = MAXIMUM RES
LET AHIGH = MAX(AHIGH1,AHIGH2)
CLASS WIDTH 1.0
.
MULTIPLOT 1 2; MULTIPLOT CORNER COORDINATES 0 0 100 95
FRAME CORNER COORDINATES 15 20 98 90
.
XLIMITS 0 100; XTIC OFFSET 2 2
MAJOR XTIC MARK NUMBER 6; MINOR XTIC MARK NUMBER 1
YLIMITS -25 30; YTIC OFFSET 2 2
LINE BLANK DASH; CHARACTER O
X1LABEL SIZE 2; X1LABEL PERCENT POINT
TITLE OFFSET 2; TITLE SIZE 3; TITLE FITTED VALUES
CLASS LOWER ALOW1; CLASS UPPER AHIGH1
PERCENT POINT PLOT Y2
.
TITLE RESIDUAL VALUES
CLASS LOWER ALOW2; CLASS UPPER AHIGH2
PERCENT POINT PLOT RES
.
END OF MULTIPLOT
HEIGHT 4; JUSTIFICATION CENTER
MOVE 50 96; TEXT R-F SPREAD PLOT

PERIODOGRAM

PURPOSE

Generates an auto-periodogram.

DESCRIPTION

A periodogram is a graphical data analysis technique for examining frequency-domain models of an equi-spaced time series. The periodogram is the Fourier transform of the autocovariance function. An equi-spaced time series is one in which the distance between adjacent points is constant. The periodogram (or spectrum) for a time series xt is:

where f is the frequency, n is the number of observations in the time series, D is (n+1)/2 for n odd and (n+2)/2 for n even. The periodogram then consists of:

Vertical axis = the spectrum estimate at the given frequency;
Horizontal axis = Fourier frequencies (1/n, 2/n, 3/n, ..., (n/2)/n) where n is the number of observations in the time series.
The frequency is measured in cycles per unit time where unit time is defined to be the distance between adjacent points. A frequency of 0 corresponds to an infinite cycle while a frequency of 0.5 corresponds to a cycle of 2 data points. Equi-spaced time series are inherently limited to detecting frequencies between 0 and 0.5.

From a data analysis point of view, the type of structure in the autocovariance function indicates the location of peaks in the periodogram. The variance at a given frequency is also referred to as the average power. Peaks in the periodogram also indicate the dominant frequency for underlying cyclic models. Smooth time series tend to generate periodograms with most of their variance (or power) in the low frequencies while rapidly oscillating series tend to generate periodograms with most of their variance in the higher frequencies. Once the dominant peaks have been identified, the next step is typically to use other time series analysis techniques (such as the complex demodulation phase plots) to determine if this frequency is constant over the entire domain of the data, or to carry out a a nonlinear fit with an underlying cyclic model (see the documentation for the COMPLEX DEMODULATION PLOT). As a simple example of fitting a nonlinear cyclic model, a time series can be modeled as:

where m is the mean of the time series, R is the amplitude, f is the phase shift, w is the frequency (which can be estimated with the periodogram), and et is the residual.

SYNTAX

PERIODOGRAM <y> <SUBSET/EXCEPT/FOR qualification>
where <y> is a response variable;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

PERIODOGRAM Y
PERIODOGRAM Y2 SUBSET Y2 > 2

NOTE 1

The spectral plot is a refinement of the periodogram (it smooths the spectrum estimate) and is generally recommended in its place since it has better statistical properties. See the SPECTRAL PLOT command for details.

NOTE 2

Missing values are not allowed. It is also common to remove trends by differencing (xt - xt-1) or to apply some other type of filter before generating the periodogram.

NOTE 3

Different time series texts present the equation for the periodogram in slightly different forms. However, these forms should be mathematically equivalent. DATAPLOT uses the definition from the Jenkins and Watts book (see below). The Bloomfield text (see below) uses a slightly different, although mathematically equivalent, definition.

NOTE 4

The spectral density is the spectrum divided by the variance of the times series. The spectral density is the Fourier transform of the autocorrelation function rather than the autocovariance function. This form of the periodogram allows comparisons of the spectrums of times series that may have different scales. The program example below shows how to plot this form of the periodogram.

NOTE 5

The CROSS-PERIODOGRAM, CO-PERIODOGRAM, and QUADRATURE PERIODOGRAM commands are recognized. However, these variations are not implemented at this time. These commands will generate a page erase, but no plot is generated. These options are recognized for the SPECTRUM command.

NOTE 6

The appearance of the periodogram can be controlled with the proper settings of the LINE, CHARACTER, SPIKES, and BAR commands. The most typical choices are as a connected line segment or as spikes. Some analyst prefer to turn the log scale on for the vertical axis (YLOG ON). If you want log scales, there is typically a problem with zero values. This can be circumvented with the following commands:

PERIODOGRAM Y
LET YJUNK = YPLOT
LET YJUNK = 0.000001 SUBSET YPLOT <= 0
LET XJUNK = XPLOT
YLOG ON
PLOT YJUNK XJUNK

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

SPECTRUM = Generates a spectral plot.
CORRELATION PLOT = Generates a correlation plot.
COMPLEX DEMOD PLOT = Generates a complex demodulation plot.
LAG PLOT = Generates a lag plot.
PLOT = Generates a data or function plot.
4-PLOT = Generates 4-plot univariate analysis.
CHARACTERS = Sets the types for plot characters.
LINES = Sets the types for plot lines.
SPIKES = Sets the on/off switches for plot spikes.
SUMMARY = Generates a table of summary statistics.
LET = Generates sine/cosine transformations (plus much more).
FIT = Carries out a least squares fit.

REFERENCE

``Spectral Analysis and Its Applications,'' Jenkins and Watts, Holden-Day, 1968 (chapters 6 and 7, page 211 for the equation for the spectrum).

``Fourier Analysis of Time Series: An Introduction,'' Bloomfield, Wiley and Sons, 1976 (section 2.3 for the periodogram).

APPLICATIONS

Frequency Time Series Analysis

IMPLEMENTATION DATE

Pre-1987

PROGRAM

. THIS SAMPLE PROGRAM READS THE FILE LEW.DAT IN THE
. DATAPLOT REFERENCE DIRECTORY. THESE DATA ARE
. BEAM DEFLECTION DATA.
.
SKIP 25
READ LEW.DAT DEFLECT
.
TITLE AUTOMATIC
MULTIPLOT 2 2; MULTIPLOT CORNER COORDINATES 0 0 100 100
LET A = MEAN DEFLECT
LINE BLANK; SPIKE ON; SPIKE BASE A
PLOT DEFLECT
.
LINE SOLID; SPIKE OFF
YLIMITS 0 40000; YTIC OFFSET 0 2000
Y1LABEL POWER; X1LABEL FREQUENCY
PERIODOGRAM DEFLECT
.
SPIKE ON; LINE BLANK
PERIODOGRAM DEFLECT
.
LET A = VARIANCE DEFLECT; LET TEMP = YPLOT/A; LET X = XPLOT
YLIMITS; YTIC OFFSET 0 0
TITLE SPECTRAL DENSITY PLOT
SPIKE BASE 0
PLOT TEMP X
.
END OF MULTIPLOT

PHASE PLANE DIAGRAM

PURPOSE

Generates a phase plane diagram.

DESCRIPTION

A first order differential equation is one of the form:

where t is an independent variable (usually time), y is a dependent variable, and y' is the derivative of y. A second order differential equation is one of the form:

where y'' is the second derivative of y.

When the functional form of the differential equation is known (e.g., y'' = x2 + y2 - y'), there are two common ways to graphically show the solution for the differential equation. The first method is to plot yi (and yi' for the second order case) against ti where ti, yi, and y'i represent the values at time i. A variation is a 3d-plot of (ti, yi, yi') The second method is to plot the phase diagram of the solution curves. Phase space is defined to be the coordinate system consisting of the dependent variable (y in our case) and each of its derivatives. The independent variable (t in our case) is only plotted implicitly. For the second order case, the phase diagram is a plot of (yi, y'i). Phase diagrams are not normally drawn for the first order case since there is only one axis (the first method is used instead). They can be extended to third order differential equations (plot (yi, y'i, y''i) by drawing a 3d-plot. Although the idea of phase space extends to higher dimensions, there is no obvious way to graph the phase diagram.

If the functional equation is known for a second order differential equation, the RUNGE KUTTA command can be used to find numerical estimates for y and y'. Then the PLOT command can be used to plot the solution using either method. For example, enter PLOT YPRIME Y VS T for the first method and PLOT YPRIME VS Y for the phase diagram.

The PHASE PLANE DIAGRAM is used for the case where the functional form of the differential equation is unknown. It works on a set of data points (i.e., values for y and optionally for time) to give a graphical estimate of the phase diagram. It uses the fact that y'=dy/dx (i.e., the change in y divided by the change in x). The one variable case (i.e., y only) plots the following:

Vertical axis = Y(i+1) - Y(i);
Horizontal axis = Y(i).
The two variable case plots the following:

Vertical axis = (Y(i+1) - Y(i))/(X(i+1)-X(i));
Horizontal axis = Y(i).
For the one variable case, the X values are assumed to be equally spaced and equal to 1 (that is, dx=1) and Y(i+1) - Y(i) is an estimate of dy. For the two variable case, X(i+1) - X(i) is an estimate of dx and Y(i+1) - Y(i) is an estimate of dy. In either case, the vertical axis is an estimate of the derivative of Y.

SYNTAX 1

PHASE PLANE DIAGRAM <y> <SUBSET/EXCEPT/FOR qualification>
where <y> is a response variable;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

SYNTAX 2

PHASE PLANE DIAGRAM <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is a response variable;
<x> is the independent variable (usually time);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

PHASE PLANE DIAGRAM Y1 X
PHASE PLANE DIAGRAM Y1
PHASE PLANE DIAGRAM Y1 Y2 SUBSET TAG > 3

NOTE 1

A related technique is to plot y(t+1) versus y(t). This can be done in DATAPLOT with the LAG PLOT command.

NOTE 2

Most differential equations textbooks give a slightly different derivation for the phase diagram. They use the fact that second (and higher order) differential equations can be rewritten as a system of first order differential equations. For example, the differential equation

can be transformed into the two equations

y1'=y2
y2'=t2+y12-y2

Then y1 and y2 are used as the coordinate system for the phase diagram. Since y2 = y1', the phase diagram is equivalent to what was derived earlier.

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

LINES = Sets the types for plot lines.
CHARACTER = Sets the types for plot characters.
LAG PLOT = Generates a lag plot.
PLOT = Generates a data or function plot.
MULTIPLOT = Allows multiple plots per page
RUNGE KUTTA = Numerically solve a first or second order differential equation.

REFERENCE

``Differential Equations, Dynamical Systems, and Linear Algebra,'' Hirsch and Smale, Academic Press, 1974 (pp. 2-4).

``Chaos,'' James Gleick, Penguin Press, 1987 (pp. 264-266).

APPLICATIONS

Differential Equations

IMPLEMENTATION DATE

88/9

PROGRAM

. Y = DEFLECTION FROM HORIZONTAL (IN INCHES)
. X = DISTANCE OUT ONTO BEAM (X = 0, 2, 4, 6, ..., 100 INCHES)
. P = LOAD (IN POUNDS) AT FREE END (HERE = 64)
. L = LENGTH (IN INCHES) OF BEAM (HERE = 100)
. E = YOUNG'S MODULUS (= 6 * 10**6)
. I = MOMENT OF INERTIA OF CROSS SECTION (HERE = 0.128 INCHES**4)
. -----START POINT-----------------------------------
.
. STEP 1--DEFINE THE PHYSICAL PARAMETERS OF THE BEAM
LET P = 64
LET L = 100
LET E = 6*10**6
LET I = 0.128
. STEP 2--DEFINE THE RIGHT-SIDE FUNCTION OF Y''(X) = F
LET FUNCTION F1 = P*(L-X)/(E*I)
LET FUNCTION F2 = (1+YP**2)**(3/2)
LET FUNCTION F = F1*F2
. STEP 3--DEFINE INITIAL CONDITIONS AND DEFINE THE DESIRED SEQUENCE
. OF POINTS AT WHICH TO COMPUTE THE SOLUTION CURVE.
LET Y(1) = 0
LET YP(1) = 0
LET X = SEQUENCE 0 2 100
. STEP 4--SOLVE THE DIFFERENTIAL EQUATION
LET Y YP = RUNGA-KUTTA F X
.
X1LABEL Y
Y1LABEL Y' DERIVED FROM DATA
TITLE AUTOMATIC
PHASE PLANE DIAGRAM Y X

PIE CHART

PURPOSE

Generates a pie chart.

DESCRIPTION

A pie chart is a graphical data analysis technique for summarizing the distributional information of a variable. It is a circular plot consisting of wedges where the size of each wedge is proportional to the frequency (= number of observations) in that wedge. The plot is to be read clockwise (where the first wedge is at 9 o'clock).

If a single variable is specified, DATAPLOT divides the values into frequency classes in the same manner as for a histogram. The histogram and the pie chart have the same information except the histogram has bars at the data values (where the height of the bar is proportional to the number of observations in the class), whereas the pie chart has wedges (where the area of the wedge is proportional to the number of observations in the class).

If two variables are specified, the first variable contains pre-computed frequencies and the second variable is a group identifier. This second form is more commonly used.

SYNTAX 1

PIE CHART <x> <SUBSET/EXCEPT/FOR qualification>
where <x> is the variable of raw data values;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax is used when you have raw data only.

SYNTAX 2

PIE CHART <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the variable of pre-computed frequencies;
<x> is the variable of group identifiers;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax is used when you have pre-computed frequencies at each data value.

EXAMPLES

PIE CHART X
PIE CHART TEMP SUBSET TEMP > 0
PIE CHART F X SUBSET X > 2
PIE CHART COUNTS STATE

NOTE 1

Each wedge is drawn with a common set of attributes. The attributes of the wedge borders are set with the LINE, LINE COLOR, and LINE THICKNESS commands (typically they are all set the same). The attributes of the interior are set with the various REGION commands. Any labels for the wedges must be set with the LEGEND or TEXT commands. The CROSS HAIR command can help in positioning labels. The program example below shows how to set the attributes. DATAPLOT does not support features such as 3d pie charts or exploding slices that are common in many business graphics programs.

NOTE 2

Although pie charts are popular in business graphics, they are generally a poor graphics technique. See the book listed in the REFERENCE section below for more information.

NOTE 3

For the one variable form of the command, DATAPLOT uses a class width of 0.3 times the standard deviation of the variable. Use the CLASS WIDTH to override this default. DATAPLOT also tends to generate a large number of zero frequency classes at the lower and upper tails. The CLASS LOWER and CLASS UPPER commands can be used to set lower and upper limits for the classes.

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

HISTOGRAM = Generates a histogram.
FREQUENCY PLOT = Generates a frequency plot.
PERCENT POINT PLOT = Generates a percent point plot.
PLOT = Generates a plot (including bar plots).
CLASS LOWER = Sets the lower class minimum for histograms, frequency plots, and pie charts.
CLASS UPPER = Sets the upper class maximum for histograms, frequency plots, and pie charts.
CLASS WIDTH = Sets the class width for histograms, frequency plots, and pie charts.
LINE = Sets the types for plot lines.
LINE COLOR = Sets the colors for plot lines.
LINE THICKNESS = Sets the thicknesses for plot lines.
REGION FILL = Sets the on/off switches for region fills.

REFERENCE

``The Elements of Graphing Data,'' William Cleveland, Wadsworth, 1985 (p. 264).

APPLICATIONS

Business Graphics

IMPLEMENTATION DATE

The ability to set the attributes of the pie wedges was implemented 93/11.

PROGRAM

LET X = DATA 81 82 83 84 85
LET Y = DATA 2 5 9 15 28
MULTIPLOT 2 2; MULTIPLOT CORNER COORDINATES 0 0 100 100
X1LABEL SALES IN MILLIONS OF DOLLARS
.
LINE THICKNESS .3 ALL; TITLE PIE CHART WITH THICKER LINES
PIE CHART Y X
.
REGION FILL ON ALL; REGION PATTERN COLOR G10 G30 G50 G70 G90
REGION FILL COLOR G10 G30 G50 G70 G90
TITLE PIE CHART WITH SOLID FILL SLICES
PIE CHART Y X
.
TITLE PIE CHART WITH LABELS
LET N = SIZE X
LEGEND SIZE 3
LOOP FOR K = 1 1 N
LET A = X(K)
LEGEND ^K 19^A
END OF LOOP
LEGEND 1 COORDINATES 8 58; LEGEND 2 COORDINATES 10 71; LEGEND 3 COORDINATES 28 92
LEGEND 4 COORDINATES 68 77; LEGEND 5 COORDINATES 67 30
PIE CHART Y X
.
REGION PATTERN COLOR BLACK ALL; REGION PATTERN D1 D2 D1D2 VERT HORI
REGION PATTERN SPACING 1.0 1.0 3.0 4.0 5.0; REGION PATTERN LINE SOLID SOLID SOLID DASH DOT
TITLE PIE CHART WITH HATCH PATTERN FILLS
PIE CHART Y X
MULTIPLOT OFF

PLOT

PURPOSE

Generates a plot.

DESCRIPTION

The PLOT command allows the analyst to generate single or multi-trace plots of data, functions, or both. It is DATAPLOT's most powerful, most important, and most heavily used graphics command. There are 7 general plot syntaxes:

2. 2-variable form
3. 3-variable multi-trace form
4. VERSUS form
5. multi-VERSUS form
6. function form
7. AND form
DATAPLOT uses the concept of traces. A trace is a connected set of points. Points in the same trace are plotted with the same attributes. In most cases, a single variable is one trace. However, a single variable can be split into multiple traces if desired (see SYNTAX 3).

SYNTAX 1 (1-variable form)

PLOT <y> <SUBSET/EXCEPT/FOR qualification>
where <y> is a variable;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This form for the PLOT command is used for plotting <y> versus its dummy index. The resulting plot will have <y> on the vertical axis and the dummy index 1, 2, 3, ..., n (where n = the number of elements in <y>) on the horizontal axis. Some examples are:

PLOT Y
PLOT TEMP SUBSET TAG > 4

SYNTAX 2 (2-variable form)

PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the dependent (i.e., the vertical axis) variable;
<x> is the independent (i.e., the horizontal axis) variable;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This is the 2-argument form for the PLOT command. It is used for plotting <y> versus <x>. The resulting plot will have <y> on the vertical axis and <x> on the horizontal axis. Some examples are:

PLOT Y X
PLOT RES X SUBSET X > -9999

SYNTAX 3 (the 3-variable multi-trace form)

PLOT <y> <x> <tag> <SUBSET/EXPECT/FOR qualification>
where <y> is the dependent (i.e., the vertical axis) variable;
<x> is the independent (i.e., the horizontal axis) variable;
<tag> is a variable that identifies groups in <y> and <x> that are plotted with common attributes;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This is the 3-argument form for the PLOT command. It is used for multi-trace plotting of <y> versus <x>. The resulting plot will have <y> on the vertical axis, <x> on the horizontal axis, and will have one trace for each distinct value in the <tag> variable. Some examples are:

PLOT Y X LAB
PLOT PRES TEMP DAY
PLOT PRES TEMP DAY SUBSET DAY <> 4
If the <x> variable and the <tag> variable are identical, all points with a common <x> value are treated as a common trace (i.e., they are plotted with common attributes).

Although DATAPLOT supports a large number of built-in plot formats, there will be cases where you may want a specialized chart format that is not available. This syntax for the PLOT command can often be used for this purpose by defining the <tag> variable in the right way. Points with a common <tag> value are treated as a trace, and attributes can be set for each individual trace.

SYNTAX 4 (VERSUS form)

PLOT <y1> <y2> <y3> ... <yk> VERSUS <x> <SUBSET/EXCEPT/FOR qualification>
where <y1>, <y2>, <y3>, ..., <yk> are dependent (i.e., vertical axis) variables;
<x> is an independent (i.e., the horizontal axis) variable;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This is the single-VERSUS argument form for the PLOT command. It is used for multi-trace plotting where the dependent variables are plotted against a common <x> variable. The resulting plot will have one trace for each <yi> variable:

<y1> (vertically) versus <x> (horizontally)
<y2> (vertically) versus <x> (horizontally)
<y3> (vertically) versus <x> (horizontally)
...
<yk> (vertically) versus <x> (horizontally)
Some examples are:

PLOT Y1 Y2 Y3 VERSUS X
PLOT Y PRED VERSUS X
PLOT Y PRED VERSUS X SUBSET X = 10.6 TO 19.7

SYNTAX 5 (multi-VERSUS form)

PLOT <syntax 4> <syntax 4> ... <syntax 4>

This is the multi-VERSUS argument form for the PLOT command. It is used for multi-trace plotting where the dependent variables are plotted against different <x> variables. Some examples are:

PLOT Y1 Y2 Y3 VERSUS X1 Y4 Y5 VERSUS X2
PLOT P1 VERSUS T1 P2 VERSUS T2 P3 VERUS T3

SYNTAX 6 (function form)

PLOT <f> FOR <x> = <start> <increment> <stop>

where <f> is a function (either pre-defined via the LET FUNCTION command, or explicitly defined herein);
<x> is the dummy variable in the function;
<start> is the desired minimum value for <x> at which the function is to be evaluated;
<increment> is the desired increment value for <x> at which the function is to be evaluated;
and <stop> is the desired maximum value for <x> at which the function is to be evaluated.
This is the function form for the PLOT command. It is used for plotting a function of one variable. Some examples are:

PLOT SIN(X)*EXP(-X) FOR X = 0 .1 5
LET FUNCTION F = EXP(-X*SIN(X**2))
PLOT F FOR X = 0 .1 3

SYNTAX 7 (AND form)

<any valid syntax 1 to 6> AND

<any valid syntax 1 to 6> AND

<any valid syntax 1 to 6> AND

...

<any valid syntax 1 to 6> AND

<any valid syntax 1 to 6>

This is the most general syntax for PLOT. It is used for generating multi-trace plots of variables, of functions, or of mixtures of both. Some examples are:

PLOT Y X AND
PLOT A+B*X FOR X = 1 1 10
PLOT Y1 Y2 VS X AND
PLOT Y X AND
PLOT A*SIN(B*X) FOR X = 1 .1 3 AND
PLOT Y3 X3 LAB

NOTE 1

Plot points can be plotted as characters, connected lines, spikes, or bars. These are set independently of each other. The default is to plot each trace as a connected line with no symbol, no bar, and no spike. The LINE, CHARACTER, SPIKE, and BAR commands are used to set the switches for plotting a given trace as a connected line, a character, a spike, or a bar respectively.

There are attribute setting commands for lines, characters, spikes, and bars. See the documentation for LINE, CHARACTER, SPIKE, and BAR for a complete list of these commands. Attributes are set giving a list of values. The first trace uses the first setting, the second trace uses the second setting, and so on. For example, CHARACTER SIZE 2.0 3.0 1.5 sets the character size for trace 1 to 2.0, the character size for trace 2 to 3.0, and the character size for trace 3 to 1.5. Attributes can be set for up to 100 traces.

As a more complex example, suppose you want to plot a variable Y as a connected line and every fifth point as a filled circle. You can do something like the following:

LET N = SIZE Y
LET X = SEQUENCE 1 1 N
LET TAG = PATTERN 1 2 2 2 2 FOR I = 1 1 N
CHARACTER CIRCLE BLANK
CHARACTER FILL ON OFF
CHARACTER SIZE 1.5
PLOT Y X TAG

NOTE 2

DATAPLOT provides a large range of plot control features for the plot. This includes titles, axis labels, legends, and so on. DATAPLOT sets these with separate commands (as opposed to arguments on the PLOT command itself). Each of these features typically has its own attribute setting commands as well. DATAPLOT simply uses whatever the current setting is for each of these attributes when it generates a plot. For example, a TITLE command is entered to define the plot title (nothing is actually generated until the next PLOT is performed). This title remains in effect for all subsequent plots until it is changed (another TITLE command) or deleted (TITLE with no arguments).

Most of the commonly used plot features are listed below in the RELATED COMMANDS section. The attribute setting commands are not listed (e.g., TITLE is listed, but TITLE COLOR and TITLE SIZE are not). See the documentation for the plot feature command for its attribute setting commands. These attribute setting commands are documented in the Plot Control chapter.

DEFAULT

None

SYNONYMS

VS and VS. are synonyms for VERSUS.

RELATED COMMANDS

CHARACTERS = Sets the types for plot characters.
LINES = Sets the types for plot lines.
SPIKES = Sets the on/off switches for plot spikes.
BARS = Sets the on/off switches for plot bars.
TITLE = Sets the plot title.
LABEL = Sets the plot axis labels.
LEGEND = Sets the plot legends.
BOX COORDINATES = Sets the locations for plot boxes.
ARROW COORDINATES = Sets the locations for plot arrows.
SEGMENT COORDINATES = Sets the locations for plot segments.
FRAME = Sets the on/off switch for the plot frame.
FRAME COORDINATES = Sets the location for the plot frame.
GRID = Sets the on/off switch for the plot grid.
LOG = Sets the on/off switch for log scale.
TIC = Sets the on/off switch for the plot tics.
TIC LABEL = Sets the on/off switch for the plot tic labels.
MARGIN COLOR = Sets the color for the plot margin.
BACKGROUND COLOR = Sets the color for the plot background.
PRE-ERASE = Sets the automatic pre-erase switch for plots.
SEQUENCE = Sets the automatic sequence switch for plots.
MULTIPLOT = Generate multiple plots per page.

APPLICATIONS

Data Analysis, Presentation Graphics

IMPLEMENTATION DATE

Pre-1987

PROGRAM 1

. THIS SAMPLE PROGRAM READS THE FILE BOXJE142.DAT IN THE DATAPLOT
. REFERENCE DIRECTORY. THESE DATA ARE YIELD FROM AN INDUSTRIAL PROCESS.
.
SKIP 25
READ BOXJE142.DAT YIELD
MULTIPLOT 2 2; MULTIPLOT CORNER COORDINATES 0 0 100 100
TITLE TIME SERIES PLOT
Y1LABEL YIELD
X1LABEL SEQUENCE NUMBER
XLIMITS 0 70
XTIC OFFSET 2 2
PLOT YIELD
.
LINE BLANK
CHARACTER CIRCLE; CHARACTER SIZE 1.2
PLOT YIELD
.
LET N = SIZE YIELD; LET X = DATA 1 N
LET A = MEAN YIELD; LET Y = DATA A A
CHARACTER OFF
SPIKE ON; SPIKE BASE A
PLOT YIELD AND
PLOT Y X
.
SPIKE OFF; BAR ON
BAR BASE A; BAR WIDTH 0.6; BAR FILL ON
PLOT YIELD AND
PLOT Y X
END OF MULTIPLOT

PROGRAM 2

. POLLUTION SOURCE ANALYSIS, LLOYD CURRIE, DATE--1990
. SUBSET OF CURRIE.DAT REFERENCE FILE
LET ID2 = DATA 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2
SERIAL READ LEAD
164 426 59 98 312 263 607 497 213 54 160 262 547 325 419 94 70
END OF DATA
SERIAL READ POT
106 175 61 79 94 121 424 328 107 218 140 179 246 231 245 339 99
END OF DATA
.
MULTIPLOT 2 2; MULTIPLOT CORNER COORDINATES 0 0 100 100
TITLE SCATTER PLOT; X1LABEL LEAD; Y1LABEL POTASSIUM
LINE BLANK ALL; CHARACTER CIRCLE; CHARACTER FILL ON
PLOT POT LEAD
CHARACTER CIRCLE SQUARE; CHARACTER FILL OFF ALL
TITLE SCATTER PLOT WITH GROUPS
LEGEND 1 CIRC() - GROUP 1; LEGEND 2 SQUA() - GROUP 2
LEGEND FILL ON; LEGEND FONT DUPLEX
PLOT POT LEAD ID2
CHARACTER CIRCLE CIRCLE SQUARE SQUARE; CHARACTER FILL OFF ON OFF ON
LET X = SEQUENCE 1 1 17; LEGEND 1 CIRC() - POTASSIUM; LEGEND 2 SQUA() - LEAD
X1LABEL SEQUENCE; Y1LABEL; TITLE CHARACTER FILL REPRESENTS GROUP ID
PLOT POT X ID2 AND
PLOT LEAD X ID2
CHARACTER BLANK ALL; LINE SOLID DASHSEGMENT 2 PATTERN DASH
SEGMENT 1 COORDINATES 16 85 19 85; SEGMENT 2 COORDINATES 16 81 19 81
TITLE MULTIPLE TRACES AS LINES
PLOT POT LEAD VS X
END OF MULTIPLOT

PROGRAM 3

MULTIPLOT 2 2; MULTIPLOT CORNER COORDINATES 0 0 100 100
LET FUNCTION F = (1/SQRT(2*PI))*EXP(-0.5*X**2)
LET FUNCTION D1 = DERIVATIVE F WRT X
LINES SOLID DOT DOT
TITLE PLOT A FUNCTION AND THE DERIVATIVE
PLOT F FOR X = -3 .1 3 AND
PLOT D1 FOR X = -3 .1 3
.
PRE-SORT OFF; FRAME OFF; DEGREES
LET THETA = SEQUENCE 0 10 1000; LET R = 2*THETA
LET Y = R*SIN(THETA); LET X = R*COS(THETA)
TITLE A POLAR COORDINATE FUNCTION
PLOT Y X
PRE-SORT ON; FRAME ON; DELETE Y X
.
SKIP 25
READ UGIANSKY.DAT Y1 Y2 LAB
LEGEND 1 INTERLAB ANALYSIS; TITLE YOUDEN PLOT
LINES BLANK ALL; CHARACTERS 1 2 3 4 5 6 7 8 9; CHARACTER SIZE 4 ALL
LIMITS 0 5.5
PLOT Y1 Y2 LAB
LEGEND 1; LIMITS
.
SKIP 25
READ CHWIRUT1.DAT Y X LAB
CHARACTER X ALL; CHARACTER SIZE 1.5 ALL; LINE SOLID ALL
TITLE SHOW SPREAD DUE TO REPLICATION
PLOT Y X X
END OF MULTIPLOT

... PPCC PLOT

PURPOSE

Generates a ppcc plot (that is, a probability plot correlation coefficient plot).

DESCRIPTION

A ppcc plot is a graphical data analysis technique for determining that member of the specified distributional family which provides a ``best'' distributional fit to the data. The distributional fit will be ``best'' in the sense that it will have the most linear probability plot of all the selected members of the family. For each of selected members of the distributional family, the probability plot is computed, and the linearity of the probability plot is summarized via the correlation coefficient. The resulting PPCC plot thus consists of:

Vertical axis = probability plot correlation coefficient value;
Horizontal axis = distributional family parameter value.
The value of the distributional parameter (on the horizontal axis) which corresponds to the maximum of the PPCC plot curve (on the vertical axis) is, of course, of interest since it indicates the best-fit member of the family. PPCC plots are available for the following distributional families (with the distributional parameter in parentheses):

2. t (nu)
3. chi-squared (nu)
4. gamma (gamma)
5. extreme value type 2 (gamma)
6. Pareto (gamma)
7. geometric (p)
8. Poisson (lambda)
9. Wald (gamma)
10. inverse Gaussian (gamma)
11. reciprocal inverse gaussian (gamma)
12. fatigue life (gamma)
13. Weibull (gamma)
14. Generalized Pareto (gamma)

SYNTAX 1

<family> PPCC PLOT <x> <SUBSET/EXCEPT/FOR/qualification>
where <x> is the variable of raw data values under analysis;
<family> is one of the following:
TUKEY LAMBDA
T
CHI-SQUARED
GAMMA
EXTREME VALUE TYPE 2
PARETO
GEOMETRIC
POISSON
WALD
INVERSE GAUSSIAN
RIG
FL
WEIBULL
GENERALIZED PARETO
EXTREME VALUE
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax is used for raw data.

SYNTAX 2

<family> PPCC PLOT <y> <x> <SUBSET/EXCEPT/FOR/qualification>
where <y> is the variable of pre-computed frequencies;
<x> is the variable of distinct values for the variable under analysis;
<family> is as above;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax is used for the case with pre-computed frequencies.

EXAMPLES

LAMBDA PPCC PLOT X
T PPCC PLOT X
EXTREME VALUE TYPE 2 PPCC PLOT X
POISSON PPCC PLOT X
LAMBDA PPCC PLOT F X
T PPCC PLOT F X
EXTREME VALUE TYPE 2 PPCC PLOT F X
POISSON PPCC PLOT F X

NOTE 1

The Weibull, extreme value type II, and generalized Pareto distributions can be based on either the minimum or maximum order statistic. The command SET MINMAX <1/2> is required before the PPCC PLOT command for these distributions. A value of 1 specifies the minimum order statistic and a value of 2 specifies the maximum order statistic. Currently, the generalized Pareto distribution is only supported for the maximum order statistic (i.e., enter SET MINMAX 2).

NOTE 2

The EXTREME VALUE option listed above generates a WEIBULL PPCC PLOT and an EV2 PPCC PLOT overlaid on the same plot.

NOTE 3

The range of parameters is determined automatically. However, if you wish to restrict the range, you can specify the lower and upper limits by appending a 1 or 2 to the parameter name and assigning a value. For example, to restrict a Weibull ppcc plot to values 0.5 and 20, do the following:

LET GAMMA1 = 0.5
LET GAMMA2 = 20
WEIBULL PPCC PLOT Y

NOTE 4

The PPCC PLOT automatically saves several parameters. The MAXPPCC parameter contains the maximum correlation that was computed and the SHAPE parameter contains the value of the estimated distributional parameter (e.g., GAMMA for the Weibull distribution) that corresponds to MAXPPCC.

NOTE 5

The PROBABILITY PLOT command can be used to generate a probability plot for over 35 distributions. For distributions that are actually a family of distributions, the PPCC PLOT is typically employed first to get the best member of the family.

NOTE 6

The PPCC PLOT commands writes conclusions based on the ppcc plot to the DPCONF.TEX (the name may vary depending on the operating system) for some of the distribution families. This file is automatically opened when a DATAPLOT session starts, although only a few commands actually write anything to it.

DEFAULT

None

SYNONYMS

FRECHET and EV2 are synonyms for EXTREME VALUE TYPE 2.

LAMBDA PPCC PLOT and TUKEY PPCC PLOT are synonyms for TUKEY LAMBDA PPCC PLOT.

STUDENT T PPCC PLOT is a synonym for T PPCC PLOT.

The CHISQUARE term can be specified as CHISQUARE or CHI SQUARE.

FL PPCC PLOT, BRIN SAUNDERS PPCC PLOT, and SAUNDERS BRIN PPCC PLOT are synonyms for FATIGUE LIFE PPCC PLOT.

IG PPCC PLOT is a synonym for INVERSE GAUSSIAN PPCC PLOT.

RIG PPCC PLOT is a synonym for RECIPROCAL INVERSE GAUSSIAN PPCC PLOT.

RELATED COMMANDS

FREQUENCY PLOT = Generates a frequency plot.
HISTOGRAM = Generates a histogram.
PIE CHART = Generates a pie chart.
PERCENT POINT PLOT = Generates a percent point plot.
PROBABILITY PLOT = Generates a probability plot.
PLOT = Generates a data or function plot.

REFERENCE

``Continuous Univariate Distributions,'' 2nd. ed., Johnson, Kotz, and Balkrishnan, John Wiley and Sons, 1994.

``The Probability Plot Correlation Coefficient Test for Normality,'' James J. Filliben. Technometrics, Vol. 17, No. 1, February 1975.

APPLICATIONS

Distributional Analysis

IMPLEMENTATION DATE

Pre-1987

PROGRAM

MULTIPLOT 5 3; MULTIPLOT CORNER COORDINATES 0 0 100 100
TITLE AUTOMATIC
TIC LABEL SIZE 5; LABEL SIZE 5
Y1LABEL PPCC VALUE; X1LABEL PARAMETER VALUE
X1LABEL DISPLACEMENT 8
HEIGHT 5
JUSTIFICATION CENTER
SET MINMAX 1
.
LET LAMBDA = 1.5
LET Y = TUKEY LAMBDA RANDOM NUMBERS FOR I = 1 1 100
TUKEY LAMBDA PPCC PLOT Y
MOVE 50 2; TEXT SHAPE = ^SHAPE
LET LAMBDA = 25
LET Y = POISSON RANDOM NUMBERS FOR I = 1 1 100
POISSON PPCC PLOT Y
MOVE 50 2; TEXT SHAPE = ^SHAPE
LET NU = 25
LET Y = T RANDOM NUMBERS FOR I = 1 1 100
T PPCC PLOT Y
MOVE 50 2; TEXT SHAPE = ^SHAPE
LET NU = 5
LET Y = CHI-SQUARE RANDOM NUMBERS FOR I = 1 1 100
CHI-SQUARE PPCC PLOT Y
MOVE 50 2; TEXT SHAPE = ^SHAPE
LET P = 0.5
LET Y = GEOMETRIC RANDOM NUMBERS FOR I = 1 1 100
GEOMETRIC PPCC PLOT Y
MOVE 50 2; TEXT SHAPE = ^SHAPE
LET GAMMA = 0.5
LET Y = GAMMA RANDOM NUMBERS FOR I = 1 1 100
GAMMA PPCC PLOT Y
MOVE 50 2; TEXT SHAPE = ^SHAPE
LET Y = EV2 RANDOM NUMBERS FOR I = 1 1 100
EV2 PPCC PLOT Y
MOVE 50 2; TEXT SHAPE = ^SHAPE
LET Y = PARETO RANDOM NUMBERS FOR I = 1 1 100
PARETO PPCC PLOT Y
MOVE 50 2; TEXT SHAPE = ^SHAPE
LET Y = WALD RANDOM NUMBERS FOR I = 1 1 100
WALD PPCC PLOT Y
MOVE 50 2; TEXT SHAPE = ^SHAPE
LET Y = INVERSE GAUSSIAN RANDOM NUMBERS FOR I = 1 1 100
INVERSE GAUSSIAN PPCC PLOT Y
MOVE 50 2; TEXT SHAPE = ^SHAPE
LET Y = RECIPRICAL INVERSE GAUSSIAN RANDOM NUMBERS FOR I = 1 1 100
RECIPRICAL INVERSE GAUSSIAN PPCC PLOT Y
MOVE 50 2; TEXT SHAPE = ^SHAPE
LET Y = FATIGUE LIFE RANDOM NUMBERS FOR I = 1 1 100
FATIGUE LIFE PPCC PLOT Y
MOVE 50 2; TEXT SHAPE = ^SHAPE
LET Y = WEIBULL RANDOM NUMBERS FOR I = 1 1 100
WEIBULL PPCC PLOT Y
MOVE 50 2; TEXT SHAPE = ^SHAPE
EXTREME VALUE PPCC PLOT Y
MOVE 50 2; TEXT SHAPE = ^SHAPE
SET MINMAX 2
LET Y = GENERALIZED PARETO RANDOM NUMBERS FOR I = 1 1 100
GENERALIZED PARETO PPCC PLOT Y
MOVE 50 2; TEXT SHAPE = ^SHAPE
END OF MULTIPLOT

... PROBABILITY PLOT

PURPOSE

Generates a probability plot for one of 38 distributions.

DESCRIPTION

A probability plot is a graphical data analysis technique for determining how well the specified distribution fits the data set. Linearity in the probability plot is indicative of a good distributional fit. The probability plot consists of:

Vertical axis = ordered observations;
Horizontal axis = order statistic medians.
DATAPLOT has extensive probability plot capabilities--38 distributions/distributional families are available. When distributional families are specified, then the LET command is used before the PROBABILITY PLOT command to specify fully which member of the distributional family is desired. For example,

LET GAMMA = 5.3
WEIBULL PROBABILITY PLOT Y
The name of the distributional parameter for families is given in the list below.

SYNTAX 1

<dist> PROBABILITY PLOT <x> <SUBSET/EXCEPT/FOR/qualification>
where <x> is the variable of raw data values under analysis;
<dist> is one of the following distributions:
UNIFORM
SEMI-CIRCULAR
TRIANGULAR (C, defaults to 0)
NORMAL
LOGISTIC
DOUBLE EXPONENTIAL
CAUCHY
TUKEY LAMBDA (LAMBDA)
LOGNORMAL
HALFNORMAL
T (NU)
CHI-SQUARED (NU)
F (NU1, NU2)
EXPONENTIAL
GAMMA (GAMMA)
BETA (ALPHA, BETA)
WEIBULL (GAMMA)
EXTREME VALUE TYPE 1
EXTREME VALUE TYPE 2 (GAMMA)
PARETO (GAMMA)
BINOMIAL (N, P)
GEOMETRIC (P)
POISSON (LAMBDA)
NEGATIVE BINOMIAL (N, K, P)
WALD (GAMMA)
INVERSE GAUSSIAN (GAMMA)
RIG (GAMMA)
FL (GAMMA)
GENERALIZED PARETO (GAMMA)
DISCRETE UNIFORM (N)
NON-CENTRAL T (NU, LAMBDA)
NON-CENTRAL F (NU1, NU2, LAMBDA)
NON-CENTRAL CHI-SQUARE (NU, LAMBDA)
NON-CENTRAL BETA (ALPHA, BETA, LAMBDA)
DOUBLY NON-CENTRAL F (NU1, NU2, LAMBDA1, LAMBDA2)
DOUBLY NON-CENTRAL T (NU, LAMBDA1, LAMBDA2)
HYPERGEOMETRIC (K, N, M)
VON MISES (B)
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax is used for raw data.

SYNTAX 2

<dist> PROBABILITY PLOT <y> <x> <SUBSET/EXCEPT/FOR/qualification>
where <y> is the variable of pre-computed frequencies;
<x> is the variable of distinct values for the variable under analysis;
<dist> is as above;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax is used for pre-computed frequencies.

EXAMPLES

NORMAL PROBABILITY PLOT X

CAUCHY PROBABILITY PLOT X

TUKEY LAMBDA PROBABILITY PLOT X

LOGNORMAL PROBABILITY PLOT X

WEIBULL PROBABILITY PLOT X

EXTREME VALUE TYPE 1 PROBABILITY PLOT X

POISSON PROBABILITY PLOT X

NORMAL PROBABILITY PLOT F X

CAUCHY PROBABILITY PLOT F X

TUKEY LAMBDA PROBABILITY PLOT F X

LOGNORMAL PROBABILITY PLOT F X

WEIBULL PROBABILITY PLOT F X

EXTREME VALUE TYPE 1 PROBABILITY PLOT F X

POISSON PROBABILITY PLOT F X

NOTE 1

For distributions that have a family of parameters, the PPCC PLOT can be used to find the optimal value of the parameter to use for generating the probability plot.

NOTE 2

The PROBABILITY PLOT command fits a least squares line to the resulting probability plot and automatically saves the following internal parameters:

PPCC = the correlation coefficient between the vertical and horizontal axis variables
PPA0 = the intercept of the fitted line
PPA1 = the slope of the fitted line
SDPPA0 = standard deviation of PPA0
SDPPA1 = standard deviation of PPA1
PPRESSD = residual standard deviation from fitted line
PPRESDF = residual degrees of freedom from fitted line
These parameters can be printed or used in subsequent computations if desired.

NOTE 3

The Weibull, extreme value type II, and generalized Pareto distributions can be based on either the minimum or maximum order statistic. The command SET MINMAX <1/2> is required before the PROBABILITY PLOT command for these distributions. A value of 1 specifies the minimum order statistic and a value of 2 specifies the maximum order statistic. Currently, the generalized Pareto distribution is only supported for the maximum order statistic (i.e., enter SET MINMAX 2).

DEFAULT

None

SYNONYMS

EV2 and FRECHET are synonyms for EXTREME VALUE TYPE 2.

EV1 and GUMBEL are synonyms for EXTREME VALUE TYPE 1.

FATIGUE LIFE is a synonym for FL.

RECIPROCAL INVERSE GAUSSIAN is a synonym for RIG.

IG is a synonym for INVERSE GAUSSIAN.

LAPLACE is a synonym for DOUBLE EXPONENTIAL.

RELATED COMMANDS

FREQUENCY PLOT = Generates a frequency plot.
HISTOGRAM = Generates a histogram.
PIE CHART = Generates a pie chart.
PERCENT POINT PLOT = Generates a percent point plot.
PPCC PLOT = Generates probability plot correlation coefficient plot.
PLOT = Generates a data or function plot.

APPLICATIONS

Distributional Analysis

IMPLEMENTATION DATE

Pre-1987 (the saving of the various internal parameters was implemented 93/12, many distributions were added after 1987)

PROGRAM 1

MULTIPLOT 2 2; MULTIPLOT CORNER COORDINATES 0 0 100 100
TITLE AUTOMATIC; X1LABEL THEORETICAL VALUE; Y1LABEL DATA VALUE
.
LET Y = NORMAL RANDOM NUMBERS FOR I = 1 1 100
NORMAL PROBABILITY PLOT Y
.
LET NU = 5
LET Y = CHI-SQUARE RANDOM NUMBERS FOR I = 1 1 100
CHI-SQUARE PROBABILITY PLOT Y
.
LET Y = EXPONENTIAL RANDOM NUMBERS FOR I = 1 1 100
EXPONENTIAL PROBABILITY PLOT Y
.
LET Y = CAUCHY RANDOM NUMBERS FOR I = 1 1 1000
LEGEND 1 CAUCHY RANDOM NUMBERS
NORMAL PROBABILITY PLOT Y
END OF MULTIPLOT

PROGRAM 2

. ALASKA PIPELINE RADIOGRAPHIC DEFECT BIAS CURVE
. PERFORM A LINEAR REGRESSION
SKIP 25
READ BERGER1.DAT TRUE MEAS
CAPTURE FIT_1_OUT.DAT
FIT MEAS TRUE
END OF CAPTURE
.
MULTIPLOT 2 2 ; MULTIPLOT CORNER COORDINATES 0 0 100 100
TITLE ORIGINAL DATA
X1LABEL TRUE DEPTH (IN .001 INCH)
Y1LABEL MEASURED DEPTH
CHARACTERS X
LINES BLANK
PLOT MEAS TRUE
TITLE PREDICTED VALUES
PLOT MEAS PRED VS TRUE
TITLE RESIDUALS
Y1LABEL
PLOT RES VS TRUE
X1LABEL
TITLE NORMAL PROBABILITY PLOT
NORMAL PROBABILITY PLOT RES
END OF MULTIPLOT

PRODUCT PLOT

PURPOSE

Generates a subsample product versus subsample index plot.

DESCRIPTION

The subsample product is the product of the data in the subsample. The product plot is used to answer the question: ``Does the subsample product change over different subsamples?'' The plot consists of:

Vertical axis = subsample product;
Horizontal axis = subsample index.
In addition, a horizontal line is drawn representing the full sample product. The appearance of the 2 traces is controlled by the first 2 settings of the LINES, CHARACTERS, SPIKES, BARS, and similar attributes.

SYNTAX

PRODUCT PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable;
<x> is the subsample identifier variable (this variable appears on the horizontal axis);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

PRODUCT PLOT Y X
PRODUCT PLOT Y X1 SUBSET X1 < 22

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

CHARACTERS = Sets the type for plot characters.
LINES = Sets the type for plot lines.
PRODUCT = Compute the product of the elements in a variable.
CUMULATIVE PRODUCT = Compute the cumulative product of the elements in a variable.
SUM PLOT = Generates a sum plot.
PLOT = Generates a data or function plot.

APPLICATIONS

Rare Usage

IMPLEMENTATION DATE

88/9

PROGRAM

LET Y = DATA 2 4 5 10 4 10 8 2 3
LET X = DATA 1 1 1 2 2 3 3 3 3
LINE BLANK DASH
CHARACTER X BLANK
XLIMITS 0.5 3.5
YLIMITS 0 500
Y1LABEL PRODUCT
X1LABEL GROUP ID
TITLE AUTOMATIC
PRODUCT PLOT Y X

PROFILE PLOT

PURPOSE

Generates a profile plot.

DESCRIPTION

A profile plot is a graphical data analysis technique for examining the relative behavior of all variables in a multivariate data set. The profile plot consists of a sequence of equi-spaced vertical spikes with each spike representing a different variable in the multivariate data set. An individual profile plot examines the behavior of all such variables but only for a specified subset of the data (e.g., looking at all the attributes of car performance, but only for a particular car, such as Chevrolet). The total length of a given spike is uniformly set to unity for sake of reference. The ``data length'' of a given spike is proportional to the magnitude of the variable for the subset relative to the maximum magnitude of the variable across all subsets. (Thus we are looking at the ratio of the ``local'' value of the variable to the ``global'' maximum of the variable.) An interconnecting line cutting across each spike at the ``data length'' gives the profile plot its unique appearance and name.

SYNTAX

PROFILE PLOT <y1> <y2> ... <yk> <SUBSET/EXCEPT/FOR qualification>
where <y1> ... <yk> are the response variables;
and where the <SUBSET/EXCEPT/FOR qualification> must be given. It is not optional for this command as it is for most other DATAPLOT commands.

EXAMPLES

PROFILE PLOT Y1 Y2 Y3 Y4 Y5 SUBSET AUTO 4
PROFILE PLOT Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 Y9 SUBSET STATE 25

NOTE 1

A few variations of the profile plot exist, all of which DATAPLOT can easily generate by judicious use of the components in the LINE, CHARACTER, SPIKE, and BAR commands (and their attribute setting commands). For example, suppose there are k variables in the profile plot (and so k spikes), then

2. element 1 of SPIKE and SPIKE LINE controls the existence and appearance of the spikes from the base out to the interconnecting line (some analysts prefer this to be SOLID, others prefer this to be DOTTED).
When using the SPIKE and SPIKE LINES commands in this context, note that both must be used if you want spikes to appear. The SPIKE (ON/OFF) command sets whether spikes will appear or not while the SPIKE LINES command sets the desired line type (SOLID, DOTTED, DASHED, etc.). Thus SPIKE ON and SPIKE LINES SOLID would for a profile plot turn the spikes on and set them to solid, respectively. Bars can be generated as an alternative to spikes. Some of the alternatives are demonstrated in the first program example below.

NOTE 2

The generation of multiple profile plots per page is typical (one profile plot for each subset of interest). This is easily done in DATAPLOT by using the PROFILE PLOT command in conjunction with the MULTIPLOT and LOOP commands. The following example generates 50 profile plots on the same page with each profile plot consisting of 6 variables:

MULTIPLOT 5 10
LOOP FOR K = 1 1 50
PROFILE PLOT Y1 Y2 Y3 Y4 Y5 Y6 SUBSET STATE K
END OF LOOP

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

STAR PLOT = Generates a (multivariate) star plot.
ANDREWS PLOT = Generates an Andrews (multivariate) plot.
LINES = Sets the type for plot lines.
SPIKES = Sets the type for plot spikes.
MULTIPLOT = Allows multiple plots per page.
LOOP = Starts a loop (iteration).
^ = Allows string and value substitution.

REFERENCE

``Graphical Methods of Data Analysis,'' Chambers, Cleveland, Kleiner, and Tukey, Wadsworth, 1983 (pp. 162-163).

APPLICATIONS

Multivariate Analysis

IMPLEMENTATION DATE

88/3

PROGRAM 1

DIMENSION 100 COLUMNS
SKIP 25
COLUMN LIMITS 20 132
READ AUTO79.DAT Y1 TO Y9
LET N = SIZE Y1
LET CAR = SEQUENCE 1 1 N
.
MULTIPLOT 2 2; MULTIPLOT CORNER COORDINATES 0 0 100 100
TITLE PROFILE PLOT
XLIMITS 1 9; XTIC OFFSET 0.5 0.5
MAJOR XTIC MARK NUMBER 9
YLIMITS 0 1
.
PROFILE PLOT Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 Y9 SUBSET CAR 1
.
SPIKE ON
PROFILE PLOT Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 Y9 SUBSET CAR 1
.
LINE BLANK; SPIKE BASE 0.5
PROFILE PLOT Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 Y9 SUBSET CAR 1
LINE SOLID; MOVEDATA 0.5 0.5; DRAWDATA 9.5 0.5
.
LINE BLANK; SPIKE OFF
BAR ON; BAR BASE 0.5
BAR FILL ON
BAR WIDTH 0.2
PROFILE PLOT Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 Y9 SUBSET CAR 1
LINE SOLID; MOVEDATA 0.5 0.5; DRAWDATA 9.5 0.5
END OF MULTIPLOT

PROGRAM 2

DIMENSION 100 COLUMNS
SKIP 25
COLUMN LIMITS 20 132
READ AUTO79.DAT Y1 TO Y9
LET N = SIZE Y1; LET CAR = SEQUENCE 1 1 N
.
COLUMN LIMITS 1 19; SKIP 0
LOOP FOR K = 1 1 25
LET K1 = 25+K
ROW LIMITS K1 K1
READ STRING AUTO79.DAT S^K
END OF LOOP
.
XLIMITS 1 9; XTIC OFFSET 0.5 0.5; MAJOR XTIC MARK NUMBER 9
YLIMITS 0 1; MAJOR YTIC MARK NUMBER 6; MINOR TIC MARK NUMBER 1
TIC LABEL SIZE 4
.
MULTIPLOT 5 5; MULTIPLOT CORNER COORDINATES 0 0 100 100
XFRAME OFF; Y2FRAME OFF; FRAME CORNER COORDINATES 15 10 95 95
.
LINE BLANK; BAR ON; BAR BASE 0.5; BAR FILL ON; BAR WIDTH 0.5
LEGEND 1 COORDINATES 55 5; LEGEND JUSTIFICATION CENTER; LEGEND SIZE 4
LOOP FOR K = 1 1 25
LEGEND 1 ^S^K
PROFILE PLOT Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 Y9 SUBSET CAR K
END OF LOOP
END OF MULTIPLOT

Q ... CONTROL CHART

PURPOSE

Generates a Quesenberry style mean, standard deviation, range, C, U, P, or NP control chart.

DESCRIPTION

The standard control chart is a data analysis technique for determining if a measurement process has gone out of statistical control.There are 7 types of control charts available:

2. standard deviation control chart;
3. range control chart;
4. C control chart;
5. U control chart;
6. P control chart;
7. NP control chart.
For the mean, range, and standard deviation control charts, the plot consist of:

Vertical axis = the mean, range, or standard deviation for each sub-group;
Horizontal axis = sub-group designation.
For the C, U, P, and NP control charts, the plot consists of: the plot consist of:

Vertical axis = either the number of defectives or the proportion of defectives for each sub-group;
Horizontal axis = sub-group designation.
In addition, horizontal lines are drawn at the mean (i.e., the mean of the means, ranges, or standard deviations) and at the upper and lower control limits.

Quesenberry control charts use modified formulas for standardizing the data for the mean, standard deviation, and range control charts and normalizing transformations for the P, NP, C, and U control charts. The 2 papers listed in the REFERENCE section provide the details. Quesenberry particularly recommends these charts for short production runs and for early detection of problems.

SYNTAX

Q <keyword> CONTROL CHART <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <keyword> is one of MEAN, RANGE, S, C, U, NP, or P for a mean control chart, a range control chart, a standard deviation control chart, a C control chart, a U control chart, an NP control chart, or a P control chart respectively;
<y> is the response (= dependent) variable containing the raw data values;
<x> is an independent variable containing the sub-group identifications;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

Q MEAN CHART Y X
Q RANGE CONTROL CHART Y X
Q C CONTROL CHART Y X SUBSET X > 2

NOTE 1

For the mean, range, and standard deviation control charts, the distribution of the response variable is assumed to be normal. For the C and U control charts, the distribution of the response is assumed to be Poisson. For the P and NP control charts, the distribution of the response variable is assumed to be binomial.

NOTE 2

The attributes of the 4 traces that make up the control chart are controlled by the standard LINES, CHARACTERS, SPIKES, and BAR commands. Trace 1 is the response variable, trace 2 is the mean line, and traces 3 and 4 are the upper and lower control limits. Some analysts prefer to draw the response variable as a character or spike rather than a connected line.

DEFAULT

None

SYNONYMS

The word CONTROL is optional in all of these commands.

Q XBAR CONTROL CHART, Q AVERAGE CONTROL CHART, for Q MEAN CONTROL CHART.
Q STANDARD DEVIATION CONTROL CHART, Q SD CONTROL CHART for Q S CONTROL CHART.

Q RANGE CONTROL CHART for Q R CONTROL CHART.

RELATED COMMANDS

CONTROL CHART = Generates mean, standard deviation, range, C, U, P, or NP control charts.
CHARACTERS = Sets the types for plot characters.
LINES = Sets the types for plot lines.
SPIKES = Sets the on/off switches for plot spikes.
BARS = Sets the on/off switches for plot bars.
PLOT = Generates a data or function plot.

REFERENCE

``SPC Q Charts for Start-Up Processes and Short or Long Runs,'', Quesenberry, Journal of Quality Technology, Vol. 23, No. 3, July 1991.

``SPC Q Charts for a Binomial Parameter p: Short or Long Runs,'' Quesenberry, Journal of Quality Technology, Vol. 23, No. 3, July 1991.

APPLICATIONS

Quality Control

IMPLEMENTATION DATE

93/12

PROGRAM

SKIP 25
READ GEAR.DAT DIAMETER BATCH
.
TITLE AUTOMATIC
MULTIPLOT 2 2
MULTIPLOT CORNER COORDINATES 0 0 100 100
PLOT DIAMETER
LINE SOLID SOLID DOT DOT
TITLE Q MEAN CONTROL CHART
Q MEAN CHART DIAMETER BATCH
TITLE Q RANGE CONTROL CHART
Q RANGE CHART DIAMETER BATCH
TITLE Q SD CONTROL CHART
Q S CHART DIAMETER BATCH
.
END OF MULTIPLOT

QUANTILE-QUANTILE PLOT

PURPOSE

Generates a quantile-quantile plot.

DESCRIPTION

A quantile-quantile plot (or q-q plot) is a graphical data analysis technique for comparing the distributions of 2 data sets. The quantile-quantile plot is a graphical alternative for the various classical 2-sample tests (e.g., t for location, F for dispersion). The plot consists of the following:

Vertical axis = estimated quantiles from data set 1;
Horizontal axis = estimated quantiles from data set 2.
The ``quantiles'' of a distribution are the distribution's ``percent points'' (e.g., the .5 quantile = the 50% point = the median). The advantage of the quantile-quantile plot is 2-fold:

2. many distributional aspects can be simultaneously tested. For example, shifts in location, shifts in dispersion, changes in symmetry/skewness, outliers, etc.
The quantile-quantile plot has 2 components:

2. a 45 degree reference line.
The appearance of these 2 components is controlled by the first 2 settings of the CHARACTERS and LINES commands. It is typical for the quantile points to be represented as, say, X's with no connecting line, and the reference line to have no plot characters but to be solid. This is demonstrated in the sample program below.

SYNTAX

QUANTILE-QUANTILE PLOT <y1> <y2> <SUBSET/EXCEPT/FOR qualification>
where <y1> is the first response variable;
<y2> is the second response variable;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

QUANTILE-QUANTILE PLOT Y1 Y2
QUANTILE-QUANTILE PLOT RUN1 RUN2
QUANTILE-QUANTILE PLOT Y1 Y2 SUBSET STATE 25

NOTE 1

One of the distributions can be a theoretical distribution. For example, the following program generates a quantile-quantile plot of a data set against a normal distribution (this is called a normal quantile plot).

LET Y1 = NORMAL RANDOM NUMBERS FOR I = 1 1 100
LET X = SEQUENCE .01 .01 .99
LET Y2 = NORPPF(X)
QUANTILE-QUANTILE PLOT Y1 Y2
This same technique can be used for other distributions (use the proper PPF function). This is essentially what a probability plot does (DATAPLOT has a PROBABILITY PLOT command for 38 distributions).

NOTE 2

The Tukey mean-difference (or m-d plot) can be generated after the quantile-quantile plot. It takes the coordinates of the quantile-quantile plot (saved in the DATAPLOT internal variables YPLOT and XPLOT and plots their difference (YPLOT - XPLOT) against their average ((YPLOT+XPLOT)/2) The advantage of this plot is that it converts the interpretation of a quantile-quantile plot to differences from a horizontal (rather than a diagonal) zero line. The program example below generates a quantile-quantile plot and then the corresponding m-d plot. The m-d plot should only be used if the two variables are on a common scale.

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

CHARACTERS = Sets the type for plot characters.
LINES = Sets the type for plot lines.
BOX PLOT = Generates a box plot.
PLOT = Generates a data or function plot.
HISTOGRAM = Generates a histogram.
PROBABILITY PLOT = Generates a probability plot.
T-TEST = Carries out a 2-sample t test.
ANOVA = Carries out an ANOVA.
MULTIPLOT = Allows multiple plots per page.
LOOP = Starts a loop (iteration).

REFERENCE

``Graphical Methods of Data Analysis,'' Chambers, Cleveland, Kleiner, and Tukey, Wadsworth, 1983 (pp. 48-57).

``Visualizing Data,'' William Cleveland, Hobart Press, 1993.

APPLICATIONS

Exploratory Data Analysis

IMPLEMENTATION DATE

88/3

PROGRAM

SKIP 25
READ AUTO83B.DAT Y1 Y2
DELETE Y2 SUBSET Y2 < 0
.
LINE BLANK SOLID
CHARACTER CIRCLE BLANK
CHARACTER SIZE 1.0
TITLE AUTOMATIC
QUANTILE-QUANTILE PLOT Y1 Y2
.
LET YMEAN = (YPLOT+XPLOT)/2
LET YDIFF = YPLOT - XPLOT
LET AMIN = MINIMUM YMEAN
LET AMAX = MAXIMUM YMEAN
LET XZERO = DATA AMIN AMAX
LET YZERO = DATA 0 0
.
TITLE TUKEY M-D PLOT
X1LABEL MEAN
Y1LABEL DIFFERENCE
YLIMITS -15 0
YTIC OFFSET 1 1
PLOT YDIFF YMEAN AND
PLOT YZERO XZERO

QUARTILE PLOT

PURPOSE

Generates a subsample quartile versus subsample index plot.

DESCRIPTION

The upper quartile is the 75th percentile (i.e., the point where 75% of the data values are below that point) and the lower quartile is the 25th percentile (i.e., the point where 25% of the data values are below that point). The subsample quartile is the quartile of the data in the subsample. The quartile plot is used to answer the question: ``Does the subsample spread change over different subsamples?'' It consists of:

Vertical axis = subsample quartile;
Horizontal axis = subsample index.
In addition, a horizontal line is drawn representing the full sample quartile. The appearance of the 2 traces is controlled by the first 2 settings of the LINES, CHARACTERS, SPIKES, BARS, and similar attributes.

SYNTAX

<UPPER/LOWER> QUARTILE PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable;
<x> is the subsample identifier variable (this variable appears on the horizontal axis);
LOWER specifies a lower quartile plot and UPPER specifies an upper quartile plot;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

UPPER QUARTILE PLOT Y X
LOWER QUARTILE PLOT Y X1 SUBSET X1 < 12

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

CHARACTERS = Sets the type for plot characters.
LINES = Sets the type for plot lines.
MEAN PLOT = Generates a mean plot.
SD PLOT = Generates a standard deviation plot.
BOX PLOT = Generates a box plot.
XBAR CHART = Generates a xbar control chart.
PLOT = Generates a data or function plot.

APPLICATIONS

Exploratory Data Analysis

IMPLEMENTATION DATE

88/2

PROGRAM

. PURPOSE--GENERATE A QUARTILE PLOT OF POINT BARROW FREON-11 DATA
SKIP 50
SET READ FORMAT 3F4.0,F5.0,F6.0,F3.0,2F9.0
READ PBF11.DAT YEAR DAY BOT SD F11 FLAG WV CO2
.
RETAIN YEAR DAY BOT SD F11 WV CO2 FLAG SUBSET FLAG 0
LET MONTH=INT(DAY/30.25)+1
.
TITLE QUARTILE PLOT
XLIMITS 0 15
YLIMITS 1 1.01
CHARACTER U U
LINE BLANK SOLID
UPPER QUARTILE PLOT WV MONTH
PRE-ERASE OFF
CHARACTER L L
LOWER QUARTILE PLOT WV MONTH

R CHART

PURPOSE

Generates a range control chart.

DESCRIPTION

A range control chart is a data analysis technique for determining if a measurement process has gone out of statistical control. The R chart is sensitive to changes in variation in the measurement process. It consists of:

Vertical axis = the range for each sub-group;
Horizontal axis = sub-group designation.
In addition, horizontal lines are drawn at the mean range value and at the upper and lower control limits.

SYNTAX

R CHART <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable (containing the raw data values);
<x> is an independent variable (containing the sub-group identifications);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

R CHART Y X

NOTE 1

The distribution of the response variable is assumed to be normal. This assumption is the basis for calculating the upper and lower control limits.

NOTE 2

The attributes of the 4 traces can be controlled by the standard LINES, CHARACTERS, BARS, and SPIKES commands. Trace 1 is the response variable, trace2 is the mean line, and traces 3 and 4 are the control limits. Some analysts prefer to draw the response variable as a spike or character rather than a connected line.

DEFAULT

None

SYNONYMS

RANGE CHART for R CHART

R CONTROL CHART for R CHART

RANGE CONTROL CHART for R CHART

RELATED COMMANDS

XBAR CHART = Generates a mean control chart.
S CHART = Generates a standard deviation control chart.
P CHART = Generates a p control chart.
NP CHART = Generates a Np control chart.
U CHART = Generates a U control chart.
C CHART = Generates a C control chart.
CHARACTERS = Sets the types for plot characters.
LINES = Sets the types for plot lines.
SPIKES = Sets the on/off switches for plot spikes.
BARS = Sets the on/off switches for plot bars.
PLOT = Generates a data or function plot.
LAG PLOT = Generates a lag plot.
4-PLOT = Generates 4-plot univariate analysis.
RANGE PLOT = Generates a range (vs subset) plot.

APPLICATIONS

Quality Control

IMPLEMENTATION DATE

88/2

PROGRAM

SKIP 25
READ GEAR.DAT DIAMETER BATCH
.
LINE SOLID SOLID DOT DOT
TITLE AUTOMATIC
X1LABEL GROUP-ID
Y1LABEL RANGE
RANGE CONTROL CHART DIAMETER BATCH

RANGE PLOT

PURPOSE

Generates a subsample range versus subsample index plot.

DESCRIPTION

The subsample range is the difference between the maximum and minimum of the data in the subsample. The range plot is used to answer the question: ``Does the subsample variation change over different subsamples?'' It consists of:

Vertical axis = subsample range;
Horizontal axis = subsample index.
In addition, a horizontal line is drawn representing the full sample range. The appearance of the 2 traces is controlled by the first 2 settings of the LINES, CHARACTERS, SPIKES, BARS, and similar attributes.

SYNTAX

RANGE PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable;
<x> is the subsample identifier variable (this variable appears on the horizontal axis);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

RANGE PLOT Y X
RANGE PLOT Y X SUBSET X = 2 TO 10

DEFAULT

None

SYNONYMS

R PLOT

RELATED COMMANDS

CHARACTERS = Sets the type for plot characters.
LINES = Sets the type for plot lines.
STANDARD DEVIATION PLOT = Generates a standard deviation plot.
VARIANCE PLOT = Generates a variance plot.
MEAN PLOT = Generates a mean plot.
MEDIAN PLOT = Generates a median plot.
BOX PLOT = Generates a box plot.
RANGE CHART = Generates a range control chart.
S CHART = Generates a standard deviation control chart.
PLOT = Generates a data or function plot.

APPLICATIONS

Quality Control

IMPLEMENTATION DATE

88/2

PROGRAM

SKIP 25
READ GEAR.DAT Y X
LINE BLANK DASH
CHARACTER X BLANK
XTIC OFFSET 0.2 0.2
Y1LABEL RANGE
X1LABEL SAMPLE ID
TITLE AUTOMATIC
RANGE PLOT Y X

RELATIVE STANDARD DEVIATION PLOT

PURPOSE

Generates a subsample relative standard deviation versus subsample index plot.

DESCRIPTION

The relative standard deviation is the standard deviation divided by the absolute value of the mean times 100. The subsample relative standard deviation is the relative standard deviation of the data in the subsample. The relative standard deviation plot is used to answer the question: ``Does the subsample spread change over different subsamples?'' It consists of:

Vertical axis = subsample relative standard deviation;
Horizontal axis = subsample index.
In addition, a horizontal line is drawn representing the full sample relative standard deviation. The appearance of the 2 traces is controlled by the first 2 settings of the LINES, CHARACTERS, SPIKES, BARS, and similar attributes.

SYNTAX

RELATIVE STANDARD DEVIATION PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable;
<x> is the subsample identifier variable (this variable appears on the horizontal axis);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

RELATIVE STANDARD DEVIATION PLOT Y X
RELATIVE STANDARD DEVIATION PLOT Y TAG SUBSET TAG > 2

DEFAULT

None

SYNONYMS

RELSD PLOT, RELATIVE SD PLOT, RELS PLOT, RS PLOT, and RSD PLOT are synonyms for RELATIVE STANDARD DEVIATION PLOT.

RELATED COMMANDS

RELATIVE STAND DEVIATION = Compute the relative standard deviation of a variable.
RELATIVE VARIANCE PLOT = Generate a relative variance plot.
CHARACTERS = Sets the type for plot characters.
LINES = Sets the type for plot lines.
MEAN PLOT = Generates a mean plot.
SD PLOT = Generates a standard deviation plot.
BOX PLOT = Generates a box plot.
XBAR CHART = Generates a mean control chart.
PLOT = Generates a data or function plot.

APPLICATIONS

Exploratory Data Analysis

IMPLEMENTATION DATE

88/2

PROGRAM

SKIP 25
READ GEAR.DAT DIAMETER BATCH
LINE BLANK DASH
CHARACTER X BLANK
XTIC OFFSET 0.2 0.2
Y1LABEL RELATIVE STANDARD DEVIATION
X1LABEL SAMPLE BATCH
TITLE AUTOMATIC
RELSD PLOT DIAMETER BATCH

RELATIVE VARIANCE PLOT

PURPOSE

Generates a subsample relative variance versus subsample index plot.

DESCRIPTION

The relative variance is the variance divided by the mean times 100. The subsample relative variance is the relative variance of the data in the subsample. The relative variance plot is used to answer the question: ``Does the subsample spread change over different subsamples?'' The plot consists of:

Vertical axis = subsample relative variance;
Horizontal axis = subsample index.
In addition, a horizontal line is drawn representing the full sample relative variance. The appearance of the 2 traces is controlled by the first 2 settings of the LINES, CHARACTERS, SPIKES, BARS, and similar attributes.

SYNTAX

RELATIVE VARIANCE PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable;
<x> is the subsample identifier variable (this variable appears on the horizontal axis);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

RELATIVE VARIANCE PLOT Y X
RELATIVE VARIANCE PLOT Y TAG SUBSET TAG > 2

DEFAULT

None

SYNONYMS

RELATIVE VAR PLOT, RV PLOT, RVAR PLOT, RELV PLOT, RELVAR PLOT, COEFFICIENT VARIATION PLOT, and COEFFICIENT OF VARIATION PLOT are synonyms for RELATIVE VARIANCE PLOT.

RELATED COMMANDS

RELATIVE VARIANCE = Compute the relative variance of a variable.
RELSD PLOT = Genrates a relative standard deviation plot.
CHARACTERS = Sets the type for plot characters.
LINES = Sets the type for plot lines.
MEAN PLOT = Generates a mean plot.
SD PLOT = Generates a standard deviation plot.
BOX PLOT = Generates a box plot.
XBAR CHART = Generates a mean control chart.
PLOT = Generates a data or function plot.

APPLICATIONS

Exploratory Data Analysis

IMPLEMENTATION DATE

88/2

PROGRAM

SKIP 25
READ GEAR.DAT DIAMETER BATCH
LINE BLANK DASH
CHARACTER X BLANK
XTIC OFFSET 0.2 0.2
Y1LABEL RELATIVE VARIANCE
X1LABEL SAMPLE BATCH
TITLE RELATIVE VARIANCE PLOT
RELATIVE VARIANCE PLOT DIAMETER BATCH

... ROOTOGRAM

PURPOSE

Generates a rootogram.

DESCRIPTION

A rootogram is a graphical data analysis technique for summarizing the distributional information of a variable. It consists of:

Vertical axis = square root of frequencies or relative frequencies;
Horizontal axis = response variable.
There are 4 types of rootograms:

2. relative rootogram (converts counts to proportions);
3. cumulative rootogram;
4. cumulative relative rootogram.
The rootogram is a modified version of a histogram. It plots the square roots of the frequencies rather than the raw frequencies. Many univariate data sets can be normalized with a square root transformation (particularly counts or measurement data that have a lower bound and tend to be skewed at the upper tail).

SYNTAX 1

ROOTOGRAM <x> <SUBSET/EXCEPT/FOR qualification>
RELATIVE ROOTOGRAM <x> <SUBSET/EXCEPT/FOR qualification>
CUMULATIVE ROOTOGRAM <x> <SUBSET/EXCEPT/FOR qualification>
CUMULATIVE RELATIVE ROOTOGRAM <x> <SUBSET/EXCEPT/FOR qualification>
where <x> is the variable of raw data values which will appear on the horizontal axis;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax is used when you have raw data only.

SYNTAX 2

ROOTOGRAM <y> <x> <SUBSET/EXCEPT/FOR qualification>
RELATIVE ROOTOGRAM <y> <x> <SUBSET/EXCEPT/FOR qualification>
CUMULATIVE ROOTOGRAM <y> <x> <SUBSET/EXCEPT/FOR qualification>
CUMULATIVE RELATIVE ROOTOGRAM <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the variable of pre-computed frequencies to appear on the vertical axis;
<x> is the variable of raw data values which will appear on the horizontal axis;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax is used when you have pre-computed frequencies at each horizontal axis value.

EXAMPLES

ROOTOGRAM TEMP
RELATIVE ROOTOGRAM TEMP
CUMULATIVE ROOTOGRAM TEMP
CUMULATIVE RELATIVE ROOTOGRAM TEMP
ROOTOGRAM COUNTS STATE
RELATIVE ROOTOGRAM COUNTS STATE
CUMULATIVE ROOTOGRAM COUNTS STATE
CUMULATIVE RELATIVE ROOTOGRAM COUNTS STATE

NOTE 1

The appearance of the bars on the rootogram (i.e., whether they are filled or not, the line width of the bar border, etc.) are controlled by the various bar attribute commands. A few are listed in the RELATED COMMANDS section below. See the documentation for the BAR command for a complete list of the bar attribute commands.

NOTE 2

By default, DATAPLOT uses a class width of 0.3 X the standard deviation of the variable. Use the CLASS WIDTH command to override this default. DATAPLOT also tends to generate a large number of zero frequency classes at the lower and upper tails. This tends to compress the histogram on the horizontal axis. Use the XLIMITS command or the CLASS LOWER and CLASS UPPER commands to avoid plotting these zero frequency classes.

NOTE 3

Although DATAPLOT does not have a FREQUENCY TABLE command, one can be generated with the following commands:

HISTOGRAM Y
LET YFREQ = YPLOT
LET XVAL = XPLOT
Then the variables YFREQ and XVAL essentially contain a frequency table. There is a LET subcommand called FREQUENCY. However, it does not generate a frequency table in the sense that a histogram or a frequency plot does. The frequency table can also be generated by replacing HISTOGRAM with ROOTOGRAM in the above sequence. However, be aware that this generates the square roots of the frequencies, not the raw frequencies.

DEFAULT

None

SYNONYMS

A synonym for CUMULATIVE RELATIVE ROOTOGRAM is RELATIVE CUMULATIVE ROOTOGRAM

RELATED COMMANDS

HISTOGRAM = Generate a histogram.
FREQUENCY PLOT = Generates a frequency plot.
PIE CHART = Generates a pie chart.
PERCENT POINT PLOT = Generates a percent point plot.
PROBABILITY PLOT = Generates a probability plot.
PPCC PLOT = Generates probability plot correlation coefficient plot.
CLASS LOWER = Sets the lower class minimum for histograms, frequency plots, and pie charts.
CLASS UPPER = Sets the upper class maximum for histograms, frequency plots, and pie charts.
CLASS WIDTH = Sets the class width for histograms, frequency plots, and pie charts.
MINIMUM = Sets the frame minima for all plots.
MAXIMUM = Sets the frame maxima for all plots.
LIMITS = Sets the frame limits for all plots.
PLOT = Generates a data or function plot.
BARS = Sets the on/off switches for plot bars.
BAR WIDTH = Sets the widths for plot bars.
BAR FILL = Sets the on/off switches for plot bar fills.
BAR PATTERN = Sets the types for bar fill patterns.
BAR BORDER LINE = Sets the types for bar border lines.

REFERENCE

Most introductory statistics book discuss frequency polygons and histograms. The rootogram is described in ``Exploratory Data Analysis,'' John Tukey, Addison-Wesley, 1977.

APPLICATIONS

Exploratory Data Analysis

IMPLEMENTATION DATE

Pre-1987

PROGRAM

SET READ FORMAT F10.1
SKIP 25
READ SUNSPOT.DAT Y
MULTIPLOT 2 2
MULTIPLOT CORNER COORDINATES 0 0 100 100
TITLE AUTOMATIC
XLIMITS 0 200; XTIC OFFSET 10 40
MAJOR XTIC MARK NUMBER 6; MINOR XTIC MARK NUMBER 3
ROOTOGRAM Y
BAR FILL ON
RELATIVE ROOTOGRAM Y
BAR FILL OFF
BAR BORDER THICKNESS 0.3
CUMULATIVE ROOTOGRAM Y
BAR FILL ON
BAR PATTERN D1
BAR PATTERN SPACING 3
CUMULATIVE RELATIVE ROOTOGRAM Y
END OF MULTIPLOT

RUN SEQUENCE PLOT

PURPOSE

Generates a run sequence plot.

DESCRIPTION

A run sequence plot is a graphical data analysis technique for preliminary scanning of the data. It consists of:

Vertical axis = i-th observation;
Horizontal axis = dummy index i.
The runs sequence plot is thus a plot of the raw data plotted in the same order that it resides in the variable. This is a useful first step in the analysis of any data (not just time series data) in that it provides information about trends, patterns in variation, and outliers. It also gives the analyst an excellent ``feel'' for the data.

SYNTAX

RUN SEQUENCE PLOT <x> <SUBSET/ECEPT/FOR qualification>
where <x> is the variable of raw data values under analysis;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

RUN SEQUENCE PLOT Y
RUN SEQUENCE PLOT Y2

NOTE

Plot points can be plotted as characters, connected lines, spikes, or bars. These are set independently of each other. The default is to plot each trace as a connected line with no symbol, no bar, and no spike. The LINE, CHARACTER, SPIKE, and BAR commands are used to set the switches for plotting a given trace as a connected line, a character, a spike, or a bar respectively.

There are attribute setting commands for lines, characters, spikes, and bars. See the documentation for LINE, CHARACTER, SPIKE, and BAR for a complete list of these commands. Attributes are set giving a list of values. The first trace uses the first setting, the second trace uses the second setting, and so on. For example, CHARACTER SIZE 2.0 3.0 1.5 sets the character size for trace 1 to 2.0, the character size for trace 2 to 3.0, and the character size for trace 3 to 1.5. Attributes can be set for up to 100 traces.

DEFAULT

None

SYNONYMS

PLOT Y is equivalent to RUN SEQUENCE PLOT Y.

RELATED COMMANDS

CHARACTERS = Sets the types for plot characters.
LINES = Sets the types for plot lines.
SPIKES = Sets the on/off switches for plot spikes.
BARS = Sets the on/off switches for plot bars.
TITLE = Sets the plot title.
LABEL = Sets the plot axis labels.
LEGEND = Sets the plot legends.
MULTIPLOT = Generate multiple plots per page.
PLOT = Generates a data or function plot.

APPLICATIONS

Data Analysis

IMPLEMENTATION DATE

Pre-1987

PROGRAM

SKIP 25
READ BOXJE142.DAT YIELD
.
TITLE AUTOMATIC
Y1LABEL YIELD
X1LABEL SEQUENCE NUMBER
XLIMITS 0 70
XTIC OFFSET 2 2
PLOT YIELD

S CHART

PURPOSE

Generates a standard deviation control chart.

DESCRIPTION

A standard deviation control chart is a data analysis technique for determining if a measurement process has gone out of statistical control. The S chart is sensitive to changes in variation in the measurement process. It consists of:

Vertical axis = the standard deviation for each sub-group.
Horizontal axis = sub-group designation.
In addition, horizontal lines are drawn at the mean standard deviation value and at the upper and lower control limits.

SYNTAX

S CHART <y> <x> <SUBSET/EXCEPT/FOR/qualification>
where <y> is the response (= dependent) variable (containing the raw data values);
<x> is an independent variable (containing the sub-group identifications);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

S CHART Y X

NOTE 1

The distribution of the response variable is assumed to be normal. This assumption is the basis for calculating the upper and lower control limits.

NOTE 2

The attributes of the 4 traces can be controlled by the standard LINES, CHARACTERS, BARS, and SPIKES commands. Trace 1 is the response variable, trace 2 is the mean line, and traces 3 and 4 are the control limits. Some analysts prefer to draw the response variable as a spike or character rather than a connected line.

NOTE 3

Versions prior to December 1993 have a bug in that the S CHART command conflicts with the SAVE command. Use the SD CHART or the SD CONTROL CHART syntax (see SYNONYMS section below) for these versions.

DEFAULT

None

SYNONYMS

SD CHART for S CHART.
S CONTROL CHART for S CHART.
SD CONTROL CHART for S CHART.

RELATED COMMANDS

R CHART = Generates a range control chart.
X CHART = Generates a mean control chart.
P CHART = Generates a p control chart.
NP CHART = Generates a Np control chart.
U CHART = Generates a U control chart.
C CHART = Generates a C control chart.
CHARACTERS = Sets the types for plot characters.
LINES = Sets the types for plot lines.
SPIKES = Sets the on/off switches for plot spikes.
BARS = Sets the on/off switches for plot bars.
PLOT = Generates a data or function plot.
LAG PLOT = Generates a lag plot.
4-PLOT = Generates 4-plot univariate analysis.
STANDARD DEVIATION PLOT = Generates a standard deviation (vs subset) plot.

APPLICATIONS

Quality Control

IMPLEMENTATION DATE

88/2

PROGRAM

SKIP 25
READ GEAR.DAT DIAMETER BATCH
.
LINE SOLID SOLID DOT DOT
TITLE AUTOMATIC
X1LABEL GROUP-ID
Y1LABEL STANDARD DEVIATION
SD CHART DIAMETER BATCH

SINE AMPLITUDE PLOT

PURPOSE

Generates a subsample sine amplitude versus subsample index plot.

DESCRIPTION

The subsample sine amplitude is the approximate least squares estimate of the amplitude in a single-frequency sinusoidal model based on data from that subsample only. The sine amplitude plot is used to answer the question: ``Does the subsample amplitude change over different subsamples?'' It consists of:

Vertical axis = subsample sine amplitude;
Horizontal axis = subsample index.
In addition, a horizontal line is drawn representing the full sample sine amplitude. The appearance of the 2 traces is controlled by the first 2 settings of the LINES, CHARACTERS, SPIKES, BARS, and similar attributes.

SYNTAX

SINE AMPLITUDE PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable;
<x> is the subsample identifier variable (this variable appears on the horizontal axis);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

SINE AMPLITUDE PLOT Y X
SINE AMPLITUDE PLOT Y X SUBSET X > 2

DEFAULT

None

SYNONYMS

SA PLOT

RELATED COMMANDS

CHARACTERS = Sets the type for plot char.
LINES = Sets the type for plot lines.
SINE FREQUENCY PLOT = Generates a sine frequency plot.
COMPLEX DEMOD AMPL PLOT = Generate a complex demodulation amplitude plot.
COMPLEX DEMOD PHASE PLOT = Generate a complex demodulation phase plot.
RANGE PLOT = Generates a range plot.
MEAN PLOT = Generates a mean plot.
AUTOCORRELATION PLOT = Generates a autocorrelation plot.
SPECTRAL PLOT = Generates a spectral plot.
BOX PLOT = Generates a box plot.
PLOT = Generates a data or function plot.

APPLICATIONS

Time Series Analysis

IMPLEMENTATION DATE

88/2

PROGRAM

SKIP 25
READ SUNSPOT.DAT Y MONTH
CHARACTER CIRCLE BLANK
LINE BLANK SOLID
XLIMITS 1 12
XTIC OFFSET 0.5 0.5
X1TIC MARK LABEL FORMAT ALPHA
X1TIC MARK LABEL CONTENTS JAN FEB MARCH APRIL MAY JUNE JULY AUG SEP ...
OCT NOV DEC
MINOR XTIC MARK NUMBER 0
Y1LABEL SINE AMPLITUDE
TITLE AUTOMATIC
SINE AMPLITUDE PLOT Y MONTH

SINE FREQUENCY PLOT

PURPOSE

Generates a subsample sine frequency versus subsample index plot.

DESCRIPTION

The subsample sine frequency is the approximate least squares estimate of the frequency in a single-frequency sinusoidal model based on data from that subsample only. The sine frequency plot is used to answer the question: ``Does the subsample frequency change over different subsamples?'' It consists of:

Vertical axis = subsample sine frequency;
Horizontal axis = subsample index.
In addition, a horizontal line is drawn representing the full sample sine frequency. The appearance of the 2 traces is controlled by the first 2 settings of the LINES, CHARACTERS, SPIKES, BARS, and similar attributes.

SYNTAX

SINE FREQUENCY PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable;
<x> is the subsample identifier variable (this variable appears on the horizontal axis);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

SINE FREQUENCY PLOT Y X
SINE FREQUENCY PLOT Y X SUBSET X > 2

DEFAULT

None

SYNONYMS

SF PLOT

RELATED COMMANDS

CHARACTERS = Sets the type for plot characters.
LINES = Sets the type for plot lines.
SINE AMPLITUDE PLOT = Generates a sine amplitude plot.
COMPLEX DEMOD AMPL PLOT = Generates a complex demodulation amplitude plot.
COMPLEX DEMOD PHASE PLOT = Generates a complex demodulation phase plot.
RANGE PLOT = Generates a range plot.
MEAN PLOT = Generates a mean plot.
AUTOCORRELATION PLOT = Generates an autocorrelation plot.
BOX PLOT = Generates a box plot.
RANGE PLOT = Generates a range plot.
MEAN PLOT = Generates a mean plot.
PLOT = Generates a data or function plot.

APPLICATIONS

Time Series Analysis

IMPLEMENTATION DATE

88/2

PROGRAM

TITLE AUTOMATIC
SKIP 25
READ ELNINO.DAT Y YEAR MONTH
.
MULTIPLOT 2 1; MULTIPLOT CORNER COORDINATES 0 0 100 100
PLOT Y
XLIMITS 1 12; XTIC OFFSET 1 1
MAJOR XTIC MARK NUMBER 12; MINOR XTIC MARK NUMBER 0
X1TIC MARK FORMAT ALPHA
X1TIC MARK CONTENT JAN FEB MARCH APRIL MAY JUNE JULY AUGUST SEP OCT ...
NOV DEC
Y1LABEL SINE FREQUENCY
YTIC OFFSET 0.05 0.05
LINE BLANK SOLID
CHARACTER CIRCLE BLANK
SINE FREQUENCY PLOT Y MONTH
END OF MULTIPLOT

SKEWNESS PLOT

PURPOSE

Generates a subsample skewness versus subsample index plot.

DESCRIPTION

The subsample skewness is the cube root of the standardized third central moment of the data in the subsample. The skewness plot is used to answer the question: ``Does the subsample skewness change over different subsamples?'' It consists of:

Vertical axis = subsample skewness;
Horizontal axis = subsample index.
In addition, a horizontal line is drawn representing the full sample skewness. The appearance of the 2 traces is controlled by the first 2 settings of the LINES, CHARACTERS, SPIKES, BARS, and similar attributes.

SYNTAX

SKEWNESS PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable;
<x> is the subsample identifier variable (this variable appears on the horizontal axis);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

SKEWNESS PLOT Y X
SKEWNESS PLOT Y X SUBSET X > 2

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

CHARACTERS = Sets the type for plot characters.
LINES = Sets the type for plot lines.
KURTOSIS PLOT = Generates a kurtosis plot.
VARIANCE PLOT = Generates a variance plot.
STANDARD DEVIATION PLOT = Generates a standard deviation plot.
RANGE PLOT = Generates a range plot
MEAN PLOT = Generates a mean plot.
MEDIAN PLOT = Generates a median plot.
BOX PLOT = Generates a box plot.
S CHART = Generates a standard deviation control chart.
PLOT = Generates a data or function plot.

APPLICATIONS

Quality Control

IMPLEMENTATION DATE

88/2

PROGRAM

SKIP 25
READ GEAR.DAT Y X
LINE BLANK DASH
CHARACTER X BLANK
XTIC OFFSET 0.2 0.2
Y1LABEL SKEWNESS
X1LABEL SAMPLE ID
TITLE AUTOMATIC
SKEWNESS PLOT Y X

... SPECTRAL PLOT

PURPOSE

Generates one of the following types of spectral plots:

2. cross-spectral plot
3. co-spectral plot
4. quadrature spectral plot
5. coherency spectral plot
6. amplitude spectral plot
7. phase spectral plot
8. gain spectral plot
9. argand spectral plot

DESCRIPTION

A spectral plot is a graphical data analysis technique for examining frequency-domain models for a single equi-spaced time series or for two equi-spaced time series. It is used to assess autocorrelation and cyclic structure. The spectral power function is a smoothed Fourier transform of the autocovariance function. An equi-spaced time series is one in which the distance between adjacent points is constant. The spectral estimate (also called power) at a given frequency for a single discrete time series xt is:

In the above equation, cxx is the autocovariance function evaluated at the lag k, D is the distance between adjacent values (DATAPLOT always uses a value of 1), L defines how many lags of the autocovariance function to use, and f is the frequency. The w(k) is the window function (which specifies the type of smoothing). The window function, the number of lags to use in the autocovariance function, and the formulas for the other variations of the spectral plot are discussed further in NOTE sections below. The autospectral plot consists of:

Vertical axis = power (= variable contribution);
Horizontal axis = frequency (cycles per observation).
The frequency is measured in cycles per unit time where unit time is defined to be the distance between adjacent points. A frequency of 0 corresponds to an infinite cycle while a frequency of 0.5 corresponds to a cycle of 2 data points. Equi-spaced time series are inherently limited to detecting frequencies between 0 and 0.5.

From a data analysis point of view, the type of structure in the autocorrelation plot indicates the location of peaks in the spectral plot. The peaks in the spectral plot also indicate the dominant frequency for underlying cyclic models. Once the dominant peaks have been identified, the next step is typically to use other time series analysis techniques (such as the complex demodulation phase plots) to determine if this frequency is constant over the entire domain of the data, or to carry out a nonlinear fit with an underlying cyclic model (see the documentation for the COMPLEX DEMODULATION PLOT command).

SYNTAX 1

SPECTRAL PLOT <y1> <SUBSET/EXCEPT/FOR qualification>
where <y1> is a response variable;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax is used for a single time series.

SYNTAX 2

<keyword> PLOT <y1> <y2> <SUBSET/EXCEPT/FOR qualification>
where <keyword> is one of the following:
CROSS-SPECTRAL
COSPECTRAL
QUADRATURE SPECTRAL
COHERENCY SPECTRAL
AMPLITUDE SPECTRAL
PHASE PSECTRAL
GAIN SPECTRAL,
ARGAND SPECTRAL;
<y1> is the first response variable;
<y2> is the second response variable;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax is used for two time series.

EXAMPLES

SPECTRAL PLOT Y
SPECTRAL PLOT Y2 SUBSET Y2 > 2
CROSS-SPECTRAL PLOT Y1 Y2
CO-SPECTRAL PLOT Y1 Y2
QUADRATURE SPECTRAL PLOT Y1 Y2
COHERENCY SPECTRAL PLOT Y1 Y2
AMPLITUDE SPECTRAL PLOT Y1 Y2
PHASE SPECTRAL PLOT Y1 Y2
ARGAND SPECTRAL PLOT Y1 Y2

NOTE 1

Missing values are not allowed. It is also common to remove trends by differencing (xt - xt-1) or to apply some other type of filter before generating the spectral plot.

NOTE 2

Different time series texts present the equations for the various spectral plots in slightly different forms. However, these forms should be mathematically equivalent. DATAPLOT uses the definitions from the Jenkins and Watts book (see REFERENCE below). The Bloomfield text (see REFERENCE below) uses slightly different, although mathematically equivalent, equations.

NOTE 3

The spectral plot is essentially a ``smoothed'' periodogram where the smoothing is done in the frequency domain. The periodogram is the Fourier transform of the autocovariance function while the autospectral plot is the smoothed Fourier transform of the autocovariance function. The periodogram can be normalized to generate a spectral density by using the autocorrelation function rather than the autocovariance function in the computational formulas (see the documentation for the PERIODOGRAM command). In a similar manner, the various spectral plots can be normalized by using the autocorrelation function rather than the autocovariance function. However, DATAPLOT does not currently provide a mechanism for specifying that the normalized estimates be used.

NOTE 4

The smoothing is determined by the choice of the window function. The 4 most common choices are the rectangular window, the Bartlett window, the Tukey window, and the Parzen window. These are all described in table 6.5 of the Jenkins and Watts text (see REFERENCE below). DATAPLOT uses the Tukey window. At this time, it does not allow the specification of an alternate window type. The formula for the Tukey window is:

where M is the number of lags to compute the autocorrelation for and w(u) is zero if u is outside the given region.

NOTE 5

The number of lags is determined automatically. By default, the number of lags is (N is the number of points in the time series):

(N/4) - 1 N > 32
(N/2) - 1 16< N <= 32
N - 1 N <= 16
If you want to over-ride the default value, enter one of the following commands:

LET LAGS = <value>
LET LAG = <value>
LET NUMLAG = <value>

NOTE 6

The formula for the spectral plot above was for the autospectral plot. This section gives the formula for the other spectral plots.
The sample autocovariance functions are computed with the standard formulas (call these c11 and c22 for the first and second series respectively). The autospectral estimates are computed for the first and second series using the formula given above (call these C11 and C22 respectively).

In the formulas below, L is the number of lags of the autocovariance function and F is the maximum frequency (this will always be 0.5 since DATAPLOT assumes the distance between 2 points is one unit of time). The w(f) is the Tukey window discussed above.

The next step is to calculate the cross covariance functions:

From these, the even and odd cross covariance functions are calculated:

The co-spectral estimate is:

The quadrature spectral estimate is (it is always zero for frequencies 0 and F):

The cross spectrum is a plot of the co-spectral plot and the quadrature spectrum plot on the same graph.

The cross amplitude spectral estimate is:

The phase spectral estimate is:

The coherency spectral estimate is:

The gain spectral estimate is (this is the cross amplitude divided by the autospectrum of the first series):

The argand spectral plot consist of:

Vertical axis = the co-spectral estimate divided by the autospectral estimate of the first series;
Horizontal axis = the quadrature spectral estimate divided by the autospectral estimate of the second series.

NOTE 7

The frequency is computed at a discrete number of points. DATAPLOT computes it at (N/2) equally spaced points where N is the number of points in the time series. However, it uses 120 frequencies as a lower bound and 1000 frequencies as an upper bound.

NOTE 8

DATAPLOT uses the algorithms on pages 310-312 and 418-420 of the Jenkins and Watts book (see REFERENCE). Although spectral estimates can be computed in terms of the fast Fourier transform, DATAPLOT does not use that method.

NOTE 9

Spectral plots are often drawn with log scales to provide better resolution (enter YLOG ON). Some analysts prefer to draw the plot as a solid connected lines while others prefer to draw it with spikes. Either way is straightforward to generate with the proper settings for the LINES and SPIKES commands.

DEFAULT

None

SYNONYMS

SPECTRUM is a synonym for SPECTRAL.

The word PLOT is optional in the various SPECTRAL PLOT commands (e.g., SPECTRUM Y, CROSS-SPECTRUM Y1 Y2).

AUTO SPECTRAL PLOT is a synonym for SPECTRAL PLOT.

RELATED COMMANDS

PERIODOGRAM = Generates a periodogram plot.
CORRELATION PLOT = Generates a correlation plot.
COMPLEX DEMOD PLOT = Generates a complex demodulation plot.
LAG PLOT = Generates a lag plot.
PLOT = Generates a data or function plot.
4-PLOT = Generates 4-plot univariate analysis.
CHARACTERS = Sets the types for plot characters.
LINES = Sets the types for plot lines.
LOG = Sets the log switches on/off.
SPIKES = Sets the on/off switches for plot spikes.
SUMMARY = Generates a table of summary statistics.
LET = Generates sin/cos transformations (plus much more).
FIT = Carries out a least squares fit.

REFERENCE

``Spectral Analysis and Its Applications,'' Jenkins and Watts, Holden-Day, 1968 (chapters 6-10).

``Spectral Analysis of Economic Time Series,`` Granger and Hatanaka, Princeton University Press, 1964 (pp. 77-79).

``Fourier Analysis of Time Series,'' Peter Bloomfield, John Wiley and Sons, 1976.

APPLICATIONS

Frequency Time Series Analysis

IMPLEMENTATION DATE

Pre-1987

PROGRAM 1

. THIS SAMPLE PROGRAM READS THE FILE LEW.DAT IN THE DATAPLOT
. REFERENCE DIRECTORY. THESE DATA ARE BEAM DEFLECTION DATA.
.
SKIP 25
READ LEW.DAT DEFLECT
.
TITLE AUTOMATIC
MULTIPLOT 2 2; MULTIPLOT CORNER COORDINATES 0 0 100 100
LET A = MEAN DEFLECT
LINE BLANK; SPIKE ON; SPIKE BASE A
PLOT DEFLECT
.
SPIKE BASE 0
LINE BLANK SOLID DOT DOT
Y1LABEL AUTOCORRELATION
X1LABEL LAG
AUTOCORRELATION PLOT DEFLECT
.
Y1LABEL POWER
X1LABEL FREQUENCY
SPECTRAL PLOT DEFLECT
.
YLOG ON
LINE SOLID
SPIKE OFF
SPECTRAL PLOT DEFLECT
.
END OF MULTIPLOT

PROGRAM 2

. READ THE FILE DUTTON.DAT IN THE DATAPLOT REFERENCE DIRECTORY.
SKIP 25
READ DUTTON.DAT X Y1 Y2 Y3
.
MULTIPLOT 2 2; MULTIPLOT CORNER COORDINATES 0 0 100 100
TITLE AUTOMATIC; LINES SOLID DASH
PLOT Y1 Y3 VS X
Y1LABEL POWER; X1LABEL FREQUENCY
SPECTRAL PLOT Y1
SPECTRAL PLOT Y3
LINE BLANK SOLID DOT DOT DOT DOT; SPIKE ON; SPIKE BASE 0
Y1LABEL CORRELATION; X1LABEL LAG
CROSS CORRELATION PLOT Y1 Y3
. PAGE 2
MULTIPLOT 4 2
SPIKE OFF; LINES SOLID DASH
YMINIMUM 0; YTIC OFFSET 1 0; Y1LABEL POWER; X1LABEL FREQUENCY
CROSS-SPECTRAL PLOT Y1 Y3
COSPECTRAL PLOT Y1 Y3
AMPLITUDE SPECTRAL PLOT Y1 Y3
YMINIMUM; YTIC OFFSET 0 0
QUADRATURE SPECTRAL PLOT Y1 Y3
COHERENCY SPECTRAL PLOT Y1 Y3
PHASE SPECTRAL PLOT Y1 Y3
GAIN SPECTRAL PLOT Y1 Y3
Y1LABEL CO-SPECTRUM/SPECTRUM 1
X1LABEL QUADRATURE SPECTRUM/SPECTRUM 2
ARGAND SPECTRAL PLOT Y1 Y3
END OF MULTIPLOT

STANDARD DEVIATION PLOT

PURPOSE

Generates a subsample standard deviation versus subsample index plot.

DESCRIPTION

The subsample standard deviation is the standard deviation (with divisor ni-1) of the data in the subsample. The standard deviation plot is used to answer the question: ``Does the subsample variation change over different subsamples?'' It consists of:

Vertical axis = subsample standard deviation;
Horizontal axis = subsample index.
In addition, a horizontal line is drawn representing the full sample standard deviation. The appearance of the 2 traces is controlled by the first 2 settings of the LINES, CHARACTERS, SPIKES, BARS, and similar attributes.

SYNTAX

STANDARD DEVIATION PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable;
<x> is the subsample identifier variable (this variable appears on the horizontal axis);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

STANDARD DEVIATION PLOT Y X
STANDARD DEVIATION PLOT Y X SUBSET X > 2

DEFAULT

None

SYNONYMS

SD PLOT
S PLOT

RELATED COMMANDS

CHARACTERS = Sets the type for plot char.
LINES = Sets the type for plot lines.
VARIANCE PLOT = Generates a variance plot.
STAND DEVI OF THE MEAN PLOT = Generates standard deviation of the mean plot.
RANGE PLOT = Generates a range plot.
MEAN PLOT = Generates a mean plot.
MEDIAN PLOT = Generates a median plot.
BOX PLOT = Generates a box plot.
S CHART = Generates a standard deviation control chart.
PLOT = Generates a data or function plot.

APPLICATIONS

Quality Control

IMPLEMENTATION DATE

88/2

PROGRAM

SKIP 25
READ GEAR.DAT DIAMETER BATCH
LINE BLANK DASH
CHARACTER X BLANK
XTIC OFFSET 0.2 0.2
Y1LABEL STANDARD DEVIATION
X1LABEL BATCH
TITLE STANDARD DEVIATION PLOT
STANDARD DEVIATION PLOT DIAMETER BATCH

STANDARD DEVIATION OF MEAN PLOT

PURPOSE

Generates a subsample standard deviation of the mean versus subsample index plot.

DESCRIPTION

The subsample standard deviation of the mean is the subsample standard deviation divided by the square root of the subsample size. The standard deviation of the mean plot is used to answer the question: ``Does the subsample variation of the mean change over different subsamples?'' It consists of:

Vertical axis = subsample standard deviation;
Horizontal axis = subsample index.
In addition, a horizontal line is drawn representing the full sample standard deviation of the mean. The appearance of the 2 traces is controlled by the first 2 settings of the LINES, CHARACTERS, SPIKES, BARS, and similar attributes.

SYNTAX

STANDARD DEVIATION OF THE MEAN PLOT <y> <x> <SUBSET/EXCEPT/FOR/qualification>
where <y> is the response (= dependent) variable;
<x> is the subsample identifier variable (this variable appears on the horizontal axis);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

STANDARD DEVIATION OF THE MEAN PLOT Y X
STANDARD DEVIATION OF THE MEAN PLOT Y X1 SUBSET X1 > 2

DEFAULT

None

SYNONYMS

STANDARD DEVIATION OF MEAN PLOT

SDM PLOT

STANDARD DEVIATION MEAN PLOT

SD OF MEAN PLOT

SD MEAN PLOT

RELATED COMMANDS

CHARACTERS = Sets the type for plot char.
LINES = Sets the type for plot lines.
STANDARD DEVIATION PLOT = Generates a standard deviation plot.
VARIANCE PLOT = Generates a variance plot.
VARIANCE OF MEAN PLOT = Generates variance of the mean plot.
RANGE PLOT = Generates a range plot.
MEAN PLOT = Generates a mean plot.
MEDIAN PLOT = Generates a median plot.
BOX PLOT = Generates a box plot.
S CHART = Generates a standard deviation control chart.
PLOT = Generates a data or function plot.

APPLICATIONS

Quality Control

IMPLEMENTATION DATE

88/2

PROGRAM

SKIP 25
READ GEAR.DAT DIAMETER BATCH
LINE BLANK DASH
CHARACTER X BLANK
XTIC OFFSET 0.2 0.2
Y1LABEL STANDARD DEVIATION OF THE MEAN
X1LABEL BATCH
TITLE STANDARD DEVIATION OF THE MEAN PLOT
STANDARD DEVIATION OF THE MEAN PLOT DIAMETER BATCH

STAR PLOT

PURPOSE

Generates a star plot.

DESCRIPTION

A star plot is a graphical data analysis technique for examining the relative behavior of all variables in a multivariate data set. The star plot consists of a sequence of equi-angular spokes (radii). Each spoke represents a different variable in the multivariate data set. An individual star plot examines the behavior of all such variables but only for a specified subset of the data (e.g., looking at all the attributes of car performance, but only for a particular car, such as Chevrolet). The total length of a given spoke is uniformly set to unity for sake of reference. The ``data length'' of a given spoke is proportional to the magnitude of the variable for the subset relative to the maximum magnitude of the variable across all subsets. Thus we are looking at the ratio of the ``local'' value of the variable to the ``global'' maximum of the variable. An interconnecting line cutting across each spoke at the ``data length'' gives the star plot its unique appearance and name.

SYNTAX

STAR PLOT <y1> <y2> ... <yk> <SUBSET/EXCEPT/FOR qualification>
where <y1> is the first response variable;
<y2> is the second response variable;
<yk> is the last response variable;
and where the <SUBSET/EXCEPT/FOR qualification> must be given. It is not optional for this command as it is for most other DATAPLOT commands.

EXAMPLES

STAR PLOT Y1 Y2 Y3 Y4 Y5 SUBSET AUTO 4
STAR PLOT Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 Y9 SUBSET STATE 25

NOTE 1

A few variations of the star plot exist, all of which DATAPLOT can easily handle by judicious use of the components in the LINES command. For example, suppose there are k variables in the star plot (and so k spokes), then

2. the next k elements of LINES controls the appearance of the spokes from the origin out to the interconnecting line (some analysts prefer this to be SOLID, others prefer this to be DOTTED);
3. the following k elements of LINES controls the appearance of the spokes from the interconnecting line out to the uniform end of the spoke (some analysts prefer this to be SOLID, others prefer this to be DOTTED, others prefer this to be BLANKed out).
When using the LINES command in this context, note the convenient abbreviations SO for SOLID, DO for DOTTED, DA for DASHED, BL for BLANK, etc., as in LINES SO DO DO DO DO DO BL BL BL BL BL which for a 5-variable star plot, would set the interconnecting line to SOLID, the inner part of the spokes to DOTTED, and BLANK out the outer part of the spokes.

NOTE 2

The generation of multiple star plots per page is typical (one star plot for each subset of interest). This is easily done in DATAPLOT by using the STAR PLOT command in conjunction with the MULTIPLOT and LOOP commands. This is demonstrated in the program example below.

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

PROFILE PLOT = Generates a (multivariate) profile plot.
ANDREWS PLOT = Generates an Andrews plot.
SYMBOL PLOT = Generates a symbol plot.
LINES = Sets the type for plot lines.
MULTIPLOT = Allows multiple plots per page.
LOOP = Starts a loop (iteration).
^ = Allows string and value substitution.

REFERENCE

``Graphical Methods for Data Analysis,'' Chambers, Cleveland, Kleiner, and Tukey, Wadsworth, 1983 (pp. 160-161).

APPLICATIONS

Multivariate Analysis

IMPLEMENTATION DATE

88/3

PROGRAM

DIMENSION 100 COLUMNS
SKIP 25; COLUMN LIMITS 20 132
READ AUTO79.DAT Y1 TO Y9
LET N = SIZE Y1; LET CAR = SEQUENCE 1 1 N
COLUMN LIMITS 1 19; SKIP 0
LOOP FOR K = 1 1 25
LET K1 = 25+K
ROW LIMITS K1 K1
READ STRING AUTO79.DAT S^K
END OF LOOP
.
MULTIPLOT 5 5; MULTIPLOT CORNER COORDINATES 0 0 100 100
FRAME CORNER COORDINATES 15 10 95 95
LEGEND 1 COORDINATES 55 5; LEGEND JUSTIFICATION CENTER; LEGEND SIZE 4
LOOP FOR K = 1 1 25
LEGEND 1 ^S^K
STAR PLOT Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 Y9 SUBSET CAR K
END OF LOOP
END OF MULTIPLOT

... STATISTIC PLOT

PURPOSE

Generates a statistic versus index plot for a given statistic.

DESCRIPTION

A statistic plot consists of subsample statistic versus subsample index. The subsample statistic is the value of some statistic for the data in the subsample. The statistic plot is used to answer the question: ``Does the subsample statistic change over different subsamples?'' The plot consists of:

Vertical axis = subsample statistic;
Horizontal axis = subsample index.
In addition, a horizontal line is drawn representing the full sample statistic. The appearance of the 2 traces is controlled by the first 2 settings of the LINES, CHARACTERS, SPIKES, BARS, and similar attributes.

SYNTAX 1

<stat> STATISTIC PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <stat> is one of the following statistics:

MEAN, MIDMEAN, MEDIAN, TRIMMED MEAN, WINDSORIZED MEAN,
SUM, PRODUCT, SIZE (or NUMBER or SIZE),
STANDARD DEVIATION, STANDARD DEVIATION OF MEAN,
VARIANCE, VARIANCE OF THE MEAN,
RELATIVE STANDARD DEVIATION, RELATIVE VARIANCE,
RANGE, MIDRANGE, MAXIMUM, MINIMUM, EXTREME,
LOWER HINGE, UPPER HINGE, LOWER QUARTILE, UPPER QUARTILE,
<FIRST/SECOND/THIRD/FOURTH/FIFTH/SIXTH/SEVENTH/EIGTH/ NINTH/TENTH> DECILE,
SKEWNESS, KURTOSIS, NORMAL PPCC,
AUTOCORRELATION, AUTOCOVARIANCE,
SINE FREQUENCY, SINE AMPLITUDE,
CP, CPK, EXPECTED LOSS, PERCENT DEFECTIVE,
TAGUCHI SN0 (or SN), TAGUCHI SN+ (or SNL),
TAGUCHI SN- (or SNS), TAGUCHI SN00 (or SN2);
<y> is the response (= dependent) variable;
<x> is the subsample identifier variable (this variable appears on the horizontal axis);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax is used for statistics that require a single variable to compute.

SYNTAX 2

<stat> STATISTIC PLOT <y1> <y2> <x> <SUBSET/EXCEPT/FOR qualification>
where <stat> is one of the following statistics:
LINEAR INTERCEPT, LINEAR SLOPE, LINEAR RESSD, LINEAR CORRELATION;
<y1> is the first response (= dependent) variable;
<y2> is the second response (= dependent) variable;
<x> is the subsample identifier variable (this variable appears on the horizontal axis);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax is used for variables that require two statistics to compute. If a linear fit is performed, the first variable is the dependent variable while the second variable is the independent variable.

EXAMPLES

MEAN PLOT Y X
STANDARD DEVIATION PLOT Y X1

NOTE 1

The subcommands (e.g., MEAN PLOT) are documented individually.

NOTE 2

Although DATAPLOT supports this command for a large number of statistics, there may be cases where you want it for an unsupported statistic. The following example shows how to compute the rank correlation (assume Y1 and Y2 are the response variables and TAG is the group identifier).

LET TAGDIST = DISTINCT TAG
LET NGROUP = SIZE TAGDIST
LOOP FOR K = 1 1 NGROUP
LET IGROUP TAGDIST(K)
LET A = RANK CORRELATION Y1 Y2 SUBSET TAG = IGROUP
LET YNEW(K) = A
LET XNEW(K) = K
END OF LOOP
LET A = RANK CORRELATION Y1 Y2
LET YNEW2 = DATA A A
LET XNEW2 = DATA 1 NGROUP
PLOT YNEW XNEW AND
PLOT YNEW2 XNEW2
This basic idea can be easily adapted to other statistics (even ones that are not built-in to DATAPLOT). It can also be adapted to statistics requiring any arbitrary number of variables to compute.

DEFAULT

None

SYNONYMS

On most of the commands, the word STATISTIC is optional and is usually omitted (e.g., the mean plot is documented under MEAN PLOT rather than MEAN STATISTIC PLOT). The one exception is for the AUTOCORRELATION STATISTIC PLOT where the word STATISTIC is required.

RELATED COMMANDS

CHARACTERS = Sets the type for plot characters.
LINES = Sets the type for plot lines.
BOX PLOT = Generates a box plot.
CONTROL CHART = Generates a control chart.
PLOT = Generates a data or function plot.
SUMMARY = Computes various statistics for a variable.

APPLICATIONS

Exploratory Data Analysis

IMPLEMENTATION DATE

88/2 (a few of the statistics have been added at various times since then, see the date on the individual command)

PROGRAM 1

SKIP 25
READ GEAR.DAT DIAMETER BATCH
.
TITLE AUTOMATIC
MULTIPLOT 3 3 ; MULTIPLOT CORNER COORDINATES 0 0 100 100
CHARACTER X ALL; LINE BLANK ALL
XTIC OFFSET 1 1
X1LABEL BATCH; Y1LABEL DIAMETER
PLOT DIAMETER BATCH BATCH
CHARACTERS BOX PLOT; LINES BOX PLOT; FENCES ON
BOX PLOT DIAMETER BATCH
.
LINE BLANK SOLID
CHARACTER X BLANK
Y1LABEL MEAN; TITLE MEAN PLOT
MEAN PLOT DIAMETER BATCH
Y1LABEL RANGE; TITLE RANGE PLOT
RANGE PLOT DIAMETER BATCH
Y1LABEL STANDARD DEVIATION; TITLE SD PLOT
STANDARD DEVIATION PLOT DIAMETER BATCH
Y1LABEL RELATIVE STANDARD DEVIATION; TITLE RELSD PLOT
RELSD PLOT DIAMETER BATCH
Y1LABEL SKEWNESS; TITLE SKEWNESS PLOT
SKEWNESS PLOT DIAMETER BATCH
Y1LABEL AUTOCORRELATION; TITLE AUTOCORELATION PLOT
AUTOCORRELATION STATISTICS PLOT DIAMETER BATCH
Y1LABEL S/N; TITLE TAGUCHI SN PLOT
TAGUCHI SN PLOT DIAMETER BATCH
END OF MULTIPLOT

PROGRAM 2

SKIP 25
READ BERGER1.DAT Y X BATCH
.
TITLE AUTOMATIC
XTIC OFFSET 1 1
MULTIPLOT 2 2 ; MULTIPLOT CORNER COORDINATES 0 0 100 100
LINE BLANK SOLID
CHARACTER X BLANK
Y1LABEL SLOPE
LINEAR SLOPE PLOT Y X BATCH
Y1LABEL INTERCEPT
LINEAR INTERCEPT PLOT Y X BATCH
Y1LABEL CORRELATION
LINEAR CORRELATION PLOT Y X BATCH
Y1LABEL RESSD
LINEAR RESSD PLOT Y X BATCH
END OF MULTIPLOT

STEM AND LEAF PLOT

PURPOSE

Generates a stem and leaf plot.

DESCRIPTION

A stem and leaf plot is a graphical data analysis technique for summarizing the distributional information of a variable. It is similar to a histogram, but it preserves the original numeric values in the data. As such, it is an effective alternative to the histogram for small to moderate size data sets. It is not recommended for large data sets.

SYNTAX

STEM AND LEAF PLOT <x> <SUBSET/EXCEPT/FOR qualification>
where <x> is the variable of raw data values;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

STEM AND LEAF PLOT TEMP

NOTE

Although the stem and leaf plot is a graphics command, the plot is generated as alphanumeric output, not as graphics output. This means that if device 2 is on, the stem and leaf plot is not generated in the plot file DPPL1F.DAT. The CAPTURE command can be used to direct the stem and leaf output to a text file.

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

FREQUENCY PLOT = Generates a frequency plot.
HISTOGRAM = Generates a histogram.
PIE CHART = Generates a pie chart.
PERCENT POINT PLOT = Generates a percent point plot.
PROBABILITY PLOT = Generates a probability plot.
PPCC PLOT = Generates probability plot correlation coefficient plot.
CAPTURE = Redirect alphanumeric output to a file.

APPLICATIONS

Exploratory Data Analysis

IMPLEMENTATION DATE

Pre-1987

PROGRAM

SKIP 25
READ GEAR.DAT DIAMETER
STEM AND LEAF PLOT DIAMETER
The following output is generated:

97 :

98 : 01244

98 : 788

99 : 00111123344444

99 : 5555566666666666666677777788888888888889999

00 : 000000000122222222224444

00 : 56666699

01 : 03

01 : 8

SUM PLOT

PURPOSE

Generates a subsample sum versus subsample index plot.

DESCRIPTION

The subsample sum is the sum of the data in the subsample. The sum plot is used to answer the question: ``Does the subsample sum-of-counts change over different subsamples?'' It consists of:

Vertical axis = subsample sum;
Horizontal axis = subsample index.
In addition, a horizontal line is drawn representing the full sample sum. The appearance of the 2 traces is controlled by the first 2 settings of the LINES, CHARACTERS, SPIKES, BARS, and similar attributes.

SYNTAX

SUM PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable;
<x> is the subsample identifier variable (this variable appears on the horizontal axis);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

SUM PLOT Y X
SUM PLOT Y X1 SUBSET X1 > 2

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

CHARACTERS = Sets the type for plot characters.
LINES = Sets the type for plot lines.
FREQUENCY (LET) = Compute subsample frequencies.
HISTOGRAM = Generates a histogram.
PRODUCT PLOT = Generates a product plot.
MEAN PLOT = Generates a mean plot.
BOX PLOT = Generates a box plot.
PLOT = Generates a data or function plot.

APPLICATIONS

Exploratory Data Analysis

IMPLEMENTATION DATE

88/9

PROGRAM

LET Y = DATA 2 4 5 10 10 20 25 30 35

LET X = DATA 1 1 1 2 2 3 3 3 3

LINE BLANK DASH

CHARACTER X BLANK

XLIMITS 0.5 3.5

Y1LABEL SUM

X1LABEL GROUP ID

TITLE AUTOMATIC

SUM PLOT Y X

SYMBOL PLOT

PURPOSE

Generates a symbol plot.

DESCRIPTION

A symbol plot is a scatter plot for which the attributes of the plot symbols are controlled by the values of other variables. This plot allows the size, type, color, and fill attribute to be controlled.

SYNTAX 1

SYMBOL PLOT <y> <x> <size> <SUBSET/EXCEPT/FOR qualification>
where <y> is the vertical axis variable;
<x> is the horizontal axis variable;
<size> is a variable that controls the size of the plot symbol;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

SYNTAX 2

SYMBOL PLOT <y> <x> <size> <symbol> <SUBSET/EXCEPT/FOR qualification>
where <y> is the vertical axis variable;
<x> is the horizontal axis variable;
<size> is a variable that controls the size of the plot symbol;
<symbol> is a variable that controls the type of symbol plotted;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

SYNTAX 3

SYMBOL PLOT <y> <x> <size> <symbol> <color> <SUBSET/EXCEPT/FOR qualification>
where <y> is the vertical axis variable;
<x> is the horizontal axis variable;
<size> is a variable that controls the size of the plot symbol;
<symbol> is a variable that controls the type of symbol plotted;
<color> is a variable that controls the color of the symbol;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

SYNTAX 4

SYMBOL PLOT <y> <x> <size> <symbol> <color> <fill> <SUBSET/EXCEPT/FOR qualification>
where <y> is the vertical axis variable;
<x> is the horizontal axis variable;
<size> is a variable that controls the size of the plot symbol;
<symbol> is a variable that controls the type of symbol plotted;
<color> is a variable that controls the color of the symbol;
<fill> is an optional variable that controls whether the symbol is filled or not;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

SYMBOL PLOT Y X PRESSURE
SYMBOL PLOT Y X PRESSURE TYPE
SYMBOL PLOT Y X PRESSURE TYPE ICOL
SYMBOL PLOT Y X PRESSURE TYPE ICOL IFILL

NOTE 1

SYMBOL PLOT Y and SYMBOL PLOT Y X are equivalent to PLOT Y and PLOT Y X respectively. The only difference is that the line pattern is automatically set to blank.

NOTE 2

The size for Y(i) is scaled to: Y(i)/MAXIMUM(ABS(Y)). The maximum value of Y is set to the size given in the CHARACTER SIZE (or CHARACTER HW) command while the others multiply this size by the scale factor. The minimum scale factor is truncated at 5%.

NOTE 3

The <size>, <symbol>, <color>, and <fill> variables are optional, but must be entered in that order. To skip one, set all the elements of the skipped variable to 1. For example,

LET N = SIZE Y
LET SYMBOL = 1 FOR I = 1 1 N
SYMBOL PLOT Y X SIZE SYMBOL COLOR

NOTE 4

The <symbol> and <color> variables define indices to the CHARACTER and CHARACTER COLOR commands respectively. That is, they contain values between 1 and 100 (e.g., a value of 3 would go to the third setting of these commands).

NOTE 5

Zero values for the <fill> variable indicate non-filled characters while non-zero values indicate filled characters. In addition, to generate filled characters a fillable symbol must be used (e.g., CIRCLE or SQUARE).

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

CHARACTER = Sets the type for plot characters.
CHARACTER COLOR = Sets the color for plot characters.
CHARACTER FILL = Sets the fill for plot characters.
ANDREWS PLOT = Generates an Andrews plot.
PROFILE PLOT = Generates a profile plot.
STAR PLOT = Generates a star plot.
PLOT = Generates a data or function plot.

REFERENCE

``Graphical Methods for Data Analysis,'' Chambers, Cleveland, Kleiner, and Tukey, Wadsworth, 1983.

APPLICATIONS

Multivariate Analysis

IMPLEMENTATION DATE

92/12

PROGRAM

DIMENSION 20 COLUMNS
MULTIPLOT 2 2; MULTIPLOT CORNER COORDINATES 0 0 100 100
SKIP 25
READ CURRIE.DAT ID1 ID2 FRWC CTOTAL POT LEAD IRON
LET TYPE = ID1; LET TYPE = 1 SUBSET ID1 < 10
LET TYPE = 2 SUBSET ID1 = 11 TO 19; LET TYPE = 3 SUBSET ID1 > 19
LET N = SIZE ID1; LET DUMMY = 1 FOR I = 1 1 N
TITLE AUTOMATIC; Y1LABEL POTASSIUM; X1LABEL LEAD
CHARACTER HW 5.0 3.75 ALL
.
CHARACTER CIRCLE SQUARE DIAMOND TRIANGLE
LEGEND 1 IRON DETERMINES SYMBOL SIZE
SYMBOL PLOT POT LEAD IRON
.
CHARACTER HW 2.0 1.5 ALL
LEGEND 1 ID1 DETERMINES SYMBOL TYPE
SYMBOL PLOT POT LEAD DUMMY TYPE
.
CHARACTER HW 5.0 3.75 ALL
LEGEND 1 IRON DETERMINES SYMBOL SIZE
LEGEND 2 ID1 DETERMINES SYMBOL TYPE
SYMBOL PLOT POT LEAD IRON TYPE
.
LET AFILL = 0 FOR I = 1 1 N
LET AFILL = 1 SUBSET ID2 = 1
LEGEND 3 ID2 DETERMINES FILL
SYMBOL PLOT POT LEAD IRON TYPE DUMMY AFILL
END OF MULTIPLOT

SYMMETRY PLOT

PURPOSE

Generates a symmetry plot.

DESCRIPTION

A symmetry plot is a graphical data analysis technique for assessing if a data set is symmetric about the mean. It consists of:

Vertical axis = Y(n-i+1) - median;
Horizontal axis = median - Y(i);
where median is the sample median, Y is sample variable, and i goes from 1 to the index of the median point. This plot graphs the distance from the median of points above the median against the corresponding points below the median. The interpertation of this plot is that the closer these points lie to the 45 degree line, the more symmetric the data is. The symmetry plot can be generated for either raw data of for pre-computed frequencies (i.e., grouped data).

SYNTAX 1

SYMMETRY PLOT <y> <SUBSET/EXCEPT/FOR qualification>
where <y> is the variable of raw data values;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax is used when you have raw data.

SYNTAX 2

SYMMETRY PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the variable of pre-computed frequencies;
<x> is the variable of group identifiers;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This syntax is used when you have pre-computed frequencies.

EXAMPLES

SYMMETRY PLOT Y1
SYMMETRY PLOT Y1 GROUP
SYMMETRY PLOT Y1 SUBSET Y1 > -3.0

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

LINES = Sets the type for plot lines.
CHARACTERS = Sets the type for plot characters.
PLOT = Generates a data or function plot.
PROBABILITY PLOT = Generate a probability plot.
PERCENT POINT PLOT = Generates a percent point plot.

REFERENCE

``Graphical Methods for Data Analysis,'' Chambers, Cleveland, Kleiner, and Tukey, Wadsworth, 1983.

APPLICATIONS

Exploratory Data Analysis

IMPLEMENTATION DATE

88/9

PROGRAM

SKIP 25
READ MARSHAK.DAT Y
CHARACTER CIRCLE
CHARACTER HW 2.0 1.5
LINE BLANK
Y1LABEL POINTS ABOVE MEDIAN
X1LABEL POINTS BELOW MEDIAN
TITLE AUTOMATIC
SYMMETRY PLOT Y

TAGUCHI SN PLOT

PURPOSE

Generates a Taguchi signal-to-noise plot for the ``target is better'' (= ``nominal is better'') case with a ``variance is dependent on the mean'' subcase.

DESCRIPTION

This primary Taguchi SN plot answers the question: ``What level of the independent variable yields the ``best'' value of the response as measured by the largest value of the signal-to-noise (S/N) ratio?'' For this ``target is better'' case, the S/N ratio is defined as:

where x is the subsample mean and s is the subsample standard deviation. The Taguchi SN plot consists of the following:

Vertical axis = the Taguchi S/N value for each sub-group;
Horizontal axis = sub-group designation.
A reference line is drawn at the full sample S/N ratio.

SYNTAX

TAGUCHI SN PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable that contains the raw data values;
<x> is an independent variable that contains the sub-group identifications;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

TAGUCHI SN PLOT YIELD CATALYST
TAGUCHI SN PLOT Y X SUBSET MATERIAL 4

DEFAULT

None

SYNONYMS

The word TAGUCHI is optional (i.e., SN PLOT is a synonym for TAGUCHI SN PLOT).

S/N, SN0, S/N0, SNT, and S/NT are synonyms for SN.

RELATED COMMANDS

TAGUCHI SN (LET) = Compute the Taguchi SN statistic for a variable.
TAGUCHI SN+ PLOT = Generates a (larger is better) signal-to- noise plot.
TAGUCHI SN- PLOT = Generates a (smaller is better) signal-to- noise plot.
TAGUCHI SN00 PLOT = Generates a (target variable is independent of the mean) signal-to-noise plot.
MEAN PLOT = Generates a mean plot.
SD PLOT = Generates a standard deviation plot.
CONTROL CHART = Generates a mean, range, standard deviation, P, C, U, or NP control chart.
Q CONTROL CHART = Generates a Quesenberry style control chart.
CHARACTERS = Sets the types for plot characters.
LINES = Sets the types for plot lines.
SPIKES = Sets the on/off switch for plot spikes.

REFERENCE

``Statistical Methods and Applications,'' Jack Elliot, Allied Signal, 1987 (pp. 4-3, 4-4).

APPLICATIONS

Experiment Design and Quality Control

IMPLEMENTATION DATE

88/8

PROGRAM

SKIP 25
READ GEAR.DAT DIAMETER BATCH
CHARACTERS X ALL
CHARACTER SIZE 3 ALL
TITLE AUTOMATIC
X1LABEL BATCH
Y1LABEL DIAMETER
XTIC OFFSET 0.5 0.5
TIC LABEL SIZE 3
MULTIPLOT 2 1; MULTIPLOT CORNER COORDINATES 0 0 100 100
PLOT DIAMETER BATCH BATCH
CHARACTER X BLANK
LINE BLANK SOLID
Y1LABEL SN RATIO
TAGUCHI SN PLOT DIAMETER BATCH
END OF MULTIPLOT

TAGUCHI SN00 PLOT

PURPOSE

Generates a Taguchi signal-to-noise plot for the ``target is better'' (= ``nominal is better'') case with a ``variance is independent of the mean'' subcase.

DESCRIPTION

This Taguchi SN plot answers the question: ``What level of the independent variable yields the ``best'' value of the response as measured by the largest value of the signal-to-noise (S/N) ratio?'' For this ``target is better'' case, the S/N ratio is defined as:

where s is the subsample standard deviation. The Taguchi SN plot consists of the following:

Vertical axis = the Taguchi S/N value for each sub-group;
Horizontal axis = sub-group designation.
A reference line is drawn at the full sample S/N ratio.

SYNTAX

TAGUCHI SN00 PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable that contains the raw data values;
<x> is an independent variable that contains the sub-group identifications;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

TAGUCHI SN00 PLOT YIELD CATALYST
TAGUCHI SN00 PLOT Y X SUBSET MATERIAL 4

DEFAULT

None

SYNONYMS

The word TAGUCHI is optional (i.e., SN00 PLOT is a synonym for TAGUCHI SN00 PLOT).

SNT2, S/N2, and SN2 are synonyms for SN00.

RELATED COMMANDS

TAGUCHI SN00 (LET) = Computes the Taguchi SN00 statistic for a variable.
TAGUCHI SN+ PLOT = Generates a (larger is better) signal-to- noise plot.
TAGUCHI SN- PLOT = Generates a (smaller is better) signal-to- noise plot.
TAGUCHI SN PLOT = Generates a (target variable is dependent on the mean) signal-to-noise plot.
MEAN PLOT = Generates a mean plot.
SD PLOT = Generates a standard deviation plot.
CONTROL CHART = Generates a mean, range, standard deviation, P, NP, C, or U control chart.
Q CONTROL CHART = Generates a Quesenberry control chart.
CHARACTERS = Sets the types for plot characters.
LINES = Sets the types for plot lines.
SPIKES = Sets the on/off switch for plot spikes.

REFERENCE

``Statistical Methods and Applications,'' Jack Elliot, Allied Signal, 1987 (pp. 4-3, 4-4).

APPLICATIONS

Experiment Design and Quality Control

IMPLEMENTATION DATE

88/8

PROGRAM

SKIP 25
READ GEAR.DAT DIAMETER BATCH
CHARACTERS X ALL
CHARACTER SIZE 3 ALL
TITLE AUTOMATIC
X1LABEL BATCH
Y1LABEL DIAMETER
XTIC OFFSET 0.5 0.5
TIC LABEL SIZE 3
MULTIPLOT 2 1; MULTIPLOT CORNER COORDINATES 0 0 100 100
PLOT DIAMETER BATCH BATCH
CHARACTER X BLANK
LINE BLANK SOLID
Y1LABEL SN RATIO
TAGUCHI SN00 PLOT DIAMETER BATCH
END OF MULTIPLOT

TAGUCHI SN+ PLOT

PURPOSE

Generates a Taguchi signal-to-noise plot for the ``larger is better'' case.

DESCRIPTION

The Taguchi SN+ plot answers the question: ``What level of the independent variable yields the ``best'' value of the response (as measured by the largest value of the signal-to-noise ratio)?'' The ``+'' in SN+ stands for ``larger is better.'' For this ``larger is better'' case, the signal-to-noise ratio is defined as:

where N is the number of observations in the subsample and y is the data observations in the subset. The Taguchi SN+ plot consists of the following:

Vertical axis = the Taguchi S/N value for each sub-group;
Horizontal axis = sub-group designation.
A reference line is drawn for the full sample S/N ratio.

SYNTAX

TAGUCHI SN+ PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable that contains the raw data values;
<x> is an independent variable that contains the sub-group identifications;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

TAGUCHI SN+ PLOT YIELD CATALYST
TAGUCHI SN+ PLOT Y X SUBSET MATERIAL 4

DEFAULT

None

SYNONYMS

The word TAGUCHI is optional (i.e., SN+ PLOT is a synonym for TAGUCHI SN+ PLOT).

S/N+ and SNL are synonyms for SN+.

RELATED COMMANDS

TAGUCHI SN+ (LET) = Compute the Taguchi SN+ statistic for a variable.
TAGUCHI SN PLOT = Generates a (target; variable is dependent on mean) signal-noise plot.
TAGUCHI SN- PLOT = Generates a (smaller is better) signal-to- noise plot.
TAGUCHI SN00 PLOT = Generates a (target variable is independent of the mean) signal-to-noise plot.
MEAN PLOT = Generates a mean plot.
SD PLOT = Generates a standard deviation plot.
CONTROL CHART = Generates a mean, range, standard deviation, P, NP, C, or U control chart.
Q CONTROL CHART = Generates a Quesenberry control chart.
CHARACTERS = Sets the types for plot characters.
LINES = Sets the types for plot lines.
SPIKES = Sets the on/off switch for plot spikes.

REFERENCE

``Statistical Methods and Applications,'' Jack Elliot, Allied Signal, 1987 (pp. 4-3, 4-4).

APPLICATIONS

Experiment Design Analysis and Quality Control

IMPLEMENTATION DATE

88/8

PROGRAM

SKIP 25
READ GEAR.DAT DIAMETER BATCH
CHARACTERS X ALL
CHARACTER SIZE 3 ALL
TITLE AUTOMATIC
X1LABEL BATCH
Y1LABEL DIAMETER
XTIC OFFSET 0.5 0.5
TIC LABEL SIZE 3
MULTIPLOT 2 1; MULTIPLOT CORNER COORDINATES 0 0 100 100
PLOT DIAMETER BATCH BATCH
CHARACTER X BLANK
LINE BLANK SOLID
Y1LABEL SN RATIO
TAGUCHI SN+ PLOT DIAMETER BATCH
END OF MULTIPLOT

TAGUCHI SN- PLOT

PURPOSE

Generates a Taguchi signal-to-noise plot for the ``smaller is better'' case.

DESCRIPTION

The Taguchi SN- plot answers the question: ``What level of the independent variable yields the ``best'' value of the response (as measured by the largest value of the signal-to-noise ratio)?'' The ``-'' in SN- stands for ``smaller-is-better.'' For this ``smaller is better'' case, the signal-to-noise ratio is defined as:

where N is the number of observations in the subsample and y is the data observations in the subset. The Taguchi SN+ plot consists of the following:

Vertical axis = the Taguchi S/N value for each sub-group;
Horizontal axis = sub-group designation.
A reference line is drawn for the full sample S/N ratio.

SYNTAX

TAGUCHI SN- PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable that contains the raw data values;
<x> is an independent variable that contains the sub-group identifications;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

TAGUCHI SN- PLOT YIELD CATALYST
TAGUCHI SN- PLOT Y X SUBSET MATERIAL 4

DEFAULT

None

SYNONYMS

The word TAGUCHI is optional (i.e., SN- PLOT is a synonym for TAGUCHI SN- PLOT).

S/N- and SNS are synonyms for SN-.

RELATED COMMANDS

TAGUCHI SN PLOT = Generates a (target variable is dependent on the mean) signal-to-noise plot.
TAGUCHI SN+ PLOT = Generates a (larger is better) signal-to- noise plot.
TAGUCHI SN00 PLOT = Generates a (target variable is independent of the mean) signal-to-noise plot.
MEAN PLOT = Generates a mean plot.
SD PLOT = Generates a standard deviation plot.
CONTROL CHART = Generates a mean, range, standard deviation, P, NP, C, or U control chart.
Q CONTROL CHART = Generates a Quesenberry control chart.
CHARACTERS = Sets the types for plot characters.
LINES = Sets the types for plot lines.
SPIKES = Sets the on/off switch for plot spikes.

REFERENCE

``Statistical Methods and Applications,'' Jack Elliot, Allied Signal, 1987 (pp. 4-3, 4-4).

APPLICATIONS

Experiment Design Analysis and Quality Control

IMPLEMENTATION DATE

88/8

PROGRAM

SKIP 25
READ GEAR.DAT DIAMETER BATCH
CHARACTERS X ALL
CHARACTER SIZE 3 ALL
TITLE AUTOMATIC
X1LABEL BATCH
Y1LABEL DIAMETER
XTIC OFFSET 0.5 0.5
TIC LABEL SIZE 3
MULTIPLOT 2 1; MULTIPLOT CORNER COORDINATES 0 0 100 100
PLOT DIAMETER BATCH BATCH
CHARACTER X BLANK
LINE BLANK SOLID
Y1LABEL SN RATIO
TAGUCHI SN- PLOT DIAMETER BATCH
END OF MULTIPLOT

TAIL AREA PLOT

PURPOSE

Generates a tail area plot (the empirical survival distribution function against the sorted failure times).

DESCRIPTION

In reliability analysis, many data sets consists of a set of failure times where these failure times are typically truncated at some upper limit value. The cumulative distribution (or CDF) of these failure times is defined as:

F(t) = prob(T < t)
where T is the lifetime of a randomly selected unit. That is, the cumulative distribution is the probability that an item fails before time t. The survival distribution function (SDF) is defined as:

S(t) = prob(T > t)
= 1 - F(t)
That is, the survival function is the probability that an item is still surviving at time T.

A tail area plot is a plot of the empirical SDF versus the sorted failure times. The empirical SDF of t is the number of observations greater than or equal to t divided by the number of observations. At each failure time, DATAPLOT calculates the following two points and plots them on the vertical axis:

y1 = (N - I +1)/(N + 1)
y2 = (N - I)/(N + 1)

where N is the number of data points and I is the rank of the failure time. The last failure time only calculates one of these points.

When all of the points are connected, a staircase type plot results. The vertical step is constant for the failure times. The lengths of the horizontal steps are determined by the distances between the sorted failure times. Ties show up as longer vertical strips. The beginning of a horizontal strip represents an estimate of the survival probability at time ti while the end of the horizontal strip represents an estimate of the survival probability at time ti+1. This estimate is essentially the number of items still surviving divided by the total number of items (plus 1).

SYNTAX

TAIL AREA PLOT <y> <SUBSET/EXCEPT/FOR qualification>
where <y> is a response variable containing failure times;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

TAIL AREA PLOT Y1
TAIL AREA PLOT Y1 SUBSET TAG > 1

NOTE 1

Some analysts prefer to plot the log of the empirical SDF function. Enter the command YLOG ON to do this. In addition, the empirical cumulative hazard function is -log(empirical SDF). This can be plotted with the following commands:
TAIL AREA PLOT Y
LET YNEW = -LOG(YPLOT)
LET XNEW = XPLOT
PLOT YNEW XNEW

NOTE 2

If the data are censored, a more complicated estimate is required for the empirical SDF. This estimate is called the Kaplan-Meier or product-limit estimate. DATAPLOT does not generate explicit Kaplan-Meier estimates at this time.

NOTE 3

An estimate of F(t) can be obtained by entering the command LET CDF = 1 - YPLOT after generating the tail area plot.

DEFAULT

None

SYNONYMS

SURVIVAL PLOT

RELATED COMMANDS

LINES = Sets the type for plot lines.
CHARACTERS = Sets the type for plot characters.
CME PLOT = Generates a conditional mean exceedance plot.
WEIBULL PLOT = Generates a Weibull plot.
PROBABILITY PLOT = Generates a probability plot (over 25 distributions).
PLOT = Generates a data or function plot.

REFERENCE

``Statistical Models and Methods for Lifetime Data,'' Lawless, Wiley, 1982.

APPLICATIONS

Reliability

IMPLEMENTATION DATE

88/9

PROGRAM

SKIP 25
READ HAHN.DAT MILES TAG
LEGEND 1 CUT-OFF IS 135,000 MILES; LEGEND 1 COORDINATES 20 25
TITLE AUTOMATIC
XLABEL MILES WHEN FAILED; Y1LABEL CUMULATIVE PROPORTION SURVIVING
XLIMITS 0 150000
TAIL AREA PLOT MILES

TRIMMED MEAN PLOT

PURPOSE

Generates a subsample trimmed mean versus subsample index plot.

DESCRIPTION

The subsample trimmed mean is the mean of the data with 100p1% of the data deleted from the bottom and 100p2% of the data deleted from the top of the ascending- ordered data set. The trimmed mean plot is used to answer the question: ``Does the subsample location change over different subsamples?'' It consists of:

Vertical axis = subsample trimmed mean;
Horizontal axis = subsample index.
In addition, a horizontal line is drawn representing the full sample trimmed mean. The appearance of the 2 traces is controlled by the first 2 settings of the LINES, CHARACTERS, SPIKES, BARS, and similar attributes.

SYNTAX

TRIMMED MEAN PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable;
<x> is the subsample identifier variable (this variable appears on the horizontal axis);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

TRIMMED MEAN PLOT Y X
TRIMMED MEAN PLOT Y X SUBSET X > 2

NOTE

The analyst usually precedes the TRIMMED MEAN PLOT command with 2 LET commands of the following type:

LET P1 = .05
LET P2 = .10
where P1 and P2 are both between 0 and 1 so as to indicate the amount of data to be trimmed from the left (bottom) and right (top) respectively. In the example here, P1 = .05 and P2 = .10 indicates that 5% of the smallest data and 10% of the largest data should be trimmed off before forming the mean. Values greater than 1 are interpreted as percents. That is, P1 and P2 could be specified as 5 and 10 respectively.

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

CHARACTERS = Sets the type for plot characters.
LINES = Sets the type for plot lines.
MEAN PLOT = Generates a mean plot.
MEDIAN PLOT = Generates a median plot.
MIDMEAN PLOT = Generates a midmean plot.
MIDRANGE PLOT = Generates a midrange plot.
WINDSORIZED MEAN PLOT = Generates a Windsorized mean plot.
SD PLOT = Generates a standard deviation plot.
BOX PLOT = Generates a box plot.
XBAR CHART = Generates an xbar control chart.
PLOT = Generates a data or function plot.

APPLICATIONS

Quality Control

IMPLEMENTATION DATE

88/2

PROGRAM

SKIP 25
READ GEAR.DAT Y X
LET P1 = 10
LET P2 = 10
LINE BLANK DASH
CHARACTER X BLANK
XTIC OFFSET 0.2 0.2
Y1LABEL TRIMMED MEAN
X1LABEL SAMPLE ID
TITLE AUTOMATIC
TRIMMED MEAN PLOT Y X

U CONTROL CHART

PURPOSE

Generates a (Poisson) proportion control chart.

DESCRIPTION

A U chart is a data analysis technique for determining if a measurement process has gone out of statistical control. The U chart is sensitive to changes in the normalized number of defective items in the measurement process. Normalized means that the number of defectives is divided by the unit area. You can also normalize to compensate for unequal sample sizes (see the NOTE below). The ``U'' in U chart stands for ``units'' as in defectives per lot. The U control chart consists of:

Vertical axis = the normalized number of defectives (number of defectives/area for sub-group = u) for each sub-group;
Horizontal axis = sub-group designation.
A sub-group is frequently a time sequence (e.g., the number of defectives in a daily production run where each day is considered a sub-group). If the times are equally spaced, the horizontal axis variable can be generated as a sequence (e.g., LET X = SEQUENCE 1 1 N where N is the number of sub-groups).

In addition, horizontal lines are drawn at the mean number of defectives and at the upper and lower control limits. The distribution of the number of defective items is assumed to be Poisson. This assumption is the basis for the calculating the upper and lower control limits. The control limits are calculated as:

where u is the total number of defects divided by the total area (i.e., the sum of the areas variable) and A is the area corresponding to a given sub-group. This means that the control limits can vary with the sub-group. Also, zero serves as a lower bound on the LCL value.

SYNTAX

U CHART <y1> <area> <x> <SUBSET/EXCEPT/FOR qualification>
where <y1> is a variable containing the number of defective items in each sub-group;

<area> is a variable containing the sample size or area adjustment;

<x> is a variable containing the sub-group identifier (usually 1, 2, 3, ...);

and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

U CHART Y SIZE X
U CHART Y AREA X SUBSET X > 5

NOTE 1

The U CONTROL CHART is similar to the C CONTROL chart. The distinction is that the C CONTROL CHART is used when the material being inspected is constant in area and the sub-groups have an equal sample size. The U CONTROL CHART is used when either of these assumptions is not valid.
If the area is constant but the sample size is unequal, simply use the sample size as the <area> variable. If the sample sizes are equal, but the areas vary, use the area without modification as the <area> variable. If both the areas and the sample size vary, then use sample size times area for the <area> variable.

NOTE 2

The attributes of the 4 traces that make up the U control chart are controlled by the standard LINES, CHARACTERS, SPIKES, and BAR commands. Trace 1 is the response variable, trace 2 is the mean line, and traces 3 and 4 are the upper and lower control limits.
Some analysts prefer to draw the response variable as a character or a spike rather than a connected line. The example program demonstrates setting the line attributes (the control lines are drawn as dotted lines).

DEFAULT

None

SYNONYMS

U CHART for U CONTROL CHART

RELATED COMMANDS

C CHART = Generates a C control chart.
P CHART = Generates a P control chart.
NP CHART = Generates a Np control chart.
CHARACTERS = Sets the types for plot characters.
LINES = Sets the types for plot lines.

REFERENCE

``Guide to Quality Control,'' Kaoru Ishikawa, Asian Productivity Organization, 1982 (Chapter 8).

APPLICATIONS

Quality Control

IMPLEMENTATION DATE

88/2

PROGRAM

SKIP 25
READ CCU.DAT X NUMDEF SIZE
XLIMITS 0 20; XTIC OFFSET 0 1; YTIC OFFSET 1
LINES SOLID SOLID DOT DOT; TITLE AUTOMATIC
Y1LABEL NORMALIZED NUMBER OF DEFECTIVES; XLABEL SAMPLE ID
U CHART NUMDEF SIZE X

VARIANCE PLOT

PURPOSE

Generates a subsample variance versus subsample index plot.

DESCRIPTION

The subsample variance is the variance (with divisor ni-1) of the data in the subsample. The variance plot is used to answer the question: ``Does the subsample variation change over different subsamples?'' It consists of:

Vertical axis = subsample variance;
Horizontal axis = subsample index.
In addition, a horizontal line is drawn representing the full sample variance. The appearance of the 2 traces is controlled by the first 2 settings of the LINES, CHARACTERS, SPIKES, BARS, and similar attributes.

SYNTAX

VARIANCE PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable;
<x> is the subsample identifier variable (this variable appears on the horizontal axis);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

VARIANCE PLOT Y X
VARIANCE PLOT Y X SUBSET X > 2

DEFAULT

None

SYNONYMS

VAR PLOT
V PLOT

RELATED COMMANDS

CHARACTERS = Sets the type for plot characters.
LINES = Sets the type for plot lines.
STANDARD DEVIATION PLOT = Generates a standard deviation plot.
STANDARD DEVI OF MEAN PLOT = Generates a standard deviation of the mean plot.
VARIANCE OF MEAN PLOT = Generates variance of the mean plot.
RANGE PLOT = Generates a range plot.
MEAN PLOT = Generates a mean plot.
MEDIAN PLOT = Generates a median plot.
BOX PLOT = Generates a box plot.
S CHART = Generates a standard deviation control chart.
PLOT = Generates a data or function plot.

APPLICATIONS

Quality Control

IMPLEMENTATION DATE

88/2

PROGRAM

SKIP 25
READ GEAR.DAT DIAMETER BATCH
LINE BLANK DASH
CHARACTER X BLANK
XTIC OFFSET 0.2 0.2
Y1LABEL VARIANCE
X1LABEL BATCH
TITLE VARIANCE PLOT
VARIANCE PLOT DIAMETER BATCH

VARIANCE OF THE MEAN PLOT

PURPOSE

Generates a subsample variance of the mean versus subsample index plot.

DESCRIPTION

The subsample variance of the mean is the subsample variance divided by the subsample size. The variance of the mean plot is used to answer the question: ``Does the subsample variation of the mean change over different subsamples?'' It consists of:

Vertical axis = subsample variance of the mean;
Horizontal axis = subsample index.
In addition, a horizontal line is drawn representing the full sample variance of the mean. The appearance of the 2 traces is controlled by the first 2 settings of the LINES, CHARACTERS, SPIKES, BARS, and similar attributes.

SYNTAX

VARIANCE OF THE MEAN PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable;
<x> is the subsample identifier variable (this variable appears on the horizontal axis);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

VARIANCE OF THE MEAN PLOT Y X
VARIANCE OF THE MEAN PLOT Y X SUBSET X > 2

DEFAULT

None

SYNONYMS

VARIANCE OF MEAN PLOT
VARM PLOT
VM PLOT

RELATED COMMANDS

CHARACTERS = Sets the type for plot characters.
LINES = Sets the type for plot lines.
STANDARD DEVIATION PLOT = Generates a standard deviation plot.
VARIANCE PLOT = Generates a variance plot.
STANDARD DEVI OF MEAN PLOT = Generates standard deviation of mean plot.
RANGE PLOT = Generates a range plot.
MEAN PLOT = Generates a mean plot.
MEDIAN PLOT = Generates a median plot.
BOX PLOT = Generates a box plot.
S CHART = Generates a standard deviation control chart.
PLOT = Generates a data or function plot.

APPLICATIONS

Quality Control

IMPLEMENTATION DATE

88/2

PROGRAM

SKIP 25
READ GEAR.DAT DIAMETER BATCH
LINE BLANK DASH
CHARACTER X BLANK
XTIC OFFSET 0.2 0.2
Y1LABEL VARIANCE OF THE MEAN
X1LABEL BATCH
TITLE VARIANCE OF THE MEAN PLOT
VARIANCE OF THE MEAN PLOT DIAMETER BATCH

VECTOR PLOT

PURPOSE

Generates a vector plot.

DESCRIPTION

A vector plot is a plot in which pairs of points are drawn as vectors. That is, an arrow is drawn from the first point to the second point. Several formats for storing the vectors are supported.

SYNTAX 1

VECTOR PLOT <y1> <x1> <y2> <x2> <SUBSET/EXCEPT/FOR qualification>
where <y1> is a variable containing the y coordinates for the first point;
<x1> is a variable containing the x coordinates for the first point;
<y2> is a variable containing the y coordinates for the second point;
<x2> is a variable containing the x coordinates for the second point;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

SYNTAX 2

VECTOR PLOT <y1> <x1> <ydelta> <xdelta> <SUBSET/EXCEPT/FOR qualification>
where <y1> is a variable containing the y coordinates for the first point;
<x1> is a variable containing the x coordinates for the first point;
<ydelta> is a variable containing the change in the vertical direction from the first point to the second point;
<xdelta> is a variable containing the change in the horizontal direction from the first point to the second point;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

SYNTAX 3

VECTOR PLOT <y1> <x1> <angle> <length> <SUBSET/EXCEPT/FOR qualification>
where <y1> is a variable containing the y coordinates for the first point;
<x1> is a variable containing the x coordinates for the first point;
<angle> is a variable containing the angle between the two points;
<length> is a variable containing the length of the vector between the two points;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

VECTOR PLOT Y1 X1 Y2 X2
VECTOR FORMAT DELTA
VECTOR PLOT Y1 X1 YDEL XDEL
VECTOR FORMAT ANGLE
VECTOR PLOT Y1 X1 ANGLE LENGTH

NOTE 1

The VECTOR FORMAT command specifies which syntax to use. The default is syntax 3 (i.e., specify the coordinates of the first point, the angle, and a length). VECTOR FORMAT DELTA specifies syntax 2 and VECTOR FORMAT POINTS specifies syntax 1.

NOTE 2

The line style, thickness, and color of the line segment portion of the arrow are set by the LINE, LINE THICKNESS, and LINE COLOR commands respectively. The size of the arrowhead is set by the CHARACTER HW or CHARACTER SIZE command. The CHARACTER FILL command specifies whether the arrow head is filled or hollow.

NOTE 3

The following two commands affect the appearance of the arrow head.

VECTOR ARROW <FIXED or VARIABLE>
VECTOR ARROW <OPEN or CLOSED>
Entering FIXED means all the arrow heads are drawn the same size. Entering VARIABLE means that the size of the arrow head is scaled to the length of the longest vector. The arrow head is drawn as a triangle. Entering OPEN specifies that the base of the triangle is not drawn while entering CLOSED specifies that it is. The defaults are FIXED and CLOSED.

NOTE 4

The VECTOR PLOT command currently supports only 2D plots. A 3D vector plot can be generated with a little more effort. You can do something like the following:

READ X Y Z TAG
x1 y1 z1 1
x2 y2 z2 1
x3 y3 z3 2
x4 y4 z4 2
...
END OF DATA
CHARACTER VECTOR ALL
3D-PLOT Z Y X TAG
The key is that the TAG variable identifies pairs of points (i.e., the starting point and the ending point). Setting the character type to VECTOR specifies that an arrow is drawn between pairs of points with the same value for TAG.

NOTE 5

Two bugs were fixed 94/2.

2. For vector plots, the pre-sort switch should be set to OFF. This is now done automatically. Earlier versions can work around this by entering the command PRE-SORT OFF (and then PRE-SORT ON after the VECTOR PLOT command).

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

VECTOR FORMAT = Specify the data format for vector plots.
VECTOR ARROW = Specify the attributes of the arrow for vector plots.
LINES = Sets the type for plot lines.
LINE COLOR = Sets the color for plot lines.
LINE THICKNESS = Sets the thickness for plot lines.
CHARACTER = Sets the type for plot characters.
CHARACTER HW = Sets the height and width for plot characters.
CHARACTER SIZE = Sets the size for plot characters.
CHARACTER FILL = Sets the fill switch for plot characters.
CHARACTER COLOR = Sets the color for plot characters.
PLOT = Generates a data or function plot.

APPLICATIONS

Data Analysis

IMPLEMENTATION DATE

92/10

PROGRAM

READ Y1 X1 Y2 X2
0.1000000E+02 -0.1524000E+02 0.1051449E+02 -0.1734317E+02
0.1000000E+02 -0.1397000E+02 0.1094621E+02 -0.1644858E+02
0.1000000E+02 -0.1143000E+02 0.1378484E+02 -0.1588376E+02
0.1000000E+02 -0.1016000E+02 0.1672586E+02 -0.1649936E+02
0.1000000E+02 -0.8890000E+01 0.1808503E+02 -0.1553822E+02
0.1000000E+02 -0.7620000E+01 0.1995356E+02 -0.1463254E+02
0.1000000E+02 -0.6350000E+01 0.2206867E+02 -0.1353830E+02
0.1000000E+02 -0.5080000E+01 0.2215228E+02 -0.1128796E+02
0.1000000E+02 -0.3810000E+01 0.1978718E+02 -0.8024862E+01
0.1000000E+02 -0.2540000E+01 0.1635728E+02 -0.4799302E+01
0.1000000E+02 -0.1270000E+01 0.1361335E+02 -0.2298972E+01
0.1000000E+02 0.0000000E+00 0.1239326E+02 -0.5136330E+00
0.1000000E+02 0.1270000E+01 0.1173628E+02 0.1501220E+01
0.1000000E+02 0.2540000E+01 0.1153493E+02 0.2589486E+01
0.1000000E+02 0.3810000E+01 0.1203320E+02 0.3974670E+01
0.1000000E+02 0.5080000E+01 0.1404934E+02 0.5650798E+01
0.1000000E+02 0.6350000E+01 0.1606889E+02 0.7883220E+01
0.1000000E+02 0.7620000E+01 0.1955596E+02 0.1140826E+02
0.1000000E+02 0.8890000E+01 0.2272479E+02 0.1506554E+02
0.1000000E+02 0.1016000E+02 0.2338944E+02 0.1776808E+02
0.1000000E+02 0.1143000E+02 0.2053801E+02 0.1859783E+02
0.1000000E+02 0.1270000E+02 0.1738284E+02 0.1887298E+02
0.1000000E+02 0.1397000E+02 0.1535987E+02 0.1946041E+02
0.1000000E+02 0.1524000E+02 0.1346489E+02 0.1979359E+02
END OF DATA
CHARACTER VECTOR ALL; CHARACTER HW 0.3 0.6 ALL
VECTOR ARROW OPEN; VECTOR FORMAT POINT; TITLE AUTOMATIC
VECTOR PLOT Y1 X1 Y2 X2

WEIBULL PLOT

PURPOSE

Generates a Weibull plot.

DESCRIPTION

A Weibull plot is a graphical data analysis technique for determining if a 2-parameter (location t0 = 0) Weibull distribution provides a good distributional model for the data. A good distributional fit is indicated by linearity in the Weibull plot. It consists of:

Vertical axis = cumulative percent occurred (in a loge(loge(1/(1-p)))) scale where p = (i-.3)/(n+.4));
Horizontal axis = failure time (in a log10 scale).
For the general (3-parameter) Weibull distribution, the cumulative distribution function F(t), the density function f(t), and the percent point function G(p) are (respectively):

F(t) = p = 1 - exp(-(zbeta))
f(t) = (beta/nu) * z(beta-1) *exp(-(zbeta))
G(p) = t = t0 + nu * (loge(1/(1-p)))(1/beta)
with

z = (t-t0)/nu
Namewise,

t0 = location parameter (smallest allowable t);
nu = scale parameter (= ``characteristic life'', note that t0+nu falls at the 63.2% point irrespective of beta);
beta = shape parameter (specifies the member of the Weibull family).
For the 2-parameter (t0 = 0) Weibull distribution, simplifications occur and so the percent point function G(p) reduces to:

G(p) = t = nu * (loge(1/(1-p)))**(1/beta)
which by rearrangement becomes

loge(loge(1/(1-p))) = beta * loge(G(p)/nu)
loge(loge(1/(1-p))) = beta * loge(t/nu)
loge(loge(1/(1-p))) = -beta*loge(nu) + beta*loge(t)
After a minor adjustment to take loge(t) to log10(t), this last expression defines the resulting Weibull plot. That is, the left side appears vertically on the Weibull plot and the right side appears horizontally.

If the data follows a 2-parameter distribution, then the plot will be near-linear and the slope of the plot will be identically beta (the shape parameter). For the Weibull plot, both beta and eta are estimated (behind the scenes) by least squares.

In addition to the raw data, three other lines are drawn on the plot:

2. A horizontal line at the 63.2% point;
3. A vertical line at the intersection point of the fitted line and the 63.2% line.

SYNTAX 1

WEIBULL PLOT <y> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response variable (e.g., days to failure);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

SYNTAX 2

WEIBULL PLOT <y> <tag> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response variable (e.g., days to failure);
<tag> is a 0 or 1 indicator variable where 1 indicates that the item failed by the failure mode of interest and 0 indicates that the item failed but by some other failure mode which is not of interest;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

WEIBULL PLOT Y
WEIBULL PLOT Y TAG
WEIBULL PLOT Y SUBSET MATERIAL 4
WEIBULL PLOT Y TAG SUBSET MATERIAL 5 SUBSET PROCESS 3

NOTE 1

The value of beta indicates the current status of the failures:

beta . hazard function . failure type

........................................

< 1 . decreasing . infant mortality

1 . constant . exponential

> 1 . increasing . old-age wear out

NOTE 2

The following parameters are automatically produced by DATAPLOT when using the WEIBULL PLOT command:

ETA = estimated ``characteristic life''
BETA = estimated shape parameter
SDETA = estimated standard deviation of eta
SDBETA = estimated standard deviation of beta
BPT1 = estimated 0.1% point of failure times
BPT5 = estimated 0.5% point of failure times
B1 = estimated 1% point of failure times
B5 = estimated 5% point of failure times
B10 = estimated 10% point of failure times
B20 = estimated 20% point of failure times
B50 = estimated 50% point of failure times
B80 = estimated 80% point of failure times
B90 = estimated 90% point of failure times
B95 = estimated 95% point of failure times
B99 = estimated 99% point of failure times
B995 = estimated 99.5% point of failure times
B999 = estimated 99.9% point of failure times

NOTE 3

The attributes of the 4 lines on the Weibull plot can be specified via the LINES and CHARACTER commands (along with their attribute setting commands).

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

CHARACTERS = Sets the type for plot characters.
LINES = Sets the type for plot lines.
WEIBULL PPCC PLOT = Generates a Weibull probability plot correlation coefficient plot.
WEIBULL PROBABILITY PLOT = Generates a (fixed GAMMA) Weibull probability plot.
NORMAL PLOT = Generates a Normal plot.
HISTOGRAM = Generates a histogram.
BOX PLOT = Generates a box plot.
PLOT = Generates a data or function plot.

APPLICATIONS

Reliability and Life Testing

IMPLEMENTATION DATE

88/2

PROGRAM 1

SKIP 25
READ HAHN.DAT MILES TAG
LINE SOLID DASH DOT DOT
TITLE AUTOMATIC
WEIBULL PLOT MILES TAG

WINDSORIZED MEAN PLOT

PURPOSE

Generates a subsample Windsorized mean versus subsample index plot.

DESCRIPTION

The subsample Windsorized mean is the mean of the data with 100p1% of the bottom data replaced by the next larger value, and with 100p2% of the top data replaced by the next smaller value, in the ascending ordered data set. The Windsorized mean plot is used to answer the question: ``Does the subsample location change over different subsamples?'' It consists of:

Vertical axis = subsample Windsorized mean;
Horizontal axis = subsample index.
In addition, a horizontal line is drawn representing the full sample Windsorized mean. The appearance of the 2 traces is controlled by the first 2 settings of the LINES, CHARACTERS, SPIKES, BARS, and similar attributes.

SYNTAX

WINDSORIZED MEAN PLOT <y> <x> <SUBSET/EXCEPT/FOR/qualification>
where <y> is the response (= dependent) variable;
<x> is the subsample identifier variable (this variable appears on the horizontal axis);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

WINDSORIZED MEAN PLOT Y X
WINDSORIZED MEAN PLOT Y X1 SUBSET X1 > 2

NOTE

The analyst usually precedes the WINDSORIZED MEAN PLOT command with 2 LET commands of the following type:

LET P1 = .05
LET P2 = .10
where P1 and P2 are both between 0 and 1 and indicate the amount of data to be Windsorized from the left and right, respectively. In the example here, P1 = .05 and P2 = .10 indicates that 5% of the smallest data are replaced by the next larger value and 10% of the largest data are replaced by the next smaller value before forming the mean. Values greater than 1 are interpreted as percents. That is, P1 and P2 could be entered as 5 and 10 respectively.

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

CHARACTERS = Sets the type for plot characters.
LINES = Sets the type for plot lines.
SD PLOT = Generates a standard deviation plot.
VARIANCE PLOT = Generates a variance plot.
RANGE PLOT = Generates a range plot.
MEAN PLOT = Generates a mean plot.
MEDIAN PLOT = Generates a median plot.
MIDMEAN PLOT = Generates a midmean plot.
MIDRANGE PLOT = Generates a midrange plot.
TRIMMED MEAN PLOT = Generates a trimmed mean plot.
BOX PLOT = Generates a box plot.
XBAR CHART = Generates an xbar control chart.
PLOT = Generates a data or function plot.

APPLICATIONS

Quality Control

IMPLEMENTATION DATE

88/2

PROGRAM

SKIP 25
READ GEAR.DAT Y X
LET P1 = 10
LET P2 = 10
LINE BLANK DASH
CHARACTER X BLANK
XTIC OFFSET 0.2 0.2
Y1LABEL WINDSORIZED MEAN
X1LABEL SAMPLE ID
TITLE AUTOMATIC
WINDSORIZED MEAN PLOT Y X

XBAR CHART

PURPOSE

Generates a mean control chart.

DESCRIPTION

An xbar (or mean) control chart is a data analysis technique for determining if a measurement process has gone out of statistical control. The xbar chart is sensitive to shifts in location in the measurement process. It consist of:

Vertical axis = the mean for each sub-group.
Horizontal axis = sub-group designation.
In addition, horizontal lines are drawn at the overall mean and at the upper and lower control limits. The distribution of the response variable is assumed to be normal. This assumption is the basis for calculating the upper and lower control limits.

SYNTAX

XBAR CHART <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the response (= dependent) variable (containing the raw data values);
<x> is an independent variable (containing the sub-group identifications);
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

XBAR CHART Y X
XBAR CHART Y X SUBSET X > 2

NOTE

The attributes of the 4 traces can be controlled by the standard LINES, CHARACTERS, BARS, and SPIKES commands. Trace 1 is the response variable, trace2 is the mean line, and traces 3 and 4 are the control limits. Some analysts prefer to draw the response variable as a spike or character rather than a connected line.

DEFAULT

None

SYNONYMS

XBAR CONTROL CHART, MEAN CONTROL CHART, MEAN CHART, X CHART, AVERAGE CONTROL CHART, and AVERAGE CHART are synonyms for XBAR CHART.

RELATED COMMANDS

R CHART = Generates a range control chart.
S CHART = Generates a standard deviation control chart.
P CHART = Generates a p control chart.
NP CHART = Generates a Np control chart.
U CHART = Generates a U control chart.
C CHART = Generates a C control chart.
Q CONTROL CHART = Generates a Quesenberry style control chart.
CHARACTERS = Sets the types for plot characters.
LINES = Sets the types for plot lines.
SPIKES = Sets the on/off switches for plot spikes.
BARS = Sets the on/off switches for plot bars.
PLOT = Generates a data or function plot.
LAG PLOT = Generates a lag plot.
4-PLOT = Generates 4-plot univariate analysis.
MEAN PLOT = Generates a mean versus subset plot.

APPLICATIONS

Quality Control

IMPLEMENTATION DATE

88/2

PROGRAM

SKIP 25
READ GEAR.DAT DIAMETER BATCH
.
LINE SOLID SOLID DOT DOT
TITLE AUTOMATIC
X1LABEL GROUP-ID
Y1LABEL MEAN
X CHART DIAMETER BATCH

YOUDEN PLOT

PURPOSE

Generates a Youden plot.

DESCRIPTION

A Youden plot is a graphical data analysis technique for carrying out an interlab comparison where each lab has made 2 runs on the same product or 1 run on 2 different products. The Youden plot answers the question: ``Are the labs in the study all behaving as if from the same population?'' It consists of:

Vertical axis = data from run 1;
Horizontal axis = data from run 2.
The various labs in the study are encoded by the plot character within the plot. In the ideal case (all labs from same population), the Youden plot will have a structureless ``random shotgun patter.'' Any structured deviation from this ``shotgun pattern'' suggests one or another lab is ``different from the rest.'' The advantage of the Youden plot is 2-fold:

2. Within-lab differences are easy to detect (displacement drawn with a fixed size and with the base).
Typically the Youden plot has no connecting lines between the data points and it has the lab identification imbedded in the plot characters. For example, if there are 8 labs, enter the following commands:

LINES BLANK ALL
CHARACTERS 1 2 3 4 5 6 7 8
or

LINES BLANK ALL
CHARACTERS B R S A T G B X

SYNTAX

YOUDEN PLOT <y1> <y2> <lab> <SUBSET/EXCEPT/FOR qualification>
where <y1> is the first response variable;
<y2> is the second response variable;
<lab> is the coded laboratory variable;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

YOUDEN PLOT Y1 Y2 LAB
YOUDEN PLOT Y1 Y2 LAB SUBSET MONTH 4
YOUDEN PLOT Y1 Y2 LAB SUBSET MATERIAL 1 TO 5

DEFAULT

None

SYNONYMS

PLOT Y1 Y2 TAG is a synonym for YOUDEN PLOT Y1 Y2 TAG.

RELATED COMMANDS

CHARACTERS = Sets the type for plot characters.
LINES = Sets the type for plot lines.
BIHISTOGRAM = Generates a bihistogram.
QUANTILE-QUANTILE PLOT = Generates a quantile-quantile plot
PLOT = Generates a data or function plot.
MULTIPLOT = Allows multiple plots per page.
T-TEST = Carries out a 2 sample t test.
ANOVA = Carries out an ANOVA.

REFERENCE

``Graphical Methods for Data Analysis,'' Chambers, Cleveland, Kleiner, and Tukey, Wadsworth, 1983.

APPLICATIONS

Interlaboratory Analysis

IMPLEMENTATION DATE

88/9

PROGRAM

SKIP 25
READ UGIANSKY.DAT Y1 Y2 LAB
.
CHARACTERS 1 2 3 4 5 6 7 8 9
CHARACTER SIZE 4 ALL
LINES BLANK ALL
LIMITS 0 5.5
TITLE AUTOMATIC
Y1LABEL SAMPLE 1 DAYS TO FAILURE
X1LABEL SAMPLE 2 DAYS TO FAILURE
LEGEND 1 CHAR = LAB ID
YOUDEN PLOT Y1 Y2 LAB

3D PLOT

PURPOSE

Generates a 3-dimensional plot.

DESCRIPTION

The 3D-PLOT command allows the analyst to generate single or multi-surface 3d-plots of data, functions, or both.

There are 6 general 3D-PLOT syntaxes:

2. 4-variable multi-trace form
3. VERSUS form
4. multi-VERSUS form
5. function form
6. AND form

SYNTAX 1

3D-PLOT <y> <x1> <x2> <SUBSET/EXCEPT/FOR qualification>
where <y> is a response variable;
<x1> is a response variable;
<x2> is a response variable;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This is the 3-argument form for the 3D-PLOT command. It is used for plotting <y> versus <x1> and <x2>. The resulting plot will have <y> on the vertical axis, <x1> on one horizontal axis, and <x2> on the other horizontal axis. Some examples are:

3D-PLOT Y X1 X2
3D-PLOT RES P1 P2

SYNTAX 2

3D-PLOT <y> <x1> <x2> <tag> <SUBSET/EXCEPT/FOR qualification>
where <y> is a response variable;
<x1> is a response variable;
<x2> is a response variable;
<tag> is a coded variable for identifying traces;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This is the 4-argument form for the 3D-PLOT command. It is used for multi-trace plotting of <y> versus <x1> and <x2>. The resulting plot will have <y> on the vertical axis, <x1> on one horizontal axis, <x2> on the other horizontal axis, and will have multiple traces-- one trace for each distinct value in the <tag> variable. Some examples are:

3D-PLOT Y X1 X2 LAB
3D-PLOT PRES TEMP1 TEMP2 DAY

SYNTAX 3

3D-PLOT <y1> <y2> <y3> ... <yk> VERSUS <x1> <x2>
where <y1>, <y2>, <y3>, ..., <yk> are response variables to be plotted on the vertical axis;
<x1> and <x2> are response variables to be plotted on the horizontal axes;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.
This is the single-VERSUS argument form for the 3D-PLOT command. It is used for multi-trace 3-dimensional plotting. The resulting 3-d plot will have one trace for each <yi> variable:

<y1> (vertically) versus <x1> and <x2> (horizontally)
<y2> (vertically) versus <x1> and <x2> (horizontally)
<y3> (vertically) versus <x1> and <x2> (horizontally)
...
<yk> (vertically) versus <x1> and <x2> (horizontally)
Some examples are:

3D-PLOT Y1 Y2 Y3 VERSUS X1 X2
3D-PLOT Y PRED VERSUS X1 X2

SYNTAX 4

3D-PLOT <syntax 3> <syntax 3> ... <syntax 3>
This is the multi-VERSUS argument form for the 3D-PLOT command. It is used for multi-trace 3-dimensional plotting. The resulting 3-d plot will have one trace for each <yi> variable:

<y1> (vertically) versus <x1> and <x2> (horizontally)
<y2> (vertically) versus <x3> and <x4> (horizontally)
<y3> (vertically) versus <x5> and <x6> (horizontally)
...
<yk> (vertically) versus <x(2k-1)> and <x(2k)> (horizontally)
Some examples are:

3D-PLOT Y1 Y2 Y3 VERSUS X1 X2 Y4 Y5 VERSUS X3 X4
3D-PLOT P1 VERSUS T1 T2 P2 VERSUS T3 T4 P3 VERUS T5 T6

SYNTAX 5

3D-PLOT <f> FOR <x1> = <start 1> <increment 1> <stop 1> FOR <x2> = <start 2> <increment 2> <stop 2>
where <f> is a function (either pre-defined via the LET FUNCTION command, or explicitly defined herein);
<x1> is one dummy variable in the function;
<start 1> is the desired minimum value for <x1> at which the function is to be evaluated;
<increment 1> is the desired increment value for <x1> at which the function is to be evaluated;
<stop 1> is the desired maximum value for <x1> at which the function is to be evaluated;
<x2> is the other dummy variable in the function;
<start 2> is the desired minimum value for <x2> at which the function is to be evaluated;
<increment 2> is the desired increment value for <x2> at which the function is to be evaluated;
and <stop 2> is the desired maximum value for <x2> at which the function is to be evaluated.
This is the function form for the 3D-PLOT command. It is used for plotting the surface of a function. Some examples are:

3D-PLOT SIN(X)*EXP(-X-Y) FOR X = 0 .1 5 FOR Y = 1 .1 2
LET FUNCTION F = EXP(-X*SIN(X**2+Y**2))
3D-PLOT F FOR X = 0 .1 3 FOR Y = 1 .2 2

SYNTAX 6

<any valid syntax 1 to 5> AND

<any valid syntax 1 to 5> AND

<any valid syntax 1 to 5> AND

...

<any valid syntax 1 to 5> AND

<any valid syntax 1 to 5>

This is the most general syntax for 3D-PLOT. It is used for generating multi-trace plots of variables, of functions, or of mixtures of both. Some examples are:

3D-PLOT Y X1 X2 AND
3D-PLOT A+B*X*Y FOR X = 1 1 10 FOR Y = 0 .2 1
3D-PLOT Y X1 X2 AND
3D-PLOT A*X+Y**2 FOR X = 1 .1 3 FOR Y = 2 .1 3 AND
3D-PLOT Y3 X3 X4 LAB

NOTE 1

The view for the plot is determined by the eye coordinates. The default eye coordinates for all 3 dimensions are:

See the documentation for EYE COORDINATES for details.

NOTE 2

If the 3D plot is compressed in one or more directions, the most likely problem is that the X, Y, and Z scales have different ranges (e.g., X and Y go from 0 to 1000 while Z goes from 0 to 1). One solution to this problem is to scale the data to the same range via the LET command (e.g., divide each of them by the appropriate power of 10 so that they all go from -1 to +1).

NOTE 3

Increasing the magnitude of the eye coordinates will shrink the size of the 3D plot. That is, the further away the eye is from the plot, the smaller the plot appears.

NOTE 4

The eye coordinates can be negative. This can be useful for looking at the plot from a different perspective. The following algorithm can be a useful starting point:

TITLE AUTOMATIC
MULTIPLOT 2 2
MULTIPLOT CORNER COORDINATES 0 0 100 100
LET XMAX = MAXIMUM X; LET XMIN = MINIMUM X; LET XEYE = XMAX + 3*(XMAX-XMIN)
LET YMAX = MAXIMUM Y; LET YMIN = MINIMUM Y; LET YEYE = YMAX + 3*(YMAX-YMIN)
LET ZMAX = MAXIMUM Z; LET ZMIN = MINIMUM Z; LET ZEYE = ZMAX + 3*(ZMAX-ZMIN)
. All positive
EYE COORDINATES XEYE YEYE ZEYE
3D-PLOT ...
. X view negative
LET XTEMP = -XEYE
EYE COORDINATES XTEMP YEYE ZEYE
3D-PLOT ...
. Y view negative
LET YTEMP = -YEYE
EYE COORDINATES XEYE YTEMP ZEYE
3D-PLOT ...
. Both X and Y views negative
EYE COORDINATES XTEMP YTEMP ZEYE
3D-PLOT ...
Most reasonable views generate plots that are only marginally different from one of these 4 views. Changing the magnitude of the eye coordinates can make the plot slightly larger or smaller, but will not change the basic appearance. Making the Z eye coordinate negative is generally not helpful.

The ROTATE EYE command can be used to automatically rotate the eye coordinates. This command can be used with the LOOP and MULTIPLOT commands to automatically display various rotations of the 3d plot. This is demonstrated in the second program example.

NOTE 5

Some limitations of the 3D-PLOT command are:

2. No axes or axes labels are drawn. Text labels can be added with either the LEGEND or TEXT command (although the analyst will need to do the proper positioning). The 3D FRAME command can be used draw a frame around the plot. Although this essentially draws the axes, it has no capability for putting tic marks or tic mark labels on the frame lines.
3. Shaded 3d-plots and solid 3d objects are currently not supported.
4. Dynamic 3d-plots (e.g., spinning the 3d-plot under user control) are currently not supported. The MULTIPLOT and LOOP command can be used in conjunction with the ROTATE EYE command to emulate this capability somewhat.
5. Specialized 3d charts (such as 3d frequency polygons or 3d histograms) are not available.
6. Alternate projection methods are not available.
Future implementations should address some of these limitations.

NOTE 6

The plot traces can be drawn as lines, characters, spikes, or bars. The LINES, CHARACTER, SPIKES, and BAR commands can be used to set these (along with their various attribute setting commands).

DEFAULT

None

SYNONYMS

3DPLOT is a synonym for 3D-PLOT.

VS and VS. can be used as synonyms for VERSUS.

RELATED COMMANDS

EYE COORDINATES = Specify the eye coordinates for 3d plots.
ROTATE EYE = Rotate the current eye coordinates.
3DFRAME = Specify the type of frame (if any) to be drawn on 3d plots.
PLOT = Generates a 2d data or function plot.
CONTOUR PLOT = Generates a contour plot.
CHARACTERS = Sets the types for plot characters.
LINES = Sets the types for plot lines.
SPIKES = Sets the on/off switches for plot spikes.
BARS = Sets the on/off switches for plot bars.
MULTIPLOT = Generate multiple plots per page.

APPLICATIONS

Data Analysis

IMPLEMENTATION DATE

Pre-1987

PROGRAM 1

LET FUNCTION E = -0.5*((X**2)+(Y**2))
LET FUNCTION F = (1/(2*PI))*EXP(E)
TITLE AUTOMATIC
3D-PLOT F FOR X = -2 .2 2 FOR Y = -2 .1 2

PROGRAM 2

LET FUNCTION F = SIN(X+COS(Y))
3DFRAME 3PLANE
FEEDBACK OFF
TITLE AUTOMATIC
.
MULTIPLOT 4 4; MULTIPLOT CORNER COORDINATES 0 0 100 100
LOOP FOR K = 1 1 16
ROTATE
3DPLOT F FOR X = -2 .2 2 FOR Y = -2 .2 2
END OF LOOP
END OF MULTIPLOT

PROGRAM 3

. THIS IS THE DATAPLOT PROGRAM FILE BOXYIELD.DP
. PURPOSE--GENERATE A 3D PLOT OF A NON-LINEAR FUNCTION
.
LET M1 = 70
LET M2 = 155
LET SIG1 = 10
LET SIG2 = 5
LET RHO = 1.6
.
LET FUNCTION X1 = (TIME-M1)/SIG1
LET FUNCTION X2 = (TEMP-M2)/SIG2
LET FUNCTION F1 = EXP(-X1**2)
LET FUNCTION F2 = EXP(RHO*X1*X2)
LET FUNCTION F3 = EXP(-X2**2)
LET FUNCTION F = 10*((1000*F1*F2*F3)**.25)
.
TITLE AUTOMATIC
3DPLOT F FOR TIME = 50 2 90 FOR TEMP = 130 2 180

PROGRAM 4

. THIS IS DATAPLOT PROGRAM DEXSURF.DP
. PURPOSE--GENERATE VARIOUS SURFACES
. UNDER LINEAR + INTERACTION MODEL
. DATE--JULY 1989
.
EYE COORDINATES 10 20 30
XLABEL SIZE 5
X3LABEL SIZE 5
LET X1 = SEQUENCE -1 .2 1 FOR I = 1 1 121
LET X2 = SEQUENCE -1 11 .2 1
TITLE AUTOMATIC
.
LOOP FOR B12 = -3 3 3
MULTIPLOT 3 3; MULTIPLOT CORNER COORDINATES 0 0 100 100
LOOP FOR B1 = -3 3 3
LOOP FOR B2 = -3 3 3
XLABEL B1 = ^B1 B2 = ^B2
X3LABEL B12 = ^B12
LET Y = B1*X1+B2*X2+B12*X1*X2
3DPLOT Y X1 X2 X1 AND
3DPLOT Y X1 X2 X2
END OF LOOP
END OF LOOP
END OF MULTIPLOT
END OF LOOP

4-PLOT

PURPOSE

Generates a run sequence plot, a lag plot, a histogram, and a normal probability plot on the same page.

DESCRIPTION

This plot is a useful summary plot when taking an initial look at a variable.

SYNTAX

4-PLOT <y1> <SUBSET/EXCEPT/FOR qualification>
where <y1> is a response variable;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

4-PLOT Y1
4-PLOT RUN1
4-PLOT Y1 SUBSET TAG = 1

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

HISTOGRAM = Generates a histogram.
NORMAL PROBABILITY PLOT = Generates a normal probability plot.
PLOT = Generates a data or function plot.
LAG PLOT = Generates a lag plot.
MULTIPLOT = Allows multiple plots per page
SUMMARY = Computes various summary statistics.

APPLICATIONS

Exploratory Data Analysis

IMPLEMENTATION DATE

Pre-1987

PROGRAM

LET Y1 = NORMAL RANDOM NUMBERS FOR I = 1 1 100
TITLE AUTOMATIC
4-PLOT Y1

6-PLOT

PURPOSE

Generate 6 plots on the same page that are useful after a fit.

DESCRIPTION

This plot is used after some type of fit to generate some of the most common diagnostic plots in a convenient way. The plots are:

2. the residuals versus the independent variable (uses whatever the current settings are for the LINE and CHARACTERS commands);
3. the residuals versus the predicted values (uses whatever the current settings are for the LINE and CHARACTERS commands);
4. a lag plot of the residuals (uses an X as the plot symbol);
5. a histogram of the residuals;
6. and a normal probability plot of the residuals.
The 6-PLOT command does not perform a fit. It assumes that this has been done in a prior command and that the PRED and RES variables are the results from that fit. The dependent variable and the independent variable used in the fit are given as the 2 arguments to the 6-PLOT command. If a multi-variable fit was performed, specify the independent variable you want to use on the horizontal axis for the first 2 plots (plots against the remaining independent will have to be generated with additional PLOT commands).

SYNTAX

6-PLOT <y> <x> <SUBSET/EXCEPT/FOR qualification>
where <y> is the dependent variable that was used in the most recent fit;
<x> is an independent variable used in the most recent fit;
and where the <SUBSET/EXCEPT/FOR qualification> is optional.

EXAMPLES

6-PLOT Y X
6-PLOT RUN1
6-PLOT Y1 SUBSET TAG = 1

NOTE 1

If a fit has not yet been performed, DATAPLOT does not automatically generate one (and so the PRED and RES variables still contain all zero values).

NOTE 2

Remember that various commands generate updated values for RES and PRED. Specifically, LOWESS, SMOOTH, SPLINE FIT, and several others automatically update these variables. Be sure that the fit you want to plot was the most recent command to update the RES and PRED variables before entering the 6-PLOT command.

DEFAULT

None

SYNONYMS

None

RELATED COMMANDS

FIT = Performs a least squares linear or non-linear fit.
LOWESS = Peforms a locally weighted least square regression.
SMOOTH = Smooth a time series.
SPLINE FIT = Performs a spline fit.
ANOVA = Performs an analysis of variance.
MEDIAN POLISH = Performs a median polish.
HISTOGRAM = Generates a histogram.
NORMAL PROB PLOT = Generates a normal probability plot.
PLOT = Generates a data or function plot.
LAG PLOT = Generates a lag plot.
MULTIPLOT = Allows multiple plots per page

APPLICATIONS

Exploratory Data Analysis

IMPLEMENTATION DATE

93/12

PROGRAM

. ALASKA PIPELINE RADIOGRAPHIC DEFECT BIAS CURVE
. PERFORM A LINEAR REGRESSION
SKIP 25
READ BERGER1.DAT MEAS TRUE
FIT MEAS TRUE
TITLE AUTOMATIC
CHARACTER CIRCLE
CHARACTER SIZE 1.2
CHARACTER FILL ON
LINE BLANK SOLID
6-PLOT MEAS TRUE

(blank page)


Last Modified: 11:21am EST, February 24, 1997