Expression syntax

NASIS Online Help System

Expression syntax

Syntax:

The following syntax rules define all the types of expressions that may be created.

images/DEFINE_GRP11.jpg

images/DEFINE_GRP21.jpg

images/DEFINE_GRP31.jpg

Explanation of Expression Syntax

An expression can produce either a numeric or a character string value, depending on its contents. Numeric and character data can be mixed in expressions, with type conversions similar to those done by Informix. Evaluation of expressions follows C conventions for operator precedence. For example, the arithmetic expression: A + B * C is evaluated as A + (B * C) because multiplication has higher precedence than addition.

Most of the expressions involving arithmetic, boolean and comparison operators require little explanation. They work as would be expected, and produce numeric results. The operator ** denotes exponentiation; the expression A ** B is equivalent to the function POW(A, B). Comparisons and boolean expressions produce a 1 for True and a 0 for false.

If a null value is used in an expression, the result is normally null. However, in comparisons, a null value is treated as less than any non-null value and two nulls are considered equal to each other. In boolean expressions, a null is considered False. Invalid computations, such as division by zero, produce a null result. Special cases with null values are noted individually.

String Expressions

String expressions allow for substring extraction, string concatenation, and case changes. They expect to operate on character type input, and will convert the input to character if necessary. Note that when a number is converted to a string it is expressed with 6 decimal places. To produce different formats for numbers, use the SPRINTF function. The results of the following string expressions are always character strings:

expression [ n1:n2 ]

Returns a substring of the string expression, starting at position n1 for a length of n2 characters. The first character of the string is position 0. Note, this differs from the way substrings are defined in Informix queries.

Example: if variable A has the value "Sample", the expression A[1:3] returns the value "amp".

expression || expression

Concatenates two strings.

Example: the expression "ABC" || "DEF" produces the string "ABCDEF". If one expression in a concatenation is null it is treated as the string "", so the result is not a null value unless both of the expressions are null.

CLIP (expression)

Removes trailing blanks from a string. This is not normally necessary because NASIS removes trailing blanks when reading data from the database.

Example: the expression CLIP("ABC ") produces the string "ABC".

UPCASE (expression)

Converts a string to upper case.

Example: the expression UPCASE("ABc12") produces the string "ABC12".

LOCASE (expression)

Converts a string to lower case.

Example: the expression LOCASE("ABc12") produces the string "abc12".

NMCASE (expression)

Converts a string to "name" case: first letter of each word upper case and the remainder lower case.

Example: the expression NMCASE("now is the time") produces the string "Now Is The Time".

SECASE (expression)

Converts a string to "sentence" case: first letter of the string upper case and the remainder lower case.

Example: the expression SECASE("now is the time") produces the string "Now is the time".

TEXTURENAME (expression)

Converts a set of texture codes to a special string format used in reports. The expression operated on by TEXTURENAME can have zero or more values, each of which is a string as used in the NASIS data element "texture". This element can contain a mixture of codes for texture classes, modifiers, and terms used in lieu of texture. The codes are expanded and concatenated together, with commas as necessary, to produce a texture description as used in manuscript reports.

Example: if the variable T has two values, one of which is "SL", and the other is "SR- CL GR-SIL", the expression TEXTURENAME(T) produces a result with two values, the string "sandy loam", and the string "stratified clay loam to gravelly silt loam".

GEOMORDESC (expression, expression, expression)

Converts data from the component geomorphic description to a standard landform description string for use in reports. The three expressions used as input can be arrays, but all must have the same number of values. The first parameter is the feature name or names for a component, the second has the feature Id for each feature, and the third has the Exists-On reference for each feature. Where an Exists-On reference matches a feature ID, the two names are combined with the word "on". If two features have the same feature ID the Exists-On reference is attached to both and they are output as separate strings. Other features that do not have an Exists-On relationship are output as separate strings. The number of values in the result can be more or less than the number of values in the input expressions.

Example: Data for this operation would be obtained by joining the component geomorphic description table and the geomorphic feature table, such as:

EXEC SQL

SELECT geomorph_feat_name, geomorphic_feat_id, exists_on_feature

FROM component, component_geomorph_desc, real geomorph_feature

WHERE JOIN component TO component_geomorph_desc

AND JOIN component_geomorph_desc TO geomorph_feature;

AGGREGATE COLUMN geomorph_feat_name NONE, geomorphic_feat_id NONE, exists_on_feature NONE.

Assume this query produces the data shown in the following table:

geomorph_feat_name	geomorphic_feat_id	exists_on_feat
alluvial fan
till plain	1
pothole	2	1

The expression GEOMORDESC(geomorph_feat_name, geomorphic_feat_id, exists_on_feat) would produce a result with two values, "alluvial fan" and "pothole on till plain".

STRUCTPARTS (expression, expression, expression)

Converts data from the Pedon Horizon Soil Structure table to a standard structure description string for use in reports The parameters are used in the same manner as the GEOMORDESC function above. The first parameter would be the type of structure, usually a string concatenated from structure_grade, structure_size, and structure_type. The second parameter is the row identifier, structure_id, and the third parameter is the reference column, structure_parts_to. The only difference between GEOMORDESC and STRUCTPARTS is that the latter uses the words "parting to" to separate linked structures, instead of "on".

ARRAYCAT (expression, delimiter)

Concatenates the values in a multiple valued variable or expression, to produce a single valued result. The first argument is a multiple valued expression, and the second argument is a string to be used as a delimiter between the values. An empty string may be specified as the delimiter. If any values of the first argument are null, they and their associated delimiters are skipped. The result has dimension 0 if the first argument has dimension 0, otherwise it has dimension 1.

Example: If the variable A has four values, "A1", "A2", Null, and "A4", the expression ARRAYCAT (A, "-") would produce a single string: "A1-A2-A4".

Function Expressions

The following function expressions can use either character or numeric values, and produce results in the same type as the input, unless specified otherwise.

NEW (expression)

Returns True if the value of the expression is different from the value it had in the previous iteration of the script, or False if the value is the same as last time.

Example: the expression NEW (mapunit_symbol) would be True each time the mapunit symbol changed.

CODENAME (expression [ , name ] )

Returns the code name for the code value given by expression, using the data dictionary domain of the element name. The name must be a data element name or its alias from an EXEC SQL statement. The value of the expression must be a number representing the internal identifier for a code. This is the value normally returned by a query. If expression is the same as name you do not have to specify it twice.

Example: if the variable compkind were returned from a query, the expression CODENAME(compkind) would produce a string normally displayed in NASIS for that data element, such as "series". Code names are generally in lower case. The expression CODENAME(val, compkind), where val is a variable from a DEFINE statement, would produce the code name for a compkind whose value is in the variable val.

CODELABEL (expression [ , name ] )

Returns the code label for the code value given by expression, using the data dictionary domain of the element name. This operates just like CODENAME. The code label is typically the same as the code name but is capitalized properly for use in reports.

Example: in the above example, the expression CODELABEL(compkind) would produce "Series".

ARRAYCOUNT (expression)

Counts the number of non-null values in a multiple valued expression. It can operate on either a character or numeric argument, and will return a single numeric value of zero or more.

Example: if the variable A has three values, 1, 2, and NULL, the expression ARRAYCOUNT(A) would produce the result 2.

ARRAYMIN (expression)

Computes the minimum of the values in a multiple valued expression. It can operate on either a character or numeric argument, and will return a single value of the same type as its argument. In this case, a null value is not considered to be smaller than a non-null value. The result is null only if all values of the array are null. The result has dimension 0 if the original expression has dimension 0, otherwise it has dimension 1.

Example: if the variable A has three values, 1, 2, and 3, the expression ARRAYMIN(A) would produce the result 1.

ARRAYMAX (expression)

Computes the maximum of the values in a multiple valued expression. It can operate on either a character or numeric argument, and will return a single value of the same type as its argument. The result is null only if all values of the array are null. The result has dimension 0 if the original expression has dimension 0, otherwise it has dimension 1.

Example: if the variable A has three values, "X", "Y", and "Z", the expression ARRAYMAX(A) would produce the result "Z".

ARRAYMEDIAN (expression)

Locates the median value in a multiple valued expression, by sorting the non-null values and selecting the middle one. It can operate on either a character or numeric argument, but there is a slight difference in operation between the two. When there is an even number of values there is not a single middle value, so with numeric data the median is the average of the two middle values, and with character data the median is the larger of the two. The result is null only if all values of the array are null. The result has dimension 0 if the original expression has dimension 0, otherwise it has dimension 1.

Example: if the variable A has three values, "X", "Y", and "Z", the expression ARRAYMEDIAN(A) would produce the result "Y".

ARRAYMODE (expression)

Finds the modal value in a multiple valued expression by counting the occurrences of each distinct value and returning the value that occurs most often. In case of a tie, the smallest value is returned. It can operate on either a character or numeric argument, and will return a single value of the same type as its argument. The result is null only if all values of the array are null. The result has dimension 0 if the original expression has dimension 0, otherwise it has dimension 1.

Example: if the variable A has four values, 2, 3, 5, and 3, the expression ARRAYMODE(A) would produce the result 3.

ARRAYSHIFT (expression, expression)

Shifts the values in the first argument, which is a multiple valued variable, by the number of positions specified in the second argument, which has a single value. If the second argument (call it "n") is positive, the values are shifted "up", so that the value that was in position 1 moves to position n+1, and so on until the last n values are discarded. The first n array positions are assigned a null value. If the second argument is negative, the values are shifted in the opposite direction. The result has the same data type and number of values as the first argument.

Example: if the variable A has three values, 1, 2, and 3, the expression ARRAYSHIFT(A, -1) would produce a result with three values, 2, 3, and Null.

ARRAYROT (expression, expression)

Operates like ARRAYSHIFT but performs a rotation of the values in the first argument. Values shifted off one end of the array are moved onto the other end. If the number of positions shifted is greater than the number of values, the effect is to perform more than one rotation, or a rotation modulo the dimension.

Example: if the variable A has three values, 1, 2, and 3, the expression ARRAYROT(A, -4) would produce a result with three values, 2, 3, and 1.

LOOKUP (expression, expression, expression)

Selects one value from an array based on an index. The first expression is the key, which must be a single value, and the second expression is the index array. If the key value is found in the index array, the value from the corresponding array position in the third expression is returned, otherwise the result is null. If there is more than one match, the result has the values from all matching rows, so it is possible for the result to have more than one value. The second and third expressions must be arrays of equal dimension. A common error is to mismatch the dimensions of these two expressions, due to differences in the way they are aggregated. The first and second expressions must have the same type, and the result will have the type of the third expression.

Example: The variable max_thickness has a single number, the variable horizon_thickness has 6 numbers, and the variable ph_r has 6 numbers. The expression LOOKUP (max_thickness, horizon_thickness, ph_r) would return the value of ph_r from the horizon whose horizon_thickness value matches the value of max_thickness.

COUNT (expression)

Maintains a running count of the occurrences of the expression. On each iteration of the script the value of the expression is tested for a null, and if its not null the counters value is increased by one.

Example: a variable defined with the value COUNT(musym) could be printed at the end of a report to show the number of mapunits read (because musym cant be null).

MIN (expression)

Finds the smallest value of the expression. On each iteration of the script, the value of the expression is compared to an internal counter, and replaces the counters value if the expression is smaller. If a null value for the expression is encountered, the result of MIN becomes and remains null.

Internal counters for the MIN function cannot be reset.

Example: a variable defined with the value MIN(elevation) could be printed at the end of a report to show the minimum of elevation.

MAX (expression)

Finds the largest value of the expression. On each iteration of the script, the value of the expression is compared to an internal counter, and replaces the counters value if the expression is greater. Null values are smaller than any non-null value, so the result is only null if all input values are null.

Internal counters for the MAX function cannot be reset.

Example: a variable defined with the value MAX(elevation) could be printed at the end of a report to show the maximum of elevation.

SPRINTF ("format", expression [ , expression ] ... )

Formats one or more expression values into a character string using the C function sprintf (same as the Prelude sprintf). The first argument is a format specification, which must have a single value, and the remaining arguments are expressions whose values are to be formatted. If any of the expressions are multiple valued, the result is also multiple valued, and its dimension is that of the expression with the largest dimension.

It is the users responsibility to see that the number and type of the expressions correspond to the format, as there is no checking performed. Character data should use the %s formatting code, and numeric data should use the %f or %g formatting code.

Null values in the expressions produce an unusual result. The formatted value plus all characters of the format string up to the next % sign are skipped.

Example: The variable name has one character value, "Bob". The variable position has two numeric values, 10 and 12. The expression
SPRINTF ("%s:%.f", name, position) will produce a result containing two character values, "Bob:10" and "Bob:12".

USER

The user name from the data dictionary.

Example: if the person running NASIS has the login name "rose", the expression USER will return a single character value, "rose".

TODAY

The current date in mm/dd/yyyy format.

Example: the result of the expression TODAY might be "07/20/1998".

`unix command`

Returns the standard output from a UNIX command. The command line may contain report variables or data elements preceded with $.

Example: if the variable file contains a character string which is a file name, the expression `cat $file` would result in a single character string containing the complete contents of the file.

Numeric Functions

The following function expressions operate on numeric values, and produce numeric results. If the input values are character strings they are first converted to numbers.

ARRAYSUM (expression)

Computes the sum of the values in a multiple valued expression. It expects a numeric argument, and will try to convert character values to numbers. It returns a single numeric value. If individual values of the array are null they are treated as zeroes. The result is null only if the array has no values. The result has dimension 0 if the original expression has dimension 0, otherwise it has dimension 1.

Example: if the variable A has three values, 1, 2, and 3, the expression ARRAYSUM(A) would produce the result 6.

ARRAYAVG (expression)

Computes the average of the values in a multiple valued expression. It expects a numeric argument, and will try to convert character values to numbers. It returns a single numeric value. If individual values of the array are null they are not counted in the average. The result is null if all values are null. The result has dimension 0 if the original expression has dimension 0, otherwise it has dimension 1.

Example: if the variable A has three values, 1, 2, and 3, the expression ARRAYAVG(A) would produce the result 2.

ARRAYSTDEV (expression)

Computes the standard deviation of the values in a multiple valued expression. It expects a numeric argument, and will try to convert character values to numbers. It returns a single numeric value. If individual values of the array are null they are not included in the computation. The result is null if all values are null. The result has dimension 0 if the original expression has dimension 0, otherwise it has dimension 1.

Example: if the variable A has three values, 1, 2, and 3, the expression ARRAYSTDEV(A) would produce the result 1.

WTAVG (expression, expression)

Computes the sum of the first expressions values after multiplying each by a weighting factor, taken from the corresponding value of the second expression, then divides the result by the sum of the weights. The two expressions must be arrays of the same dimension. Individual null values are ignored in computing the average. The result is null if all the individual values are null. The result has dimension 0 if the original expressions have dimension 0, otherwise it has dimension 1.

Example: The variable comppct_r has 3 values (40, 30, 20) and the variable elev_r has three values (1000, 1200, 900). The expression WTAVG (elevation, comppct_r) would produce the value 1044.44, which is the average of the elevation values, weighted by the comp_pct values, or (1000*40 + 1200*30 + 900*20) / (40 + 30 + 20).

SUM (expression)

Computes a running total of the value of the expression. On each iteration of the script, the value of the expression is added to an internal counter. The result of the function is the value of that counter at each iteration. If a null value for the expression is encountered, the result of SUM becomes and remains null.

Internal counters for the SUM function cannot be reset. If you want to compute subtotals, use the ASSIGN statement to add the value of the expression to a defined variable rather than an internal counter. Then a conditional expression can be used to reset the variables value to 0 at the correct time.

Example: a variable defined with the value SUM(acres) could be printed at the end of a report to show the total of acres.

AVERAGE (expression)

Computes a running average of the value of the expression. On each iteration of the script, the value of the expression is added to an internal counter, and the result is divided by the number of values processed. If a null value for the expression is encountered, the result of AVERAGE becomes and remains null..

Internal counters for the AVERAGE function cannot be reset.

Example: a variable defined with the value AVERAGE(elev_r) could be printed at the end of a report to show the average of elevation.

LOGN (expression)

Computes the natural logarithm of the expression.

Example: the expression LOGN(10) produces the value 2.302585.

LOG10 (expression)

Computes the base 10 logarithm of the expression.

Example: the expression LOG10(10) produces the value 1.

EXP (expression)

Computes the exponential (e^x) of the expression.

Example: the expression EXP(1) produces the value of e, 2.718282.

COS (expression)

Computes the cosine of the expression interpreted as an angle in radians.

Example: the expression COS(0) produces the value 1.

SIN (expression)

Computes the sine of the expression interpreted as an angle in radians.

Example: the expression SIN(0) produces the value 0.

TAN (expression)

Computes the tangent of the expression interpreted as an angle in radians.

Example: the expression TAN(0) produces the value 0.

ACOS (expression)

Computes the arccosine of the expression, returning an angle in radians.

Example: the expression ACOS(0) produces the value of /2, 1.570796.

ASIN (expression)

Computes the arcsine of the expression, returning an angle in radians.

Example: the expression ASIN(1) produces the value of /2, 1.570796.

ATAN (expression)

Computes the arctangent of the expression, returning an angle in radians.

Example: the expression ATAN(1) produces the value of /4, 0.785398.

ATAN2 (expression, expression)

Computes the angular component of the polar coordinates (r, ) that are equivalent to the rectangular coordinates (x, y) given by the two expressions. This is the same as ATAN(y / x).

Example: the expression ATAN2(5, 5) produces the value of /2, 1.570796.

SQRT (expression)

Computes the square root of the expression. Returns a null value if the expression is negative.

Example: the expression SQRT(2) produces the value 1.414214.

ABS (expression)

Computes the absolute value of the expression.

Example: the expression ABS(-10) produces the value 10.

POW (expression, expression)

Computes the value of the first expression raised to the power of the second expression.

Example: the expression POW(2, 5) produces the value 32.

MOD (expression, expression)

Computes the remainder after dividing the first expression by the second expression.

Example: the expression MOD(5, 2) produces the value 1.

ROUND (expression [, expression] )

Rounds off the value of the first expression to the number of decimal places specified by the second expression. If the second expression is not used, it is assumed to be zero, which means round off to the nearest whole number. When the second expression is a positive number, it specifies the number of places to the right of the decimal point to be preserved. If negative, it means round to the specified number of places to the left of the decimal point, as illustrated in the examples.

Examples:      ROUND (15.751, 1) produces 15.8
     ROUND (15.751) produces 16
     ROUND (15.751, -1) produces 20

REGROUP Expression

The REGROUP expression is used to perform secondary aggregation of data. It operates a little like the AGGREGATE option in a query and can be used to perform a second level of aggregation when dealing with a complex data structure. It uses two expressions, which must be arrays of the same dimension. In the expression "REGROUP array BY array ..." the second array (the "BY" array) is used as a key for grouping the values from the first array (the data array). The result is a new array whose dimension is the number of unique values in the "BY" array. The values in the result are aggregates derived from each group of rows in the data array that have the same key value.

The aggregation function determines how these aggregates are produced. The types of aggregation are the same as the query AGGREGATE option, except that NONE and UNIQUE are not applicable in REGROUP, because there can be only one value in each position of the result array. The valid aggregations types are:

SUM Computes the sum of the values in each group.

AVERAGE Computes the average of the values in each group.

FIRST Select the value from the first row of the group.

LAST Select the value from the last row of the group.

MIN Selects the smallest of the values in each group.

MAX Selects the largest of the values in each group.

LIST Concatenates the values (converted to character strings if numeric) into a single string with a delimiter between each value. If a quoted string is specified after the word LIST, that string is the delimiter, otherwise a comma and space are placed between each value.

Some additional rules on the REGROUP expression are:

The "BY" array does not have to be sorted. REGROUP will always collect together all data values for each unique key value. However, the choice of value for FIRST or LAST will be affected by the order of values in the data array.

Nulls in the data array are ignored during aggregation except for FIRST and LAST, which preserve a null if it is the first or last value found. If all data values for some key value are null the corresponding result value will be null.

A null in the "BY" array is a valid key value and will produce a corresponding value in the result, aggregating all null key values together.

Example: These examples use the arrays A and B as inputs:

A		B
George		4
Abe		4
Sue		5
Sam		8
Mary		8
William		8

The arrays C and D are produced by the statements:

DEFINE C REGROUP A BY B AGGREGATE FIRST.

DEFINE D REGROUP A BY B AGGREGATE LIST "-".

C		D
George		George-Abe
Sue		Sue
Sam		Sam-Mary-William