Submerged Rosette
  Exchange Format Description WHPO Home   

Improved Exchange Format for WHP Data

version of 12 September 2001

James H. Swift & Stephen C. Diggs
(with help from David Newton, Sarilee Anderson, John Osborne, and Jeremy Ward)
WOCE Hydrographic Program Office
UCSD Scripps Institution of Oceanography
whpo@ucsd.edu
Summary

Exchange formats for WOCE Hydrographic Program (WHP) CTD and bottle data are described. The WHP-exchange formats provide simplified exchange and improved readability of WHP data. WHP-exchange data files draw the essential information from the present WHP .sum, .sea/.hyd, and .ctd data files and present them in rigorously-described comma-delimited (csv) ASCII formats which should ease data exchange and simplify data import.

Overview of WHP-exchange file formats

The new WHP-exchange bottle and CTD data formats include these features:
  • ASCII, spreadsheet-like
  • comma-delimited values (csv)
  • no special meaning to blank/empty spaces
  • station information in every line in the file (bottle) or in the top lines in each file (CTD)
  • only one missing data value defined for all parameters
  • missing data value format defined in the format for each parameter
  • WHP quality flag, when provided, associated directly with its parameter
  • positions in decimal degrees
  • dates in YYYYMMDD format
Data written in WHP-exchange format by the WHP Office from existing .sum, .sea/.hyd, and .ctd data files are "data products" because they do not include all information from the original files. The original .sum, .sea/.hyd, and .ctd files will form the official WOCE Archive version of WHP data. The WHP-exchange format files will also be included in the WOCE Archive. Format description for WHP-exchange bottle data (_hy1.csv)

[Note: To better understand this section please refer to document "example_hy1.csv". It is recommended that the reader examine "example_hy1.csv" in a text editor application in order to see all characters and also in a spreadsheet application in order to view overall layout.]

The overall layout of a _hy1.csv bottle data file is described in Table 1.
  • The first line of a WHP-exchange format file is a single word which describes the file type, in this case "BOTTLE", followed by a comma and a date/time stamp.
  • The format next provides for 0-N optional information lines, each beginning with a "#" character, near the beginning of a _hy1.csv file. The WHP Office intends to use the "#" lines to hold file history information.
  • A description of the station information columns of a _hy1.csv file is in Table 2.
  • A description of the remaining data columns and preferred parameter names in a _hy1.csv files is in Table 3.
  • A line with "END_DATA" signals the end of the data lines.
  • After that line, a bottle data file may hold other file-specific documentation. The primary documentation for WHP data will, however, remain in the ".doc" file (or zipped directory).
General rules for WHP-exchange_hy1.csv data files:
  1. Each line must end with a carriage return or end-of-line.
  2. With the exception of the file type line, lines starting with a "#" character, or including and following a line which reads "END_DATA", each line in a _hy1.csv file must have exactly the same number of commas as do all other lines in that file.
  3. The number and names of the parameters in a _hy1.csv file is not specifically addressed, except that for WHP data certain parameters are noted as REQUIRED. It is not, for example, necessary that a bottle data file contain columns for CFC measurements when there are no CFC data.
  4. The order of the header and bottle data parameters in a _hy1.csv file is preferred to be as shown in "example_hy1.csv" but not strictly required. Although the _hy1.csv files prepared by the WHP Office shall be as consistent as feasible in this regard, data users are urged to use "read" statements that are sensitive to parameter names rather than position of the parameter in the data files.
  5. All parameters defined as alphanumeric (e.g. "A14") and integer (e.g. "I4") will be shown in the full defined width and will be right-justified, meaning that entries shorter than the defined width (for example, EXPOCODEs are almost always shorter than the defined maximum of 14 alphanumeric characters) will be padded with meaningless spaces to the left of the first character.
  6. The bottle data parameter names should follow those listed in Table 3 when feasible. Data providers are urged to use caution, however, and list their actual parameter name rather than a WHP parameter name whenever there is any question on this matter.
  7. Each data parameter listed in Table 3 - except for all flags, which are "I1" - will be listed in "F9.x" floating point format, where "x" indicates the number of decimal places. For each parameter, the WHPO will pad with meaningless zeros data received with fewer decimal places and round data received with extra decimal places to the number of decimal places specified in Table 3.
  8. When a quality byte is available for a parameter, that quality byte shall be placed in the column immediately to the right of the parameter. (Also see "WOCE Flag 1 vs. WOCE Flag 2, and IGOSS Quality Codes", below.)
  9. The name of a quality flag always begins with the name of the parameter with which it is associated, followed by an underscore character, followed by "FLAG", followed by an underscore, and then followed by an alphanumeric character indicating the flag type. (Also see "WOCE Flag 1 vs. WOCE Flag 2, and IGOSS Quality Codes", below.)
  10. The "missing value" for a data value is always defined as -999, but written in the decimal place format of the parameter in question. For example, a missing salinity would be written -999.0000 or a missing phosphate -999.00. The value -999 was chosen because it is out of range for all WHP parameters.

Table 1.       General description of _hy1.csv file layout.

1st line File type, here BOTTLE, followed by a comma and a DATE_TIME stamp
YYYYMMDDdivINSwho
 
YYYY    4 digit year 
MM      2 digit month 
DD      2 digit day 
div     division of Institution 
INS     Institution name 
who     initials of responsible person 

example:   20000711WHPSIOSCD
#lines A file may include 0-N optional lines, typically at the start of a data file, but after the file type line, each beginning with a "#" character and each ending with carriage return or end-of-line. Information relevant to file change/update history of the file itself may be included here, for example.
2nd line Column headings. A list of column headings approved and used by the WHP Office is found in Table 2. A list of parameter headings approved and used by the WHP Office is found in Table 3. Data originators are urged, however, to be careful to supply their correct column headings rather than to simply copy 'approved' column headings into their files.
3rd line Units. A list of parameter units used by the WHP Office is found in Tables 2 and 3. Data originators are urged, however, to be careful to supply their correct units rather than to simply copy the units used by the WHP.
data lines As many data lines may be included in a single file as is convenient for the user, with the proviso that the number and order of parameters, parameter order, headings, units, and commas remain absolutely consistent throughout a single file. Thus a single data file may contain data lines for as little as one bottle from one cruise to as much as many bottles from many cruises.
note Within a _hy1.csv file it is very strongly preferred that data from each station be contiguous, it is recommended that data from each cast at a station be contiguous, and it is preferred that the data from each cast be sorted from lowest pressure to highest pressure.
END_DATA   The line after the last data line must read END_DATA, and be followed by a carriage return or end of line.
other lines Users may include any information they wish in 0-N optional lines at the end of a data file, after the END_DATA line.

Table 2.       _hy1.csv header columns

Parameter Format   Description notes
EXPOCODE A14 The expedition code, assigned by the WHP Office for WHP data or generated by the user or a data facility for non-WHP data. A single alphanumeric word, without spaces, commas, or "/" characters (but "_" underscore characters are OK) which is unique cruise identifier code. REQUIRED.
SECT A6 For WOCE data the WHP section identifier. Optional.
STNNBR A6 The originator's station number. This column is used for a single alphanumeric word, without spaces, commas, or "/" characters (but "_" underscore characters are OK) which is unique station identifier. Numeric-only STATION identifiers are preferred by many data users, but provision for alphanumeric identifiers is retained to maintain compatibility with WOCE records. REQUIRED.
CASTNO I3 The originator's cast number. This column is used for a single integer cast number. Where cast number is unknown a default value of 1 is used or written in by the WHP Office. REQUIRED.
note No "cast type" designator is used.
SAMPNO A7 The sample number as described in WHP Office Report 90-1, WOCE Report No. 67/91, "Requirements for WHP Data Reporting". It is very strongly recommended that at least one, preferably both, of the parameters SAMPNO and BTLNBR be reported for bottle data files. Where neither SAMPNO or BTLNBR are available to the WHP Office, the WHP Office may add a SAMPNO column containing consecutive integers for each station/cast.
BTLNBR A7 The bottle identification number as described in WHP Office Report 90-1, WOCE Report No. 67/91, "Requirements for WHP Data Reporting". It is very strongly suggested that at least one, preferably both, of the parameters SAMPNO and BTLNBR be reported for bottle data files. It is preferred that one of these, preferably BTLNBR, include a quality flag in the column immediately to its right. This is the primary index to a water sample. Pressure - or depth - is a measured parameter. The pressure value can change during processing, and so pressure (or sample depth) should never be used to index water sample data.
BTLNBR_FL
AG_W
I1 The parameter name of a data quality flag should be identical to the actual parameter name, followed by "FLAG" and then by a character indicating the type of quality flag, with underscores between each word. W = WHP quality flag; I = IGOSS quality flag; U = quality flag from user-defined table.
DATE I8 Cast date in YYYYMMDD integer format. REQUIRED
TIME I4 Cast time (UT) as HHMM. Optional. Must have all four digits.
LATITUDE F8.4 Latitude as SDD.dddd where "S" is sign (blank or missing is positive), DD are degrees, and dddd are decimal degrees. Sign is positive in northern hemisphere, negative in southern hemisphere. Spaces to left of leftmost digit are ignored. Data with positions not reliable to ten-thousandths of a degree should be padded with meaningless zeros. To convert from WHP .sum files, the "BO" or "bottom" position (ship position when cast is at deepest level) will be used if available, with "BE" (ship position at cast start) or "EN" (ship position at cast end) used in that priority order when "BO" position is not available. REQUIRED
LONGITUDE    F9.4 Longitude as SDDD.dddd where "S" is sign (blank or missing is positive), DDD are degrees, and dddd are decimal degrees. Sign is positive for "east" longitude, negative for "west" longitude. Spaces to left of leftmost digit are ignored. Data with positions not reliable to ten-thousandths of a degree should be padded with meaningless zeros. To convert from WHP .sum files, the "BO" or "bottom" position (ship position when cast is at deepest level) will be used if available, with "BE" (ship position at cast start) or "EN" (ship position at cast end) used in that priority order when "BO" position is not available. REQUIRED
DEPTH I5 Reported depth to bottom. Preferred units are "meters" and should be specified in Line 2. In general, corrected depths are preferred to uncorrected depths. Documentation accompanying data should include notes on methodology of correction. The WHP Office will, however, initially provide whatever depth was received from the data originator in _hy1.csv and _ct1.csv files due to ongoing confusion about which received depths were corrected and which were uncorrected. When no depth-to-bottom is supplied by the data originator for one or more rows of data in a _hy1.csv file which contains a "DEPTH" column, -999 will be written in by the WHPO. Optional.

Table 3.     WHP parameter names, units, and comments.

Parameter  Format   Suggested Units   Comments
CTDPRS F9.1 decibars corrected CTD pressure (in a _hy1.csv file the value accompanying closure of the rosette bottle)
CTDTMP F9.4 degrees C
(specify ITS-90
or IPTS-68 if 
known)
corrected CTD temperature (in a _hy1.csv file the value accompanying closure of the rosette bottle)
CTDSAL F9.4 corrected CTD salinity (in a _hy1.csv file the value accompanying closure of the rosette bottle)
CTDSAL_
FLAG_a
I1 W = WHP
quality flags;
I = IGOSS
quality flag;
The parameter name of a data flag should be identical to the actual parameter name, followed by "FLAG" and then by a character indicating the type of quality flag, with underscores between each word.
U = quality flag
from user-
defined table
(table to be
supplied in
comment lines)
[A FLAG value can follow any data value. FLAG is shown here only for CTDSAL for simplicity. Typically a WHP data file will have FLAG_W values following most parameters in this table except for CTDPRS and CTDTMP.]
SALNTY F9.4 bottle salinity
CTDOXY F9.1 µmol/kg corrected CTD oxygen (in a _hy1.csv file the value accompanying closure of the rosette bottle; often not available in _hy1.csv files)
OXYGEN F9.1 µmol/kg bottle oxygen (must specify actual units, not simply copy the suggested units)
SILCAT F9.2 µmol/kg silicate (must specify actual units, not simply copy the suggested units)
NITRAT F9.2 µmol/kg nitrate (must specify actual units, not simply copy the suggested units)
NO2+NO3

(shown only
if separate 
NITRAT and 
NITRIT are 
not available)

F9.2 µmol/kg nitrate plus nitrite (must specify actual units, not simply copy the suggested units) [Most modern techniques for determining dissolved nitrate return a value of nitrate (NO3) plus nitrite (NO2). A separate determination is then done for nitrite and the result subtracted by the data originator to obtain nitrate. If no separate nitrite determination was carried out - or in rare cases the nitrite number was not subtracted - data providers should list the result as NO2+NO3. Because nitrite values are in most regions small compared to nitrate, most data users will not adversely affect their results by relabeling NO2+NO3 as NITRAT.]
NITRIT F9.2 µmol/kg nitrite (see NO2+NO3) (must specify actual units, not simply copy the suggested units)
PHSPHT F9.2 µmol/kg phosphate (must specify actual units, not simply copy the suggested units)
CFC-11 F9.3 µmol/kg (must specify actual units, not simply copy the suggested units)
CFC-12 F9.3 pmol/kg (must specify actual units, not simply copy the suggested units)
CFC113 F9.3 pmol/kg (must specify actual units, not simply copy the suggested units)
CCL4 F9.3 pmol/kg
TRITUM F9.3 TU (must specify actual units)
HELIUM F9.4 nmol/kg
DELHE3 F9.2 %
DELC14 F9.1 0/00
DELC13 F9.1 0/00
O18O16 F9.2 per mille
TCARBN F9.1 µmol/kg total carbon
ALKALI F9.1 µmol/kg alkalinity
PCO2 F9.1 µatm partial pressure of CO2
PH F9.2 pH

Format description for WHP-exchange CTD data

[Note: To better understand this section please refer to document "example_ct1.csv". It is recommended that the reader examine "example_ct.csv" in a text editor application in order to see all characters and also in a spreadsheet application in order to view overall layout.]

The overall layout of a_ct1.csv CTD data file is described in Table 4.
  • The first line of a WHP-exchange format file is a single word which describes the file type, in this case "CTD", followed by a comma and a date/time stamp.
  • The format next provides for 0-N optional information lines, each beginning with a "#" character, near the beginning of a _ct1.csv file. The WHP Office intends to use the "#" lines to hold file history information.
  • Next is a line indicating the number of header lines (counting the present line and those following), usually 10 in WHP CTD data in WHP-exchange format.
  • Next are the remaining 9 lines (usually) of header information. These mostly match the description of the similar information in a _hy1.csv file.
  • Next are the remaining 9 lines (usually) of header information. These mostly match the description of the similar information in a _hy1.csv file.
  • A line with "END_DATA" signals the end of the data lines.
  • After that line, a CTD data file may hold other file-specific documentation. The primary documentation for WHP data will, however, remain in the ".doc" file (or zipped directory).
General rules for WHP-exchange_ct1.csv data files:
  1. Each line must end with a carriage return or end-of-line.
  2. With the exception of the file type line, lines starting with a "#" character, the 10 header lines, or including and following a line which reads "END_DATA", each line in a_ct1.csv file must have exactly the same number of commas as do all other lines in that file.
  3. The order of the parameters in the header lines in a_ct1.csv file should follow the order listed (and in "example_ct1.csv") to make it simplest for users to import files. All _ct1.csv files prepared by the WHP Office will adhere to the header parameter line order shown in "example_ct1.csv". Still, CTD data users are urged to use "read" statements that are sensitive to parameter names rather than position of the parameter in the data files.
  4. It is not necessary that a CTD data file contain a column for CTD oxygen probe measurements (CTDOXY) when there are no CTD oxygen probe data.
  5. When a quality byte is available for a CTD parameter, that quality byte shall be placed in the column immediately to the right of the parameter. (See "WOCE Flag 1 vs. WOCE Flag 2, and IGOSS Quality Codes", below.)
  6. The name of a quality flag always begins with the name of the parameter with which it is associated, followed by an underscore character, followed by "FLAG", followed by an underscore, and then followed by an alphanumeric character indicating the flag type. (See "WOCE Flag 1 vs. WOCE Flag 2, and IGOSS Quality Codes", below.)
  7. The "missing value" for a data value is always defined as -999, but written in the decimal place format of the parameter in question. For example, a missing salinity would be written -999.0000. The value -999 was chosen because it is out of range for all WHP parameters.
  8. Each data parameter listed in Table 5 - except for all flags, which are "I1" - will be listed in "F9.x" floating point format, where "x" indicates the number of decimal places. For each parameter, the WHPO will pad with meaningless zeros data received with fewer decimal places and round data received with extra decimal places to the number of decimal places specified in Table 5.

Table 4.       General description of _ct1.csv file layout.

1st line File type, here CTD, followed by a comma and a DATE_TIME stamp

YYYYMMDDdivINSwho
 
   
YYYY    4 digit year 
MM      2 digit month 
DD      2 digit day 
div     division of Institution 
INS     Institution name 
who     initials of responsible person

example:   20000711WHPSIOSCD 
ORIGINAL_DEPTH_HEADER=  
#lines A file may include 0-N optional lines at the start of a data file, each beginning with a "#" character and each ending with carriage return or end-of-line. Information relevant to file change/update history may be included here, for example.
2nd line NUMBER_HEADERS = n (n = 10 in this table and the example_ct1.csv file.)
3rd line EXPOCODE = [expocode] (see Table 2 for definition)
4th line SECT = [section] (see Table 2 for definition)
5th line STNNBR = [station] (see Table 2 for definition)
6th line CASTNO = [cast] (see Table 2 for definition)
7th line DATE = [date] (see Table 2 for definition)
8th line TIME = [time] (see Table 2 for definition)
9th line LATITUDE = [latitude] (see Table 2 for definition)
10th line LONGITUDE = [longitude] (see Table 2 for definition)
11th line DEPTH = [bottom] (see Table 2 for definition)
next lines Parameter headings. A list of CTD parameter headings approved and used by the WHP Office is found in Table 5. Data originators are urged, however, to be careful to supply their correct column headings rather than to simply copy 'approved' column headings into their files.
next lines Units. A list of parameter units used by the WHP Office is found in Table 5. Data originators are urged, however, to be careful to supply their correct units rather than to simply copy the units used by the WHP.
data lines A single _ct1.csv CTD data file will normally contain data lines for one CTD cast. Generally these will be what is called a "2 decibar" file, i.e. there will be a 2-decibar interval between data lines, and each line will lay at either even or odd whole decibars.
END_DATA  The line after the last data line must read END_DATA, and be followed by a carriage return or end of line.
other lines Users may include any information they wish in 0-N optional lines at the end of a data file, after the END_DATA line.

Table 5.     _ct1.csv WHP parameter names, units, and comments.

Parameter  Format  Suggested Units  Comments
CTDPRS F9.1 decibars corrected CTD pressure
PARAMET 
ER_NAME
_FLAG_a
I1 W = WHP
quality flag.
I = IGOSS
quality flag.
The parameter name of a data flag should be identical to the actual parameter name, followed by "FLAG" and then by a character indicating the type of quality flag, with underscores between each word.
[U = quality 
flag from user-
defined table]
[A FLAG value can follow any data value in this table. FLAG is shown here only for CTDPRS for simplicity. Typically a WHP data file will have FLAG_W values following every parameter in this table.]
CTDTMP F9.4 degrees C
(specify ITS-90
or IPTS-68 if 
known)
corrected CTD temperature
CTDSAL F9.4 corrected CTD salinity
CTDOXY F9.1 µmol/kg corrected CTD oxygen (must specify actual units, not simply copy the suggested units)

WOCE Flag 1 vs. WOCE Flag 2, and IGOSS Quality Codes

The data files in WHP-exchange format contain original WOCE quality flags, where these have been provided.
  1. Where both WOCE Flag 1 (QUALT1) and WOCE Flag 2 (QUALT2) exist in an original WOCE .sea/.hyd or .ctd data file, the WOCE Flag 2 value will be used in the WHP-Exchange format files, but only when that flag results from DQE activity. WOCE Flag 2 values should be ignored and not used to replace WOCE Flag 1 values when it is evident that they were set to some default pre-examination value other than the value of WOCE Flag 1.
  2. The determination of when to replace WOCE Flag 1 with WOCE Flag 2 in a WHP-Exchange format file is done on a parameter by parameter basis.
  3. Where both WOCE Flag 1 and WOCE Flag 2 quality bytes are present in the original WHP-format data file, these will be preserved in the original files in the WOCE Archive.
In order that WHP data be used with data from other WOCE programs, and with non-WOCE data, it may be advantageous for some users to translate the WOCE-specific quality codes into the more widely recognized IGOSS quality codes.

The WHP quality codes for the water bottle itself are:
1           Bottle information unavailable.
2 No problems noted.
3 Leaking.
4 Did not trip correctly.
5 Not reported.
6 Significant discrepancy in measured values between Gerard and Niskin bottles.
7 Unknown problem.
8 Pair did not trip correctly. Note that the Niskin bottle can trip at an unplanned depth while the Gerard trips correctly and vice versa.
9 Samples not drawn from this bottle.

Flags 6, 7, and 8 apply primarily to large volume samplers.

The WHP bottle parameter data quality codes are:
1           Sample for this measurement was drawn from water bottle but analysis not received. Note that if water is drawn for any measurement from a water bottle, the quality flag for that parameter must be set equal to 1 initially to ensure that all water samples are accounted for.
2 Acceptable measurement.
3 Questionable measurement.
4 Bad measurement.
5 Not reported.
6 Mean of replicate measurements (Number of replicates should be specified in the -.DOC file and replicate data tabulated).
7 Manual chromatographic peak measurement.
8 Irregular digital chromatographic peak integration.
9 Sample not drawn for this measurement from this bottle.

The WHP CTD data quality codes are:
1           Not calibrated.
2 Acceptable measurement.
3 Questionable measurement.
4 Bad measurement.
5 Not reported.
6 Interpolated over >2 dbar interval.
7 Despiked.
8 Not assigned for CTD data.
9 Not sampled.

The WMO IGOSS observation quality codes are:
No quality control yet assigned to this element
1 The element appears to be correct
2 The element is probably good
3 The element is probably bad
4 The element appears erroneous
5 The element has been changed
6 to 8   Reserved for future use
9 The element is missing

A perfect translation is probably not feasible, but we suggest the following WHP-to-IGOSS (not IGOSS-to-WHP) translation rules as reasonable:

WOCE       IGOSS
bottle
1                 0
2                 1
3                 3 (see note #1)
4                 4
5                 0
6                 4
7                 4
8                 4
9                 9

water sample
1                 0
2                 1
3                 2 (see note #2)
4                 4
5                 0
6                 2
7                 2
8                 2
9                 9

ctd
1                 0
2                 1
3                 2 (see note #2)
4                 4
5                 0
6                 2
7                 2
9                 9
 
 
Note #1: The WHP Office, in the interest of being conservative, has chosen to translate the WOCE bottle quality code 3 into IGOSS quality code 3. A leaking water sample bottle typically results in a discrepancy or error in gas samples, such as oxygen and CFCs, but less often results in data discrepancies for salinity and nutrients. It is suggested that data users who wish to import only "good" data not import any water sample data from bottles with a WOCE code 3 or IGOSS code 3. A data user who is willing to entertain slightly greater risk might choose to import non-gas sample data (e.g., salinity and nutrients) from a WOCE code 3 or IGOSS code 3 water sample bottle, and allow import of gas sample data (e.g. oxygens and CFCs) for bottles with IGOSS Code 2. (The WHP Office is not, however, currently assigning IGOSS code 2 to water sample bottles; but future data originators or data centers may wish to use code 2.)
Note #2: The WHP Office has noted that in general, data originators tend to be conservative and so in DQE reports many WHP code 3 ("questionable") water sample parameter data are deemed WHP code 2 ("good") by the examiners. The IGOSS code 2 ("probably good") seems to be a reasonable interpretation. The WHP Office is not currently assigning IGOSS code 3 ("probably bad") to WHP water sample data values.