NASA LogoAtmospheric Sciences at LaRC AASO Logo
Home Missions Data Servics About Us

 

ICARTT Data Format

Eric Williams; AL/NOAA

Jim Crawford; Ali Aknan; LaRC/NASA

Hans Schlager; DLR

(Last Modified: Jnue 2008)


Data File Formats

1. Requirements for data files

A. Time information

B. Location information

C. Measurements

i. Uncertainties

ii. Missing data

2. Filenames

3. Recommended File Format Specification for ICARTT Time-series Data Files

A. Structure

B. File header information

C. Examples

EXAMPLE 1

EXAMPLE 2

EXAMPLE 3

4. Recommended File Format Specification for ICARTT Multi-dimensional Data Files

A. Structure

Example_2110

Example_2310

5. File Formats for Other Data

6. Data Exchange/Protocol During Field Study

7. File Scanning Software


Data File Formats

In large studies such as ICARTT 2004 and MILAGRO 2006, there are many different types of data collected. Many data sets are simply straight time series with one or a number of parameters being measured sequentially (and simultaneously) in time. However, there are some data sets that are truly multi-dimensional in that a sample will be taken by an instrument at a single point in time and a number of parameters will be measured on that sample simultaneously. An example is wind profiler data in which a 30-minute averaged sample taken at some time period will be binned into height information and at each height will be wind speed, wind direction, and temperature. Another, more extreme, example is output from 3- dimensional models. Data such as these clearly cannot be represented as a single time series. Sections 1- 2 below outline the ICARTT format for all types of data, with an emphasis on standard time-series types of data. Section 3 - 4 is specific for standard timeseries types of data. Section 5 offers guidance for non-standard time-series data.

Though adapted from the NASA Ames data format, the ICARTT data format will have no restriction on the number of characters per line or on the number of characters per record. The file name will be limited to 127 characters in length.

1. Requirements for data files

A. Time information

The philosophy here is that the data in the files must possess at least the minimum amount of accompanying information to uniquely identify each data point - this generally means time and location information. Moreover, the format must be able to handle all forms of timing configurations, including data that are irregularly spaced in time. For example, there are instruments that integrate a measurement over time until a certain signal-to-noise threshold has been reached. The integration period varies according to atmospheric conditions so that the resulting data have both variable integration times and are irregularly spaced in time. There is absolutely no way to represent these data with a single time point. The most efficient way of representing these data is with two time points: starting time and stopping time. This is the first requirement for the data file structure.

In those cases when many data sets are used or merged, a convenient single time reference point is the mid-point of the sampling period(s). Generally, this is the average of the start time and the stop time, but this is not always the case. As an example, there are measurements that integrate over a certain time but because of sample airflow changes (e.g., changing altitude during aircraft sampling) the sampling volume mid point does not correspond to the sampling time mid-point. In this case, the actual time mid-point must be specified by the investigator. Other cases, the “median”, or “weighted average” might be more appropriate representation of the mid-point (e.g., binning data where more samples are concentrated over a certain segment in the “sampling period”). Thus in order to encompass all of the possible diversity in sampling, three times need to be specified for each data point: start time, mid-point time, and stop time.

There are different views on what format should be used to represent time. In current measurement practice it is typical to find 1 second sampling intervals regardless of the platform (i.e., aircraft, etc.). Measurements at 1 Hz generally capture most of the important variablity in air quality data, and, while longer intervals are commonly reported, shorter intervals are not. The Ames format shows time as seconds from the start of the day defined in the file header and in the file name (see below). The ICARTT file format will adopt this structure. However, recognizing the need in some cases for >1 Hz sampling, the ICARTT format will allow data in fractional seconds though the default will be integer seconds. This does not mean that data MUST be shown in 1 second increments; whether it be 1 minute or some other increment, this decision is left to the principal investigator. In all cases, though, all times are explicitly accounted for in the period (day) specified by the header and file name. If no data are available for any time period, then that is represented by the missing data identifier. The one exception to this is when no sampling takes place from the start of a day to some point during the day. This might occur because of, for example, aircraft take-off. All times are in UTC.

B. Location information

The specification of this information is straightforward. All data points in the files need to have latitude (lat), longitude (lon), and either altitude (for aircraft, lidars, sondes) or elevation (for surface data). The lat/lon system used here will be strictly numeric: decimal degrees (to five decimal places) with south latitudes and west longitudes represented as negative numbers (i.e., no N, E, W, S identifiers). Elevations will be in integral meters. Altitudes must be explicitly defined since many types of altitude measurements are in use (pressure alt; GPS alt; geopotential alt; etc.).

Because this information is required to uniquely identify any given data point, ideally it is included in the file with those data. However, it is sometimes advantageous to have location information consolidated and uniquely identified in a separate file (e.g., an aircraft parameter file). If this is done, then information about that parameter file must be included in the data file header information. This will be specified below.

C. Measurements

In general, each file contains data of one parameter or species separated by a space. Multiple variables per file are allowed only if all were measured on exactly the same time base, as, for example, by the same instrument (e.g., GC/MS; PILS/IC). The numeric representation of a variable will be defined by the units in which it was measured. The ICARTT format contains the NASA Ames provision for a data scaling factor. However, we recommend that all scale factors be 1.0 unless it is grossly inconvenient to do so. If very large or very small numbers are required, then they can be represented with exponential notation, as in 1.01e9 or 5.23e-6.

i. Uncertainties

Every data point should have a corresponding total uncertainty (or error) which has the same units as the measurement. This uncertainty in the measurement is indicated as a TOTAL uncertainty to include all systematic and random effects. Ideally, these uncertainties are tabulated as the next (and separate) column after the data column in the file. However, this requirement can be relaxed if the uncertainty data can be reproduced by information in the header of the file. For example, if all uncertainties can be calculated by a function that has any given data point as input, then the formula can be included as header information.

ii. Missing data

Missing data are just that - missing. It makes no difference what the reason, whether it be a calibration period, a system crash, instrument maintenance, etc. Missing data are represented by negative numbers large enough to never be construed as actual data. For the ICARTT file format the value is -9999 (or -99999, etc.). Note that this is different from the Ames data exchange format in that Ames requires missing data flags to be numbers larger than any “good” data value. This somewhat arbitrary standard breaks down for measurements in urban areas where “good” data values can exceed reasonable expectation. For example, it is not uncommon in these areas for NO, NO2, or CO data to be in the parts per million range which are very large numbers for the standard units of measure (ppbv) for these species. On the other hand, there is no conceivable situation in which large negative numbers (e.g., -9999) can be construed as “good” data. Therefore, we specify for the ICARTT format that the primary missing data flag be -9999.

On the other hand, data below (or above) the limit of detection (LOD) are not actually “missing” but do convey some information. While some investigators choose to tabulate all of their quantifiable data, including negative values, others choose not to show these data points, but rather indicate the value is less than (or greater than) some quantifiable limit. These conditions will be indicated by two additional missing data flags that are substituted for the missing data values. The flag for data values GREATER THAN some UPPER LOD (ULOD) will be –7777 (or -77777, etc.), and the flag for data values LESS THAN some LOWER LOD (LLOD) will be -8888 (or -88888, etc.). These flags (if used) and the values of the upper and lower LOD are documented at specific locations in the header file (see below).

2. Filenames

Features of different file naming conventions (including Ames) have been adapted here. File names for the ICARTT data format, limited to 127 characters or less, are defined as follows:

dataID_locationID_YYYYMMDD[hh[mm[ss]]]_R#[_L#][_V#][_comments].extension,

where the only allowed characters are: a-zA-Z0-9_.- (that is, upper case and lower case alphanumeric, underscore, period, and hyphen). All fields not in square brackets are required and are described as follows:

dataID: short description of measured parameter/species, instrument, or model (e.g., DIAL-O3; NCAR-CH2O; VOC; PTRMS )
locationID: short description of site; station; platform; laboratory or institute
YYYY: four-digit year
MM: two-digit month
DD: two-digit day
hh: optional two-digit hour
mm: optional two-digit minute
ss: optional two-digit second
R: revision number of data
L: optional launch number
V: optional volume number
comments: optional additional information
extension: file type descriptor

The underscore is used ONLY to separate the different fields of the file name; it has special significance for file-checking software. To separate characters within a field for readability, use lower and upper case letters. The use of the hyphen, though allowed, is discouraged since this character in file names may cause problems with some older operating systems and network software. The square brackets “[ ]” enclose optional parameters but are not shown in the file name. Dates and times in file names are always UTC. The date and time in the file name give the date/time at which the data within the file begin (data files), or date/time at which the image applies (image files). For aircraft and sonde data files, the date always refers to the UT date of launch.

The dataID is a short string of characters used to identify the parameters in the file. For files that contain one or two variables those variable names can be used in the file name. For files in which many variables are represented, it may be best to indicate in the file name a class of compounds (e.g., VOC; PhotolysisRates) or an abbreviation of the instrument used to make the measurements (e.g., PTRMS). The dataID can contain PI institution and/or instrument name, separated by dashes, to uniquely identify the measurements (e.g., LARC-DIAL-O3).

The locationID is used to identify the measurement platform, site, station, or source (laboratory or institute) of the information within a data file. Some examples could be: DC8, BAE146, RHBrown, GOME (satellite), IoS (Appledore Island site), ChebPt (Chebogue Point site), and others. It may be useful to have a standardized set of abbreviations used for the ICARTT study. These abbreviations should be decided upon by the Data Management team. The following list has been used during recent campaigns: DC8, P3B, J31, BE200, C130, NOAAP3, PROTEUS, CESSNA, DUCHESS, MODEL, SATELLITE, O3SONDES, GROUND, AIRMAP, MOZAIC, LIGHTNING, TRAJECTORY.

The R parameter is required. We must specify a data revision code that will track changes in data and document why those changes occurred. For this we specify a revision number counter “_R#” where the underscore is a required element to separate the fields (this is needed for certain file checking software). The revision number "#" must match the "last" revision number specified in the Normal Comments section of the file header (see below).

The optional parameters “_L#” and “_V#” may be needed in some special cases. If the contents of the file pertain to a second or third aircraft launch on the indicated date, then a launch counter "_L#" (i.e. L2, L3, etc.) must appear after the "R" identifier but before a volume counter, if present (see below). Launch number one is implied when "_L#" is omitted from the file name. If a data file is one volume of a multi-volume dataset, then a volume counter "_V#" (i.e. V1, V2, V3, etc.), must appear after the "R" parameter (and the “L” parameter, if present) separated by an underscore from the rest of the identifier. The volume number (the "#" in "V#") must match the volume number in the file header. When "_V#" is missing from the file name a one-volume dataset is implied.

The optional comments parameter is for additional information required by the PI (or Data Manager) to identify the file contents but that does not fit into the other fields of the file name. This should be used sparingly.

The file extension is a 2-4 character parameter that identifies the file type. The principal file type for the ICARTT study will be “.ict” and describes the time series data in a file formatted to ICARTT standards. Other file types may include:

" .txt", ".htm", ".html", ".pdf", ".doc" -- these represent text (or document) files; not ICARTT formatted data files
".jpg", ".jpeg", ".gif", ".png", ".bmp" -- image files
“.cdf” NetCDF file

These allowable file extensions will need to be defined by the team of Data Managers.

3. Recommended File Format Specification for ICARTT Time-series Data Files

A. Structure

We recommend that, whenever possible, ICARTT time series data files conform to the following Ames file format:

FFI = 1001; one real, unbounded independent variable; primary variables are real; no auxiliary variables; independent and primary variables are recorded in the same record.

What this means in English is that there is one time (independent) variable and that all other data depend on that variable. Any number of other variables can be defined, but they all depend on the one. In the typical case the fundamental variable is the start time of the measurement and others can be defined as in the following example, where the variable names refer to columns in the data file:

start time
stop time
mid-point time
latitude
longitude
altitude/elevation
data variable1
variable1 uncertainty
data variable2
variable2 uncertainty
<etc.>

This format accounts for most time series data measured anytime, over any arbitrary integration period, and at any place on or above the planet (within reason for air quality data). Obviously, the format can be condensed. For example, if measurements are reported as 1 second intervals, then stop time and mid-point time need not be included as data columns provided all time intervals in the measurement period are accounted for by inclusion of the missing data flag(s). Similarly, if the measurements are made at a fixed location then latitude, longitude, and elevation are fixed and these data would be included in the header information (see below). As pointed out above, if the location data (latitude, etc.) are included in a separate file, then these columns can be excluded provided the location data file name is included in the header information for the data file. Similarly, if uncertainty is defined as some function that is the same for all data points then that function can be included in the header information and the user can then calculate uncertainties. Variations in the way the format is used, based on the needs of the data provider, are accounted for in the file header information. As an example, some PIs may wish to report the END time of the measurement period as the independent variable. The ICARTT format allows this provided that the time variable is clearly labeled as such (e.g., End_UTC) and that additional information describing this (non-standard) situation be provided in the Normal Comments section of the file header. If the data periods are not of a constant duration, then the start time and mid-point time of each period must be included as an additional column and the Data Interval value set to 0 (see below). The header specifications are described below.

B. File header information

The basic structure of the ICARTT file header is similar to the Ames exchange format. For the ICARTT study we recommend some additional information that will be included in the comments sections. The most general header is shown below as an example; more specialized headers will be described as modifications to the general form. Different items of information in the same record (same line) are shown below as separated by a semicolon – in the actual file they are separated by a single space.

The formula for the total number of lines in the header for FFI=1001 files:
14 + ( # dependent variables, given in line 10) + (# special comments) + (# normal comments)

C. Examples

Below are three examples of (similar) time series data using different forms of header information. Be aware that the automatic word-wrap feature in word processing programs gives the appearance that there are more lines of text than are really there. In these examples any continuation of lines from directly above has been indented for clarity.


EXAMPLE 1. All required data columns are shown explicitly.
File name: NOX_RHBrown_20040830_R0.ict

41 1001
Williams, Eric
Aeronomy Laboratory/NOAA
Nitric oxide and nitrogen dioxide mixing ratios from R/V Ronald H. Brown
ICARTT_NEAQS
1 1
2004 08 30 2004 12 25
0
Start_UTC, (number of seconds from 0000 UTC)
9
1 1 1 1 1 1 1 1 1
-9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999
Stop_UTC, seconds
Mid_UTC, seconds
DLat, degrees
DLon, degrees
Elev, m
NO, ppbv
NO, 1sig
NO2, ppbv
NO2,1sig
0
18
PI_CONTACT_INFO: Address: 325 Broadway, Boulder, CO 80305; email: eric@al.noaa.gov; 303-497-3226
PLATFORM: NOAA research vessel Ronald H. Brown
LOCATION: Latitude, longitude and elevation data is included in the data records
ASSOCIATED_DATA: N/A
INSTRUMENT_INFO: NO: chemiluminescence; NO2: narrow-band photolysis/chemiluminescence
DATA_INFO: All data with the exception of the location data is in ppbv. All one-minute averages contain at least 35 seconds of data, otherwise missing.
UNCERTAINTY: included in the data records as variables with a _1sig suffix
ULOD_FLAG: -7777
ULOD_VALUE: N/A
LLOD_FLAG: -8888
LLOD_VALUE: N/A; N/A; N/A; N/A; N/A; 0.005; N/A; 0.025; N/A
DM_CONTACT_INFO: N/A
PROJECT_INFO: ICARTT study; 1 July-15 August 2004; Gulf of Maine and North Atlantic Ocean
STIPULATIONS_ON_USE: Use of these data requires PRIOR OK from the PI
OTHER_COMMENTS: N/A
REVISION: R0
R0: No comments for this revision.
Start_UTC Stop_UTC Mid_UTC DLat DLon Elev NO_ppbv NO_1sig NO2_ppbv NO2_1sig
43200 43259 43229 41.00000 -71.00000 15 0.555 0.033 2.220 0.291
43260 43319 43289 41.01234 -71.01234 15 10.333 0.522 31.000 0.375


EXAMPLE 2 This example is similar to Example 1. Differences include the exception of the elimination of variables stop time, mid time, lat, lon, elev, and uncertainties, the inclusion of a special comment, the inclusion of DM info, and a second revision comment.
File name: NOX_RHBrown_20040830_R1.ict


36 1001
Williams, Eric
Aeronomy Laboratory/NOAA
Nitric oxide and nitrogen dioxide mixing ratios from R/V Ronald H. Brown
ICARTT_NEAQS
1 1
2004 08 30 2004 12 25
60
Start_UTC, (number of seconds from 0000 UTC)
2
1 1
-9999 -9999
NO, ppbv
NO2, ppbv
1
Lightning struck the ship at ~ 14:00:23 UTC, or at 50423 seconds after midnight UTC. The 13 minute section of missing data from 14:00 to 14:43 (50400 through 52780 of Start_UTC) reflects the period when the instrument was checked out and the computer rebooted.
19
PI_CONTACT_INFO: Address: 325 Broadway, Boulder, CO 80305; email: eric@al.noaa.gov; 303-497-3226
PLATFORM: NOAA research vessel Ronald H. Brown; sampling through high-flow manifold (res. time ~ 1 s) at 15 m above waterline
LOCATION: Ship location data in file ShipData_RHBrown_20040830_R0.ict
ASSOCIATED_DATA: ShipData_RHBrown_20040830_R0.RHB
INSTRUMENT_INFO: NO: chemiluminescence; NO2: narrow-band photolysis/chemiluminescence, See Williams et al., BigScience, 42, p. 50-51, 2001
DATA_INFO: Units are ppbv. All one-minute averages contain at least 35 seconds of data, otherwise missing. Midpoint time is 29 seconds after the minute. One second data are available, contact the PI.
UNCERTAINTY: NO: +/-(5%+0.005 ppbv); NO2: +/-(12%+0.025 ppbv)
ULOD_FLAG: -7777
ULOD_VALUE: N/A
LLOD_FLAG: -8888
LLOD_VALUE: 0.005; 0.025
DM_CONTACT_INFO: Donna Sueper; NOAA/AL; dsueper@al.noaa.gov. Data manager for data within ShipData_RHBrown_20040830_R0.ict is Jim Johnson with PMEL, James.Q.Johnson@noaa.gov
PROJECT_INFO: ICARTT study; 1 July-15 August 2004; Gulf of Maine and North Atlantic Ocean
STIPULATIONS_ON_USE: Use of these data requires PRIOR OK from the PI
OTHER_COMMENTS: N/A
REVISION: R1; R0
R1: NO2 data have been increased by 13% based on calibration standard recheck.
R0: No comments for this revision.
Start_UTC NO_ppbv NO2_ppbv
43200 0.555 2.509
43260 10.333 35.030


EXAMPLE 3. This example is similar to examples 1 and 2. Here the platform is a ground site with a locationID of ChebPt.
File name: NOX_ChebPt_20040830_R2.ict


36 1001
Williams, Eric
Aeronomy Laboratory/NOAA
Nitric oxide and nitrogen dioxide mixing ratios from Chebogue Point, Nova Scotia
ICARTT_NEAQS
1 1
2004 08 30 2004 12 25
60
Start_UTC, (number of seconds from 0000 UTC)
2
1 1
-9999 -9999
NO, ppbv
NO2, ppbv
0
20
PI_CONTACT_INFO: Address: 325 Broadway, Boulder, CO 80305; email: eric@al.noaa.gov; 303-497-3226
PLATFORM: 10 m tower at the Chebogue Point ICARTT research site.
LOCATION: Chebogue Point, Nova Scotia, Canada; lat: 43.45678; lon: -66.00000; elev: 30 m.
ASSOCIATED_DATA: Met_ChebPt_20040830_R2.ict
INSTRUMENT_INFO: NO: chemiluminescence; NO2: narrow-band photolysis/chemiluminescence.
DATA_INFO: All data is in units of ppbv.
UNCERTAINTY: NO: +/-(5%+0.005 ppbv); NO2: +/-(12%+0.025 ppbv)
ULOD_FLAG: -7777
ULOD_VALUE: N/A
LLOD_FLAG: -8888
LLOD_VALUE: 0.005; 0.025
DM_CONTACT_INFO: Donna Sueper; NOAA/AL; dsueper@al.noaa.gov
PROJECT_INFO: ICARTT study; 1 July-15 August 2004
STIPULATIONS_ON_USE: Use of these data requires PRIOR OK from the PI
OTHER_COMMENTS: N/A
REVISION: R2; R1; R0
R2: NO data have been decreased by 13% based on operator ineptitude.
R1: NO2 data have been increased by 13% based on calibration standard recheck.
R0: No comments for this revision.
Start_UTC NO_ppbv NO2_ppbv
43200 0.483 2.509
43260 0.899 35.030


4. Recommended File Format Specification for ICARTT Multi-dimensional Data Files

Also, view the "Amended FFI 2110" or Amended FFI 2310 documents for more details on these file types.

A. Structure

We recommend the standard Ames file formats FFI=2110 and FFI=2310 for exchange of most multidimensional data files. The FFI's descriptors are:

FFI 2110; two real independent variables, one unbounded and one bounded with its values recorded in the data records, primary variables are real; the first auxiliary variable is NX(m,1) (or, primary variables' ArrayDimension), all other auxiliary variables are real.

FFI 2310; two real independent variables, one unbounded and one bounded with its number of constant increment values, base value, and increment defined in the auxiliary variable list; primary variables are real; auxiliary variables are real.

For a complete description of these file types, please see the Ames file format document. The following are examples on FFI 2110 and FFI 2310 formats. The text in italics indicates comments not in the file but those added here for clarity. The normal comments section mimics FFI 1001 format described above.


Example_2110
File name: AR_DC8_20050203_R0.ict

54 2110
PI LastName, First Name
Code 916, Goddard Space Flight Center, Greenbelt, MD 20771
AROTAL
PAVE Mission
1 1
2005 02 03 2006 01 18
60
Altitude, meters
UT of data; XX.XXXX hours from 0 hours on flight date
7 {Number of PRIMARY variables}
0.1 0.0001 0.1 0.01 0.0001 0.1 .0001
-9999 -999999 -999999 -999999 -999999 -99999 -999999
Temperature, K
Log base 10 of Number density, part/cc
Temperature error, K
Aerosol, Klet
Log base 10 of Ozone number density, part/cc
Ozone mixing ratio, ppb
Log base 10 of Ozone number density error, part/cc
11 {Number of AUXILIARY variable}
1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
-9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999
Number of altitudes reported, none
Year; UT
Month; UT
Day; UT
Averaging time of presented data; xxx.x minutes
Latitude; degrees
Longitude; degrees
pressure altitude; meters
GPS altitude; meters
Static air temperature; K
SZA; degrees
0
18
PI_CONTACT_INFO: Enter PI Address here
PLATFORM: NASA DC8
LOCATION: Lat, Lon, and Alt included in the data records
ASSOCIATED_DATA: N/A
INSTRUMENT_INFO:
DATA_INFO:
UNCERTAINTY:
ULOD_FLAG: -7777
ULOD_VALUE: N/A;
LLOD_FLAG: -8888
LLOD_VALUE: N/A;
DM_CONTACT_INFO: Enter Data Manager Info here
PROJECT_INFO: PAVE MISSION: Jan-Feb 2005
STIPULATIONS_ON_USE: Use of these data should be done in consultation with the PI
OTHER_COMMENTS:
REVISION: R0;
R0: Version 2005-0: AROTAL T &amp; O3 Rayleigh Retrievals. Further revisions may be needed to fine-tune aerosol characterization.
UTC NumAlts Year Month Day AvgTime Latitude Longitude PAlt GpsAlt SAT_K SZA Altitude[] TempK[] Log10_NumDensity[] TempK_Err[] AerKlet[] Log10_O3NumDensity[] O3_MR[] Log10_O3NumDensity_Err[]
54000 9 2005 2 3 0 42.308 -70.582 6910 6979 242.5 65.5
     9154 -9999 -999999 -9999 -9999 113178 212 -999999
     9304 -9999 -999999 -9999 -9999 123353 2250 -999999
     9454 -9999 -999999 -9999 -9999 123008 2116 -999999
     9604 -9999 -999999 -9999 -9999 120933 1337 -999999
     9754 -9999 -999999 -9999 -9999 119675 1019 -999999
     9904 -9999 -999999 -9999 -9999 122655 2061 -999999
     10054 -9999 -999999 -9999 -9999 124384 3126 -999999
     10204 -9999 -999999 -9999 -9999 124632 3371 -999999
     10354 -9999 -999999 -9999 -9999 121341 1609 -999999
54060 8 2005 02 03 0 42.278 -70.613 6978 7043 241.7 65.5
     10118 9999 -999999 -9999 -9999 124458 3205 -999999
     10268 -9999 -999999 -9999 -9999 123160 2421 -999999
     10418 -9999 -999999 -9999 -9999 121221 1582 -999999
     10568 -9999 -999999 -9999 -9999 120950 1523 -999999
     10718 -9999 -999999 -9999 -9999 117339 680 -999999
     10868 -9999 -999999 -9999 -9999 122751 2423 -999999
     11018 -9999 -999999 -9999 -9999 124230 3491 -999999
     11168 -9999 -999999 -9999 -9999 124039 3424 -999999

Note the use of scale factors in this example.


Example_2310
File name: LidarO3_WP3_20040830_R0.ict

46 2310
Williams, Eric
NOAA Aeronomy Laboratory
Ozone number density profile from WP3 aircraft lidar
ICARTT_ITCT
1 1
2004 08 30 2009 09 04
60.0
Geometric altitude of observation, meters
Elapsed time in UT, seconds from 0 hours on day given by date
1 {Number of PRIMARY variables}
1.0e9
-9999
O3 number density, #/cc
9 {Number of AUXILIARY variable}
1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
-9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999
number of altitudes at current time mark, number
geometric altitude at which data begin, meters
altitude increment, meters
geometric altitude of aircraft, meters
UT, hours
UT, minutes
UT, seconds
aircraft longitude, degress
aircraft latitude, degrees
0
18
PI_CONTACT_INFO: Address: 325 Broadway, Boulder, CO 80305; email: eric@al.noaa.gov; 303-497-3226
PLATFORM: NOAA WP3
LOCATION: Lat, Lon, and Alt included in the data records
ASSOCIATED_DATA: N/A
INSTRUMENT_INFO: Differential absorption lidar. See Williams et al., BigScience, 42, p. 50-51, 2001
DATA_INFO: The units are number density (#/cc). The vertical averaging interval is 975 m at 1-7 km above the aircraft and 2025 m > 7 km above the aircraft. Horizontal averaging interval: 60 km.
UNCERTAINTY: N/A
ULOD_FLAG: -7777
ULOD_VALUE: N/A
LLOD_FLAG: -8888
LLOD_VALUE: N/A
DM_CONTACT_INFO: N/A
PROJECT_INFO: ICARTT study; 1 July-15 August 2004
STIPULATIONS_ON_USE: Use of these data requires PRIOR OK from the PI
OTHER_COMMENTS: N/A
REVISION: R0
R0: No comments for this revision.
UT_TIME Num_altitudes geo_alt_begin alt_increment geo_alt_aircraft UT_hour UT_min UT_sec Lon_aircraft Lat_aircraft Array_O3_NumDensity[]
30300 26 12819 75 10389 8 25 35 -133.24 -9.45
     1340 1519 1660 1779 1868 1939 1973 1992 1989 1955 1934 1897 1817 1721 1619 1514 1434 1343 1258 1203 1140 1088 1037 956 892 878
30360 22 12819 75 10383 8 26 0 -133.22 -9.93
     1351 1523 1658 1774 1860 1930 1962 1974 1966 1932 1909 1877 1803 1706 1600 1493 1407 1310 -9999 -9999 1094 1045


Note that this file uses a scale factor (1e9) for the number density data since it would be very cumbersome to add the exponential notation to every value. Also, this example was adapted from the NASA document and did not have uncertainty or flag values associated with the data.


5. File Formats for Other Data

Data collected during studies such as ICARTT for which a standard time-series format does not apply, can be formatted according to standards common to the user community and agreed to by the Data Management Working Group. For many modeling data sets the data files are generally stored in NetCDF format, which is a de facto standard for that community. However, the multi-dimensional data format defined above can accommodate these data sets and we leave this as an optional format. For some instruments (e.g., lidars), data are available as image files usually in standard formats such as GIF or JPEG. Not all software for reading and writing these formats allow additional text information (e.g., as a header) so the file names for these files must be defined to include as much information as possible. If necessary, the Data Management team will work with these PIs to achieve a mutually acceptable solution.

Data acquired by sensors on satellites are not conveniently incorporated into the ICARTT format. The data protocol allows each data record to be identified with a single timestamp only if data are reported continuously with a constant time interval (e.g., 1 second). Otherwise, start and stop times must be reported, and a Data Interval of 0 is entered on line 8 of the file header. Satellite data are unique in that while they are recorded on a constant data interval, significant gaps in the data may exist. These gaps may be due to cloud interference, changes in viewing mode (e.g., nadir versus limb), or other considerations. Given the sheer volume of data and the file sizes associated with satellite observations, it is not sensible to populate these data gaps with missing data values. It is also unreasonable to report start and stop times since data are typically collected on short timescales (typically sub-second) such that integration time is not an issue. Instead, satellite data files will report a data interval of -1 on line 8 of the file header. This signifies that each data record is identified by a single timestamp, but the actual timeline is discontinuous. Trajectory and Ground data files could also be constructed using discontinuous timeline.

In general, if problems or difficulties arise, the Data Management team will deal with them on a case-by-case basis. We want to ensure that all data that are collected during the ICARTT study are made available to all participants as quickly and as seamlessly as possible. Please feel free to send your inquires or questions to Ali Aknan or Jim Crawford. We welcome your comments or suggestions.

 

6. Data Exchange/Protocol During Field Study

During the field study every attempt should be made to have data posted to the data repository no more than 24 hours after the measurements have been taken. For some data this will be an absolute requirement due to the needs of flight and ship track planning. These data should be identified well before the commencement of the field campaign.

During and immediately after the campaign, “field” data files will be available. Data exchanged during the field study are considered a special case since these data are typically “first look” and, due to time constraints, are not likely to have undergone the full scrutiny of the PI. In order to reflect this fact the file names will be modified slightly with respect to the convention stipulated above in that the data revision code (R#) will be a "letter" (e.g., RA, RB, etc.) instead of a numeric code. This will be the flag to indicate to the user that these are "field" data to be used only during the field study. These files should be deleted as soon as possible after the study and replaced with preliminary data files which will have some QA/QC performed.

Normally, each campaign is accompanied with its own version of the “Data Exchange Protocol” which is agreed on by the science team. The 2004 ICARTT Data Exchange Protocol was specific only for that campaign.

 

7. File Scanning Software

View the "Help FScan" document for more details.

There are 2 versions available to Scan ICARTT formatted files:

1. Web-based: http://www-air.larc.nasa.gov/cgi-bin/fscan

2. Standalone Version (Windows Only)

 


Last updated: February 13, 2009