Date: Tue, 1 Nov 1994 10:11:30 -0500 From: Jason Mathews Welcome to the CDF Mailing list!! This initial message is primarily a test message to check for the validity of email addresses. You are receiving this message because your name is now included on the list. To send a message to everyone on the list, send your message to "cdf-users@nssdc.gsfc.nasa.gov" For a starting topic, we could discuss potential uses of data compression inside the CDF library. If you have any ideas or questions, just speak up and we can get the ball rolling. We have 16 people signed up for the list so far! ------------------------------------------------------------------------------- Jason Mathews |National Space Science Data Center (NSSDC) NASA/Goddard Space Flight Center| Code 633.2 | (DECnet) NSSDCA::MATHEWS Greenbelt, MD 20771-0001 USA | --------------------------------+ http://nssdc.gsfc.nasa.gov/cdf/cdf_home.html Date: Sat, 5 Nov 94 17:17:24 EST From: "Lloyd A. Treinish" Subject: High-level classes of data in CDF I would like to inquire if anyone has developed unambiguous conventions for handling any of the following types of data that CDF, as a carrier of multi-dimensional arrays, does not deal with directly. Admittedly such conventions require interpretation with appropriate semantics at an application level. However, such data can be decomposed into multi-dimensional arrays that CDF handles very well. At the very least, I would like to start a discussion on the subject. 1. Non-scalar variables 2. Aggregates other than series (e.g., composites, multizone grids) 3. Data dependency (e.g., node, cell center, edge) 4. Curvilinear meshes beyond that implied by dimensional product of geographic independent variables at an application level 5. Invalidity beyond that implied by a fill value or valid minimum/maximum and interpretation at an application level 6. Irregular meshes beyond indexed grids as 1d variables, which are ambiguous (i.e., could be also actual 1d data or scattered data that is indexed by location) 7. Unstructured meshes (e.g., triangles in 2d, tetrahedra in 3d) Lloyd Treinish IBM T. J. Watson Research Center From: Jason Mathews Subject: Announcing CDF/IDL WWW-based browsing/retrieval system Date: Tue, 17 Jan 1995 12:45:20 -0500 (EST) Dear CDF users: Some users have showed an interest in WWW-based systems with access to CDFs. One such system, called OMNIWeb, accesses the space physics OMNI data set, which has been converted to CDF. ***************************************************************** * * * THE NATIONAL SPACE SCIENCE DATA CENTER IS PROUD TO ANNOUNCE * * * * * * T H E N S S D C O M N I W E B S Y S T E M * * * * * * * ***************************************************************** NSSDC's OMNIWeb includes a graphical data browser and a data retrieval tool. The graphical data browser let users visualize the data as plots. OMNIWeb uses IDL (a commercial scientific data analysis package that supports CDF) to generate on-the-fly images of selected variables. The browsing feature was designed to assist users in following trends and isolate areas of interest. The retrieval tool allows users to choose subsets from the available data and instantly retrieve it to their computer in ASCII or binary formats (CDF or host-encoded). NSSDC's OMNIWeb also includes a context-sensitive help system that guides user's interactions with the system by providing helpful hints, tips, and other relevant information. Jason Mathews Date: Wed, 19 Apr 1995 19:18:23 EDT From: "Larry Bleau" Subject: Loss of accuracy in EPOCH_breakdown? Since the EPOCH time is a REAL*8 value it can represent time values less than a millisecond. When I use EPOCH_breakdown, however, I only get back the millisecond value, no microseconds. A routine of ours constructs an EPOCH time to include microseconds. Using the above CDF EPOCH utility routine would result in a loss of accuracy. Any suggestions, other than to do it ourselves? Larry Bleau University of Maryland Date: Thu, 20 Apr 95 08:35:49 EDT From: "Lloyd A. Treinish" Subject: Loss of accuracy in EPOCH_breakdown? The millisecond value returned should be a float, so you should have at least integral precision on microseconds (i.e., assuming 6 digits accuracy). Anything beyond that would be suspect given that the entire epoch value (msec > 0 AD) is stored in a double, uncompressed. The other values would be ints. Date: Thu, 20 Apr 1995 12:46:10 EDT From: "Larry Bleau" Subject: Re: Loss of accuracy in EPOCH_breakdown? Beg to differ. According to the documentation the milliseconds value returned is an INTEGER*4 in the range 0-999. The underlying C code (in epochu.c) operates on a double and produces only integer (long). Returning a float for the msec component would resolve this issue, but that'd be a major change to the routine's interface. Larry Bleau University of Maryland Date: Fri, 21 Apr 1995 14:04:00 EDT From: "Larry Bleau" Subject: Re: Epoch microsecond accuracy I received the following from Jeff of CDFSUPPORT: >Your EPOCH microseconds concern came my way. The IEEE double-precision >floating-point encoding doesn't have enough precision to correctly store >microseconds since 1-Jan-0000 00:00:00.000 (0 AD, 1 BC, or whatever). >The VAX D_FLOAT encoding may have enough precision in some cases but even >it is right at the edge of not having enough. Choosing a more recent >"epoch" would make storing microseconds since that "epoch" possible. I >can't see adding another data type to CDF to support a new "epoch". In >fact, we probably shouldn't have the one we do have. I just ran a test using our home-grown routines, which accept microseconds as an input to computing an Epoch time and produce a microseconds output when converting back. We get precision down to the 40 microsecond level. That is, for example, a microsecond component value in the range 39-78, when converted to Epoch time and back to normal time, produces (on our system using our routines) the same microsecond component: 58. Our error is thus +/- 20 micro- seconds. So, if anyone on using Epoch time wants microsecond accuracy, they are SOL. That makes my earlier question about losing microsecond accuracy somewhat academic. Larry Bleau University of Maryland Date: Mon, 24 Apr 1995 06:55:09 GMT From: Kim Bisgaard Subject: Re: Epoch microsecond accuracy If we moved epoch starting time closer to today, say January 1., 1990, we could have sub microsecond accurracy in the vicinity of that date, say +/-1000 years. At year 0 BC we would still have millisecond accurracy. Who can pinpoint a microsecond that far away anyway? Regards Kim Bisgaard Danish Meteorological Institute, Denmark. Date: Wed, 26 Apr 95 16:28:50 EDT From: "Lloyd A. Treinish" Subject: Re: Loss of accuracy in EPOCH_breakdown? Hmmm. I guess I am recalling either ancient history, where a float value was returned in that old code or in code I used myself evolved from the original algorithms, which does return a float for msec. Date: Wed, 26 Apr 95 15:23:49 EDT From: "Lloyd A. Treinish" Subject: Re: Epoch microsecond accuracy The usual approach since the beginning of CDF to address this problem has been to having another 0d variable with record variance and obviously no dimensional variance that is some sort of "running" time. With an appropriate offset and data type, the desired accuracy was always possible. Date: Wed, 26 Apr 95 16:31:19 EDT From: "Lloyd A. Treinish" Subject: Re: Epoch microsecond accuracy OK. I was off by a little. I assuming tens of microseconds was possible, but obviously not with IEEE doubles, but potentially other types. I would assume O(100 microseconds) was reasonable. There's a lot of history behind the EPOCH variable, which would take some time (pun intended) to discuss, which is good and bad. I would view that as being the least of the problems in CDF viewed with the perspective of 20-20 hindsight. CDF does have the virtue of allowing you NOT to use the EPOCH variable and define your own way of storing time information and the prerequisite conventions for driving applications. From: Jason Mathews Subject: Single CDFs vs handling multiple CDFs Date: Tue, 16 May 1995 09:36:42 -0400 (EDT) A question has come up about the size of CDFs. It is easy enough to keep all relevant data in a single CDF vs. keeping it over a number of CDFs (e.g., each CDF contains one year of data), but access over the entire time period is still needed. Data in a single CDF: *CDF is self-contained and independent from any other CDF. *Generic tools work on a one-CDF-at-a-time basis. Data in multiple CDFs: *Access across CDF boundaries requires a level above CDF, which might define a database of which CDFs are logically connected, such that plotting a variable from time_1 to time_2 may cover many CDFs. *Custom application software is needed to handle a set of CDFs. The ISTP community is using a "virtual" CDF concept, where a collection of CDFs are logically connected and operations may be done on any set of CDFs, but the generic CDF tools (CDFedit, CDFlist, etc.) don't know about multiple CDFs and custom software must be created to handle this concept. Are there any thoughts as to how data should be stored in a CDF and to the future of generic CDF tools/applications? Should there be a generic level above CDF that knows about CDF relationships with more metadata, so there would be variable relationships within a CDF and across CDFs? At present, any such CDF relationships are at an application layer. Jason Mathews From mathews Tue May 16 13:23:32 1995 Subject: multiple CDFs Date: Tue, 16 May 1995 13:23:32 -0400 (EDT) In reply to the following conversation: > Edwards AFB > 16 May 1995 > > Hi Jason, > > Has anyone in your group explored the idea of using commercial database > software (e.g. ORACLE) to store data, and rigging the interface to the > database to read and write CDFs? > > The intent would be to use the database to query information that could > span multiple time periods, and have the database return it as a single > file CDF. In this way, there'd be no need to develop custom applications > to handle multiple CDFs. > > We'll be facing the same type of problem at the AFFTC in some future work, > and we've thought of using this approach. We don't have a prototype for it > as yet, though. > > Tom Sweeney I don't think we want ORACLE to read/write CDFs, but it might be possible to create a set of functions (e.g., OracleCDFlib?) with the ORACLE API that is a superset of the CDFlib functions and deals with CDF sets as appossed to individual CDFs. However, using ORACLE to store metadata on CDFs is an approach that is used in the data access operationes of the CDAW CDFPLOT application that uses an IDL database to relate CDFs. Jason Mathews Date: Tue, 16 May 95 13:32:26 EDT From: "Lloyd A. Treinish" Subject: multiple CDFs FYI, one of the reasons we first created CDF a decade ago was because RDBMS, and specifically Oracle, had an inappropriate data model (i.e., relational) and inefficient access/query mechanism (SQL) for multi-dimensional arrays. In this context, Oracle was used to manage warehousing, temporal and characteristic metadata for off-line data in random tape formats and some on-line data in CDF. An RDBMS was viewed as being appropriate for this purpose. Based upon a user query, the identification of a data subset of interest could be done, which resulted in CDF containing those data. This appears to be close to what you suggest. While quaint in its implementation by today's standards given its vintage, it did offer functionality that has yet to be duplicated. Sadly, the system was turned off over a year ago with no direct and equivalent replacement. I would certainly recommend the approach of loosely coupling a RDBMS and CDF, as we did long ago, where Oracle could manage pointers to relevant CDFs as various types of metadata. If the metadata are simple or small, this would be overkill compared to a "flat file" approach (i.e., one could even use CDF to store the metadata) or using a "simple" RDBMS. As Jason suggests, an overseeing application would still be required. In the interim, Oracle seems to be realized that CDF-like functionality has some merit as embodied in their recently announced Oracle7 Multidimension product. I've only read about it, and have not tried it nor know how well it performs on real problems. From what I have read so far, there seems to be quite a bit of old and familiar wheel reinvention. Date: Tue, 16 May 95 13:46:09 EDT From: "Lloyd A. Treinish" Subject: Single CDFs vs handling multiple CDFs Even the ancient VMS-based CDAW tools could take as input multiple CDFs, and create various visualizations. Clearly, if the data in the multiple CDFs had no common basis, then such operations would be meaningless. I don't see anything inherent in CDF to preclude multiple access. With the current implementation it's at the application level. For example, this is something I do routinely with Data Explorer when I have distinct data sets in a common coordinate system, but they are in different CDFs because they are on different grids. With the idea of a very simple self-decribing array-based data model, the idea of creating higher-level constructs and hence, higher- level interface with more complex semantics is doable. That was always part of the idea behind CDF, although rarely implemented. MR-CDF would be one direct example for multi-resolution views of an array. The Data Explorer(DX) data model, for example, supports aggregation and complex mesh structures, among other things that are not directly supported by CDF. However, thedata must be stored as arrays somewhere. Hence, the DX data model has the facilities to define a complex construct as a single entity, but it can be decomposed into arrays with specific semantics (e.g., data, nodes of a mesh, neighbors, etc.). I cite DX as an example in this context because one could do exactly the same thing with CDF as carriers of multi-dimensional arrays, which has been proposed, to support more complex structures as Jason implies. Obviously, this doesrequire additional conventions for attributes, etc. that today would be at the applications level. Of course, one could create an interface that raises the level of application building. Date: Thu, 1 Jun 1995 11:24:34 GMT From: Kim Bisgaard Subject: Suggestion regarding Epoch I would like to put forward a suggestion to enhance the usability of the CDF Epoch data type. There seems to be at least two arguments against using the existing Epoch data type: (1) It users two much space since it is a 8-byte float. (2) It has not got sufficient precision. Simple solution: Change the CDF library to work as follows: Upon finding a gAttribute called EPOCH_START of type Epoch, its value is then used instead of 01-Jan-0000 00:00:00.000 as the starting value. Further add a new function to CDF library 'EPOCHbreakdown2' with the following interface: void EPOCHbreakdown2( double epoch, /* In -- The CDF_EPOCH vlaue */ long *year, /* Out -- Year (AD, e.g., 1994) */ ... float *msec); /* Out -- Milliseconds with fractional part */ The benefits of this solution are that it gives enhanced precision, with minor changes to CDFlib code, and backwards compatability. The defices are that it does not solve the (1) problem, and that the EPOCH functions suddenly becomes dependend of a open CDF file because of the gAttribute EPOCH_START, and that it is possible to move all timestamps in a CDF archive by changing a gAttribute. Complex solution: Besides the simple solution add a new data type CDF_EPOCH4 as a 4-byte float. This would then have the sufficient precision to cover the time span with most CDF archives. Thus, giving a solution to the (2) problem. One could then also work with the defices of the simple solution, like defining new EPOCH utility routines taking an extra argument of an CDFid thus making it possible to deal with multiple CDF's with different EPOCH_START. It is possible to forbid changing of the EPOCH_START as soon as data has been entered into the CDF. It is possible to allow for vAttributes, thus having different starting times in one CDF archive. An even more condenced representation of an EPOCH time could be a CDF_EPOCH2 as a 2-byte integer. Regards Kim Bisgaard Danish Meteorological Institute, Denmark. Date: Wed, 7 Jun 1995 12:38:29 +0400 From: Anatoly Petrukovich Subject: pointer-class version of AVG_TYPE attribute A.Petrukovich Interball key parameter group Space Research Institute Moscow, Russia I wonder, why pointer class version of the AVG_TYPE attribute is absent. Components of the polar vectors usually have different averaging types, e.g. standard and angle_degrees. Or for the polar vectors that is done automatically? From: Jason Mathews Subject: NSSDC CDF News Update Date: Wed, 7 Jun 1995 15:51:34 -0400 (EDT) NSSDC Common Data Format (CDF) News Update June 1995 General CDF News ---------------- On May 2, 1995 the Common Data Format (CDF) staff was awarded a "NASA Group Achievement Award" in recognition for providing outstanding software support for the management, archiving, analysis, and distribution of NASA data. World Wide Web CDF Access ------------------------- The CDF data access tools used in the WWW-based OMNIWeb Data System have been reused and extended to handle multiple CDFs for the NSSDC Coordinated Heliospheric Observations (COHO) data in the COHOWeb Data System. OMNIWeb and COHOWeb provide an interactive interface for data retrieval and visualization that brings the scientific data to the researcher over the Internet. IDL is used in both systems as a graphics engine to generate plots of the selected CDFs. The OMNIWeb and COHOWeb WWW-based data systems are available via the NSSDC Space Physics home page URL: http://nssdc.gsfc.nasa.gov/space/space_physics_home.html CDF Home Page Update -------------------- The CDF Home Page has been updated with a "What's New" page and an introduction to the CDF User's Mailing List. The CDF User's Guide has been converted to HTML with the latex2html conversion tool along with a searchable WAIS index available from the World Wide Web as a context- sensitive document. The CDF home page is available via the URL: http://nssdc.gsfc.nasa.gov/cdf/cdf_home.html New Contributions ----------------- QCDF Browser is a client server application for accessing and displaying CDF files that was contributed by Tony Allen . It is available via anonymous from ncgl.gsfc.nasa.gov (128.183.106.87) in the directory /pub/cdf/apps/graphics/unix. Software Related ---------------- - The CDF distribution v2.5.12 was released on May 30, 1995. - A new method of including the CDF constants used with the CDF/IDL interface was developed to ease the problem of too many local variables in an IDL function/procedure. The constants are contained in structures rather than as individual variables. - Support for the IRIX 5.x operating system was added to the distribution. - Match control parameters were added for the shareable CDF library on VMS systems. This will allow applications linked to the shareable CDF library to use new revisions of the library without having to be relinked (except in those cases where relinking is mandatory). - Work began on CDFexport. CDFexport will allow CDFs to be exported in a variety of formats and will eventually replace CDFlist. - Changes were necessary to CDF's IDL interface because of differences between IDL 3.x and IDL 4.0. - The use of extended (virtual) memory on the IBM PC under Microsoft C 7.00 was implemented. Initial testing was successful with the exception of the Curses-based toolkit programs and the Fortran test programs. Those problems will be investigated. - The CDF V2.5 distribution was modified for the POSIX-compliant C compiler (c89) on the HP. Date: Wed, 07 Jun 1995 16:11:52 EDT From: "Larry Bleau" Subject: Suggestion: Another encode_EPOCH routine I was looking over for an easy way to (from Fortran) convert from an Epoch time into a year/day-of-year/time format. All the encode_EPOCH routines convert it into year/month/day. I'd like to suggest another routine, call it encode_EPOCH4, which would generate the character string yyyy ddd hh:mm:ss.ttt where ddd is the day of the year yyyy. How much interest is there in a routine like this? Larry Bleau University of Maryland From: Ingo Heuten Subject: question to z-variables Date: Wed, 7 Jun 1995 15:17:15 -0700 (PDT) Hi CDF-users, I'm an exchange student from Germany at the Southern Oregon State College in Ashland Oregon where I have been working on a creation of a database using IDL's interface to the Common Data Format. My problem is that I can't store a value at a specific position of a z-variable. I tried this in the following way, but this doesn't work: EXAMPLE: PRO createCDffile id = CDF_CREATE('testfile', [5,3,2], /COL_MAJOR) var1 = CDF_VARCREATE(id, 'VAR1', ['VARY','VARY','VARY','VARY'], $ dim=[5,3,2,3], /CDF_FLOAT) CDF_CLOSE, id END PRO store_data id = CDF_OPEN('testfile') num1 = 7.5 num2 = 8.3 num3 = 21.6 CDF_VARPUT, id, 'VAR1', num1, OFFSET=[2,1,0,0], COUNT=[1,1,1,1] CDF_VARPUT, id, 'VAR1', num2, OFFSET=[2,1,0,1], COUNT=[1,1,1,1] CDF_VARPUT, id, 'VAR1', num3, OFFSET=[2,1,0,2], COUNT=[1,1,1,1] ;;; the value is stored every time at position [2,1,0,0] CDF_CLOSE, id END I would be glad if I get an answer with an idea what I did wrong and how I can store values in z_variables? Thanks in advance, Ingo Heuten Date: Thu, 8 Jun 1995 10:39:31 -0400 (EDT) From: CDFSUPPORT@nssdca.gsfc.nasa.gov Subject: RE: question to z-variables This appears to be a bug in IDL's CDF interface. You should contact them directly at `support@rsi.com'. I did some investigating, however, and think I see what's happening. It looks like the 4th dimension offset (index) is being ignored (as you noted) because the number of rVariable dimensions is 3 (as specified when the CDF was created). That shouldn't have any affect on accessing zVariables but it appears that it does (the bug). By increasing the number of rVariable dimensions to 4 I was able to make your procedure work correctly. Feel free to pass this along to RSI. CDFsupport (Jeff) From: Jason Mathews Subject: Re: Suggestion: Another encode_EPOCH routine Date: Sun, 25 Jun 1995 14:44:36 -0400 (EDT) A epoch function encodeEPOCHx with a customizable format has been added to the CDF library Version 2.5.13 (released 14-June-95). The following fields may be specified in the format string: Day of month (1-31) Day of year (001-366) Month (`Jan',`Feb',...,`Dec') Month (1,2,...,12) Year (4-digit) Year (2-digit) Hour (00-23) Minute (00-59) Second (00-59) Fraction of second. Fraction of day. For more information, please see the file EPOCHx.doc available via the location http://nssdc.gsfc.nasa.gov/cdf/misc/EPOCHx.doc and in the cdf25 distribution package. Jason Mathews Date: Tue, 20 Feb 1996 15:25:15 -0800 From: Dave Lauben Subject: Matlab interface for CDF? Hello, I've just joined cdf-users. Does anyone have a Matlab interface in the works similar to the IDL (external) interface? I've seen the mex code for netCDF, but of course that's not what we're interested in here. If no one is working on this, we will probably undertake to do so... Anyone know any reason why this wouldn't work? Thanks, -Dave.