Student Abstracts: ORNL - Computer Science

SULI

CCI

PST

FaST

Student Abstracts: Computer Science at ORNL

AquaSentinel: Working for Homeland Security. DAVID FEAKER (Tennessee Technological University, Cookeville, TN 38505) DAVID E. HILL (Oak Ridge National Laboratory, Oak Ridge, TN 37831).

The AquaSentinel project, designed by Dr. Elias Greenbaum and Charlene Sanders and receiving help from Miguel Rodriguez and Dave Hill at the Oak Ridge National Laboratory, is a United Defense funded program for Homeland Security. The present project involves designing a program, AquaData Interpreter, for the United Defense AquaSentinel project. AquaSentinel is a newly patented bio-terrorism defense technology. This technique works by comparing the fluorescence given off by algae that is naturally present in a healthy drinking water supply with the fluorescence of drinking water that has been poisoned with a toxin like cyanide. Through the use of this program, one can determine whether or not the water is safe to drink or can potentially pose a health risk. The AquaData Interpreter is a computer program designed in C/C++ which takes the measured fluorescence data and compares the healthy sample to the sick sample. By comparing the variances of each data point corresponding to time in seconds, a chart can be created to show the significant differences between healthy algae and poisoned algae. It was found, through the use of the AquaData Interpreter program, that with even a small amount of poison, a big effect can be seen in the health of the algae. Thus by having the technology to detect harmful substances before the water is taken into a filtration system, disaster can inevitably be averted.

Construction of a Three-Dimensional Bifurcation Model for Lung Airways. ERICA SHERRITZE (Tennessee Technological University, Cookeville, TN 38505) RICHARD C. WARD (Oak Ridge National Laboratory, Oak Ridge, TN 37831).

A graphical, three-dimensional model of the lung airway is the fist step toward a goal of studying the effects of environmental toxicants or therapeutic aerosols on the human body. In order to develop this geometric model, a numerical investigation of the mathematic equations used to construct a previous model* was conducted. Using these equations and the given symmetric parameters, a copy of the model was generated. This first step involved using symbolic computational software, Maple 8, to simplify the equations. Once simplified, the equations were incorporated into a Microsoft Visual Studio C++ program to generate sets of points which were imported into Rhinoceros 3D NURBS Modeling program. Non-Uniform Rational B-Splines, or NURBS, are vector-valued piecewise rational polynomial functions that are used as a geometric tool for computer-aided design. After each equation was transferred into Rhino 3D, the process of constructing a three-dimensional model from the two-dimensional curves began. Curves located in the x, y-plane were then lofted with curves located in the x, z-plane to create the circular, parent segment of the bifurcation. Similar procedures were performed to create the remaining daughter branches and flow divider portions of the bifurcation. The result was a symmetric three-dimensional model of a lung bifurcation. Having this example model will enable slightly more realistic, future designs with asymmetric and multi-planed branching, to emerge through the alteration of the beginning equations and parameters. Multiple branches will then be scaled and added to the model to replicate the airways in the lungs. These designs will also be used to fit real CT data taken from the airways of pigs and humans. This work is a small portion of a much larger research project to develop a virtual model of the human body. * Lieber, B. B. and Zhao, Y., 1994, 'Steady Inspiratory Flow in a Model Bifurcation,’ Transactions of the ASME.,Vol. 116, pp. 490-491.

Displaying Oceanographic Data with The Live Access Server. TIMOTHY RACZ (Appalachian State University, Boone, NC 28608) ALEX KOZYR (Oak Ridge National Laboratory, Oak Ridge, TN 37831).

Data visualization is helpful for finding interesting features in data and for formulating hypotheses. To make data visualization possible over the internet, the Thermal Modeling and Analysis Project (TMAP) at the Pacific Marine Environmental Laboratory (PMEL) developed the Live Access Server (LAS). This configurable web server, along with a visualization application known as Ferret, allows access to geo-referenced scientific data and can produce various types of color-coded maps, graphs, and plots of the data. These features make LAS suitable for use with oceanographic data gathered by the Carbon Dioxide Analysis Information Center (CDIAC). While the LAS software is capable of handling gridded oceanographic data derived from original bottle measurements, it has no means of handling the bottle measurements. Effort has been spent on customizing LAS's ability to reference this raw bottle data. This requires writing a program that converts the bottle data into a form LAS can handle. Another problem is to configure LAS to depict sampling stations on the maps that are produced from the gridded data. This can be accomplished by writing custom scripts that are used by the Ferret data visualization software. While bottle data access has been successful the display of sampling stations has not yet been accomplished. The LAS software is quite flexible and allows data providers to customize it in many ways. LAS allows web users to obtain the data they need in a variety of different formats making the data easier to interpret. Displaying oceanographic data with LAS provides a powerful tool for anyone who needs to access this kind of data.

Information Analysis Techniques Using Upper Ontology Languages. TRAVIS BREAUX (University of Oregon, Eugene, OR 97403) THOMAS E. POTOK (Oak Ridge National Laboratory, Oak Ridge, TN 37831).

With increasingly ubiquitous networks has come an unprecedented flow of digital information. The tools to analyze (i.e., manage, evaluate, classify) and visualize (i.e., search, retrieve, present) information from multiple, inhomogeneous sources have largely relied on improvements in statistical methods. The results from statistical methods, however, significantly overlook the relevance of semantic features present within natural language and text-based information. Emerging research and development in ontology languages (e.g. RDF, RDFS, SUO-KIF, and OWL) offers promising avenues for overcoming the limitations inherent in statistical methods by leveraging existing and future libraries of meta-data and semantic mark-up. Using semantic features (e.g. hypernyms, meronyms, synonyms, etc.) commonly represented in ontology languages, subsumption inference can be used to reason about document content at conceptually higher levels than statistical methods. Subsumption inference fundamentally provides the capability to traverse class hierarchies composed of predicates, or in this case, words from natural language. This paper begins with a background in contemporary statistical methods required to introduce an alternative classification and search algorithm using semantic features commonly found in ontology languages. Following is an overview of eight popular ontologies, both dictionary and inference-based with attention to features desirable for use in the specified algorithm. In addition, the results from the search algorithm and companion software tool that uses Princeton University's WordNet as the ontology for searching text documents is presented. As one of the eight ontologies reviewed and the largest, most comprehensive dictionary-based ontologies, WordNet is attractive for evaluation purposes and prototyping. Finally, from problems discovered in both dictionary and inference-based ontologies, a set of guidelines are presented for evaluating and/ or developing ontologies for use in the stated algorithm.

National Energy Assurance Analysis Center. KIMBERLY WINDOM (Robeson Community College, Lumberton, NC 28358) ANDREW S LOEBL PH. D. (Oak Ridge National Laboratory, Oak Ridge, TN 37831).

Oak Ridge National Laboratory (ORNL) has proposed to the Department of Energy (DOE) and its Office of Energy Assurance (OEA) to establish an advanced computationally based non-classified, collaborative environment called the National Energy Assurance Analysis Center (NEAAC). The goal of NEAAC is to generate, integrate and provide enabling data, systems and technologies that will assure the DOE and its collaborating organizations and agencies-federal, state, or local-the continuity and viability of our nation's critical energy infrastructures, which, along with the other critical national infrastructures, provide the essential services that support the operation of our society. NEAAC's objective is to ensure a secure and reliable flow of energy to America's homes, industries, public service facilities, and transportation systems. To accomplish this there is a need for close collaboration with the private sector. NEAAC is needed in collaboration with the OEA to identify critical infrastructure components and interdependencies, identify natural and malevolent threats to the infrastructure, recommend actions to correct or mitigate infrastructure vulnerabilities, and plan for and provide technical support during emergency response and other system disruptions. The collaboration offered in the proposal currently consists of two National Laboratories, ten Universities with specialties spanning the entire energy supply and distribution sector in the U.S., including Oak Ridge Associated Universities (ORAU). Each institution is respectively affiliated with regional and local energy suppliers/distributors. This collaboration is the basis upon which analytical methods, new knowledge and computer based tools in a hosted collaborative research and training virtual environment will be developed under NEAAC. The students? summer assignment has focused upon the computational science and computer/network specifications by which open and effective Research and Development Programs can be accomplished. To better document the specifications of the network bandwidth, software technologies, mass storage, and visualization capabilities available through ORNL's advanced computational resources, research and interviews are in progress so that current technology can be documented to better understand how to create that advanced collaborative environment, envisioned through NEAAC, for the benefit of DOE and its missions.

The Genetic Algorithm as a Means of Modeling Electrical Power Generation Levels. JENNIFER CAMP (Purdue University, West Lafayette, IN 47907) VICKIE E. LYNCH (Oak Ridge National Laboratory, Oak Ridge, TN 37831).

Computer simulations to predict the cost and availability of electricity could be very helpful as the power industry is further deregulated. To most accurately reflect the electrical generation of utility companies, the simulation must employ a form of artificial intelligence in order to learn the most effective generation strategies. The genetic algorithm is used to simulate the learning techniques of the utility companies, which are represented as intelligent agent objects. Based on the total expected demand and a set of predetermined rules, each agent decides upon a bid of how much electricity to generate. The chromosomes of the genetic algorithm represent what rules the agent is currently utilizing to determine the bid. After the bids are made, a decision is made on which bids to accept based on a realistic modeling of the electric grid. After the power is dispatched, the agents are then informed of the amount of their electricity that was actually used and the price that was paid for the electricity. Using these results, the chromosomes are then manipulated using the standard reproduction, crossover, and mutation procedures of the genetic algorithm. One would expect the agents generation levels to organize to a point where each agent was earning the maximum profit. The final version of the simulation showed evidence that the agents were learning and that the generation levels were organizing near the expected values given by the economic models. Current research is being performed to merge this simulation with an existing simulation that models power transmissions in a given network. In this code, power is dispatched using power flow equations with linear programming, and the agents represent the utility companies associated with the generator nodes of the network. The agents must adjust their levels of generation to the changes in demand and the amount of power that can be transmitted on the lines of the grid. Initial results of the coupled codes show good agreement with the economic models.

Three Dimensional Data Visualization Using the Visualization Toolkit. ERIC MUELLER (University of Tennessee, Knoxville, TN 37996) ROBERT HARRISON (Oak Ridge National Laboratory, Oak Ridge, TN 37831).

VTK, the Visualization Toolkit, is a powerful and comprehensive open-source tool for the the visualization of data. Unfortunately, VTK is a programming interface rather than an end-user program, and using it directly to visualize data, though not overly painful, requires more time than can be afforded. To handle this problem, data-viewing applications are designed, usually for particular uses. One such use is in the visualizations of the higher-dimensional data sets which may be generated by approximating solutions to the electric potential equations in a molecule. Although the data is of course significant, a grasp of the processes used to generate the data is not necessary for the understanding of such a visualization utility. Current programs designed for use in this type of visualization, such as ParaView and MayaVI, are complex and counter-intuitive, as well as being very restrictive of the types of data that may be displayed and manipulated. New applications must be developed for the particular requirements of these data sets.

Using Concepts of E-Commerce in Scientific Data Applications-Atmospheric Data. CORTEZ HARVEY (DePaul University, Chicago, IL 60604) DR. RICHARD WARD (Oak Ridge National Laboratory, Oak Ridge, TN 37831).

The Atmospheric Radiation Measurement (ARM) archive contains ten years worth of atmospheric data including microwave, radar, lidar, and energy flux measurements in NetCDF format stored on the High Performance Storage System (HPSS) at ORNL. Links to the ARM data are maintained as metadata in a Sybase database. A user interface is being developed to allow users quick access to the ARM data via the metadata links. This interface essentially mimics E-commerce in that it provides the user opportunities to view metadata and order data in a manner similar to ordering products on the Internet. The user first selects a site, facility, instrument, month, and year for the desired data. Thumbnails (or miniature plots) corresponding to the data matching the selection criteria are displayed for each day of the selected month. The user selects a day by selecting a thumbnail. A set of plots for the primary measurements of the selected instrument is then displayed. This set of plots is referred to as "quicklook" plots because they give a quick view of the raw data in the ARM archive. This project uses a technology called Java Server Pages (JSP) and Java Servlets to implement the process of viewing information and ordering the data. Developed on a desktop PC, the application has been ported to the development server in the Environmental Sciences Division, where it will serve as the prototype for the ARM data "quicklook" browser.

Value Chain Analysis Application. JOHN KNOX (Roane State Community College, Oak Ridge, TN 37830) RICHARD WARD (Oak Ridge National Laboratory, Oak Ridge, TN 37831).

The Value Chain Analysis (VCA) method has been applied to aid steel manufacturers in finding the most productive way of making their products while minimizing their cost. A computer application using the VCA method was developed in Microsoft Access to allow the user to input initial data ranging from raw material to variable costs and to enable tracking of specific information as material is passed from one process to another. The Microsoft Access VCA application has been completely rewritten to improve the user interface and clean up the database design. With the new version, the user can view sensitivities (e.g., rate of change in product produced per rate of change in energy consumed) and the effects of incorporating new technologies. At the moment, optimization of the technology analysis is done in the MATLAB environment, but soon these calculations will be integrated into the new VCA application. Instead of using Microsoft Excel to generate plots and graphs, they are now done in Microsoft Access which is leading us towards our goal of having the entire program contained in Microsoft Access. We're designing the new VCA application to be flexible enough for use in other optimization problems. An example of this is the use of wireless technology in monitoring industrial processes. Science & Technology Highlights, No. 1, 2000, ORNL, Office of Energy Efficiency and Renewable Program, p. 4. http://www.ornl.gov/ORNL/Energy_Eff/Sci_Tech_hilights/No_1_2000.pdf

Visualization of Scientific Data over Wide Area Networks. KAYA SHAH (MIT, Boston, MA 02139) S. V. NAGESWARA RAO (Oak Ridge National Laboratory, Oak Ridge, TN 37831).

In today's world of advanced computing, data modeling is taking on a new role of importance. Scientists generate large, multidimensional data sets through simulations, experiments and computations. Furthermore, these data sets are rapidly gaining in size. When terabyte data sets need to be analyzed, terascale computing is necessary. Most scientists do not have local access to terascale computing power and thus access supercomputers over networks. This paper examines the process of visualizing three-dimensional scientific data over a network. In order to explore the actual visualization process, interpreters, visualization toolkits, and software programs were familiarized with. Interpreter languages such as Tcl/TK allowed programming in the Visualization Toolkit (VTK). Additionally, experiments were conducted with ParaView and MayaVi, which are visualization programs that run on top of VTK. To better understand the flexibility of ParaView, filters and modules were added to the program by modifying the source code. In order to explore visualization over networks, communication between two computers on a LAN was achieved through sockets. The next steps involved writing a program to compare data set size and transmission time as well as make the decision to visualize the data set locally or remotely. Future extensions of this project include experimentation over a globally distributed test-bed for network services, such as PlanetLab.