Student Abstracts: PNNL - Computer Science

A Defense-in-Depth Model of Security in Collaborative Systems with the Goal of Robust Prevention, Detection and Response. AARON BROWN (Gonzaga University Spokane, WA 99253) DAVE MILLARD (Pacific Northwest National Laboratory, Richland, WA, 99352)

Despite increasing concern and interest regarding software security, most software developers continue to treat security as an afterthought. One particular aspect of security that is often over looked is the necessity for a dynamic, adaptive view of secure computing that stresses appropriate security measures for the type of system being designed. This lack of adaptivity is often seen in the development of collaborative systems. Because of their extremely dynamic nature (in terms of user community, data, usage load, and usage method), collaborative systems have several special concerns that are not often addressed by software developers. This paper lays out a generalized defense-in-depth model for collaborative systems covering several methods of attack prevention, detection, and response. Examples used are drawn from work on a system called WebOSB. External references and information are taken from diverse sources such as the Trusted Computer System Evaluation Criteria (DOD-5200.28-STD, often called "The Orange Book"), academic papers and conference presentations, and other published sources such as Secrets and Lies by Bruce Schneier. The paper stresses the importance of not only attack prevention, but also attack detection and response. It also recommends the use of robust systems of auditing, monitoring and logging, and a well-rounded security policy.

Asynchronous HTTP request using AJAX and PHP. CASEY DAVIS (Big Bend Community College Moses Lake, WA 98837) DARREN CURTIS (Pacific Northwest National Laboratory, Richland, WA, 99352)

By its very nature HyperText Transfer Protocol(HTTP) is stateless. Once a request is made, the server will return the information requested and then cease communication with the client. AJAX(Asynchronous JavaScript and XML) is a method of programming that allows the client to simulate persistent communications with the server. By allowing the developer to use CGI(Common Gateway Interface) programming to define exactly what the client is looking for, the asynchronous requests reduce packet size and reload time, while also making the application less like a web page and more like a desktop application. Using JavaScript on the client-side the developer may initialize an asynchronous HTTP request to the CGI preprocessor, in this case PHP(PHP HyperText PreProcessor). PHP may then make logical decisions about what data is to be presented and returned to the JavaScript. JavaScript will then update only the relevant section of the web browser window, thus making the browser react, seemingly in real-time, without reloading the entire page of data. This new way of using existing technology is adding a whole new facet to the web industry. While all the things that go into making AJAX work are not new, they are being used in many new ways. Google® is putting it to good use with programs such as Google Suggest®, Google Maps®, and most recently Google Earth®. Since AJAX was build to conform to XML(eXtensible Markup Language) standards and is fully DOM(Document Object Model) compliant developers may use AJAX for simple things such as form validation, and list generation, to more complex things like SVG(Scalable Vector Graphics) which will allow for drawing images on the fly.

Conceptual Similarity of Shapes for Feature Extraction/Comparison Software. LYONS JORGENSEN (Big Bend Commuinity College Moses Lake, WA 98837) PATRICK PAULSON (Pacific Northwest National Laboratory, Richland, WA, 99352)

To determine the make and model of a car from images the shape of the windows need to be extracted and compared. The comparer needs to analyze similarity in shape independent of scale, rotation, and translation. A new Conceptual Space is a possible solution for comparing qualitative data like shape. We describe a conceptual space for polygons that allows their similarity based on standard distance calculations. The similarity values seem to be consistent and intuitively reasonable. More analysis should be done to compare the results with human perception.

Hyperlink Topology: A Plausible Search Engine Technique?. JESSE SCHUSS (Onondaga Community College Syracuse, NY 13215) SCOTT BUTNER (Pacific Northwest National Laboratory, Richland, WA, 99352)

Search engines are a fact of life now. They are used throughout the world to make life easier by returning relevant results to a user's query. As the web grows, however, search engines become more ineffective and new techniques are required. New web analysis techniques are required to keep up with the growth of the web. Link topology is currently used by search engines to aid in ranking their results. However, link topology hasn't been used by itself to examine the web. Utilizing a crawler, SOLAME, we gathered data from the web and constructed a relatively small database. Specifically, hyperlinks were examined, collected, and categorized. We then constructed a matrix of source-destination hyperlinks and, using Euclidean distances of column vectors from the matrix, plotted the results using IN-SPIRE to reduce a multidimensional space down to a conceptual two or three dimension visualization. The results revealed that within the visualization of IN-SPIRE web sites were clustered together along the bounds of the planar domain. That there was hyperlink clustering reveals sites that have similar subject matter content, thus proving that the link topology of a web site reveals some information about its content. This result was expected, following the logic that sites would link to sites with similar subject matter. Whether that site had arguments for or against the first was not taken into account for this project. This explorative step into new topological analysis of web site is just one part of a growing effort to analyze the web on part of the entire web community. Though the initial results were based off of a small database, they have yielded enough information to warrant further studies. Predictably, more topic specific and argument category result sets (i.e.: sites for or against) can be achieved through the use of a contextual engine to analyze the text surrounding the link.

Lessons Learned Modifying a Web Application That Uses the Java Servlet Technology and MySQL Database. DANIELLE EVANS (Big Bend Commuinity College Moses Lake, WA 99357) MITCH PELTON (Pacific Northwest National Laboratory, Richland, WA, 99352)

Java™ servlets provide software developers with a simple, consistent mechanism for extending the functionality of a web server. A Java™ servlet uses the Java™ programming language for generating dynamic web content by using the ETL (Extract, Transform and Load) process with the MySQL database engine. MySQL is a RDBMS (Relational Database Management System) that uses SQL (Structured Query Language) for extracting, transforming and loading data in a database. The Apache Tomcat web services' contains an API (Application Programming Interface) library that the Java™ servlets communicate with to post state information. Web application interfaces are generated by a URL (Uniform Resource Locator) request that returns a HTTP (Hypertext Transfer Protocol) document. The FMS (FRAMES Module Server) application is one such application that enables a client to extract data for a pre-defined schema as defined by a program called FRAMES (Framework for Risk Analysis Multimedia Environmental Systems). This schema is contained in a set of text files called dictionaries. Typically a schema provides a framework for naming and storing different elements of information about something. The schema provided by these dictionaries is mapped to the given schema identified by the database being mapped. The FMS application offers users a way to share information that will help make FRAMES a more productive software tool. The FMS application allows for disparate data to be more readily accessed as well as QA/QC (Quality Assurance/Quality Control). Testing, debugging and making this application more user friendly, has revealed problems dealing with the Java™ servlet technology and the JBuilder X development environment. The experiences, problems and their solutions are discussed in this paper. The problems with finding, downloading and using the software needed are prevalent when creating the FMS application that uses the discussed technologies. Testing, debugging and making the application more user friendly have presented unique experiences along with solutions found to make using the software easier are discussed in this document. The solutions to some of the problems were simple and some more difficult.

MetaMind: A Meta-data Extraction Tool for Classifying Web Content Using Domain-Specific Context and Content Heuristics. STEVE SILVA (Washington State University Pullman, WA 99163) SCOTT BUTNER (Pacific Northwest National Laboratory, Richland, WA, 99352)

Developing software agents which can reliably classify web documents is an important prerequisite to widespread adoption of semantic web technologies. Effective agents must be able to process large numbers of web documents in a timely and cost effective manner, while providing for classification accuracy that is comparable to that of a human reader. This research paper discusses the design and implementation of the MetaMind tool, along with preliminary results obtained from the web-based classification of environmental compliance assistance documents. MetaMind is a meta-data extraction tool designed to (1) extract domain-specific meta-data attributes from web documents; and (2) use these meta-data attributes (along with additional context clues) to aid in classifying these documents against predefined, hierarchical subject taxonomy. MetaMind's classification and meta-data extraction rules are based on both document content (analysis of word frequency and word/phrase position) and context cues (including web domain and path structure, analysis of inbound and outbound hyperlinks, etc). In order to facilitate integration with other applications, MetaMind is being implemented as a web service using a combination of Java Servlet technology and JESS, a Java-based Expert System Shell developed by Sandia National Laboratory. An extended form of the Rich Site Summary standard (RSS) is being used to present the resulting meta-data information to the user. MetaMind is being developed to support the US EPA's Compliance Assistance Clearinghouse, an online meta-data repository focusing on documents providing regulatory interpretation and assistance to the business community. While MetaMind is being developed for application to a specific and relatively narrow domain, we believe that many classification tasks are similarly constrained within narrow subject domains. In such applications, classification heuristics encoded as production rules can be used very effectively without suffering the problems of scaling and maintainability that often occur when such approaches are applied to broader subject areas.

MetDataCNV, a Windows Based Software Tool for converting TD-6201 and SAMSON Meteorological Data for use in the CALMET meteorological Model. NOAH ZEMKE (Big Bend Community College Moses Lake, WA 98837) FREDERICK RUTZ (Pacific Northwest National Laboratory, Richland, WA, 99352)

The CALMET meteorological model requires surface and upper air meteorological data as input for running a simulation. TD-6201 upper air meteorological data and Solar and Meteorological Surface Observational Network (SAMSON) surface meteorological data can be used by CALMET but not in their original formats. While the legacy FORTRAN executables READ62 and SMERGE can convert the SAMSON and TD-6201 files, they require that an input text file be hand edited before their execution. Time consuming alterations of input files that are not edited accurately could cause errors at run-time in the CALMET processor. A software application named MetDataCNV has been developed to solve these problems by having a user friendly interface automatically create the input files, then execute the READ62 and SMERGE executables at the click of a button. The MetDataCNV integrates a graphical user interface, process manager, and two FORTRAN processors. The graphical user interface allows the user to select meteorological station data and begin the process manager. The process manager controls the flow of input files and execution of the FORTRAN processors. The FORTRAN Processors executed by the process manager use the input files created, to convert the meteorological station data files to a format readable by the CALMET processor. MetDataCNV effectively converts the TD-6201 and SAMSON formats with limited involvement from the user. This tool alleviates the need to make tedious changes to the input files by hand, and the separate execution of the two FORTRAN processors. MetDataCNV has the potential of being integrated with DUSTRAN. This paper will describe the design and development of the MetDataCNV tool.

Real-time Harvesting of Distributed Environmental Data for Improved Management of Complex Distributed Water and Power Management Systems. ELVIRA MEZA (Columbia Basin College Pasco, WA 99301) LANCE VAIL (Pacific Northwest National Laboratory, Richland, WA, 99352)

The amount of environmental data that is available and being collected at ever increasing rates exceeds the current ability of most environmental decision makers to acquire and process the data in time to support decision making. Demands for increasing marginal improvements in system performance require decision making to be performed on a real- time basis. This will further increase the demand for faster and more reliable automated systems to harvest data distributed on numerous sources over the Internet. By combining a metadata database with a Java program, remote data sources can be queried and data acquired on either a 'just in time' or real- time basis. Java is selected since it has benefits of platform independence and ease of interaction through web browsers. The data harvesting procedure involves the Java program acquiring information about the location and protocol for the web query to acquire the information. Based on this information, the Java program proceeds to acquire the actual data from the Internet source. Java's exclusive structural design permits the construction of a single application that can run across several platforms. Java is integrated into nearly all major operating systems, and it is built into the most popular web browsers which mean that Java is able to run on practically every Internet connected machine in the world. Also, Java supports multiple threads of execution built into the language .This means that data and code written in Java is easy to understand better than other complex programming language. In addition, Java combines different programming language properties into one language. This makes Java programs faster than other programming languages. This work is a small portion of a much larger initiative within PNNL to develop the next generation of integrated environmental decision support software to enable water and power resources managers the ability to make optimal real- time decisions. Two examples of likely future applications of this approach include: 1) harvesting water related information including stream- flow measurements, snow- pack observations, and remote sensing data on snow cover extent to help calibrate and continuously update the Lab's Distributed Hydrologic Soil and Vegetation Model to provide reliable stream- flow forecasts and 2) harvest water and fisheries information including real-time reservoir levels and fish passage data to improve the characterization and management of the hydropower system as part of the Lab's Integrated Energy Operation Center Initiative.

Removing Unused References from the FRAMES-1.x System. DUARD CRANDALL (Brigham Young university Idaho Rexburg, ID 83440) FREDERICK RUTZ (Pacific Northwest National Laboratory, Richland, WA, 99352)

The Framework for Risk Analysis Multimedia Environmental Systems (FRAMES) is a software platform developed by Pacific Northwest National Laboratory that can be used to create conceptual models of possible contaminant flow through various mediums [2]. These models have access to various parameters whose data needs to be accredited to the proper source. The references for these parameters are stored in a file that is accessed by the FRAMES User Interface (FUI). References contained in these files may, or may not be accessed by parameters in a given scenario. Since a user may wish to remove these unused references from the system, a method was developed that would interrogate both the files containing the references and the parameters, and delete unused references from the reference file [1]. This paper discusses the methods used to remove unused references from the FRAMES 1.x system.

Vulnerability Auditing at the Application Layer of the Open System Interconnect Model. TRACY KISSIRE (Columbia Basin College Pasco, WA 99301) VAHID S. HACKLER (Pacific Northwest National Laboratory, Richland, WA, 99352)

Research is being performed to assist in the development of efficient processes for identifying and mitigating risk associated with web based applications. Web based applications are highly effective in managing the vast amount of data processed in the business environment. As businesses continue to improve computing and network security, increasingly complex attacks are being directed at web based applications. Vulnerability auditing at the lower levels of the Open System Interconnect (OSI) model is a common procedure; auditing for vulnerabilities at the application level of the OSI model is difficult because of the limitless possibility of exploitation methods. New software tools capable of thoroughly auditing for vulnerabilities at the application layer of the OSI model are needed.

Student Abstracts: Computer Science at PNNL