This is the accessible text file for GAO report number GAO-03-833T entitled 'Gulf War Illnesses: Preliminary Assessment of DOD Plume Modeling for U.S. Troops' Exposure to Chemical Agents' which was released on June 02, 2003. This text file was formatted by the U.S. General Accounting Office (GAO) to be accessible to users with visual impairments, as part of a longer term project to improve GAO products' accessibility. Every attempt has been made to maintain the structural and data integrity of the original printed product. Accessibility features, such as text descriptions of tables, consecutively numbered footnotes placed at the end of the file, and the text of agency comment letters, are provided but may not exactly duplicate the presentation or format of the printed version. The portable document format (PDF) file is an exact electronic replica of the printed version. We welcome your feedback. Please E-mail your comments regarding the contents or accessibility features of this document to Webmaster@gao.gov. This is a work of the U.S. government and is not subject to copyright protection in the United States. It may be reproduced and distributed in its entirety without further permission from GAO. Because this work may contain copyrighted images or other material, permission from the copyright holder may be necessary if you wish to reproduce this material separately. Testimony: Before the House Subcommittee on National Security, Emerging Threats, and International Relations, Committee on Government Reform: United States General Accounting Office: GAO: For Release on Delivery Expected at 1:00 p.m. EDT: Monday, June 2, 2003: Gulf War Illnesses: Preliminary Assessment of DOD Plume Modeling for U.S. Troops' Exposure to Chemical Agents: Statement of Keith Rhodes, Chief Technologist Center for Technology and Engineering, Applied Research and Methods: GAO-03-833T: GAO Highlights: Highlights of GAO-03-833T, a testimony before the House Subcommittee on National Security, Emerging Threats, and International Relations, Committee on Government Reform Why GAO Did This Study: Of the approximately 700,000 veterans of the Persian Gulf War, many have undiagnosed illnesses. The Department of Defense (DOD) and the Central Intelligence Agency (CIA) have concluded, using computer plume modeling, that no U.S. troops were exposed to hazardous substances because plumes—clouds of chemical warfare agents—could not have reached the troops. GAO was asked to assess DOD and CIA plume modeling to determine whether DOD’s conclusions could be supported. GAO’s final assessment will be reported at a later date. what GAO Found: DOD’s conclusion as to the extent of U.S. troops’ exposure is highly questionable because DOD and CIA plume modeling results are not reliable. In general, modeling is never precise enough to draw definitive conclusions, and DOD did not have accurate information on source term (such as the quantity and purity—concentration—of the agent) and meteorological conditions (such as the wind and weather patterns), essential to valid modeling. In particular, the models DOD selected were not fully developed and validated for long-range environmental fallout; the source term assumptions were not accurate; the plume height was underestimated; the modeling only considered the effects on health of a single bombing; field-testing at Dugway Proving Ground did not realistically simulate the actual bombing conditions; and divergence in results among models. DOD’s conclusion, based on the findings of epidemiological studies-- that there was no significant difference between rates of illness for exposed versus not exposed troops--is not valid. In the epidemiological studies, the results of DOD’s flawed modeling served as a key criterion for determining the exposure classification—exposed versus not exposed to chemical agents—of the troops. Such misclassification is a serious problem that can have two types of effects: First, if misclassification affects both comparison groups equally (nondifferential classification-- equally in the exposed and unexposed groups), it may water down the results so that important associations are missed. Second, if misclassification affects one group more than the other (differential misclassification), it may introduce bias that obscures important associations or creates false associations. Consequently, the misclassification in the studies resulted in confounding—that is, distorting—the results, making the conclusion invalid. www.gao.gov/cgi-bin/getrpt?GAO-03-833T. To view the full product, including the scope and methodology, click on the link above. For more information, contact Keith Rhodes at (202) 512-6412 or rhodesk@gao.gov. [End of section] Mr. Chairman and Members of the Subcommittee: We are pleased to be here today to present our preliminary assessment of the plume modeling conducted by DOD and CIA to determine the number of U.S. troops that might have been exposed to the release of chemical warfare agents during the Gulf War in 1990. We will report the final results of this study at a later date. As you know, many of the approximately 700,000 veterans of the Persian Gulf War have undiagnosed illnesses since the war's end in 1991. Some fear they are suffering from chronic disabling conditions because of wartime exposures to vaccines, as well as chemical warfare agents, pesticides, and other hazardous substances with known or suspected adverse health effects. Available bomb damage assessments during the war showed that of the 21 sites bombed in Iraq--categorized by intelligence agencies as nuclear, biological, or chemical facilities-- 16 had been destroyed by bombing. Some of these sites were near the areas where U.S. troops were located. When the issue of the possible exposure of troops to low levels of chemical warfare agents was first raised, during the summer of 1993, the Department of Defense (DOD) and the Central Intelligence Agency (CIA) concluded that no U.S. troops were exposed because (1) there were no forward-deployed chemical warfare agent munitions and (2) plumes-- clouds of chemical warfare agents--from the bombing that destroyed the chemical facilities could not have reached the troops. This position was maintained until 1996, when it became known that U.S. troops destroyed a stockpile of chemical munitions after the Gulf War in 1991, at a forward-deployed site, Khamisiyah, in Iraq. Consequently, DOD and the CIA made several modeling efforts to estimate the number of troops that might have been potentially exposed to chemical warfare agents. But recognizing that actual data on the source term--such as the quantity and the purity (concentration) of the agent--and meteorological conditions--such as the wind and the weather patterns-- were not available,[Footnote 1] DOD and CIA conducted field-testing and modeling of bombing sites at Khamisiyah, in 1996 and 1997, to determine the size and path of the plume, as well as the number of U.S. troops exposed to the plume. During these initial modeling efforts, DOD asked the Department of Energy's Lawrence Livermore National Laboratories (LLNL) to also conduct modeling. In 1997, DOD and CIA also combined a number of their own individual modeling efforts into a composite and conducted additional plume modeling of the bombing sites at Al Muthanna, Muhammadiyat, and Ukhaydir. Subsequently, in 2000, DOD revised its modeling of Khamisiyah. In our testimony today, at your request, my remarks will focus on our preliminary findings of DOD and CIA plume modeling during the Gulf War. Specifically, I will address the validity of the following DOD conclusions: * based on DOD plume modeling efforts, that the extent to which U.S. troops were exposed was minimal and: * based on findings of government-funded epidemiological studies, that there was no significant difference as to the rate of illness between troops that were exposed to chemical warfare agents versus those not exposed. Our work thus far has involved interviews with agency officials and experts in this area, reviews of relevant documents and literature, and a review of DOD's methodology and analyses of plume modeling. Our work has been performed in accordance with generally accepted government auditing standards. Summary: DOD's conclusion as to the extent of U.S. troops' exposure--based on DOD and CIA plume modeling--is highly questionable because the results of the modeling are unreliable. In general, modeling is never precise enough to draw definitive conclusions, and DOD did not have accurate information on source term and meteorological conditions. We have several reasons for this assessment: First, DOD selected models that were not fully developed and validated for modeling long-range environmental fallout. Second, some of the assumptions regarding the source term data used in the modeling were not accurate--based on incomplete information, data that were not validated, and testing that did not realistically simulate the actual conditions at Khamisiyah. For example, the CIA calculated the agent purity in 1991 to be 50 percent at Khamisiyah, but 18 percent at Al Muthanna and about 15 percent at Muhammadiyat. The CIA did not independently validate or establish agent purity levels based on empirically driven analyses, and relied on UNSCOM reporting for these rates. This assessment of the agent purity rate at Al Muthanna was questioned by a DOD official. We plan to examine the validity of the methodology used to calculated the rate of degradation. Third, the plume height was underestimated, which resulted in discounting the impact of certain meteorological conditions, such as high-speed winds at nighttime, when many of the bombings occurred. This would have a dramatic effect on the distance the chemical agent traveled. Moreover, according to an internal DOD memo, plume height in one case at Al Muthanna was arbitrarily determined by a DOD official to be 10 meters. At Muhammadiyat and Ukhaydir, plume heights were estimated to be the height of the munition or the munition stack. However, independent field-testing demonstrated that a single 1,000- pound bomb would create plume height in excess of 400 meters above the ground. Fourth, DOD, in its modeling, only considered the effect of a single bombing of the sites on the health of the U.S. troops. But DOD did not take into account the cumulative effects of repeated bombings of the sites on troops' health. Fifth, post-war field-testing done at Dugway Proving Ground, to estimate the source term data and plume height, did not realistically simulate the actual conditions of bombings at any of the sites. The simulation occurred under conditions that were not comparable to those that existed at Khamisiyah. For example, there were differing seasonal and meteorological conditions, differences in rocket construction, and lesser quantities of rockets. These differences result in multi-variable uncertainty that cannot be resolved. Finally, there was a great divergence among the various models DOD selected with regard to the size and path of the plume and the extent to which troops were exposed. Combining the results of various models masked the highly divergent predictions among the individual models regarding the size and path of the plume. The results of LLNL model which showed the largest area of coverage were disregarded and not included in the composite model. DOD's conclusion that there were no significant differences in the rate of illness between exposed and non-exposed troops is questionnable. DOD based this conclusion on the findings of epidemiological studies, in which DOD modeling was flawed. In addition, the modeling results served as a key criterion for classifying troops that were ill and had been exposed compared with troops that were ill and determined not to have been exposed. However, the troops classified as non-exposed might have been exposed. Such misclassification is a serious problem that can have two types of effects. First, if misclassification affects both comparison groups equally (non-differential classification--equally in the exposed and unexposed groups), it may water down the results so that important associations are missed. Second, if misclassification affects one group more than the other (differential misclassification), it may introduce bias that obscures important associations or creates false associations. Consequently, the misclassification in the studies resulted in confounding--that is, distorting--the results. Background: In March 1991, after the conclusion of the Gulf War, U.S. Army demolition units destroyed munitions at the Khamisiyah storage site-- which included a bunker and an open pit--in southeastern Iraq. Later, through inspections conducted by the United Nations Special Commission (UNSCOM) in Iraq, it was discovered that hundreds of 122-millimeter rockets destroyed at Khamisiyah contained the nerve agents sarin and cyclosarin. U.S. and coalition forces also bombed many other known or suspected Iraqi chemical warfare research, materiel, storage, and production sites. According to DOD and the CIA, coalition air strikes resulted in damage to filled chemical munitions at only two facilities in central Iraq, Al Muthanna bunker 2 and Muhammadiyat, and at the Ukhaydir ammunition storage depot in southern Iraq. At Muhammadiyat, munitions containing an estimated 2.9 metric tons of sarin and cyclosarin and 15 metric tons of the chemical agent mustard were damaged during the air strikes. At Al Muthanna, munitions containing an estimated 17 metric tons of sarin and cyclosarin were damaged during the air strikes. According to DOD, the U.S. Government did not immediately make the connection between the chemical munitions found by UNSCOM at Khamisiyah and U.S. demolition bombings there. However, in 1996, concerns raised by the Presidential Advisory Committee on Gulf War Illnesses prompted the CIA to examine this issue.[Footnote 2] The CIA contracted with the Science Applications International Corporation (SAIC) to conduct the initial analysis and modeling of the bombing of chemical munitions in Khamisiyah bunker 73. The CIA's first report, published in August 1996, modeled the potential release of agents from bunker 73. The CIA and DOD jointly published a second report in September 1997. In this report, they combined the results of five different dispersions (for example, the size and path of the plume) and meteorological models to determine the extent of the plume from bombing of chemical munitions in Khamisiyah. In 2000, DOD published the results of a new modeling of the Khamisiyah site, using updated CIA source assessments and revising the hazard area. Information Needed for Modeling the Effects of Chemical Warfare Agents: In chemical plume modeling, simulations are produced that recreate or predict the size and path of the plume, including the potential hazard area, and the potential effect on the health of the exposed population. Modeling requires accurate information on: * source term characteristics, properties (for example, vapor pressure, flash point, size of particles, persistency, and toxicity information), and rate of the agent release; * temporal characteristics of the period of release (for example, whether the initial release of chemical agent occurred during daylight hours when it might rapidly disperse into the surface air or at night when differing dispersion patterns would exist depending on terrain and the height of the release); * accurate collection of data that drive the meteorological models, such as temperature, humidity, barometric pressure, dew point, wind velocity and direction at varying altitudes, and other related measurements of weather conditions during the modeled period; * data from global weather models to simulate large-scale weather patterns and from regional and localized weather models to simulate the weather in the area of the chemical agent release and throughout the area of dispersion; and: * information regarding the location of potentially exposed populations, animals, crops or other assets that may be affected by releases of the agent. Types of Models Used: The modeling of various chemical agent releases during the 1991 Persian Gulf War included global-scale models, such as the National Centers for Environmental Prediction Global Data Assimilation System (GDAS) and the Naval Operational Global Atmospheric Prediction System (NOGAPS). Regional and local weather models used included the Coupled Ocean- Atmosphere Mesoscale Prediction System (COAMPS), the Operational Multiscale Environment Model with Grid Adaptivity (OMEGA), and the Mesoscale Model Version 5 (MM5). Transport and diffusion models (often simply called dispersion models) were also used. They project both the path of the chemical agents after release and the degree of hazard posed by the agents. For example, the modeling of various releases during the 1991 Gulf War included dispersion models, such as the Second-order Closure Integrated Puff (SCIPUFF) model along with its Hazard Prediction and Assessment Capability (HPAC) component; the Vapor, Liquid, and Solid Tracking (VLSTRACK) model; the Non-Uniform Simple Surface Evaporation Model (NUSSE); and the Atmospheric Dispersion by Particle-in-Cell (ADPIC) model. DOD's Conclusions Regarding the Extent of Exposure of U.S. Troops Are Highly Questionable: DOD's conclusion as to the extent of U.S. troops' exposure--based on DOD and CIA plume modeling--is highly questionable because the results of the modeling are unreliable. The modeling conducted was not precise enough to draw definitive conclusions regarding the size and path of the plume. We found six reasons to question the conclusions: First, the models selected were not fully developed and validated. Second, the assumptions regarding the source term used in the modeling were not accurate. Third, the plume height was underestimated. Fourth, DOD modeling only considered the effects of a single bomb on health. Fifth, post-war field testing done at Dugway Proving Ground did not realistically simulate the actual conditions of bombing at any site. And, finally, there was a great divergence among the various models DOD selected with regard to the size and path of the plume. The Models Selected Were Not Fully Developed and Validated: DOD and CIA officials selected in-house models for use in plume modeling (see appendix 1). In the case of Khamisiyah and other sites, DOD models--such as the VLSTRACK and HPAC/SCIPUFF dispersion models-- were not fully developed and validated for environmental fallout at the time of their selection. In particular, these models were not appropriate for long-range tracking of chemical agents. VLSTRACK was developed primarily as a tactical decision aid for predicting hazards resulting from the release of chemical and biological agents in a military environment. Modeling experts at the Naval Surface Center told us that the two-month DOD panel reanalysis and modeling was a developmental effort because existing models did not have the capability to perform the required projections. Considerations of potential illness from low-level exposure to chemical agents resulting from nerve and blister agents accidentally released in Iraq required extensive extensions and modifications to some of the methodology in VLSTRACK. HPAC was developed jointly by the Defense Intelligence Agency and the then Defense Special Weapons Agency (now known as DTRA) and was specifically tailored to do counterproliferation contingency planning. In a 1998 scientific review and evaluation of SCIPUFF, which is an integral part of HPAC, the National Oceanic and Atmospheric Administration's (NOAA's) Air Resources Laboratory stated that SCIPUFF is probably better suited for short-range (about 10 kilometers) dispersion applications rather than for long-range transport modeling. Among the limitations cautioned regarding the use of the HPAC model are that does not provide a definitive answer due to uncertainties about transport, location, and weather. In addition, based on the DOD modeling effort, it is evident that a group using the VLSTRACK model might receive a significantly different prediction from that of a group using the HPAC model. And neither of these models has sufficient fidelity--that is, reliability--to permit the conclusion that the actual hazard area--that is, path of the plume- -is confined to the predicted hazard area. In a September 1998 memo, the Deputy to the Secretary of Defense for Counterproliferation and Chemical/Biological Defense cited a DOD panel study team, which found that the VLSTRACK and HPAC models generate hazard predictions that are significantly different from each other. The memo noted, "This occurred even when the source terms and weather inputs are as simple and as identical as possible. In operational deployment, the average model user could obtain different answers for the same threat.": With regard to meteorological models, according to a 1997 memo from the Director of NOAA's Air Resources Laboratory to DOD, the selection of models was dominated by in-house, that is, DOD, models that were not well known outside of DOD. The Director noted that there were three mainstream mesoscale models available and well accepted for deriving site-specific flow conditions from large-scale meteorological information: MM5, RAMS, and Eta. At that time, OMEGA and COAMPS were too new and not well accepted outside of DOD circles. OMEGA was still under development, and a Peer Review Panel on the 1997 Khamisiyah modeling reported that there were major problems with the OMEGA model. For example, there were physically impossible aspects to the OMEGA model solutions and major errors in its simulations. For the analysis done for Khamisiyah and Al Muthanna, a DOD technical review panel found that OMEGA consistently under-predicted surface wind speeds by a factor of 2 to 3 when compared with actual observations collected at five World Meteorological stations in the area. The Source Term Assumptions Were Not Accurate: There were significant uncertainties in the source term used in the plume modeling at Khamisiyah. DOD and the CIA made assumptions about the source term based on field-testing, intelligence information, imagery, UNSCOM inspections, and Iraqi declarations to UNSCOM. However, these assumptions were based on incomplete information, data that were not validated, and testing that did not realistically simulate the actual conditions at Khamisiyah. In its initial modeling of the demolition of chemical munitions at Khamisiyah, the CIA did not have accurate and precise information as to how rockets with chemical warheads would be affected by open pit demolition, compared with bunker demolition. This lack of information included the number of rockets, agent purity, and amount of agent released in the atmosphere, agent reaction in an open-pit demolition, and prevailing meteorological conditions. A DOD panel also found a lack of information,[Footnote 3] that is, substantial uncertainties regarding the number of damaged rockets that might have released chemical agents and how fast the nerve agents--sarin and cyclosarin, which were mixed together in the rockets--were released. Some of these agents may have leaked from rockets into the soil or into the wood of the boxes that contained the rockets and evaporated over time. The panel also found that the CIA and SAIC analyses used what were essentially guesses for the lack of data. For example, the numbers of rockets were based on what was known to be there before the demolition and what was found by the UNSCOM during their inspections, but, according to a DOD panel, the numbers varied by a factor of 5 or 6. In addition, this panel recognized that meteorological data were limited because there were relatively few observations, and these were made far from the Khamisiyah site. Observations were few because Iraq stopped reporting weather station measurement information to the World Meteorological Organization in 1981. As a result, data on the meteorological conditions during the Gulf War were sparse. The only data that were available were for the surface wind observation site, 80 to 90 kilometers away, and the upper atmospheric site, about 200 kilometers away. The panel also recognized that wind patterns could contain areas of bifurcation--lines where winds move in one direction on one side and in another direction on another side--which also move over time and are different at different altitudes. Source term assumptions on agents (sarin and cyclosarin) purity established for the four sites--Khamisiyah, as well as Al Muthanna, Muhammadiyat, and Ukhaydir--differed widely. Discrepancies between the Khamisiyah purity data and the Al Muthanna and Muhammadiyat data were not adequately resolved. The agents were assumed to be purer in February 1991 at Al Muthanna than in January at Muhammadiyat and purer still in March at Khamisiyah. In each case, agent purity was a key factor in the DOD and CIA methodology for determining the amount of agents released. Since the purity of the sarin and cyclosarin was used as a factor in calculating the amount of agents released, purity is critical in compounding the uncertainty of the modeling. For example, for modeling purposes, 10 tons of agent with a purity of 18 percent would be represented as only 1.8 tons of agent. The CIA did not independently validate or establish agent purity levels based on empirically driven analyses, and relied on UNSCOM reporting for these rates. This assessment of the agent purity rate at Al Muthanna was questioned by a DOD official, who noted in a memo, "Why we use the 18 percent purity instead of the 50 percent number available in public sources, and why we treat GF like GB when there are documents that mention the higher toxicity are not easily deferred with 'because the CIA says so.' I think the GF vs. GB numbers accepted by the EPA or CDC or whatever is the competent authority, but the purity number is problematic." We plan to examine the validity of the methodology used to calculated the rate of degradation. In addition, according to Iraqi production records obtained by UNSCOM, the agent purity at Khamisiyah, in early January 1991, was about 55 percent. The agent subsequently degraded to 10-percent purity by the time laboratory analysis had been completed on samples taken by UNSCOM from one of the rockets in October 1991. On the basis of the sample purity and indications that the degradation rate for sarin and cyclosarin are similar, the CIA assessed that the ratio of sarin to cyclosarin when the munitions were blown up in March 1991 was the same as that sampled in October 1991--3:1. According to the CIA, assuming a conservative exponential degradation of the sarin and cyclosarin, the purity on the date of demolition, 2 months after production, was calculated to be about 50 percent. At Al Muthanna, however, where the agent was stored in a bunker, the CIA estimated the chemical warfare agent had deteriorated to approximately 18 percent purity by the time that bunker 2 was destroyed, in early February 1991, leaving about 1600 kilograms (1.6 metric tons) of viable sarin. The CIA based its estimate on UNSCOM's analysis of Iraqi purity data and supporting information, which stated that the munitions were filled with the agent in 1988 and that the maximum purity for the 1988 agent was 18 percent in 1991. However, this assumption suggests knowledge of exact production dates and storage conditions that were not established. But UNSCOM and intelligence community reporting about the near-wartime capabilities of Iraq suggests that while the sarin produced was of poor quality, it had a maximum purity of 60 per cent. According to CIA documents, the total amount of agent modeled to have been released at Al Muthanna was 1 kg, but, to be conservative, the amount released was assumed to be 10 kg. The reasoning given for the low amounts discharged was the heat of the explosion. The CIA assessed that far less agent would have been released in the Al Muthanna bunker because, based on U.S. field-testing using simulated bunkers, heat would build up rapidly in Iraqi bunkers made of thick reinforced concrete ceiling and walls, thereby destroying most of the agent. However, these bumkers were targeted using high explosives, such as Tomahawk missiles and laser-guided and non-guided bombs, that detonate and produce instantaneous and extreme blast forces and shock and pressure waves, as well as heat. While the CIA analysts gave great credibility to the heat, no consideration was given to either the blast effects of the munitions or to the higher altitude plumes generated with the types of munitions used. For Muhammadiyat, DOD also provided details regarding how they derived source term characterizations for agent released using test data from Dugway Proving Grounds. However, the types of munitions used in the testing and, therefore, the resulting effects are not comparable to what munitions were actually used and their effects. At Dugway Proving Grounds, small explosive charges were placed on boxed rockets; at Muhammadiyat, the munitions were targeted using multiple high-explosive bombs. Agent purity at Muhammadiyat was estimated at 15 percent. The Plume Height Was Underestimated: Plume heights from the explosions could be significantly higher than the plume height assumptions provided for in the modeling of Khamisiyah and other Iraqi chemical warfare sites. The plume height data the CIA provided for the demolitions at the Khamisiyah pit was 0-100 meters. However, neither the DOD nor the CIA conducted testing to establish plume heights associated with the bombings of Al Muthanna, Muhammadiyat, or Ukhaydir. DOD modelers involved with the modeling efforts told us that they did not calculate the plume height or any of the other heat or blast effects associated with the bombings of these sites because DOD had provided the modelers these data. A modeling expert from the Defense Threat Reduction Agency (DTRA) told us that DOD data on plume height was inconsistent with other test data for the types of facilities bombed. The modeling expert cited test studies conducted at White Sands Proving Grounds in New Mexico, which demonstrated plume heights would range from 300 to 400 meters in height. Modeling experts from LLNL who participated only in the initial modeling at Khamisiyah also told us, citing studies, that they questioned how the plume height was estimated. In a pre-war analysis, LLNL projected that the smoke source cloud, immediately following the bombing of Iraqi chemical warfare agent facilities, would be characterized by a surface-based plume with a 54 meter (177 ft.) horizontal radius and a height of 493 meters (1,617 ft.). A Sandia Laboratory empirical study, performed in 1969, established a power law formula for calculating plume heights attributable to high-explosive detonations (see appendix II). Using this formula, an MK-84 or GBU-24 (942.6lb. of high explosives) bomb would generate a plume of 421 meters. DOD applied the same assumptions about the height of the plume at Khamisiyah to model other possible chemical releases at the Al Muthanna, Muhammadiyat, and Ukhaydir sites. At Muhammadiyat, for example, DOD established a release height of 0.5 meters (roughly half the bomb height) for nerve agent and a release height of 1.0 meters (roughly half of the median height of the various bomb stacks) for blister (mustard) agent destroyed at this location. Moreover, according to an internal DOD memo, an initial cloud size of 10 meters in both lateral and vertical directions was "arbitrarily" established. No efforts were made by DOD to validate these estimates by analyzing video images that were available showing some of the plume data, particularly those taken from ground level at Khamisiyah, were used to project the characteristics of the actual plumes. As illustrated by figure 1, disparity in plume height source data could result in vastly differing projections regarding how far the plume travels and disperses, particularly during nighttime periods when a stable (nocturnal) boundary layer emerges. Figure 1: Boundary Layer Characteristics: [See PDF for image] [End of figure] As also shown in figure 1, above the surface layer, in the stable boundary layer, the winds often accelerate to higher speeds, in a phenomenon that is called the low-level or nocturnal jet. At altitudes on the order of 200 meters above the ground, winds may reach 10-30 meters per second (22-67.5 miles per hour) in the nocturnal jet. Higher plumes than those postulated by DOD, coupled with this phenomenon, could result in the rapid transport of chemical agents until disturbed by turbulence or the return of the mixed layer sometime after dawn. However, this possibility was not taken into consideration in any of the modeling performed. Consequently, the modeling may have resulted in underestimating the extent of plume coverage. (For a detailed discussion of this issue, see appendix II.): In addition, plume geometry associated with high-explosive discharges shows that the majority of the mass of the plume is located toward the higher altitudes, suggesting that the majority of the mass of the plume would move to higher altitues where they might be transported by these higher speed winds (see appendix III). DOD Modeling Only Considered the Effects of a Single Bombing on Health: Iraqi chemical warfare facilities were bombed on several occasions, but DOD and CIA modeling did not reflect the cumulative effects of these repeated bombings on the amounts of agents released and on the health of troops. For example, there were 17 distinct coalition air strikes on the Muhammadiyat ammunition storage depot. While modeling was requested for the duration of 72 hours after the chemical release for Khamisiyah, DOD used only a 24-hour duration for its modeling of the bombing of Muhammadiyat. This was because at this site, unlike at others, DOD made the assumption that all of the nerve agent was released at one time and therefore modeled each air strike as if it was the only strike that caused a release. According to DOD, each model produced a freeze frame of the largest hazard area. The hazard area grows until it reaches its maximum size, which the modeling suggests is about 10-12 hours after the release. Dugway Field-testing Did Not Realistically Simulate the Actual Bombing Conditions: DOD and the CIA also conducted post-war field-testing at Dugway Proving Ground to simulate the actual bombing conditions at Khamisiyah to derive the source term data for use in modeling. From May 1997 through November 1999, the testing center at Dugway Proving Ground conducted seven field-testings and two laboratory studies to obtain source term data for use in DOD and CIA modeling of Khamisiyah. For testing and simulation to be effective, the conditions have to be as close to the actual event as possible. However, the testing did not realistically simulate the conditions that existed during the demolition of 122-mm chemical-filled rockets in Khamisiyah and is therefore of questionable usefulness in providing inputs data for the modeling. The simulations took place under conditions that were not comparable to those that existed at Khamisiyah. During the field-testing, there were differences in seasonal and meteorological conditions; in munition crate construction material; in rocket construction, including the use of concrete-filled pipes as rocket replacements to provide (inert) filler to simulate larger stacks; the fewer numbers of rockets (and therefore explosives) in the simulations, which may have suppressed a potential chain reaction of explosions; the use of agent simulant (rather than real agent); and soil. These differences result in multi-variable uncertainty that cannot be resolved. For example, the Dugway testing used a small sample of 32 rockets with simulant-filled warheads to conduct seven field-testings: five were single-rocket demolitions and two involved multiple-rocket demolitions. One multiple-rocket trial demolition used nine functional rockets plus three dummy rockets, while the other multiple-rocket trial used 19 functional rockets and five dummy rockets. In contrast, at the Khamisiyah pit, stacks of 122 mm rockets, estimated to total about 1,250 rockets, were detonated. Moreover, Dugway testing officials did not know whether the 122 mm rockets used during the field-testings were the same as those at the Khamisiyah pit. Dugway officials acknowledged that exploding a larger number of rockets would make a significant difference on the testing, and aerial bombing with a heavy load would have a far greater effect than was the case with the Dugway testing. According to DOD and CIA analysts, the type of soil and wood can have a significant effect on the dispersion of the agent. However, a Dugway testing official told us that evaporation characteristics from the trials and models were uncertain. DOD and CIA estimates of the evaporation and retention rates of the chemical agent spilled on the soil may not be similar to what was actually evaporated from and retained in the pit sand at Khamisiyah. This is because while Iraqi soil was available and used in the laboratory testing, it was not used during the field-testing. Similarly, DOD and the CIA estimates of the amount of spilled agent that evaporated from and was retained in wooden crates are suspect because Dugway testing officials could not obtain actual wood from the Khamisiyah pit site for testing. The aged and possibly damp wood at Khamisiyah would absorb less agent than the new wood used at Dugway. DOD and CIA determined that only about 32 percent of the agent was released and that most leaked into the soil and wood with 18 percent of the leakage becoming part of the plume (2 percent through aerosolization and 16 percent through evaporation). Field-testings were also conducted at a different time of the year and time of the day than the actual Khamisiyah pit event. According to Dugway officials, testing was done in May and in the early morning hours when drainage conditions prevail. The U.S. demolition of the Khamisiyah pit took place on March 10th, in the late afternoon during the presence of a mixing layer. Other demolitions took place during evening and nighttime hours when the stable (nocturnal) boundary layer emerges. Despite the uncertainties in approximating the conditions that existed even at Khamisiyah, DOD and the CIA used these data not only for the Khamisiyah modeling, but also for the modeling of other sites. At all these sites, the chemical warfare munitions would have been destroyed by air strikes with much greater quantities of high-explosive charges and under differing meteorological conditions. Divergence in Results among the Models: DOD made no effort to resolve widely divergent modeling results among the models selected. Instead, a composite model approach was taken, which contributed to, rather than resolved, uncertainty. For example, the DOD panel tasked the LLNL to conduct an analysis using DOD's MATHEW meteorological model with the ADPIC dispersion model. During LLNL presentations to the DOD panel in November 1996 and February 1997, the LLNL provided a 72-hour composite projection, assuming an instantaneous release of the contents of 550 rockets containing sarin. It shows the plume covering an area extending south- southeast from the release point to the Persian Gulf, then turning eastward at the Gulf coast, and then turning northeast over the Gulf and extending northeastward across central Iran. (For a more detailed discussion of this topic, see appendix IV.): DOD models showed significant differences from the LLNL assessment. In contrast to the LLNL modeling simulations, analysis done with the DOD models--VLSTRACK with COAMPS meteorological models and HPAC/SCIPUFF with OMEGA meteorological forecasting models--showed the plume from an instantaneous release moving first southerly, and then turning to the west-southwest. See appendix V for a 72-hour plume overlay of those composite projections published by DOD. According to the DOD panel, no effort was made to reconcile the differences between the DOD and LLNL modeling efforts. The panel determined that the results were so different that it would not be possible to choose the most affected areas and which U.S. forces were affected. Accordingly, the panel recommended that a composite of the DOD models be used to combine the hazard areas predicted by the models. Yet we observed that even among the models selected for use by DOD, widely differing paths were evident (see appendix VI). Assuming that a composite modeling effort is an appropriate methodology, a composite projection, including the above projections (DOD and CIA composite and LLNL), would encompass a far larger number of forces and seriously skew the outcome of any epidemiological studies done thus far, as shown in figure 2. Figure 2: DOD Composite Projection and Lawrence Livermore National Laboratory Projection: [See PDF for image] [End of figure] A clear divergence exists in the predictions of the models. Further research was conducted to determine whether there was data available that might explain this divergence. As a result of this research, the DOD panel concluded that the divergence in the modeling outcomes may be explained by a line of diffluence (directional split) in the independently modeled 10-mm wind field data near Khamisiyah during the first 2 days of the modeling period. The precise location of this line was critical to which way the material would be transported by the wind. (See appendix VII for an illustration of this diffluence with three different data sets). In addition, DTRA officials told us that at the time of the modeling, they conducted data-validation runs of the various models against visible smoke plumes from the oil well fires in Kuwait; the runs showed a definite bias, as shown in figure 3. According to DTRA, this validation could mean that the uncertainty involved in using these models could result in an angular shift of 10 to 50 degrees to the west. In other words, the actual area coverd could be from 10 to 50 degrees to the east of the area indicated by the model, meaning that it would cover a different population from the one in the model. Figure 3: Validation Runs of Various Models: [See PDF for image] [End of figure] DOD's Conclusion from the Epidemiological Studies Is Questionable: Given that the DOD modeling was flawed, DOD's conclusion, from epidemiological studies based on this modeling with regard to rate of illness among exposed versus not exposed, is questionable. Nevertheless, the results of the modeling were used as a basis for determining the exposure classification--exposed versus not exposed to chemical agents--of the troops in population-based epidemiological studies. As we noted in 1997, to ascertain the causes of veterans' illnesses, it is imperative that investigators have valid and reliable information on exposure, especially for low-level or intermittent exposures to chemical warfare agents.[Footnote 4] To the extent that veterans are misclassified regarding exposure, relationships would be obscured and conclusions would be misleading. Misclassification of study subjects in the measurement of the variables being compared is a well-recognized methodological problem in epidemiological studies. Misclassification can have two types of effects. First, if misclassification affects both comparison groups equally (non-differential--equally in the exposed and unexposed groups), it may water down the results so that important associations are missed. Second, if misclassification affects one group more than the other (differential misclassification), it may introduce bias that obscures important associations or creates false associations. Consequently, the study misclassification resulted in confounding-- that is, distorting--the results, making the conclusion questionable. By combining the results from its individual modeling efforts, which showed different areas of coverage, and ignoring the results of the LLNL modeling, which showed much larger areas of coverage, DOD potentially may have misclassified a large number of troops truly exposed to chemical warfare agents in the putatively non-exposed group. If exposure to chemical warfare agents truly caused adverse effects resulting in increased hospitalization or death, such one-way misclassification would tend to obscure the differences in hospitalization or death rates by falsely increasing the rates in the putatively non-exposed group while not affecting the rates in the exposed group. Based on the June 1996 plume modeling, DOD officials initially stated that only 300 to 400 troops were exposed to chemical plumes. Based on additional modeling, that number was revised to approximately 5000 on September 1996; to approximately 20,000 on October 22, 1996; and to 98,910 on July 23, 1997. DOD 2000 estimates place the number exposed at 101,752. The number from the October 22, 1997 plume model served as the basis for informing approximately 100,000 Gulf War veterans of possible exposure. This 1997 plume model was also used as the basis of at least two epidemiological studies that were published in peer-reviewed scientific journals. In 2000 DOD announced that as a result of ongoing scientific analysis, DOD's Directorate for Deployment Health Support developed a new computer model that changed the location of the Khamisiyah plume footprint. The number of service members potentially exposed remained approximately 100,000. The new 2000 model reclassified 32,627 troops as unexposed who were previously classified as exposed and classified 35,771 troops as exposed who were previously classified as unexposed. Given the weaknesses in DOD modeling and the inconsistency of data set- -representing these models--given to different researchers, there can be no confidence that the research conclusions based on these models have any validity. Conclusions: In evaluating the limitations of the plume modeling, we concluded that even under the best of the circumstances, the results from the modeling cannot be definitive. Plume modeling can allow one to estimate what might have happened when chemical warfare agents are released in the environments. Mathematical equations are used to predict the activities of an actual event, in this case, the direction and extent of the chemical warfare agent plume. However, in order to predict precisely, one needs to have accurate information on the source term and the meteorological conditions. However, DOD did not have accurate information on the source term or on meteorological conditions. Given these modeling flaws, the DOD modeling results should not form the basis for determining the extent of exposure of U.S. troops during the Gulf War. The models selected were not fully developed and validated for environmental fallout and the assumptions used to provide the input into the models exhibited a preferential bias for a particular and limited outcome. Yet even under these circumstances, the models failed to provide similar conclusions. In addition, many potential exposure events were not included. It is likely that if fully developed and validated models and more realistic data for source term were included in the modeling, particularly plume height and exposure duration, the exposure footprints would be much larger and most likely to cover most of the areas where U.S. and other coalition forces were deployed. However, given the weaknesses in the data available for any further analyses, any further modeling efforts on this issue would not be any more accurate and helpful. In particular, source term data used for modeling the release of chemical warfare agents during the Gulf War were inadequate for any model to provide, with the desired accuracy and confidence, a single definitive simulation of dispersion. Several modeling experts told us that if source term inputs into modeling assessments are not accurate, the results of the modeling would not be reliable The development of source term data was not empirically driven, but rather driven by the subjective analyses of individual intelligence agencies. No empirically driven analyses were applied to determine plume height source data from the chemical warfare agent research, production, and storage sites subjected to air strikes, and no empirically driven calculations were disclosed regarding agent purity as it affected the rate of decay of the chemical warfare agent munitions that, according to intelligence agencies reports, were produced immediately prior to the war. Efforts to simulate events and define the source term through testing were unrealistic, conducted under inappropriate conditions and, in some cases, inappropriately applied to dissimilar events. The subjective and defective quality of much of the analyses conducted is best demonstrated by the dynamic nature of the source data over time. That is, repeated analyses resulted in continually changing conclusions and source data, despite the fact that no aspect of the actual events changed after their occurrence. DOD completely disregarded the results from the LLNL model which provided divergent results, which were in the DOD and CIA modeling analysis. This occurred despite a high degree of divergence, even among the selected DOD models. Further, the precise plume projections of the LLNL model were excluded from DOD's composite modeling. Finally, in the DOD and CIA composite model, divergence from individual models was masked. Despite all of the uncertainties that emerged from DOD and CIA modeling, the results of the modeling were used to serve as a basis for determining the exposure status--exposed versus not exposed to chemical agents--of the troops in population-based epidemiological studies. However, given the weaknesses in DOD modeling and the inconsistency of data set--representing these models--given to different researchers, there can be no confidence that the research conclusions based on these models have any validity. Mr. Chairman, this concludes my statement. I will be happy to answer any questions you or Members of the Subcommittee may have. Contacts and Acknowledgments: Should you or your offices have any questions concerning this report, please contact me at (202) 512-6412 or Sushil Sharma, Ph.D., DrPH, at (202) 512-3460. We can also be reached by e-mail at rhodesk@gao.gov and sharmas@gao.gov. Individuals who made key contributions to this testimony were Jason Fong and Laurel Rabin. James J. Tuite III, a GAO consultant, provided technical expertise. [End of section] Appendix I: Khamisiyah Models: On November 2, 1996, DOD requested the Institute for Defense Analysis to convene an independent panel of experts in meteorology, physics, chemistry, and related disciplines to review the Khamisiyah modeling analysis done by the CIA and its contractor, the Science Applications International Corporation. The DOD panel recommended conducting additional analyses using several DOD and non-DOD meteorological and dispersion models as shown in table 1. Table 1: Meteorological and Dispersion Models Used in Modeling Khamisiyah: Meteorological Model: Coupled Ocean-Atmosphere Mesoscale Prediction System (COAMPS); Developer/Sponsor: U.S. Navy; Dispersion Model: Hazard Prediction and Assessment Capability/Second Order Closure, Integrated Puff (HPAC/SCIPUFF); Developer/Sponsor: Defense Threat Reduction Agency. Meteorological Model: Mass Consistent Wind Field (MATHEW); Developer/ Sponsor: Department of Energy/Lawrence Livermore National Laboratory; Dispersion Model: Atmospheric Dispersion by Particle-in-cell (ADPIC); Developer/Sponsor: Department of Energy/Lawrence Livermore National Laboratory. Meteorological Model: Mesoscale Model, Version 5 (MM5); Developer/ Sponsor: National Center for Atmospheric Research; Dispersion Model: Non-Uniform Simple Surface Evaporation, Version 4 (NUSSE4); Developer/ Sponsor: U.S. Army. Meteorological Model: Naval Operational Global Atmospheric Prediction System (NOGAPS); Developer/Sponsor: U.S. Navy; Dispersion Model: Vapor Liquid Solid Tracking (VLSTRACK); Developer/Sponsor: U.S. Navy. Meteorological Model: Operational Multi-scale Environment Model with Grid Adaptivity (OMEGA); Developer/Sponsor: Defense Threat Reduction Agency; Dispersion Model: [Empty]; Developer/Sponsor: [Empty]. Source: GAO. [End of table] [End of section] Appendix II: Power Law Formula: A Sandia Laboratory empirical study performed in 1969 established a power law formula for calculating plume heights attributable to high- explosive detonations. This power law formula was derived from data on 23 test shots, ranging from 140-2,242 lbs. high explosives at U.S. Department of Energy's Nevada Test Site (National Exercise, Test, and Training Center) and provides a cloud top height at 2 minutes after detonation. Most of the shots were detonated during near neutral conditions, where the clouds continued to rise after 2 minutes; data for 5 minutes after detonation on some shots shows tops rising to nearly double the 2-minute values. The 2-minute values better represent the final cloud top heights during stable conditions. This formula is represented as: h = 76(w^1/4): where h = height of plume in meters and, w = weight of explosives in pounds: Using this formula, a MK-84 or GBU-24 (942.6lb of high explosives) bomb would generate a plume of 421 meters: H = 76 (942.6 pounds of high explosives)^1/4 H = 76 (5.541) H 421 meters: Figure II.1 shows what the plume height trend line would be using the formula to calculate plume heights, resulting from the detonation of high explosives ranging in weight from 100 - 2,000 lbs. Figure 4: Plume Height by Weight of Explosive: [See PDF for image] [End of figure] [End of section] Appendix III: Plume Geometries and Wind Transport: As shown in figure III.1, plume geometry associated with high explosive discharges shows that the majority of the mass of the plume is located towards the higher altitudes, suggesting that the majority of the mass of the plume would move to higher altitudes where they might be transported by higher speed winds. Figure 5: Examples of Various Plume Geometries: [See PDF for image] [End of figure] As shown in figure 3.2, the distribution of the plume geometry may be affected by nocturnal jets. Figure 6: Impact of Nocturnal Jets on Plume at Higher Altitudes: [See PDF for image] [End of figure] In fact, empirical studies and actual reported and observed events tend to refute DOD and intelligence agencies' assumptions and support the alternative assumption of transport by low-level jets. First, empirical testing suggests that the plume heights were much higher than postulated in the source term data. Second, no massive casualties were claimed, reported or observed in areas immediately surrounding the Iraqi chemical warfare research, production, and storage sites bombed by coalition forces. Third, since many of the bombings occurred at night, the explosive effects coupled with higher altitude plumes and the presence of a nocturnal boundary layer capable of moving hazardous materials hundreds of miles could easily account for this phenomenon, as well as the reports of chemical warfare agent detections in areas occupied by U.S. and coalition forces. Fourth, the dynamics of advection explained above may account for the reported wartime nighttime detections of very low-levels of chemical agents associated with turbulence mixing the upper and lower level atmospheric layers resulting from aircraft-related sonic booms and incoming missiles and artillery. [End of section] Appendix IV: Lawrence Livermore National Laboratory Khamisiyah Simulation: The Department of Energy's Lawrence Livermore National Laboratory (LLNL) Atmospheric Release Advisory Capability was tasked to conduct an analysis using its MATHEW meteorological model with the ADPIC dispersion model. Between 1979 and 2003, the LLNL modeling capability, known as the Atmospheric Release Advisory Capability (ARAC), now the National Atmospheric Release Advisory Center (NARAC), responded to more than 100 alerts, accidents, and disasters, and supported more than 1,000 exercises. These include assessments of nuclear accidents, fires, industrial chemical accidents, and terrorist threats. During its presentations to the DOD panel in November 1996 and February 1997, scientists from Lawrence Livermore National Laboratory provided plume projections based on the data provided by the panel staff. A number of model projections were calculated and presented to the panel. As shown in figure IV.1, the LLNL 72-hour composite projection assuming an instantaneous release of the contents of 550 rockets containing sarin. It shows the plume covering an area extending south-southeast from the release point to the Persian Gulf, then turning eastward at the Gulf Coast, and then turning northeast over the Gulf and extending northeastward across central Iran. Figure 7: Lawrence Livermore National Laboratory Composite Projections: [See PDF for image] [End of figure] LLNL's modeling assessment shows that the 72-hour exposure due to the instantaneous release of sarin from 550 rockets covers a large hazard area. According to LLNL, agent concentration in excess of the dosage amount expected to cause "minimal effects" or symptoms on individuals covered a 2,255 square km area extending approximately 130 km south- southeast from the release point.[Footnote 5] Dosages in excess of the amount that would be allowed for a worker exposed to sarin in the workplace, or the "occupational limit,[Footnote 6]" were predicted over a 114,468 square kilometer area, including Kuwait City, an approximately 200 kilometer-wide area across the Persian gulf, and the higher elevations of the Zagos mountain range in Iran. The remaining area was determined to be at the "general population limit."[Footnote 7] [End of section] Appendix V: DOD Model Simulations: A 72-hour plume overlay of those composite projections published by OSAGWI is shown in figure V.1. Figure 8: DOD Composite Projection: [See PDF for image] Note: This projection includes the VLSTRAK and SCIPIFF/HPAC dispersion models with COAMPS, MM5, and OMEGA meteorological models. [End of figure] [End of section] Appendix VI: Divergence among DOD Models: Even among the models selected for use by the DOD panel, widely divergent directional outcomes were observed. As shown in figure VI.1, differences can be seen among various models for hazard areas during the first 2 days of the modeling period for Khamisiyah. Figure 9: Figure VI.1 Divergence among Models Used in Constructing DOD and CIA Composite Analysis: [See PDF for image] [End of figure] The March 10, 1991 graphic demonstrates a 40-45 degree divergence between the HPAC/OMEGA and the HPAC/COAMPS projections while the March 11, 1991 graphic demonstrates approximately an 80 degree divergence. The uncertainty attributed to this divergence is not limited to the Khamisiyah modeling. According to a modeling analyst involved with the modeling of Al Muthanna, the weather models used, COAMPS and OMEGA, each showed the plume going in different directions, at a 110-120 degree difference. The analyst said that COAMPS showed the plume going in a North/Northwest direction, while OMEGA showed the plume going South. Similar divergence among model predictions was also observed in the modeling of Muhammadiyat, as shown in figure VI.2. Figure 10: Divergence in DOD Models for Muhammadiyat: [See PDF for image] [End of figure] [End of section] Appendix VII: Divergence and Wind Field Models: In figure VII.1, windfield vector divergence projections 6.0 meters above terrain are based on observational data processed by the Meteorological Data Interpolation Code (MEDIC) model. Figure 11: Lawrence Livermore National Laboratory Diagnostic Wind Model Based on Observational Data: [See PDF for image] [End of figure] In figure VII.2, the Windfield vector model based on European Centre for Medium-Range Weather Forecast (ECMWF) projections, processed by the Meteorological Data Interpolation Code (MEDIC) model, is shown. Figure 12: Lawrence Livermore National Laboratory Diagnostic Wind Model Based on ECMWF Projections: [See PDF for image] [End of figure] In figure VII.3, the windfield vector model is based on Coupled Ocean- Atmosphere Mesoscale Prediction System (COAMPS) Simulations at the U.S. Naval Research Laboratories. Figure 13: Windfield Vector Model Based on COAMPS: [See PDF for image] [End of figure] [End of section] FOOTNOTES [1] Observations were few because Iraq stopped reporting weather station measurement information to the World Meteorological Organization in 1981. As a result, data on the meteorological conditions during the Gulf War were sparse. The only data that were available were for the surface wind observation site, 80 to 90 kilometers away, and the upper atmospheric site, about 200 kilometers away. [2] The Presidential Advisory Committee on Gulf War Veterans' Illnesses was a panel established in August 1995 to provide oversight to Gulf War illnesses investigations. [3] DOD had asked the Institute of Defense Analyses to set up a DOD- funded panel to review the modeling. [4] GAO, Gulf War Illnesses: Improved Monitoring of Clinical Progress and Reexamination of Research Emphasis Are Needed, (GAO/NSIAD-97-163, June 23, 1997). [5] Minimal effects is the lowest concentration level that would be expected to have noticeable effects on human beings. [6] Occupational limit is about one-tenth of the minimal effects value and is the maximum concentration level that would be allowed for a worker who could become exposed to sarin in the course of his job duties. [7] The general population limit represents the limit below which any member of the general population could be exposed (e.g., exhale) 7 days a week, every week, for a lifetime, without experiencing any adverse health effects.