ONS Safety Notices Issue No. 94-04 November 1994 | ||
Uninterruptible Power SuppliesContent
IntroductionThis notice is one in a series of publications issued by the Office of Nuclear and Facility Safety to share nuclear safety information throughout the Department of Energy complex. For more information, contact Dick Trevillian, Office of Operating Experience Analysis and Feedback, Office of Nuclear and Facility Safety, U.S. Department of Energy, Washington, DC 20585, telephone (301) 903-3074. No specific action or responses are required solely as a result of this notice. Safety Notices are distributed to U.S. Department of Energy Program Offices, Field Offices, and contractors who have responsibility for the operation and maintenance of nuclear and related facilities, and to other organizations involved in nuclear safety. Written requests to be added to or deleted from the distribution of Safety Notices should be sent to: BR Richard L. Trevillian, EH-33, Room E-460 GTN, U.S. Department of Energy, Washington, DC 20585. The ESH Office of Information Management maintains a file of Safety Notices and supporting information. Copies can be obtained by contacting the Office of Information Management at (301) 903-0449 or by writing to the Office of Information Management, U.S. Department of Energy, EH-72/Suite 100, CXXI/3, Washington, DC 20585. Notice SummaryThis notice contains lessons learned regarding power supplied to vital components or processes for control of nuclear safety. Numerous events have occurred at Department of Energy (DOE) and other nuclear facilities that had notable effects on facility operations. Failure of an Uninterruptible Power Supply (UPS) has resulted in disruption of facility operations, personnel evacuation, and temporary loss of instrumentation and equipment important to safety. This notice contains information about the safety significance of UPS failures as well as corrective actions taken to ensure electrical power availability. ApplicabilityThis notice applies to all DOE facilities that use UPSs to ensure safe operations. The Office of Nuclear and Facility Safety (NFS) advises operators of these facilities to understand potential hazards associated with UPS problems or failures and to be observant of conditions that may lead to such events. No specific action or response is required as a result of this notice. Equipment DescriptionUPS units are used throughout industry when either the quality or the continuous availability of electrical power is important. UPSs are designed to provide stable, uninterruptible power to equipment that is important to safe operation of the facility. UPSs preclude the three most common power disturbances that affect reliable operation of equipment: (1) power line noise and harmonics, (2) power fluctuation, and (3) sudden loss of power. Several UPS configurations are available. The simplest is the single UPS unit shown in Figure 1. The overall UPS system is designed for dual ac input sources: a normal source that feeds the rectifier charger and a bypass source that is routed to the UPS load panel via either the static transfer switch or the maintenance bypass switch. These two sources are usually derived from either a common or two separate power sources. If the bypass source voltage does not match the output voltage of the UPS inverter, a transformer must be used in the bypass feed (Figure 1). A standby generator may be used as an alternate input source, if the reliability requirement dictates the need for it. If a standby generator is used, it is automatically selected when power is lost. Commercial nuclear power plants typically use diesel generators as an alternate input source for safety related UPSs. Please refer to hard copy of Safety Notice for Figure - ³UPS system block² Events SummaryThe following five events involving problems with or failures of UPS systems at DOE facilities and one event at a commercial nuclear power plant collectively represent the major causes for UPS difficulties. These causes include electrical shorts, component aging and failure, inadequate design, inadequate procedures and operator error, configuration discrepancies, and environmental effects. The first event occurred at the Hot Fuel Examination Facility at Argonne National Laboratory-West on August 21, 1993.1 On- site fire department personnel responded to a fire alarm in the control room of the Neutron Radiography Reactor. The firemen, accompanied by the on-duty shift technician, entered the control room, and, although there was a strong odor of burnt electrical components, they found no fire. The source of the odor was in the UPS cabinet. Workers opened the breaker that supplied ac power to the UPS. An instrument technician, with the aid of the shift technician, located the source of the problem in the UPS battery drawer, disconnected it and removed it from the building. The inside of the lexan cover on the drawer was smoke-blackened and an electrical short had damaged two of the batteries and associated wiring. Engineers investigating the battery failure believed the most probable cause of failure was either (1) an electrolytic short or (2) excess voltage to the battery charger float. A short may have been caused by electrolyte collected on the batteries, that reduced the dielectric value between adjacent battery terminals where a voltage differential was sufficient to establish current flow. The two terminal batteries were adjacent, which created an area of highest voltage potential (95-volt differential). Most battery potential is 6 to 12 volts between batteries adjacent to the side and somewhat higher for end-to-end batteries. The other possibility was voltage high enough to alter the electrolyte and allow excess current to flow in one area of the battery case, which could also melt the case of the adjacent battery. Float voltage should not exceed 2.3 volts per cell. Investigators determined that maintenance and surveillance was performed on schedule and surveillance and battery replacement exceeded manufacturer specifications. Facility personnel installed insulating material between the terminal batteries to reduce the possibility of electrolytic short. The second event occurred on November 21, 1993, at the Idaho Chemical Processing Plant when an internal circuit board failed in the UPS to the criticality alarm system for the CPP-602 Product Denitrator Area.2 Investigators of the failure alarm on a remote panel for the criticality alarm system discovered the circuit board problem. They noted that there were no other alarms and that the alarm system appeared to be without power. Because there were no processing operations at the time, they took the criticality alarm system out of service and isolated it from the plant evacuation system. The following day, investigators reactivated the system to find the cause of failure and they found that the UPS was completely shut down. The manager called the UPS vendor, Exide Electronics Corporation, in to help with the investigation and the representative found a defective microprocessor chip (SAB 80535-N) in the internal circuit board. The Exide representative installed replacement chips in six UPSs at the facility. (About the same time this failure occurred, Exide discovered that a batch of chips manufactured by Siemens Corporation were defective and they were preparing a field change to replace them.) The safety significance of this event was limited because no processing operations were in progress. An alarm was initiated when the UPS failed, so if any processes had been operating, facility personnel could have shut them down in a safe, systematic manner. However, if a criticality had occurred while the criticality alarm system was out of service, the plant evacuation alarm system would not have automatically activated. On December 26, 1993, at the Hanford Plutonium Finishing Plant, the UPS unit for the seismic shut-down system tripped causing an automatic shift from main power to bypass power, which was supplied from a parallel UPS.3 When the transfer occurred, normal building ventilation shut down because of low voltage, which caused the emergency steam turbines to start up and maintain building vacuum. Personnel were evacuated from radiological controlled areas as a result. Maintenance personnel inspected the UPS and replaced printed circuit boards, an inverter transformer, a static by-pass module, and a cabinet fan. These defective components had contributed to the UPS trip. Investigating engineers determined that the root cause for the UPS failure to maintain system voltage was an inadequate system design. The design called for power supply no. 1 to receive by-pass power from supply no. 2 (i.e., if UPS no. 1 fails, UPS no. 2 provides power to the system). However, for this design to be effective, the by-pass voltage must be in phase with the primary voltage when the circuitry is switched, and in this case the voltages were out of phase. A resulting short circuit between the two units lowered the power supply voltage to the system, which caused the ventilation system to shut down. When the inverter in UPS no. 1 shut down, the short circuit was removed and UPS no. 2 provided proper system power. Facility managers decided to remove the series connection between UPS no. 1 and UPS no. 2 and to establish switching procedures for a manual, bumpless (synchronized) transfer between normal power and back-up power. On August 24, 1991, at the Rocky Flats Plutonium Processing Facility, lights on the stationary operating engineer control room panel in Building 707 went out for about 15 to 20 seconds.4 The stationary operating engineer, responded immediately, noticing that the over voltage/under voltage light was on and the transfer switch at the UPS power unit was in manual position. He discovered that work was performed on the UPS system two days earlier in accordance with a maintenance work package and a maintenance procedure. The work package did not contain a step to return the UPS system to automatic mode; but it referenced the site procedure, which included the step. Because maintenance workers did not comply with all instructions and steps in the work package, the UPS system did not automatically switch to emergency power on demand. The haste with which maintenance procedures and work packages were written created the conditions for steps and instructions to be omitted. Facility managers determined that a stationary operating engineer or the utilities manager should have had more input into the development of the work package. Subject matter experts should be involved in the development of work packages and procedures. A potentially serious event occurred on August 22, 1991, at Idaho National Engineering Laboratory when personnel reporting to work in Building CFA-609 detected a strong, rotten-egg odor.5 Investigators determined that the UPS batteries had failed and were bulging and emitting a strong odor of toxic gas. A combustible gas indicator reading showed slightly positive for explosive gas. They notified authorities and opened the doors to the building to allow the emissions to dissipate. Industrial Hygiene technicians took air sample readings at the point of emission, but none were above acceptable limits for toxic gases. Three employees complained of nausea and headaches and technicians sent them to the dispensary. Dispensary personnel released two employees and held one for observation. They later released him. Investigators determined that the temperature in the battery room was elevated and the batteries were swollen and emitting acid mist. They thought the most likely cause was a faulty switching mechanism in the UPS unit that had allowed the batteries to overcharge. Manufacturer representatives checked the UPS unit and replaced the batteries. They determined the cause of failure to be significant temperature differentials between the two rooms that contained the UPS battery racks. One battery rack was located in a generator room where normal temperatures were about 60 degrees Fahrenheit. The other rack was in a room with transformers, where temperatures of 90 degrees were normal and could exceed 100 degrees in the summer months. Facility personnel installed ventilation equipment that would maintain the room differential temperatures to less than 20 degrees. UPS batteries are often rated for only 85 degrees. Five UPSs failed simultaneously at Unit 2 of Nine Mile Point, a commercial nuclear power plant. A U.S. Nuclear Regulatory Commission (NRC) incident investigation team investigated the event and documented their findings in NUREG-1455.6 Shortly before shift change on the morning of August 13, 1991, an internal failure in the main plant transformer caused a turbine trip and reactor scram. Before automatic protection features isolated the transformer, depressed voltages on the transmission system and on the in-plant electrical distribution system occurred. In less than a second, the degraded voltage caused a simultaneous common-mode loss of five non-safety-related UPSs that powered important control room instrumentation and plant equipment. The power loss affected reactor control rod position indicators, some reactor power and water level indicators, control room annunciators, the plant communications system, the plant process computer, and lighting at some locations. As a result, managers declared a site emergency. Automatic reactor protection systems, including the reactor scram, functioned properly. All necessary engineered safety features were available and used as needed. However, the difficulty experienced by operators because of the loss of many normally available plant status indicators and equipment severely complicated the plant transient condition and underscored the importance of the lost power supplies. Licensee and NRC investigators found that the root causes of the UPS failures were a common-mode design deficiency and a common-cause maintenance deficiency. Each UPS contains a control logic unit that is essential to its operation. The UPSs were lost because the power for these control logic units was from a source that was affected by the transformer failure. The units could have been supplied with back-up power from internal batteries; however, the batteries were dead. If either deficiency had been corrected, the UPSs would not have been lost. The NRC investigation team made the following findings.
Potential CausesUPS failures may be attributed to six primary root causes; however, it should be recognized that most events involving UPS failures at DOE facilities and other industries have a variety of contributing causes. This is illustrated by the events described in this safety notice. The primary root cause categories are:
Maintenance InadequaciesLack of preventive maintenance or improper or inadequate maintenance contributed to a number of UPS failures. In many cases, maintenance is performed infrequently or not at all, or maintenance requirements are inadequate. Often, no provisions exist to replace aging components such as capacitors. Failure to label all installed UPS systems and identify their associated functions also resulted in lack of maintenance. Personnel ErrorsEvents related to personnel error typically involved failure to comply with procedures, inattention to detail, and improper UPS operation because of lack of system knowledge. Most contributing factors were inadequate maintenance and testing procedures and deficient practices. Inadequate planning, training, and verification of surveillance, maintenance, and testing are other contributing factors. Procedural InadequaciesSeveral UPS failures resulted from inadequate or incorrect maintenance, test, and surveillance procedures. Such procedures are sometimes the result of inadequate vendor documentation or failure to ensure that work packages comply with upper-tier procedures. Often, pre-test and post-test requirements are undefined. Design and Installation InadequaciesIn many cases, UPS units were not correctly designed, specified, or installed to comply with overall system operational requirements, environmental conditions, and codes. This resulted in incompatibility between actual service conditions and design service conditions. These deficiencies may be related to a lack of standard procurement specifications or guidelines for purchasing the systems. UPS and battery system failures were often the result of high ambient temperature and excess humidity (i.e., inadequate ventilation), improper electrical connections, physical arrangements of components, or system transients. Component FailureA review performed for DOE by the Nuclear Operations Analysis Center evaluated about 250 events involving a loss of normal power or emergency power system components from September 1990 through mid-September 1991.7 In the majority of these events, the direct cause of the UPS system failure could be attributed to component failure. This result was substantiated by a review performed at Rocky Flats in July 1993.8 This type of failure was generally characterized by its isolation or unpredictability and was typical of electrical components such as capacitors, breakers, circuit boards, and static switches. Component age may have been a factor in many instances, but other contributing factors were generally present also. Unpredictable component or equipment failure was often caused by external phenomena such as lightning. The Analysis Center review indicated that a large percentage of UPS failures were related to batteries. Common problems associated with batteries included dead cells, internal and external short circuits, and degradation caused by excessive heat. For many failures involving batteries, however, failure can be predicted and therefore prevented. These failures were generally the result of multiple root causes such as equipment failure combined with inadequate maintenance. Management InadequaciesFailure of managers to properly assign priorities to UPS system maintenance tasks and to make resources available for system maintenance contributed to many failures. Inadequate identification and control of facility configuration and design documentation was also a management deficiency. Corrective ActionsThe goal of a back-up power system program is to achieve increased system reliability. ONS recommends the following actions to minimize UPS degradation or failure.
Industry PracticeBecause of the number and significance of UPS problems, managers of DOE facilities and other industries have begun to address issues related to failure of emergency and back-up power supplies. Rocky Flats8 and Lawrence Livermore Laboratory9 are examples of DOE facilities that have programs dedicated to increasing UPS reliability. Managers at both facilities reviewed events related to UPS failure, performed root cause analyses, and developed recommendations to increase reliability. Managers at the DOE sites have begun to implement the programs. Lawrence Livermore personnel also developed a standard procurement specification for UPS systems and for lead-acid storage batteries as well as a maintenance and test standard for stationary lead-acid batteries.9 The DOE Office of Defense Programs developed a standard that establishes fundamental requirements and guidance for back-up and emergency power sources, including UPSs.10 They also published an evaluation report on emergency and back-up power supplies at DOE facilities.11 The DOE Back-up Power Working Group is another source for help or information regarding UPS problems.12 Additional information on application and testing of UPSs can be found in ANSI/IEEE Standard 944.13 Personnel from the NRC Office for Analysis and Evaluation of Operational Data, performed an engineering evaluation of electrical inverter operating experience in the commercial nuclear industry from 1985 to 1992.14 The number of electrical inverter failures has decreased in the last 7 1/2 years. Component failure continued to be the dominant root cause of these failures while human error was the second most common root cause. Other root causes were incorrect setpoints, lack of maintenance, and inadequate procedures. Capacitors were the component that failed most often. Other failed components were transformers, silicon-controlled rectifiers, and transfer switches. The decrease in electrical inverter failures was largely the result of three factors: (1) better cooling units, (2) more preventive maintenance, and (3) more frequent inverter replacement. References | |||||||
| Home |
Performance Measures |
Lessons Learned |
| ES&H TIS | | ||
http://tis-hq.eh.doe.gov/web/oeaf/lessons_learned/ons/sn9404.html | |
Last modified: Wednesday, 15-Jan-97 14:01:00 |