-
Safety Management Through Analysis ONS Safety Notices
Issue No. 94-04
November 1994
-
Office of Nuclear and Facility Safety (3K)
Director, Office of Nuclear and Facility Safety U.S. Department of Energy Washington, DC 20585
DOE/EH-0436Issue No. 94-04November 1994

Uninterruptible Power Supplies


Content


Introduction

This notice is one in a series of publications issued by the Office of Nuclear and Facility Safety to share nuclear safety information throughout the Department of Energy complex. For more information, contact Dick Trevillian, Office of Operating Experience Analysis and Feedback, Office of Nuclear and Facility Safety, U.S. Department of Energy, Washington, DC 20585, telephone (301) 903-3074. No specific action or responses are required solely as a result of this notice.

Safety Notices are distributed to U.S. Department of Energy Program Offices, Field Offices, and contractors who have responsibility for the operation and maintenance of nuclear and related facilities, and to other organizations involved in nuclear safety. Written requests to be added to or deleted from the distribution of Safety Notices should be sent to: BR Richard L. Trevillian, EH-33, Room E-460 GTN, U.S. Department of Energy, Washington, DC 20585.

The ESH Office of Information Management maintains a file of Safety Notices and supporting information. Copies can be obtained by contacting the Office of Information Management at (301) 903-0449 or by writing to the Office of Information Management, U.S. Department of Energy, EH-72/Suite 100, CXXI/3, Washington, DC 20585.


Notice Summary

This notice contains lessons learned regarding power supplied to vital components or processes for control of nuclear safety. Numerous events have occurred at Department of Energy (DOE) and other nuclear facilities that had notable effects on facility operations. Failure of an Uninterruptible Power Supply (UPS) has resulted in disruption of facility operations, personnel evacuation, and temporary loss of instrumentation and equipment important to safety. This notice contains information about the safety significance of UPS failures as well as corrective actions taken to ensure electrical power availability.


Applicability

This notice applies to all DOE facilities that use UPSs to ensure safe operations. The Office of Nuclear and Facility Safety (NFS) advises operators of these facilities to understand potential hazards associated with UPS problems or failures and to be observant of conditions that may lead to such events. No specific action or response is required as a result of this notice.


Equipment Description

UPS units are used throughout industry when either the quality or the continuous availability of electrical power is important. UPSs are designed to provide stable, uninterruptible power to equipment that is important to safe operation of the facility. UPSs preclude the three most common power disturbances that affect reliable operation of equipment: (1) power line noise and harmonics, (2) power fluctuation, and (3) sudden loss of power. Several UPS configurations are available. The simplest is the single UPS unit shown in Figure 1.

The overall UPS system is designed for dual ac input sources: a normal source that feeds the rectifier charger and a bypass source that is routed to the UPS load panel via either the static transfer switch or the maintenance bypass switch. These two sources are usually derived from either a common or two separate power sources. If the bypass source voltage does not match the output voltage of the UPS inverter, a transformer must be used in the bypass feed (Figure 1).

A standby generator may be used as an alternate input source, if the reliability requirement dictates the need for it. If a standby generator is used, it is automatically selected when power is lost. Commercial nuclear power plants typically use diesel generators as an alternate input source for safety related UPSs.

Please refer to hard copy of Safety Notice for Figure - ³UPS system block²


Events Summary

The following five events involving problems with or failures of UPS systems at DOE facilities and one event at a commercial nuclear power plant collectively represent the major causes for UPS difficulties. These causes include electrical shorts, component aging and failure, inadequate design, inadequate procedures and operator error, configuration discrepancies, and environmental effects.

The first event occurred at the Hot Fuel Examination Facility at Argonne National Laboratory-West on August 21, 1993.1 On- site fire department personnel responded to a fire alarm in the control room of the Neutron Radiography Reactor. The firemen, accompanied by the on-duty shift technician, entered the control room, and, although there was a strong odor of burnt electrical components, they found no fire. The source of the odor was in the UPS cabinet.

Workers opened the breaker that supplied ac power to the UPS. An instrument technician, with the aid of the shift technician, located the source of the problem in the UPS battery drawer, disconnected it and removed it from the building. The inside of the lexan cover on the drawer was smoke-blackened and an electrical short had damaged two of the batteries and associated wiring.

Engineers investigating the battery failure believed the most probable cause of failure was either (1) an electrolytic short or (2) excess voltage to the battery charger float. A short may have been caused by electrolyte collected on the batteries, that reduced the dielectric value between adjacent battery terminals where a voltage differential was sufficient to establish current flow. The two terminal batteries were adjacent, which created an area of highest voltage potential (95-volt differential). Most battery potential is 6 to 12 volts between batteries adjacent to the side and somewhat higher for end-to-end batteries.

The other possibility was voltage high enough to alter the electrolyte and allow excess current to flow in one area of the battery case, which could also melt the case of the adjacent battery. Float voltage should not exceed 2.3 volts per cell.

Investigators determined that maintenance and surveillance was performed on schedule and surveillance and battery replacement exceeded manufacturer specifications. Facility personnel installed insulating material between the terminal batteries to reduce the possibility of electrolytic short.

The second event occurred on November 21, 1993, at the Idaho Chemical Processing Plant when an internal circuit board failed in the UPS to the criticality alarm system for the CPP-602 Product Denitrator Area.2

Investigators of the failure alarm on a remote panel for the criticality alarm system discovered the circuit board problem. They noted that there were no other alarms and that the alarm system appeared to be without power. Because there were no processing operations at the time, they took the criticality alarm system out of service and isolated it from the plant evacuation system. The following day, investigators reactivated the system to find the cause of failure and they found that the UPS was completely shut down. The manager called the UPS vendor, Exide Electronics Corporation, in to help with the investigation and the representative found a defective microprocessor chip (SAB 80535-N) in the internal circuit board. The Exide representative installed replacement chips in six UPSs at the facility. (About the same time this failure occurred, Exide discovered that a batch of chips manufactured by Siemens Corporation were defective and they were preparing a field change to replace them.)

The safety significance of this event was limited because no processing operations were in progress. An alarm was initiated when the UPS failed, so if any processes had been operating, facility personnel could have shut them down in a safe, systematic manner. However, if a criticality had occurred while the criticality alarm system was out of service, the plant evacuation alarm system would not have automatically activated.

On December 26, 1993, at the Hanford Plutonium Finishing Plant, the UPS unit for the seismic shut-down system tripped causing an automatic shift from main power to bypass power, which was supplied from a parallel UPS.3 When the transfer occurred, normal building ventilation shut down because of low voltage, which caused the emergency steam turbines to start up and maintain building vacuum. Personnel were evacuated from radiological controlled areas as a result. Maintenance personnel inspected the UPS and replaced printed circuit boards, an inverter transformer, a static by-pass module, and a cabinet fan. These defective components had contributed to the UPS trip.

Investigating engineers determined that the root cause for the UPS failure to maintain system voltage was an inadequate system design. The design called for power supply no. 1 to receive by-pass power from supply no. 2 (i.e., if UPS no. 1 fails, UPS no. 2 provides power to the system). However, for this design to be effective, the by-pass voltage must be in phase with the primary voltage when the circuitry is switched, and in this case the voltages were out of phase. A resulting short circuit between the two units lowered the power supply voltage to the system, which caused the ventilation system to shut down. When the inverter in UPS no. 1 shut down, the short circuit was removed and UPS no. 2 provided proper system power. Facility managers decided to remove the series connection between UPS no. 1 and UPS no. 2 and to establish switching procedures for a manual, bumpless (synchronized) transfer between normal power and back-up power.

On August 24, 1991, at the Rocky Flats Plutonium Processing Facility, lights on the stationary operating engineer control room panel in Building 707 went out for about 15 to 20 seconds.4 The stationary operating engineer, responded immediately, noticing that the over voltage/under voltage light was on and the transfer switch at the UPS power unit was in manual position. He discovered that work was performed on the UPS system two days earlier in accordance with a maintenance work package and a maintenance procedure. The work package did not contain a step to return the UPS system to automatic mode; but it referenced the site procedure, which included the step. Because maintenance workers did not comply with all instructions and steps in the work package, the UPS system did not automatically switch to emergency power on demand.

The haste with which maintenance procedures and work packages were written created the conditions for steps and instructions to be omitted. Facility managers determined that a stationary operating engineer or the utilities manager should have had more input into the development of the work package. Subject matter experts should be involved in the development of work packages and procedures.

A potentially serious event occurred on August 22, 1991, at Idaho National Engineering Laboratory when personnel reporting to work in Building CFA-609 detected a strong, rotten-egg odor.5 Investigators determined that the UPS batteries had failed and were bulging and emitting a strong odor of toxic gas. A combustible gas indicator reading showed slightly positive for explosive gas. They notified authorities and opened the doors to the building to allow the emissions to dissipate.

Industrial Hygiene technicians took air sample readings at the point of emission, but none were above acceptable limits for toxic gases. Three employees complained of nausea and headaches and technicians sent them to the dispensary. Dispensary personnel released two employees and held one for observation. They later released him.

Investigators determined that the temperature in the battery room was elevated and the batteries were swollen and emitting acid mist. They thought the most likely cause was a faulty switching mechanism in the UPS unit that had allowed the batteries to overcharge.

Manufacturer representatives checked the UPS unit and replaced the batteries. They determined the cause of failure to be significant temperature differentials between the two rooms that contained the UPS battery racks. One battery rack was located in a generator room where normal temperatures were about 60 degrees Fahrenheit. The other rack was in a room with transformers, where temperatures of 90 degrees were normal and could exceed 100 degrees in the summer months. Facility personnel installed ventilation equipment that would maintain the room differential temperatures to less than 20 degrees. UPS batteries are often rated for only 85 degrees.

Five UPSs failed simultaneously at Unit 2 of Nine Mile Point, a commercial nuclear power plant. A U.S. Nuclear Regulatory Commission (NRC) incident investigation team investigated the event and documented their findings in NUREG-1455.6 Shortly before shift change on the morning of August 13, 1991, an internal failure in the main plant transformer caused a turbine trip and reactor scram. Before automatic protection features isolated the transformer, depressed voltages on the transmission system and on the in-plant electrical distribution system occurred. In less than a second, the degraded voltage caused a simultaneous common-mode loss of five non-safety-related UPSs that powered important control room instrumentation and plant equipment. The power loss affected reactor control rod position indicators, some reactor power and water level indicators, control room annunciators, the plant communications system, the plant process computer, and lighting at some locations. As a result, managers declared a site emergency.

Automatic reactor protection systems, including the reactor scram, functioned properly. All necessary engineered safety features were available and used as needed. However, the difficulty experienced by operators because of the loss of many normally available plant status indicators and equipment severely complicated the plant transient condition and underscored the importance of the lost power supplies.

Licensee and NRC investigators found that the root causes of the UPS failures were a common-mode design deficiency and a common-cause maintenance deficiency. Each UPS contains a control logic unit that is essential to its operation. The UPSs were lost because the power for these control logic units was from a source that was affected by the transformer failure. The units could have been supplied with back-up power from internal batteries; however, the batteries were dead. If either deficiency had been corrected, the UPSs would not have been lost.

The NRC investigation team made the following findings.

  • The design of the five UPS units were identical, therefore all were vulnerable to the degraded voltage from the failed transformer. Maintenance practices for the five UPS units were identical. Internal control logic batteries were dead in all five UPSs. Although they were charged continuously, they had degraded from age and should have been replaced.

  • The text of the UPS technical manual was inconsistent with the drawings for the power source of the internal control logic. The units were wired in accordance with the drawings, which showed the control logic connected to the maintenance power supply as the preferred source. The technical manual did not clearly state the function or the importance of the control logic batteries.Significance of EventsPotential consequences of the loss or degradation of emergency or back- up power supplies include the following. Loss of equipment and systems important to safety, including facility and system status indicators and alarms Unnecessary actuation of safety systems Improper control system response

  • Personnel hazards such as fire, explosion, smoke inhalation, toxic gas exposure, and electrical shocks Challenges to facility operators and to remaining functional equipment Interruption of normal facility operations with severe system transients Mechanical equipment and facility damageIn general, a single UPS failure does not lead to multiple occurrences of these conditions (except in the case of common-mode failures such as the Nine Mile Point incident). The seriousness of possible consequences and the number of reported failures suggested that measures to prevent such events are warranted.


Potential Causes

UPS failures may be attributed to six primary root causes; however, it should be recognized that most events involving UPS failures at DOE facilities and other industries have a variety of contributing causes. This is illustrated by the events described in this safety notice. The primary root cause categories are:

  1. maintenance inadequacies,

  2. personnel errors,

  3. procedural inadequacies,

  4. design and installation inadequacies,

  5. component failure, and

  6. management inadequacies.

Maintenance Inadequacies

Lack of preventive maintenance or improper or inadequate maintenance contributed to a number of UPS failures. In many cases, maintenance is performed infrequently or not at all, or maintenance requirements are inadequate. Often, no provisions exist to replace aging components such as capacitors. Failure to label all installed UPS systems and identify their associated functions also resulted in lack of maintenance.

Personnel Errors

Events related to personnel error typically involved failure to comply with procedures, inattention to detail, and improper UPS operation because of lack of system knowledge. Most contributing factors were inadequate maintenance and testing procedures and deficient practices. Inadequate planning, training, and verification of surveillance, maintenance, and testing are other contributing factors.

Procedural Inadequacies

Several UPS failures resulted from inadequate or incorrect maintenance, test, and surveillance procedures. Such procedures are sometimes the result of inadequate vendor documentation or failure to ensure that work packages comply with upper-tier procedures. Often, pre-test and post-test requirements are undefined.

Design and Installation Inadequacies

In many cases, UPS units were not correctly designed, specified, or installed to comply with overall system operational requirements, environmental conditions, and codes. This resulted in incompatibility between actual service conditions and design service conditions. These deficiencies may be related to a lack of standard procurement specifications or guidelines for purchasing the systems. UPS and battery system failures were often the result of high ambient temperature and excess humidity (i.e., inadequate ventilation), improper electrical connections, physical arrangements of components, or system transients.

Component Failure

A review performed for DOE by the Nuclear Operations Analysis Center evaluated about 250 events involving a loss of normal power or emergency power system components from September 1990 through mid-September 1991.7 In the majority of these events, the direct cause of the UPS system failure could be attributed to component failure. This result was substantiated by a review performed at Rocky Flats in July 1993.8 This type of failure was generally characterized by its isolation or unpredictability and was typical of electrical components such as capacitors, breakers, circuit boards, and static switches. Component age may have been a factor in many instances, but other contributing factors were generally present also. Unpredictable component or equipment failure was often caused by external phenomena such as lightning.

The Analysis Center review indicated that a large percentage of UPS failures were related to batteries. Common problems associated with batteries included dead cells, internal and external short circuits, and degradation caused by excessive heat. For many failures involving batteries, however, failure can be predicted and therefore prevented. These failures were generally the result of multiple root causes such as equipment failure combined with inadequate maintenance.

Management Inadequacies

Failure of managers to properly assign priorities to UPS system maintenance tasks and to make resources available for system maintenance contributed to many failures. Inadequate identification and control of facility configuration and design documentation was also a management deficiency.


Corrective Actions

The goal of a back-up power system program is to achieve increased system reliability. ONS recommends the following actions to minimize UPS degradation or failure.

  1. Provide a dedicated individual or centralized group to be aware of and responsible for UPS issues and problems.

  2. Identify all UPSs installed in each facility and their functions, and consider removing any unnecessary UPS.

    • Ensure that UPSs are properly labeled as to function and operation.

    • Ensure that current or updated vendor documentation is available and addresses design and maintenance requirements of the system.

  3. Establish a preventive maintenance program for UPSs, taking into account system design, environment, and vendor recommendations.

    • Provide sufficient resources to comply with the preventive maintenance program.

    • Identify components that may be affected by age (e.g., capacitors, batteries, and diodes) and schedule replacement to preclude failure.

    • Schedule replacement of older, less reliable UPSs if feasible.

    • Schedule inspections to ensure that batteries, battery charging, inverter, and related circuits are working properly.

    • Use performance indicators to allow prediction of system performance from one surveillance or preventive maintenance period to the next.

    • Establish methods to receive, control and utilize information from vendors or industry bulletins on manufacturer defects.

    • Perform and document data trends.

    • Disseminate pertinent information to responsible personnel.

  4. Ensure that work planning, personnel training, and verification of testing and UPS maintenance are specific and take vendor recommendations into account.

    • Ensure that personnel are trained to operate UPSs.

    • Verify that someone familiar with the operation of the UPS performed a point-by-point review of the vendor technical manuals to confirm that every step in the manual is included in maintenance and surveillance instructions.

  5. Ensure that maintenance, test, and surveillance procedures are adequate.

    • Schedule reviews of procedures by cognizant engineering and management personnel before release.

    • Verify that the sequence of maintenance, surveillance, and test steps are correct.

    • Ensure that procedures contain pre-test and post-test requirements and comply with vendor recommendations.

    • Ensure that surveillance criteria contain action points to notify appropriate personnel of adverse trends before an UPS actually fails a surveillance.

    • Ensure that procedures contain detailed performance requirements and acceptance criteria with action instructions if criteria are not met. Record and review as-found performance to determine if it is declining and if it will be acceptable at the next scheduled surveillance.

    • Replace generic procedures for maintenance, testing, and preventive maintenance of UPSs with unit-specific procedures to take advantage of operating experience, vendor recommendations, and unique performance criteria.

    • Consider the use of new technologies, such as thermography, to identify components with high temperatures.

  6. Review the design basis for UPSs to ensure compatibility between design conditions and actual service conditions. If discrepancies are found, perform an engineering evaluation.

    • Verify that environmental conditions such as temperature, humidity, and ventilation are acceptable. Give special consideration to battery system requirements.

    • Verify that installation is in accordance with design specifications and that arrangement of components will preclude accidental short circuits or maintenance/testing/surveillance complications.

    • Ensure that the UPS design minimizes common mode failure.

    • Establish standard procurement specifications for UPSs.

  7. Establish a system to notify management of UPS issues and problems.

  8. Provide adequate resources to address UPS problems and support a surveillance and preventive maintenance program.

  9. Incorporate applicable lessons learned from occurrence reports, DOE and industry operating experience, and vendor information.


Industry Practice

Because of the number and significance of UPS problems, managers of DOE facilities and other industries have begun to address issues related to failure of emergency and back-up power supplies. Rocky Flats8 and Lawrence Livermore Laboratory9 are examples of DOE facilities that have programs dedicated to increasing UPS reliability. Managers at both facilities reviewed events related to UPS failure, performed root cause analyses, and developed recommendations to increase reliability. Managers at the DOE sites have begun to implement the programs.

Lawrence Livermore personnel also developed a standard procurement specification for UPS systems and for lead-acid storage batteries as well as a maintenance and test standard for stationary lead-acid batteries.9

The DOE Office of Defense Programs developed a standard that establishes fundamental requirements and guidance for back-up and emergency power sources, including UPSs.10 They also published an evaluation report on emergency and back-up power supplies at DOE facilities.11 The DOE Back-up Power Working Group is another source for help or information regarding UPS problems.12 Additional information on application and testing of UPSs can be found in ANSI/IEEE Standard 944.13

Personnel from the NRC Office for Analysis and Evaluation of Operational Data, performed an engineering evaluation of electrical inverter operating experience in the commercial nuclear industry from 1985 to 1992.14 The number of electrical inverter failures has decreased in the last 7 1/2 years. Component failure continued to be the dominant root cause of these failures while human error was the second most common root cause. Other root causes were incorrect setpoints, lack of maintenance, and inadequate procedures. Capacitors were the component that failed most often. Other failed components were transformers, silicon-controlled rectifiers, and transfer switches. The decrease in electrical inverter failures was largely the result of three factors: (1) better cooling units, (2) more preventive maintenance, and (3) more frequent inverter replacement.


References

  1. DOE Final Occurrence Report CH-AA-ANLW- NRAD-1993-0003, "Electrical Short Circuit in UPS Battery Supply,'' September 15, 1993.

  2. DOE Final Occurrence Report ID--WINC-LANDLORD-1993-0025, "Loss of Power to Data Acquisition System Number 3," January 18, 1994.

  3. DOE Final Occurrence Report RL--WHC-PFP-1993-0065, "The UPS for the Seismic Shutdown System Shifted from Normal to Bypass Mode which Resulted in a Loss of Normal Ventilation," April 11, 1994.

  4. DOE Final Occurrence Report RFO--EGGR- PUFAB-1991-1070, "Tracking No. 1291: The Uninterruptible Power Supply (UPS) Failed to Activate When a Power Bump Occurred,'' October 15, 1992.

  5. DOE 10-Day Occurrence Report ID--PTI-INELAREA3-1991- 1004, "System Deficiency Failure of Class A Equipment,'' September 4, 1991.

  6. NRC Report NUREG-1455, Transformer Failure and Common- Mode Loss of Instrument Power at Nine Mile Point Unit 2 on August 13, 1991, October 1991.

  7. Letter from W. P. Poore, Nuclear Operations Analysis Center, to Mark Williams, U.S. Department of Energy, subject: "Interim Report on Review of Normal and Emergency Power System Failures," October 24, 1991.

  8. Transmittal letter WSB-171-93 from W. S. Bennett, EG&G Rocky Flats, to Karen McElhaney, Oak Ridge National Laboratory, subject: "Collective Significance Evaluation of Sitewide Uninterruptible Power Supply (UPS) Issues," September 24, 1993.

  9. Letter from Anita Gursahani, Lawrence Livermore National Laboratory, to Karen McElhaney, Oak Ridge National Laboratory, subject: "Battery and UPS System Installation Upgrades at LLNL," October 1, 1993.

  10. Standard DOE-STD-3003-94, Backup Power Sources for DOE DP Facilities.

  11. DOE/DP-0124T, "Augmented Evaluation Team Final Report - Emergency and Backup Power Supplies at Department of Energy Facilities," November 1993.

  12. DOE Backup Power Working Group contact, John Fredlund, DP-31, (301) 903-3059.

  13. ANSI/IEEE Standard 944, IEEE Recommended Practice for the Application and Testing of Uninterruptible Power Supplies for Power Generating Stations, American National Standards Institute, 11 W. 42nd Street, 13th Floor, New York, N. Y. 10036

  14. NRC Report AEOD/E93-03, "Engineering Evaluation Report - Electrical Inverter Operating Experience - 1985 to 1992," December 1993.

-
| Home | Performance Measures | Lessons Learned |
| ES&H TIS |
-
http://tis-hq.eh.doe.gov/web/oeaf/lessons_learned/ons/sn9404.html
Last modified: Wednesday, 15-Jan-97 14:01:00