D0 Note #1922 Alarm Display User's Manual Laura A. Paterno and Stuart Fuess D0 Collaboration Table of Contents Chapter 1 - Introduction 6 1.1 Development Environment 6 1.1.1 Languages and Operating Systems 6 1.1.1.1 Software Tools 6 1.1.1.2 Motif and X Windows 6 1.1.1.3 InterTask Communication Package (ITC) 6 1.1.1.4 Client-Server Package (CLSPKG) 6 1.1.1.5 Code Management 6 1.2 Overview of this Manual 7 Chapter 2 - The D0 Alarm System 8 2.1 Central Event Distributor 8 2.2 Event Logger 8 Alarm Display User's Manual 2.3 Event Scanner 8 2.4 Smart Alarm Processor 8 2.5 Alarm Display 9 Chapter 3 - Running the D0 Alarm Display 10 Chapter 4 - Configuration Files 11 4.1 Filter 11 4.2 Groups 11 4.3 Configuration File Format 11 4.3.1 Filter Format 11 4.3.1.1 FILTER 12 4.3.1.2 NAME 12 4.3.1.3 ATTR 12 4.3.1.4 PRIORITY 12 4.3.1.5 SUBSYSTEM 12 4.3.1.6 PATH 12 4.3.1.7 SYSTEM_ID 12 4.3.1.8 ANTI 13 4.3.2 Group Format 14 4.3.2.1 GROUP 14 ii Alarm Display User's Manual 4.3.2.2 GROUP USED 14 4.3.2.3 FILTER USED 14 4.3.3 Examples of Configuration Files 14 4.3.3.1 All Alarms Configuration File 15 4.3.3.2 High Priority Configuration File 15 Chapter 5 - Alarm Display Main Window 17 5.1 Message Area 17 5.2 Control Button Area 18 5.2.1 EXIT 18 5.2.2 ALARMS 18 5.2.3 HEARTBEATS 19 Chapter 6 - Alarms Window 20 6.1 Event Message Types 21 6.1.1 Bad Event Messages 21 6.1.2 Acknowledged Event Messages 21 6.1.3 Good Event Messages 21 6.2 Display Refresh Cycle 21 6.2.1 Effect on Oscillating Events 21 6.3 COMMANDS Menu 22 6.4 CONTROL Menu iii Alarm Display User's Manual 22 Chapter 7 - Alarm List Windows 23 7.1 Scrolling List 23 7.2 List Window Buttons 24 7.2.1 CLOSE Button 24 7.2.2 SHOW Button 24 7.2.3 LIST Button 24 7.2.4 PRINT Button 24 7.2.5 ACKNOWLEDGE Button 24 7.2.6 ACKNOWLEDGE ALL Button 25 7.2.7 UNACKNOWLEDGE Button 25 7.2.8 UNACKNOWLEDGE ALL Button 25 Chapter 8 - Alarm Detail Windows 26 8.1 Push Buttons in the Detail Window 28 Chapter 9 - Manage Groups Display Window 29 9.1 COMMANDS Menu 29 9.2 EDIT Menu 29 9.3 CONTROL Menu 30 9.4 Groups Available List 30 iv Alarm Display User's Manual 9.5 Groups Applied List 30 9.6 Using and Unusing Groups 30 Chapter 10 - Heartbeats Display Window 31 10.1 COMMANDS Menu 32 Appendix A - Glossary of Terms 33 References 34 List of Figures Figure 1 - Filter Format 11 Figure 2 - Group Format 14 Figure 3 - All Alarms Configuration File 15 Figure 4 - High Priority Configuration File 15 Figure 5 - Alarm Display Main Window 17 Figure 6 - File Selection Window 18 Figure 7 - Alarms Window 20 Figure 8 - BAD Alarm List Window 23 Figure 9 - Detail Window of an Analog Device Attribute 26 Figure 10 - Detail Window of a Binary Device Attribute 27 v Alarm Display User's Manual Version 1.0 Figure 11 - Detail Window of a Comment Device Attribute 27 Figure 12 - Manage Groups Display Window 29 Figure 13 - Heartbeats Display Window 31 List of Tables Table 1 - SYSTEM_ID Values 13 Chapter 1 - Introduction The D0 Alarm Display is a part of the D0 Alarm System. It is used to show the current state of the detector hardware and online data acquisition software processes during physics data acquisition. The purpose of this manual is to describe the functions of the D0 Alarm Display. 1.1 Development Environment 1.1.1 Languages and Operating Systems 1.1.1.1 Software Tools The Alarm Display program was developed with a variety of software tools and packages. Some of these packages were designed at D0 while others are of a commercial nature. The following sections will describe the tools used. 1.1.1.2 Motif and X Windows The windowing environment in the D0 Alarm Display was designed using Digital Equipment Corporations version of Motif and X Windows[1]. 1.1.1.3 InterTask Communication Package (ITC) A package written by members of D0 that allows two or more processes to communicate with each other across a network. A detailed description of the ITC package is available in ITC user's guide[2]. 1.1.1.4 Client-Server Package (CLSPKG) 6 Alarm Display User's Manual Version 1.0 The Client-Server package (CLSPKG) is an internal D0 product that allows users to create client programs which talk to a centralized server process to obtain information. It uses ITC as its underlying communication protocol between client and server processes. The Alarm Display is a client program that connects to the central event distribution server. 1.1.1.5 Code Management The Alarm Display code was managed using Digital Equipment Corporations Code Management System (CMS)[3]. 1.2 Overview of this Manual This is the user reference manual for the D0 Alarm Display. Chapter 2 contains an overview of the entire D0 Alarm System. Chapter 3 describes how to setup and run the D0 Alarm Display. Chapter 4 explains the contents of a configuration file and how it is used to define the the Alarms window. Chapter 5 describes the main window of the display program. Chapter 6 explains the Alarms window. Chapter 7 describes the Alarm List windows and Chapter 8 the Alarm Detail windows. Chapter 9 explains how to interactively use/unuse groups. Chapter 10 describes how to create new filters and groups of filters while running the display program. Chapter 11 describes the Heartbeats Display window. Appendix A contains a glossary of useful terms that are used throughout this manual. 7 Alarm Display User's Manual Version 1.0 Chapter 2 - The D0 Alarm System An important section of the monitoring and control system for the D0 experiment is dedicated to the production, distribution, display, and logging of alarms, or more generally any significant monitoring events. On each 15Hz data acquistion cycle each front-end processor compares the current readings of all accessible hardware devices to a local database of nominal and tolerance values. Upon a change of state (a device has gone outside its tolerance or returned to its nominal value) a front-end processor[4] generates an asynchronous alarm message. This message is placed on the D0 token ring LAN[4] and received by a D0 Gateway[4] process that is connected to the central event distributor task. 2.1 Central Event Distributor The principle component of the D0 Alarm System is the central event distributor task. It is configured as a server to clients which may either send or receive events. It maintains a list of all D0 devices which are out of tolerance at the current time. It also monitors special events designated as heartbeats from critically connected tasks, and internally generates alarms when a task has not sent a heartbeat message after a specified period of time. Other significant event messages, such as the beginning and ending of data acquisition runs, are also distributed via the central event distributor task. 2.2 Event Logger The Event Logger is a client of the central event distributor task. It archives significant event messages to files on disk that are later used by Event Scan processes. It also sends a periodic heartbeat message to the central event distributor to ensure that the distributor knows that it is running. 2.3 Event Scanner An Event Scan process searches through the archive files created by the Event Logger process. It can selectively filter on certain types of significant event messages and display the filtered events to a user. The filtering can be done by date, time, device type, priority of the event, or any possible combination of these. 2.4 Smart Alarm Processor The Smart Alarm Processor, currently in prototype stages, processes significant event messages and presently only 8 Alarm Display User's Manual Version 1.0 determines if physics data acquisition should be halted due to critical hardware devices or software processes found to be out of tolerance. It is hoped that this process will eventually take more corrective actions when hardware or software is found to be out of tolerance. 2.5 Alarm Display An Alarm Display is also a client of the central event distributor task. Filtered event requests similar to those of the Event Scan processes are sent to the central event distributor and the corresponding filtered event messages are returned to the process to be displayed. The rest of this document will be devoted to a description of the various components of the D0 Alarm Display. 9 Alarm Display User's Manual Version 1.0 Chapter 3 - Running the D0 Alarm Display When you wish to run the D0 Alarm Display1 you must first log into an account on one of the D0 Clusters (FNALD0, D0SFT, or D0). Once you have logged in type the following at the DCL prompt: $ LIBTEST ALL (if this is not in your LOGIN.COM file) $ D0SETUP HMON This defines the following logicals: ALD_CONFIG_DIR - directory of all the Alarm Display configuration files ALARM_DISPLAY_CONFIG - restore/save default filename ALARM_DISPLAY_LIST - file where the command LIST ALARMS writes out the current list of alarms for a group. (Default is ALARMS.LIST) and symbols: ALARM_DISPLAY - runs the D0 Alarm Display Program Before typing the symbol ALARM_DISPLAY to run the program make sure that your X server knows which window to pop up windows in. You do this with by doing the following: $ SHOW DISPLAY Device: WSA1: [exec] Node: 0 Transport: LOCAL Server: 0 Screen: 0 If the SHOW DISPLAY command returns an error saying that the DISPLAY is undefined then define the DISPLAY with the SET DISPLAY command. The display should be set to the node you are currently running on (this means the node you are actually sitting in front of at the time you run the program). For the example below, it is assumed that you are logged into node D0HS15 on the D0 Host Cluster. $ SET DISPLAY /CREATE /NODE=D0HS15 /TRANS=TCPIP If the node only had DECNET available for its transport protocol you would replace TCPIP with DECNET. At the DCL prompt type: $ ALARM_DISPLAY This will bring up the main window of the D0 Alarm Display. The display will work on either a black and white or color monitor. Multiple displays may be run on separate windows. The display can only be run on a VAX/VMS system and that system 10 Alarm Display User's Manual Version 1.0 Chapter 4 - Configuration Files The Alarms window is defined by a configuration file that contains a set of filter and group definitions. A filter defines what events should be sent to the display program from the central event distributor, also known as the Alarm server. A group is a logical combination of a set of filters. 4.1 Filter A filter is made up of a filter name, device name, attribute name, priority[5], subsystem & path identifiers[6], system id and an anti(negation) flag. Everything but the filter name and anti flag appear in each event sent to the Alarm Display program. An event must match all parts of a filter if it is to pass the filter and be displayed by the program for a group containing that filter. 4.2 Groups A group is composed of a group name and a list of filters associated with the group. The group name is a 32 character name that must be UNIQUE. However, a filter name and a group name can be identical. 4.3 Configuration File Format Each configuration file contains three elements: comments, a set of filter definitions and a set of group definitions used by the Alarm Display program to define the Alarms window. Comments must be preceded by an "!". Filter and group definitions can be mixed together but a filter must be defined before it is used by in a group definition. By convention all configuration files have the prefix ".CONFIG". 4.3.1 Filter Format A filter has the following format: FILTER : 32 character NAME : 12 character ATTR : 4 character PRIORITY : 2 HEX SUBSYSTEM : 8 HEX PATH : 8 HEX SYSTEM_ID : 4 HEX ANTI : boolean 11 Alarm Display User's Manual Version 1.0 Figure 1- Filter Format 4.3.1.1 FILTER The FILTER field of the filter format contains the name for the filter. The filter name is a 32 character identifier. Many filters can be defined in the Alarm Display but each one must have a UNIQUE name for that display process. However, a filter name can be the same as a group name. 4.3.1.2 NAME NAME indicates the hardware device name in the filter format. The device name is a 12 character identifier. A wildcard (*) may be used anywhere in the device name to allow for further flexibility in the filter. If the device name is just a wildcard symbol "*", this indicates that any device name is allowed as long as it passes the other parts of the filter. 4.3.1.3 ATTR The ATTR field indicates a device attribute name. The attribute name is a 4 character identifier. A wildcard can be used for it as well. 4.3.1.4 PRIORITY The PRIORITY field indicates the minimum priority an event needs to pass the filter. The priority indicates the severity of an event. It is represented as a hexidecimal number and ranges from 0 to FF. 4.3.1.5 SUBSYSTEM The SUBSYSTEM identifier is used to indicate what subsystem a device is associated with. It is represented as a 8 digit hexidecimal number. A "00000000" indicates that any subsytem identifier would pass the SUBSYSTEM field of the filter. 4.3.1.6 PATH The PATH identifier gives a hierarchical location of a device in the D0 Detector. It is represented as a 8 digit hexidecimal number. A "00000000" indicates that any path identifier would pass the PATH field of the filter. 4.3.1.7 SYSTEM_ID The SYSTEM_ID represents either a front-end node number or a unique identifier associated with various critical online software processes. It is represented as a 4 digit hexidecimal 12 Alarm Display User's Manual Version 1.0 number. Table 1 shows the current values that may be used for SYSTEM_ID. System ID Process 0000 - EFFF Reserved for Front-End F000 - F0FF Reserved for central F000 Mask for Software Systems (filkter mask applied) F001 Begin/End Run messages F100 Data Logger/Tape F200 COOR F300 COMM_TKR F400 - F5FF EXAMINEs F600 Clock Servers F700 Gateways F800 Level 1 F900 Level 2 FA00 High Voltage FB00 Global Monitoring FC00 Frontland/DBL3 Servers FD00 spare FEFF HDB_Server FF00 Smart Alarms (Alarm FFFE Software Systems (filter target specification) FFFF All Systems Table 1- SYSTEM_ID Values 4.3.1.8 ANTI The ANTI flag is used to indicate that you want every event but those defined by the other filter fields. FALSE means you want all events which pass the other filter fields. TRUE means you want everything but events that pass the filter. 4.3.2 Group Format 13 Alarm Display User's Manual Version 1.0 The following is the format for a group definition: GROUP : 32 character GROUP USED : YES or NO FILTER USED : 32 character * * * Figure 2- Group Format 4.3.2.1 GROUP The GROUP field contains the name of the group. It is a UNIQUE 32 character name. However, the group name can be the same as a filter name. There is no limit to the number of groups that can be defined as long as the group names are unique. 4.3.2.2 GROUP USED The GROUP USED field is used to indicate whether the group should be displayed once the configuration file is read in and the Alarms window appears. If it is YES, the group will automatically appear in the Alarms window when it is displayed. If NO, the group will be defined but not displayed automatically. However, it can be displayed later by using the group in the Manage Groups window (see Chapter 9). 4.3.2.3 FILTER USED The FILTER USED field is used to indicate which filters belong to a group. If the same filter is used more than once in a group, the definition will actually only appear once in the group definition in the program. There is no limit to the number of filters that can be used by a group. 4.3.3 Examples of Configuration Files The following subsections show a variety of configuration files and explain what is meant by the filter definitions in each file. 4.3.3.1 All Alarms Configuration File A configuration file called ALL_ALARMS.CONFIG contains a filter and group definition that will send all events to the Alarms 14 Alarm Display User's Manual Version 1.0 window. ! ! ALL Alarms ! FILTER : ALL NAME : * ATTR : * PRIORITY : 0 SUBSYSTEM : 00000000 PATH : 00000000 SYSTEM_ID : 0000 ANTI : FALSE GROUP : ALL GROUP USED : YES FILTER USED : ALL Figure 3- All Alarms Configuration File This file indicates that any device attribute, with any priority, path, subsystem and system id will pass this filter and be displayed in the ALL group in the Alarms window. 4.3.3.2 High Priority Configuration File There a many devices in the D0 Detector that are critical to a data taking run. If one of these devices happens to generate an event, it is import to be able to see that event immediately. A configuration file called HIGH_PRIORITY.CONFIG has been defined for this purpose. This configuration file contains the following filter and group defintions: ! ! HIGH PRIORITY Alarms ! FILTER : HIGH PRIORITY NAME : * ATTR : * PRIORITY : 80 SUBSYSTEM : 00000000 PATH : 00000000 SYSTEM_ID : 0000 ANTI : FALSE GROUP : HIGH PRIORITY GROUP USED : YES FILTER USED : HIGH PRIORITY Figure 4- High Priority Configuration File This configuration file says that any device attribute (NAME = 15 Alarm Display User's Manual Version 1.0 *, ATTR = *) with a priority equal to or greater than 80 HEX (PRIORITY = 80) with any path or subsystem identifier (00000000) coming from any system (front-end processors or software processes) will be displayed in the Alarms window in the HIGH PRIORITY group. 16 Alarm Display User's Manual Version 1.0 Chapter 5 - Alarm Display Main Window The main window of the D0 Alarm Display (see Figure 5) contains a scrolling message area and a control button area. Figure 5- Alarm Display Main Window 5.1 Message Area The message area of the main window of the D0 Alarm Display displays any error or informational messages that arise while your display process is running. The message window also contains a vertical scroll bar that allows you to see any previous messages that were displayed. The message window retains 24 lines worth of messages. Several types of messages may be displayed in the message area. Some of these messages are informational messages and others are error messages corresponding to invalid user input of some kind. When the Alarms window first appears three informational messages appear in the Message Area.. The first message informs you that the Alarm Display is connected to the ALARM_SERVER process (central event distributor). The second tells you that a connection has been established with the HDB_SERVER. The HDB_SERVER provides the link to the central hardware database where more information about an alarm is automatically obtained. If either of these two messages says that the Alarm Display is NOT connected to the above processes, the program will automatically try to reconnect to either process every mintue and a reconnection message will appear in the Message Area. The third message (as seen in Figure 5) tells you if the Alarm Display is also a server process for the VAX Parameter Page(PARPAGE)[7]. If it says that it is a server for PARPAGE, 17 Alarm Display User's Manual Version 1.0 then any VAX Parameter Page may connect to the Alarm Display and request information on any bad alarms for any group defined. If it is not a server, then no VAX Parameter Page processes can connect to it. If you define the ALARM_DISPLAY_NAME logical before running the display program, you will make the Alarm Display process a server for VAX Parameter processes. 5.2 Control Button Area When you wish to select one of the buttons from the control button area move the mouse pointer over the desired button and press MB1. 5.2.1 EXIT The EXIT button disconnects your Alarm Display process from the central event distributor, destroys all the display windows currently running, and exits back to the DCL prompt. 5.2.2 ALARMS The ALARMS button brings up a file selection window that allows you to select the configuration file which will define the groups of alarms you wish to display in the Alarms window (see Chapter 6). 18 Alarm Display User's Manual Version 1.0 Figure 6- File Selection Window The default FILTER for the file selection box is defined by the ALD_CONFIG_DIR logical that is created when you do D0SETUP HMON. In figure above, ALD_CONFIG_DIR was set to D0$BETA:[HMON.ALARM_DISPLAY]. Each configuration file contains a set of filter and group descriptions that define what will be displayed in the Alarms window. The Alarms window appears once a configuration file has been selected. 5.2.3 HEARTBEATS The HEARTBEATS button displays the Heartbeats Display window (see Chapter 11). The Heartbeats Display window contains the list of all the online data acquisition software processes that send heartbeat messages to the central event distributor. 19 Alarm Display User's Manual Version 1.0 Chapter 6 - Alarms Window The Alarms window consists of a menu bar and a scrolling window. At the top of the scrolling window is a heartbeat that is associated with the central event distributor. Its color indicates if the Alarm server is connected to the display process. RED indicates that the Alarm server is not currently connected and GREEN indicates that it is connected. Following the Alarm server heartbeat is the list of groups that was created when the configuration file was chosen from the file selection window. Each group has three buttons associated with it. These buttons contain a count of the number of bad, acknowledged and good event messages associated with the group. Figure 7- Alarms Window When the Alarms window appears, a connection to the central event distributor is automatically made and all the current events, which pass the filters defined by the groups that are used, are sent to the display. Updates to the group event count 20 Alarm Display User's Manual Version 1.0 buttons only occur when a change of state occurs for an event that passes one of the various filters defined for a group. 6.1 Event Message Types 6.1.1 Bad Event Messages Bad event messages indicate that a hardware device is out of tolerance or that a data acquisition software process has error or is no longer sending its heartbeat message. A list of the bad event messages may be seen by pressing MB1 on the button containing the bad event message count for the group that you are interested in. Multiple bad event lists can be displayed simultaneously. 6.1.2 Acknowledged Event Messages An acknowledged event message was originally a bad event message. You acknowledge bad event messages when you know the cause of the bad event message but it has no effect on your physics data acquisition. Some bad event messages cause an automatic acknowledgement of other bad event messages. An example of this is when a low voltage supply trips off. When this happens a bad event message is generated saying that the supply has tripped off. However, once the supply trips off all of the currents and voltages of the supply drop below their tolerance level and generate bad event messages. These messages will automatically be acknowledged if the tripped off bad event message was sent. Automatic acknowledgement of devices is done by adding certain information into the central hardware database and then downloading this information to the front-end processors[8]. 6.1.3 Good Event Messages Good event messages indicate that a hardware or software problem has been fixed. These messages timeout after about ten minutes as seen by the large number of zeros in the good button counts. A list of good event messages can be seen by pressing MB1 on the good event message count button for a particular group. 6.2 Display Refresh Cycle The D0 Alarm Display program is updated periodically and not immediately when a device generates an event message. The refresh cycle for the program is set to every 10 seconds. 6.2.1 Effect on Oscillating Events 21 Alarm Display User's Manual Version 1.0 If a device is on the edge of its tolerance value, it can generate oscillating event messages. That is, a bad event message, followed by a good, followed by a bad, etc. Since the display only refreshes itself every 10 seconds this oscillation will still be seen but at a reduced rate. 6.3 COMMANDS Menu The COMMANDS menu of the Alarms window contains the following options: MANAGE GROUPS, RESTORE GROUP CONFIGURATIONS, SAVE GROUP CONFIGURATIONS and CLOSE. The MANAGE GROUPS option brings up the Manage Groups Display window. Refer to Chapter 9 for more on the Manage Groups Display window. The RESTORE GROUP CONFIGURATIONS reads in a previously saved or newly created configuration file. It brings up a File Selection window to allow you to select the configuration file to restore. Any duplicates found in the file and the current set of definitions are automatically ignored. The SAVE GROUP CONFIGURATIONS option saves the current set of defined filters and groups to a file. It also brings up a File Selection window that allows you to save the file to any directory and filename that you want. The CLOSE option closes the Alarms window. 6.4 CONTROL Menu The CONTROL menu consists of two options: SEND ALARMS and KEEP ALARMS. The SEND ALARMS option sends a message to the central event disributor telling it to send all events that pass the filters defined by the input configuration file to the display program. The KEEP ALARMS option sends a message to the central event distributor telling it to stop sending events to the display program. 22 Alarm Display User's Manual Version 1.0 Chapter 7 - Alarm List Windows Each BAD, ACK and GOOD count button in the Alarms window can be selected to display the list of events that make up the current count for that button. Figure 8 shows the alarm list window that would appear if the BAD count button was selected for the CD ALARMS group. Figure 8- BAD Alarm List Window The Alarm List window consists of a scrolling list of events and a set of control buttons for the list of events. 7.1 Scrolling List The scrolling list contains a list of all the events associated with the BAD, ACK or GOOD button that defined the Alarm List window. An event may be selected by pressing MB1 on the event name in the scrolling list. More than one event can also be selected by pressing MB1 on each event name. If the MB1 button is pressed twice in very quick succession, a Detail window will for that event. Refer to the next chapter for more information 23 Alarm Display User's Manual Version 1.0 on Detail windows. 7.2 List Window Buttons Each list window has a set of push buttons associated with them. All three list types, BAD,ACK, and GOOD, have a CLOSE, SHOW, LIST, and PRINT button associated with them. The BAD list also has an ACKNOWLEDGE and ACKNOWLEDGE ALL button and the ACK list has an UNACKNOWLEDGE and UNACKNOWLEDGE ALL button. 7.2.1 CLOSE Button The CLOSE button in any of the three list windows will "unmanage" the alarm list window in which the CLOSE button was selected. This means that the window is no longer visible to you but if you select on the appropriate BAD, ACK or GOOD count button the window will automatically reappear (be "managed"). 7.2.2 SHOW Button If you wish to see more information about a specific event you may select the event in the alarm list and then activate (press MB1) the SHOW button. This will bring up a Detail window about the event. Refer to the next chapter for more information on the Detail window. 7.2.3 LIST Button When the LIST button is selected, all the events in the list are written to a file. The file is defined to be the contents of the logical ALARM_DISPLAY_LIST which is defined when you do D0SETUP HMON. By default the ALARM_DISPLAY_LIST file is set to ALARMS.LIST. If you select this button in an ACK list window, not only are the events written out but the time the event was acknowledge, the user who acknowledged it and why it was acknowledged are also written to the file. 7.2.4 PRINT Button The PRINT button works in the same was as the LIST button except it prints the list of devices to the printer designated by SYS$PRINT. Be sure to define SYS$PRINT to be a print queue that is a TEXT queue and not a POSTSCRIPT queue before you run the Alarm Display program. 7.2.5 ACKNOWLEDGE Button If you find an event in a BAD list window that you wish to acknowledge, select the event from the list of events and activate the ACKNOWLEDGE button. When you activate the 24 Alarm Display User's Manual Version 1.0 ACKNOWLEDGE button a window will appear asking who you are and why you are acknowledging the event. Enter your name and a comment indicating the cause for acknowledging the event. Be sure to fill in this information. Then activate the OK button. This will move the event from the BAD list window of a group to the ACK list window of a group. You may also select more than one event and they will all be acknowledged when you activate the ACKNOWLEDGE button. If you make a mistake, activate the CANCEL button. This will stop the acknowledgement the event you selected. You will have to reselect the event and acknowledge it again. 7.2.6 ACKNOWLEDGE ALL Button When the ACKNOWLEDGE ALL button is selected, all the events in the BAD list window will automatically be acknowledged after you have filled in the information about who you are and why you are acknowledging the events. If you activate the CANCEL button instead of the the OK button, none of the events will be acknowledge. The difference between the ACKNOWLEDGE and ACKNOWLEDGE ALL button is that you don't have to select any events from the event list 7.2.7 UNACKNOWLEDGE Button The UNACKNOWLEDGE Button in the ACK List window works in the exact same manner as the ACKNOWLEDGE Button does in the BAD List window. The only difference is that the event is moved from the ACK List window to the BAD List window and not vice versa. 7.2.8 UNACKNOWLEDGE ALL Button The UNACKNOWLEDGE ALL button works in the exact same manner as the ACKNOWLEDGE ALL button with the same difference as the UNACKNOWLEDGE button. 25 Alarm Display User's Manual Version 1.0 Chapter 8 - Alarm Detail Windows An Alarm Detail window will appear if you double click on an event (or single click and press the SHOW button) in an Alarm List window. The Detail window consists of a title containing the name of the event and then a series of text strings containing information about the event based on the type of device attribute which generated the event. A device attribute can be either an analog[8], binary[8] or comment attribute. A comment attribute is an attribute that was sent from a online software process indicating that some sort of change occurred in the software. The change may be good or bad. Figure's 9, 10 and 11 show examples of these three attribute types respectively. Figure 9- Detail Window of an Analog Device Attribute Figure 9 shows a Detail window of an analog device attribute that generated a bad event. It shows the time that the event was generated, the front-end node and channel number of the device attribute, the priority for the event, the number of times (since the front-end was last rebooted) the event for this device attribute was generated, its subsystem and path identifiers, a series of text descriptions of the device from the central hardware database, the nominal and tolerance of the device as defined in the database and then the actual reading of the device when the event was generated. It also indicates that the event has not been acknowledged. 26 Alarm Display User's Manual Version 1.0 Figure 10- Detail Window of a Binary Device Attribute Figure 10 shows a details of a binary device attribute. It shows basically the same information as that of the analog attribute with the exception that the cause of the event being generated is not shown. The "State" part of the window indicates the nominal stae that the device attribute should always be in. Since this is a bad event message, one can deduce from the state that the device attribute is OFF and not ON. 27 Alarm Display User's Manual Version 1.0 Figure 11- Detail Window of a Comment Device Attribute Figure 11 contains a comment event generated by the Level 2 Monitor process. The device name of a comment event indicates the name of the process which sent the event. The attribute indicates what is happening in the process. Note again that the window contains the same system specific information as do the analog and binary Detail windows. However, not that the trip count is 0. This field is not updated for comment event messages. Only front-end generated messages will have an updated trip count. As a comment event comes from a process and not a device attribute no information will ever appear from the central hardware database concerning the software process. The only other information you will see is a comment indicating why the event was generated by the software process. 8.1 Push Buttons in the Detail Window There are several push buttons which appear in a Detail window. Only the CLOSE button actually works. All the other buttons as yet have no function associated with them. The CLOSE button unmanages the Detail window. 28 Alarm Display User's Manual Version 1.0 Chapter 9 - Manage Groups Display Window The Manage Groups Display window appears when the MANAGE GROUPS option is selected from the Alarms window COMMANDS menu. The window contains a menu bar, two scrolling lists and a two push buttons. Figure 12- Manage Groups Display Window 9.1 COMMANDS Menu The COMMANDS menu consists of one option, CLOSE. The CLOSE option unmanages the Manage Groups Display window. 9.2 EDIT Menu The EDIT menu consists of five options: CREATE GROUP, MODIFY GROUP, DELETE GROUP, RESTORE GROUP CONFIGURATIONS and SAVE GROUP CONFIGURATIONS. The CREATE GROUP option brings up the Create Group window. This window allows you to enter new filter and group definitions. A detailed description of the Create Group window will be included in a later version of this manual.. 29 Alarm Display User's Manual Version 1.0 The MODIFY GROUP option first requires that you select a group from the Groups Available list with MB1. Once you have selected a group, select the MODIFY GROUP option and this will bring up the Modify Group window. This window is identical to the Create Group window. A detailed description of the Modify Group window will be included in a later version of this manual. The DELETE GROUP option deletes a group from the list of available groups. If the group is in the Groups Applied list, it will NOT be deleted. A message will appear in the Message Area of the main window indicating that the group is currently being used and cannot be deleted. The RESTORE and SAVE options are identical to those in the Manage Groups window COMMANDS Menu. 9.3 CONTROL Menu The CONTROL menu is identical to that of the Manage Groups window CONTROL menu. 9.4 Groups Available List The Groups Available list contaims the list of all of the groups that are currently defined. These groups may or may not be used. 9.5 Groups Applied List The Groups Applied list contains the list of all of the groups that are currently being used to define the Alarms window. 9.6 Using and Unusing Groups If you wish to use a group that is currently not in the Groups Applied list, double click on the group name in the Groups Available list using MB1. This will cause the group to appear in the Groups Applied list. Any group added to the Groups Applied list is automatically used. The Alarms window will now contain a new row with the group selected. You can also use a group by selecting the name with a single MB1 click and then press the arrow pointing toward the Groups Applied list. If you wish to unuse a group, follow the above procedure except reverse the lists. Chapter 10 - Heartbeats Display Window The Heartbeats Display window contains a menu bar and a display area which shows all the current online data acquisition processes that should be connected to the central alarm 30 Alarm Display User's Manual Version 1.0 distributor when data acquisition is in progress. Figure 13- Heartbeats Display Window The default Heartbeats Display window is defined from a text file containing the names of processes which should be sending heartbeats. This file is defined to be the logical ALARM_HEARTBEAT_DATA which is defined when you do D0SETUP HMON. When a process name is in reverse video (highlighted red on 31 Alarm Display User's Manual Version 1.0 color stations) it is no longer sending heartbeats or it is not running. If it is no longer sending heartbeats an event message will be sent from the central event distributor to any display requesting events on missing heartbeat messages. The timestamp associated with a process that is no longer sending heartbeats messages will also be displayed in reverse video (highlighted red on color stations). The timestamp indicates the last time the process sent a heartbeat message. A process that is in reverse video (highlighted red) with no timestamp associated with it was not running at the time the Heartbeats Display window appeared. If the process starts after the display has appeared, you will see the process name revert to normal text (green text on color stations) and a corresponding timestamp will appear. If the process was running and stopped running (gracefully disconnected from the Alarm server) the name will appear in red and the value STOPPED will appear where the timestamp would be. 10.1 COMMANDS Menu The COMMANDS menu of the Heartbeats Display window currently contains only one option, CLOSE. This option closes the Heartbeats Display window and sends a message to the central event distributor informing that the Alarm Display process no longer wishes to receive heartbeat information. This does not mean that a missing heartbeat event message will no longer be sent. It only means that the periodic heartbeat messages that are sent to the central event distributor will no longer be sent to your Alarm Display process. 32 Alarm Display User's Manual Version 1.0 Appendix A - Glossary of Terms event message - message sent from the front-processors or the online data acquisition software processes to the central event distributor (see chapter 2). It contains information on a change of state somewhere in the D0 detector or softwate environment.. heartbeat message - periodic message that comes from online data acquisition software processes. It is used to indicate to the central event distributor that a process is still running. It contains the name of the process, an identifier for the process and the time that the message was sent. MB1 - indicates the left mouse button of a right-handed mouse and the right mouse button for a left-handed mouse. 33 Alarm Display User's Manual Version 1.0 References [1]VMS DECwindows/Motif Programming Volumes 1A,B,C,2,2A,B,C [2]John Featherly, Inter Task Communications Package (ITC) User's Guide, D0 Note X (1989) unpublished [3]Guide to VAX DEC/Code Management System (CMS) [4]J.F. Bartlett, et al, Control and Monitoring of the D0 Detector, D0 Note 1927 (1993) unpublished [5]Fred Borcherding, D0 Alarm Priorities, D0 Note 1064 (1990) unpublished [6]A.M. Jonckheere and S. Krzywdzinski, Hardware Database Device Names,. Subsystems and Hierarchical Path Designations, D0 Note 1098A (1993) unpublished [7]Rajendran Raja, The D0 Parameter Page, D0 Note 1513 (1992) unpublished [8]Laura A. Paterno, Harrison Prosper, and Rajendran Raja, HDB_Entry User's Manual, D0 Note 1512 ((1993) unpublished 34