NAME
RPM - Reprogrammable Performance Monitoring (RPM) counters.
SYNTAX
#include <i860paragon/rpm.h>
DESCRIPTION
The /usr/include/i860paragon/rpm.h file specifies the pro-
gramming interface to the Reprogrammable Performance Moni-
toring (RPM) hardware counters and the Mach kernel software
(RPM soft) counters. Both the hardware and soft counters are
memory mapped into every process. The counters may be read
but not written. By defining a pointer to the structures
defined in rpm.h and then setting the pointer to a specific
address, the RPM hardware and soft counters can be accessed.
The RPM hardware counters are specified by the structure
rpm, which contains the following fields (hardware
counters):
rpm_control
32-bit control register used to reset the
counters. All counters except rpm_time can be
reset. The rpm_time global clock is reset via the
diagnostic station. The process must have root
access and perform an authentication process to
gain access to a writable page in order to write
to the control register.
rpm_time 10 Mhz 56-bit global clock accurate to 100
nanoseconds local to the node and 1 microsecond
across all nodes in a system.
rpm_cpu0 Number of 50 Mhz cycles that CPU 0 is bus master.
rpm_cpu1 Number of 50 Mhz cycles that CPU 1 is bus master.
rpm_ltu Number of 50 Mhz cycles that ltu is bus master.
rpm_exp Number of 50 Mhz cycles that expansion card is bus
master.
rpm_north One count for every 64 bytes moving North on the
mesh.
rpm_south One count for every 64 bytes moving South on the
mesh.
rpm_east One count for every 64 bytes moving East on the
mesh.
rpm_west One count for every 64 bytes moving West on the
mesh.
To access the RPM hardware counters, define a pointer to the
structure rpm, set the pointer to the address
RPM_BASE_VADDR, and access the counter via the structure.
For example:
struct rpm *rpm;
rpm_timer_t global_time;
rpm = (struct rpm *) RPM_BASE_VADDR;
global_time = rpm->rpm_time;
The rpm->rpm_control register is used to reset the RPM
counters to zero. A task must first obtain a writable page
to the RPM counters. Normally the RPM_BASE_VADDR address
specifies a read-only page mapped into every task. In order
to obtain a writable page, the task must have root authority
(executed by the superuser) and perform an authentication
process. The authentication steps are to open the NORMA dev-
ice rpm0, obtain a pager port for the mapped rpm device, and
map the pager port into the task's address space. The end
result is an address to a writable page that is substituted
for the RPM_BASE_VADDR address. The RPM counters can then be
reset to zero by setting the rpm->rpm_control register to
0xFFFF0000.
NOTE
The SPV data collection daemons reset the RPM counters
once every second (see Limitations and Workarounds).
The rpm->rpm_time global clock can not be reset by the
rpm->rpm_control register. The global clock is reset at the
diagnostic station using the diagnostic commands
/u/paragon/diag/gclock and /u/paragon/diag/greset. The RPM
global clock is also used by the NX dclock() call which
returns a double-precision time interval in seconds since
the system was booted.
The RPM soft counters are maintained by the Mach kernel on
each CPU configured into the kernel. The RPM soft counters
are specified by the structure rpmsoft, which contains the
following fields:
rpms_idle Number of double precision seconds the CPU has
been idle.
Large grain trap handler statistics:
rpms_alltraps
Number of traps.
rpms_it Number of trap instructions.
rpms_int Number of interrupts.
rpms_iat Number of instruction access traps.
rpms_dat Number of data access traps.
rpms_ft Number of floating point traps.
Data access trap statistics:
rpms_datld
Number of data access traps on ld.x.
rpms_datst
Number of data access traps on st.x.
rpms_datfldfst
Number of data access traps on fld.x or fst.x.
rpms_datpst
Number of data access traps on pst.
rpms_datpfld
Number of data access traps on pfld.y.
rpms_datauto
Number of data access traps on fld.x++, fst.x++,
and pfld.x++.
Data access page fault statistics:
rpms_notdirty
Number of page faults for a store to a clean page.
rpms_notref
Number of page faults for an access to an unrefer-
enced page while locked.
rpms_notwr
Number of page faults for a store to a read-only
page.
rpms_pdenotu
Number of page directory entry access violations.
rpms_ptenotu
Number of page table entry access violations.
rpms_pdenotp
Number of page directory entry invalid traps.
rpms_ptenotp
Number of page table entry invalid traps.
Locked sequence related trap statistics:
rpms_lockseq
Number of traps while in a locked sequence.
rpms_lockres
Number of restarted locked sequences.
rpms_lockexp
Number of expired locked sequences.
Floating-point exception statistics:
rpms_fpe_si
Number of floating-point sticky inexact excep-
tions.
rpms_fpe_se
Number of floating-point source exceptions.
rpms_fpe_mu
Number of floating-point multiplier underflow
exceptions.
rpms_fpe_mo
Number of floating-point multiplier overflow
exceptions.
rpms_fpe_mi
Number of floating-point multiplier inexact excep-
tions.
rpms_fpe_ma
Number of floating-point multiplier add-one excep-
tions.
rpms_fpe_au
Number of floating-point underflow exceptions.
rpms_fpe_ao
Number of floating-point overflow exceptions.
rpms_fpe_ai
Number of floating-point inexact exceptions.
rpms_fpe_aa
Number of floating-point add one exceptions.
For every CPU configured into the Mach kernel there is a
corresponding rpmsoft structure that contains the statistics
for the CPU. At the address RPMSOFT_BASE_VADDR is an array
of structures, one for each CPU configured into the Mach
kernel. The number of CPUs configured into the Mach kernel
can be found by using the host_info() kernel call. To access
the RPM soft data, increment a pointer through the array of
the rpmsoft structures. For example:
struct rpmsoft *rpmsoft;
rpm_timer_t idle_sum_time;
rpmsoft = (struct rpm *) RPMSOFT_BASE_VADDR;
idle_sum_time = 0.0;
for ( i = 0; i < num_cpus; ++i ) {
idle_sum_time = (rpmsoft + i)->rpms_idle;
}
The RPM soft counters can not be written or reset. The
counters represent summed statistics since the system was
booted.
For a GP node, the first rpmsoft structure in the array con-
tains the statistics for the application CPU, while the
second rpmsoft structure in the array contains the statis-
tics for the message co-processor.
EXAMPLES
The following example reads the RPM global clock and con-
verts the 56-bit time to double-precision seconds (the
dclock() functionality).
#define RPM_CLOCK_FREQ (10000000)
#define _2_to_52d (4503599627370496.0)
#define OR_EXPONENT (0x4330)
#define MASK_EXPONENT (0x000F)
double hz;
struct rpm *rpm;
union {
unsigned short wordwise[4];
double value;
} t;
rpm = (struct rpm *) RPM_BASE_VADDR;
t.value = rpm->rpm_time;
hz = 1.0/RPM_CLOCK_FREQ;
t.wordwise[3] = (t.wordwise[3] & MASK_EXPONENT) | OR_EXPONENT;
t.value = hz * (t.value - _2_to_52d);
This code converts the 56-bit integer count into a 64-bit
double-precision value representing seconds. The code
ignores the highest 4 bits of the 56-bit counter (a 52-bit
counter, counting at 10Mhz can count for 14.28 years before
wraparound occurs).
Consider the representation of a double. Doubles have three
fields: a sign field, an exponent field and a fraction
field. The sign field is a single bit, 0 for positive and 1
for negative. The exponent field is bits 62 to 52, which can
hold integers from 0 to 2047. The actual value of the
exponent is the value in the exponent field minus 1023. The
fraction field is bits 51 to 0. The actual value of the
fraction is 1.f, where f is the value of the integer in the
fraction field. (For information on the hidden 1, refer to
IEEE standard 854 for radix-independent floating-point
arithmetic.)
To convert the 52-bit integer to floating point, the value
of the exponent must be set to 52, and the value 1x2**()52
must be subtracted (the hidden 1). To set the value of the
exponent to 52, the value of the exponent field is set to
1023 + 52, or 1075 (0x433 hex). To subtract the hidden 1,
the value 4503599627370496.0 (1x2**()52) is subtracted. At
this point the 52-bit number has been converted to a
floating-point representation of the same number. To convert
the floating point representation of the 52-bit (10Mhz)
counter to seconds, simply multiply by 10M.
LIMITATIONS AND WORKAROUNDS
The SPV tool uses the RPM hardware to collect mesh and
memory bus utilization information. Once every second the
RPM hardware counters are collected and reset on every node
in the system. The SPV data collection daemon must be
stopped if you want an individual application to collect and
interpret the RPM hardware counters. To stop the SPV daemon
the root user must either select the SPV File menu Data col-
lection command to temporarily stop the SPV data collection
or stop the SPV daemon itself. This is done by invoking
/etc/init.d/spv stop or modifying the /etc/init.d/spv script
to not start the daemon during system boot.
The RPM hardware counters wrap around to zero when the max
32-bit count value has been reached.
The original intent of the RPM bus counters (rpm_cpu0,
rpm_cpu1, rpm_ltu, and rpm_exp) was to report bus utiliza-
tion information. The original concept was to count bus
cycles when a module becomes a bus master. Subsequent per-
formance investigations indicated that becoming a bus master
is an expensive operation in terms of bus cycles. Thus, the
default bus master is rpm_cpu0 whether the CPU is using the
bus or not. The rpm_cpu1, rpm_ltu, and rpm_exp counters
correctly denote the amount of bus utilization for the mes-
sage coprocessor, the ltu, and when the expansion card is a
bus master. However, the rpm_cpu0 bus counter does not
correctly reflect the bus usage of application CPU.
The total utilization of the bus counters (rpm_cpu0,
rpm_cpu1, rpm_ltu, and rpm_exp) is always a little over 97%
because about 2% of the bus is consumed for memory refresh.
SEE ALSO
spv, dclock()
System Performance Visualization Tool User's Guide
C System Calls Reference Manual
Acknowledgement and Disclaimer