Monitoring use case: Host selection for computationally intensive applications

A researcher intends to start a long running computational task. Suitable computational resources, within the grid, need to be discovered by the researcher and/or his proxy (e.g. GRAPPA). Potential discriminating information may include historical values of various compute node characteristics. By analyzing current historical trends the researcher can better predict the potential running time of the task and find the "best" node to execute on. The proper sensors, the archival system and an interface to the MDS will be developed.

Patrick McGuigan, mcguigan@cse.uta.edu

CPU Load, available memory, available free secondary storage, network interface utilization.

Estimation of running time for computationally intensive tasks may be predicted by analysis of the data contained within the archives of the measurements.

The collection, storage and retrieval of measurement data should impose a minimal load on the host.

At this time the frequency is expected to be twice per minute.

The frequency at which the data will be accessed has an upper limit based on the frequency of ATLAS job submissions for computationally intesive tasks.

For this scenario, timliness is very important.

Scaling is an issue with the system if many potential hosts are asked for archival information. It is possible that the list of potential hosts can be reduced based on other factors (e.g. OS, architecture set, installed software base)

Users of the system will need to have valid grid credentials for accessing the archives via MDS.

The largest concern is for the failure of the RDBMS and its host. If information providers can not access the database, then requests for data from MDS clients will fail and suitable hosts may not be found for jobs.

 

At this time, an archive should contain 1 week’s worth of measurements.

VDT 1.0 on Linux