Steve White investigated Sunday (21-May) and found:
ALL dbservers and everything else in the nameservice were affected by this. The SAM shifter restarted the SAM dbservers and Calibration dbservers.
It would be helpful if the trigger db client db server error would tell the user to contact D0 SAM Admin and D0 Database Support whenever there is a problem connecting to any db server.
Robert restarted the servers. Steve Kovich is investigating machine problems.
Since this has happened twice in the last 2 weeks (dbserv5 hung), Steve increased the number of 'interrupts' from 20 to 30. If this increase does not resolve this issue, then the fail over should be improved to use something other than a ping to assess server status.