Overview:
The "broken" status of the enstore library managers can occur when the tape robot gives it's status to enstore as being offline or There are multiple consecutive errors in enstore. You will not get this status if the library managers are drained. Prior to the error enstore is running normally and the status of the library managers was unlocked.
Procedure:
Example: Enstore log entry
10:40:18 stkensrv4.fnal.gov 031976 enstore I STKMC FINISHED mount VO1979 0,0,10,0 returned ('ok', 0, 'mount VO1979 0,0,10,0 => 0,Mount: VO1979 mounted on 0, 0,10, 0') 10:40:19 stkenmvr1a.fnal.gov 013865 root I EA11MV MSG_TYPE=MC_LOAD_DONE mounted VO1979 10:40:19 stkensrv4.fnal.gov 017897 enstore I STKMC query drive 0,0,10,6 => 0,0,0,10,6 online in use VO1218 9840 10:40:25 stkensrv0.fnal.gov 002092 enstore A VOLSRV MSG_TYPE=ALARM {'label': 'VO1979', 'root_error': 'NOACCESS', 'severity': 3} 10:40:25 stkensrv0.fnal.gov 002092 enstore I VOLSRV VO1979 system inhibit set to NOACCESS 10:40:25 stkensrv0.fnal.gov 002092 enstore I VOLSRV pause library_managers for stk.media_changer media_changerare paused due to too many volumes set to NOACCESS 10:40:25 stkenmvr1a.fnal.gov 013865 root A EA11MV MSG_TYPE=ALARM {'root_error': 'volume VO1979 already labeled VO1979', 'severity': 1} 10:40:25 stkenmvr1a.fnal.gov 013865 root E EA11MV marking VO1979 noaccess 10:40:25 stkenmvr1a.fnal.gov 013865 root E EA11MV transfer failed WRITE_VOL1_WRONG volume VO1979 already labeled VO1979 volume=VO1979 location=0 10:40:25 sdssdp7.fnal.gov 002662 jhendry W ENCP transfer file EXfer error: ('fd_xfer write error', 32, 'Broken pipe', 2662) 10:40:25 sdssdp7.fnal.gov 002662 jhendry E ENCP INFILE=/opdb/d2/spool/products/acacia-THROUGH-lss.IRIX-ONLY.Part01.20011010.tar OUTFILE=/pnfs/sdss/products2/acacia-THROUGH-lss.IRIX-ONLY.Part01.20011010.tar FILESIZE=279490560 LABEL=VO1979 LOCATION= DRIVE=stkenmvr1a:/dev/rmt/tps0d1n DRIVE_SN=3310000195 TRANSFER_TIME=6.47 SEEK_TIME=0.00 MOUNT_TIME=14.24 QWAIT_TIME=17.37 TIME2NOW=0.00 STATUS=TOO MANY RETRIES ('WRITE_VOL1_WRONG', 'volume VO1979 already labeled VO1979') 10:40:25 sdssdp7.fnal.gov 002662 jhendry I ENCP Error after transferring 0 bytes in 1 files in 169.174152017 sec. Overall rate = 0 MB/sec. Drive rate = 0 MB/sec. Network rate = 0 MB/sec. Exit status = 1. 10:40:25 stkensrv4.fnal.gov 020246 enstore I EAGLBM mover_error updated suspect volume list for VO1979 10:40:29 stkensrv4.fnal.gov 017897 enstore I STKMC dismount VOLUME 0,0,10,6 force => 0,Dismount: Forced dismount of VO1218 from 0, 0,10, 6 10:40:29 stkensrv4.fnal.gov 031976 enstore I STKMC FINISHED dismount VO1218 0,0,10,6 returned ('ok', 0, 'dismount VOLUME 0,0,10,6 force => 0,Dismount: Forced dismount of VO1218 from 0, 0,10, 6') 10:40:29 stkensrv0.fnal.gov 002092 enstore E VOLSRV library_managers for stk.media_changer media_changerare paused due to too many volumes set to NOACCESS 10:40:29 stkensrv4.fnal.gov 020246 enstore A EAGLBM MSG_TYPE=ALARM {'root_error': 'LM eagle.library_manager goes to BROKEN state', 'severity': 1} 10:40:31 stkenmvr1a.fnal.gov 013865 root I EA11MV dismounting VO1979 10:40:31 stkensrv4.fnal.gov 031976 enstore I STKMC REQUESTED dismount VO1979 0,0,10,0 10:40:32 stkensrv4.fnal.gov 017906 enstore I STKMC query drive 0,0,10,0 => 0,0,0,10,0 online in use VO1979 9840 10:40:41 stkensrv4.fnal.gov 017906 enstore I STKMC dismount VOLUME 0,0,10,0 force => 0,Dismount: Forced dismount of VO1979 from 0, 0,10, 0 10:40:42 stkensrv4.fnal.gov 031976 enstore I STKMC FINISHED dismount VO1979 0,0,10,0 returned ('ok', 0, 'dismount VOLUME 0,0,10,0 force => 0,Dismount: Forced dismount of VO1979 from 0, 0,10, 0') 10:40:42 stkensrv2.fnal.gov 020694 enstore A Enstore_Up_Down MSG_TYPE=ALARM {'Reason': "['eagle.library_manager down']", 'severity': 4, 'root_error': 'ENSTORE BALL IS RED'}
In the example above VO1979 was the fourth tape in a row where an attempt was made to write a VOLUME SERIAL number on a tape and it failed due to a label already existing on the tape. This problem happened because the tape was defined to enstore without the "VOL1OK" option. This caused the Library Manager to go into a "Broken" state. The root cause of the problem must be fixed first before the Library manager can be reset. To fix this problem: