ENCP release notes, from v2_18 to v2_19 Encp changes: ============= Encps on 64bit alpha (OSF1) will now work. Previous version had a problem converting 64bit unsigned longs to 32bit unsigned ints. Moved a corrupted filesystem check that listed the entire contents of a directory. This particular check is now only done if an error occurs first. For directories with large number of files this was a large performance hit. Encp now checks the input files filesize on reads against the recorded filesize. This is to catch files with zero length in pnfs but have been successfully written to tape without any other errors. Numerous log file messages were changed and/or added. Most improve the use of the unique id as search string when reading the log files manually. When using encp via dcache files and directories can now have the name "root". In the log file vendor was spelled venor. This is now corrected. Misc.: ====== There is a program called ecrc that will calculate the CRC for a local file. Detailed cvs commit logs ========== ./doc/WWW/Makefile ==================================================================================== add making route files ========== ./doc/WWW/index.html ==================================================================================== get rid of bad colon ========== ./doc/WWW/talks.html ==================================================================================== add route.ps ========== ./etc/stk.conf ==================================================================================== Added 100 tapes to the miniboone allocation. discipline ketchup Updated MINOS quota Moved 994081 and 91 back to production eagle42 back to production added test.cw quota Put the test library quota back to 10 Increase test quota from 10 to 20 (from the right server node this time) Increase test quota from 10 to 20 move eagle42 to test lib for cern_wrapper testing, correctly this time using eagle42 for cern_wrapper testing enable DAQ access for miniboone Swapped 9940b drives to alternative controllers Take 9940 81 and 91 back to evaluation library restrict host access for chimichanga, snickers and mustard is set back to 3 Moved 9940B10 to eval-b from eval-bb library change quota for T1 restrict host access for chimichanga, snickers and mustard is set to 5 added sg for test tape 9940 81 and 91 to 9940 lib once more remove discipline restrictions for snickers and chimichanga remove colon assign 9940B10 mover to eval-bb library give sdss more tapes Put d0lib-archive's quota back to 75. Try to get d0lib-archive quota to work. increased allocation for d0lib-archive increase the buffer sizes for 9940B to 1.3Gb uppercased b in 9940b in block size def down to 500Mb of max_buffers for 9940b drives downgraded buffers to 1Gb from 1.35Gb for 9940B returned eagle42 to eagle library from test added cern wrapper info eagle42 to test library for cern wrapper testing Switched 9940B10 to tps2d0n Modify dev entry for 9940B 10 and 11 added quota for eval-a and eval-b Fixed "NUL1BM" typo to "NULLBM" Corrected 9940b rate eval-a and eval-b libraries created LE10 and LE11 9940B drives setup 9940 movers 81 and 91 moved to test library for performance testing allow 3 simultaneous requests from chimichanga added user allocation for cdf-sam of 85 9940 tapes - tj ========== ./etc/enstore_alarm_search.html ==================================================================================== sam.conf ========== ./etc/enstore_cambot.html ==================================================================================== add .fnal.gov to adiccam links ========== ./etc/enstore_log_file_search.html ==================================================================================== sam.conf ========== ./etc/d0en.enstore.k5login ==================================================================================== added new users added d0enmvr25a add enstore on hppc Added d0endca3a to these files. add d0enmvr4a and 7a back into d0en ========== ./etc/enstore_user.html ==================================================================================== sam.conf ========== ./etc/plotHelp.html ==================================================================================== update ========== ./etc/rip.conf ==================================================================================== remove colon ========== ./etc/rip.enstore.k5login ==================================================================================== added new users ========== ./etc/root.k5login ==================================================================================== added new users added d0enmvr25a Added d0endca3a to these files. Added stken movers 10 and 11, and stkenout 1 and 2 add user principals to allow ksu to root ========== ./etc/sam.conf ==================================================================================== added d0enmvr25a to the mezsilo library Remove sammam library, for real this time Moved D30A, B, C into samlto library Removed sammam library Added D30A, B, C LTO drives Updated d0enmvr19a configuration Removed DI36, 37, 44, 45 Removed DC03 and DC04 at Frank's request disable write access to sammam and sam-m2 libraries 994025 added mezsilo remove colon sam.conf typo in prev DC03 dev entry DC03 and DC04 /dev/ entries have changed.. no idea why.. firmaware reloaded perhaps? change dismount delay time applied discipline to d0bbin and fnd+ nodes per Jon's request add support for extra links on plot page set adminpri for d0ola,b,c set update_interval for LTO movers to 5 s changed dismount_delay and max_dismount_delay to 10s for LTO and 9940 movers Modified logname for meztest.library_maanger to MZTSTLM changed library for d0enmvr4a and d0enmvr7a added d0enmvr4a, port 7606 - added d0enmvr7a, prot 7607 and meztest,library_manager port 2524 ========== ./etc/stken.enstore.k5login ==================================================================================== added new users add enstore on hppc add stkensrv5 Added stken movers 10 and 11, and stkenout 1 and 2 ========== ./etc/auth_stk.conf ==================================================================================== more allocate for sdss steve authorized sdss to write to the eagles Added user allocation for cdf-sam of twenty tapes. added e835 and e907 allocations ========== ./etc/enstore_system_info.html ==================================================================================== remove user command section sam.conf add inventory summary ========== ./etc/cdfen.enstore.k5login ==================================================================================== added new users add enstore on hppc ========== ./etc/hosts ==================================================================================== Added d0endca3a to these files. add mvr10a, 11a, out1, out2 add d0enmvr4a and d0enmvr7a back into hosts correctly ========== ./etc/cdf.conf ==================================================================================== remove colon set online priority for user stager stager coming from fcdfsgi1 ========== ./etc/enstore_system_html_d0ensrv2 ==================================================================================== add user data bytes count ========== ./etc/enstore_system_html_stkensrv2 ==================================================================================== add user data bytes count ========== ./etc/enstore_system_top.html ==================================================================================== add user data bytes count ========== ./etc/make_enstore_system_html ==================================================================================== only cat file if it exists ========== ./etc/enstore_system_html_cdfensrv2 ==================================================================================== add user data bytes count ========== ./etc/enstore_system_middle.html ==================================================================================== add link to ngop monitoring of enstore sam.conf production page on www-isd now add user data bytes count ========== ./modules/.cvsignore ==================================================================================== ignore ecrc binary ========== ./modules/EXfer.c ==================================================================================== On alpha machines we need to be careful about using unsigned longs. They are actuall 64bit sized values. Which doesn't always play nice when coercing down to 32bits. Since, we only need 32bits for the adler32 algorithm just use unsigned integer. The 64 bit unsigned long size problem is now fixed for the threaded version of this code. Changed the crc variable from an unsigned long to unsigned int. On true64 bit machines where the long was 8 bytes this created a conversion error to 32 bits. This should work since all supported platforms define unsigned int to be 4 bytes. However, that is not going to necessarily remain true. C99 defines a header file called stdint.h that defines various int types. One of them is uint32_t. Unfortunatly, only Linux 6x and 7x have this header file. It might be a while until other platforms have releases with this functionality. ========== ./modules/Makefile ==================================================================================== added ecrc ========== ./sbin/.cvsignore ==================================================================================== ignore ecrc link ========== ./sbin/encpCut ==================================================================================== add ecrc include encp_t ========== ./sbin/release-notes ==================================================================================== better parsing left to do: automatically specify current and previous verison - it is hard coded now - which is terrible. ========== ./sbin/routes ==================================================================================== I've added d0endca3a to the file. add stkenmvr10a,11a,out1,out2, clean up private lan ========== ./sbin/ADICDrvBusy ==================================================================================== remove DLT plots ========== ./sbin/checkPNFS ==================================================================================== move rc to inside fail loop output result to stdout and not separate file that gets lost ========== ./sbin/netscan ==================================================================================== cut can be in 2 different places on linux machines remove check for processors and disk space add option to not check ipmi stuff allow sendmail on srv2 from now on - needed for automated processing of helpdesk ticket summary more attempts to ignore normal monitoring items ========== ./sbin/ntpset ==================================================================================== cut can be in 2 different places on linux machines allow ntpdc for rh7 systems ========== ./sbin/readDcache ==================================================================================== correct typo on dccp file name correct { } errors remove rm of files in pnfs written via dcache because this is causing problems - rm before written to enstore fix weak read test on cdf, fix different ports on cdf/stk no alarms on errors, but final error code ========== ./sbin/silo-check ==================================================================================== allow tape to be in any library if sg is test ========== ./sbin/choose_ran_file ==================================================================================== Added the list of active volumes to the output of at the request of ISA. ========== ./sbin/keytab_check ==================================================================================== cut can be in 2 different places on linux machines ========== ./sbin/tapes-burn-rate.py ==================================================================================== complete path to binaries changes needed to make this run on the production nodes and not airedale ========== ./sbin/tapes-plot-sg.py ==================================================================================== complete-r path to binaries complete path to binaries ========== ./src/Makefile ==================================================================================== remove a bad comment ========== ./src/alarm.py ==================================================================================== move non enstore import functions to enstore_functions2 ========== ./src/atomic.py ==================================================================================== Moved the filesystem is currputed test from encp.py to here to remove a listdir for each transfer. Also, fixed the way default errnos are determined. Better error detection for file creation problems. ========== ./src/delete_at_exit.py ==================================================================================== move non enstore import functions to enstore_functions2 ========== ./src/e_errors.py ==================================================================================== Fixed a bug in is_ok(). Added various is_XXX() functions. These test for retriable, non-retriable, alarmable and resendable error conditions. Also, includes a test for OK. ========== ./src/encp.py ==================================================================================== bumping version to v2_19 because of encpCut Fixed more potential problems with errno usage. Included a check on the input file for reads to make sure the os filesize and the pnfs filesize match. Fixed a bug when writing to enstore from dcache. If the file only had read permissions then an unecessary test for write permissions (file and directory) would falsly fail the transfer. Fixed a dcache interfacing problem. Encp was trying to make sure the output file had write priledges. It shouldn't even care when dcache is involed what the output files permissions are. I cannot be assumed that all system exceptions contain the attribute errno. Defaults are now in place where needed. Moved the FSCORRUPTED test to atomic.py. This removed doing a lisdir for each write transfer. Moved a call to os.listdir() outside of a loop. This could be a permformace hit for large directories and multi-file transfers. Added the unique id to some more log messages... Added some log messages. These are to help trace when encp does certain things. Others are to make the unique_id more usable. Cleaned up the code to use new functions from e_errors for checking the status fields of tickets. Spelling fix: recieved -> received. Created the setup_signal_handling() function. Moved code from "__main__" to do so. Added a log message that associates unique_id with filenames. Changed FNCTL.O_NONBLOCK to os.O_NONBLOCK in anticipation of using python 2.2. bumping version to x2_18_2 because of encpCut Fixed an internal comment. Made changes to handle problems with 64bit OSF nodes making encp requests. These machines could handle file sizes larger than 2GB-1 within a C int type variable. When the integer is placed in the stringified dictionary it was not containing an appending L for long type. It was this missing L that caused the problemd for the library manager since it runs on a 32bit Linux node. venor is now spelled vendor. "unable to registor bfid" error now a warning not informational. Bug fix to the internal error handling mechanism. bumping version to x2_18_1 because of encpCut move non enstore import functions to enstore_functions2 Change to pass mylint.py. Three bug fixes. 1) Empty permissions left after read. 2) OS file size zero and pnfs layer 4 size correct after writes. 3) Files writing under 2) were allowed to be read. ========== ./src/enstore_constants.py ==================================================================================== add D_MPD_FILE no trace levels less than 6 add extra page support to plot page check more the mover and lm status tickets add fields for lm ping node before rcping ========== ./src/enstore_files.py ==================================================================================== add handling of timestamp from client on log line add mounts/drive type plots move non enstore import functions to enstore_functions2 remove enstore_files import, use enstore_functions2 check more the mover and lm status tickets add import type check for pending queue as [] check if lm queue is valid first ========== ./src/enstore_html.py ==================================================================================== add mounts/drive type plots bug fixes change functions to functions2 add enstatusonlypage back in remove enstore_files import, use enstore_functions2 add volume_audit to alarm page remove safe_dict make sure lm queue is a dict ========== ./src/enstore_make_plot_page.py ==================================================================================== add label for chk_prod_code add comma add user_bytes ========== ./src/enstore_saag.py ==================================================================================== move non enstore import functions to enstore_functions2 ========== ./src/enstore_status.py ==================================================================================== add volume name to messages move non enstore import functions to enstore_functions2 check more the mover and lm status tickets get lm state correct get movers state right check for drive_id ========== ./src/enstore_up_down.py ==================================================================================== move non enstore import functions to enstore_functions2 ========== ./src/entv.py ==================================================================================== Now uses a .entvrc file to record the geometry of the window when it is closed. It is also possible to change the background color with the .entvrc file. Typing on the command line after invoking the entv: entv.py d0en instead of "entv.py d0en &" would cause the status and message threads to abort (like a C^c happened). Now only a C^c will kill entv. Working on reducing CPU usage. Doesn't crash when resized. Child window for mover status seems stable. move non enstore import functions to enstore_functions2 Removed a debug return that prevented unsued clients from being deleted. Volume background now diapears along with the text. Other little fixes. Everything seems to work *correctly* now. Volume class gone. Trace class used. Debugging output (mostly) gone. debug Added a sleep call. This seems to limit the amount of CPU it uses. Faster startup. With new movers, entv can determine the client machine at startup. Death of entv is handled more gracefully. Speed up the initialization. Put mover timeouts in during initial status check. Before 0 through k movers were positioned for each k, now when k is positioned that is the only one positioned. Also, changed where the movers get drawn to. This includes some general fixes. Mostly having to do with cleaner start and stopping. But some with location of graphics. major cleanup. better threading. better startup. ability to reinitalize. ========== ./src/espion.py ==================================================================================== move non enstore import functions to enstore_functions2 ========== ./src/enstore_functions.py ==================================================================================== Changed is_ok() to accept a status (aka a tuple of length two) or a dictionary with an element named 'status' that is a tuple of length two. Previously, it only accepted the dictionary. move read_erc to enstore_erc_functions move non enstore import functions to enstore_functions2 add subscribe for new config msg bug in format ping is different on d0ensrv2 ping node before rcping ========== ./src/pnfs.py ==================================================================================== Fixed bug if a directory named "root" was in the directory path of a command that took a pnfs id as argument. Enstore was confusing this 'root' directory with the "root" pnfs directory. Added to new options, --tagchmod and --tagchown. revert to previous version if layer 4 is missing, instantiate it as a new file take care of moved file handle missing field in initialization fix File.set_size() again fix File.set_size() Modified to use option.check_correct_count() to prevent user from entering to many options in by mistake. Removed errnious retry attempts from the pnfs "No such file or directory" bug. Fixed the --pnfs-state command. Determines if the user is currently in a pnfs directory before allowing a tag to be written/read. ========== ./src/ftt_driver.py ==================================================================================== make diagnostic messaages reflecting what seek does ========== ./src/inquisitor.py ==================================================================================== move non enstore import functions to enstore_functions2 check more the mover and lm status tickets check ticket from mover for completeness mover state check remove safe_dict remove old interface docs ========== ./src/inquisitor_plots.py ==================================================================================== add mounts/drive type plots add import of enstore_functions2 move non enstore import functions to enstore_functions2 check if node is up before rcp ========== ./src/inventory.py ==================================================================================== move non enstore import functions to enstore_functions2 also output a file to be read in to create the total bytes counter ========== ./src/makeplot.py ==================================================================================== move non enstore import functions to enstore_functions2 ========== ./src/monitored_server.py ==================================================================================== check more the mover and lm status tickets add self and default add fields for lm check ticket from mover for completeness ========== ./src/mover.py ==================================================================================== made the same change to disk mover do not calculate CRC in the tape thread if net thread detected a transfer failure added volume name to alarm message generate an alarm if mover is too long in the certain states fixed a bug added log message after dismount completes 1. in assert_volume check a volume label and then decide if it is correct even if there is no label at all. 2. do not dismount tape on ENCP_GONE. replace current_work_ticket with wrapper_dict in the vol_labels call added trace fro complete CRC removed rewriting label for 9940B because firmware was fixed Changed FNCTL.O_NONBLOCK to os.O_NONBLOCK in preperation of python 2.2. use create_wrapper_dict to get an expected by wrapper dictionary added handle_error moved log message use a vol_labels wrapper method to label blank tapes to match with whatever vrapper requires Due to a HW problem a driver does not report a label on the mounted tape, which was causing a traceback during a forced dismount. modified restart, added timestamps for mounting and mounted log messages attempt to fix an attribute errorr when sending status while reinitializing a buffer fixed bug changes related to volume assert and drive type in the mounting and mount log messages 1st running code for the volume assert volume assert bugs fixed volume assert added fixed lint complaint changed message in stop_draining better way of dealing with specifics of 9940B drive, actually it is bug in the firmware Added hack to write label twice. On initial tape density conversion in a 9940B, the drive fails to write the filemark. Rewriting the label and filemark covers up the problem. Silly bug in pad subtraction fixed and neat Wayne's shortcut for pad calculation read last block with padding and remove pad length from CRC checks this is due to an ILI/ENOMEM on 9940B when you try to read less than full block return client_ip in the status message move non enstore import functions to enstore_functions2 added client hostname to status info ========== ./src/null_wrapper.py ==================================================================================== added vol_labels ========== ./src/plotter.py ==================================================================================== add debug messages add extra page support to plot page move non enstore import functions to enstore_functions2 ========== ./src/verify_db.py ==================================================================================== make mylint happy use low level cursor to make it much faster ========== ./src/volume_clerk.py ==================================================================================== fix check veto list guard inquire_vol() against external_label == None log when listing all volumes add a missing return ========== ./src/file_clerk_client.py ==================================================================================== survive prematural failure of get_brand() ========== ./src/option.py ==================================================================================== Added the --get-asserts for the library mananger. Added the --mover-timeout for the volume_assert. Added to new options, --tagchmod and --tagchown. Added the function check_correct_count(). This allows the code to check if the user specified extra options that were not expected. ========== ./src/configuration_client.py ==================================================================================== bump up trace severities remove print add subscribe for new config msg ========== ./src/event_relay.py ==================================================================================== remove some log messages ========== ./src/file_clerk.py ==================================================================================== log when listing the tape ignore pnfs_mapname if the record does not have it ========== ./src/enstore_display.py ==================================================================================== Negitive positions are possible. The code did not take this into account. The regular expressions parsing the geometry now handle negative positions. Now uses a .entvrc file to record the geometry of the window when it is closed. It is also possible to change the background color with the .entvrc file. The checkbutton on the menubar is now set to on initially. Also, the resize event -- from a user -- waits a finite amount of time until the window contents are redrawn. Should extra resize events occur the wait time is reset to the full wait time. This should cut down on the number of times the canvas is redrawn when the user is resizing the window. Added the menubar with the option to turn off animation. Working on reducing CPU usage. Fixed the MoverDisplay windows. User can click on a connection and it will change color. Other general cleanups. Doesn't crash when resized. Child window for mover status seems stable. move non enstore import functions to enstore_functions2 Removed a debug return that prevented unsued clients from being deleted. Volume background now diapears along with the text. Other little fixes. When a connection is terminated the line is removed from the display. Everything seems to work *correctly* now. Volume class gone. Trace class used. Debugging output (mostly) gone. Font sizes are good. Some font color changes too. Changes to the timer. Code cleanup. The font selection for the mover text will select a size to fit in the designated space. Geometry selection is cleaner. Bug fix for font size selection. Faster startup. With new movers, entv can determine the client machine at startup. Death of entv is handled more gracefully. Speed up the initialization. Put mover timeouts in during initial status check. Before 0 through k movers were positioned for each k, now when k is positioned that is the only one positioned. Also, changed where the movers get drawn to. This includes some general fixes. Mostly having to do with cleaner start and stopping. But some with location of graphics. major cleanup. better threading. better startup. ability to reinitalize. ========== ./src/library_manager.py ==================================================================================== fixed a bug in the processing write requests This change returns a status to volume_assert.py on the initial request. Undue the prevous changes. Made changes to the volume assert code. Largest of which is to add the assert pending and active queues to the output from --get-queue. These changes are for the volume assert test. queue requests with restricted host access, do not ignore them fixed a bug introduced in the previous rev. expanded fuctionality of restrict_host_access more flexible match in restrict_host_access explicitely delete a 3rd argument returned by discipline for restrict_version_access added diagnostic messages do not process adminpri write requests if at least one for the given vollume family has been processed modified postponed queue add subscribe for new config msg change affecting only disk movers ========== ./src/host_config.py ==================================================================================== Fixed a problem with the file passing mylint.py/pychecker. This file was not correctly handling the case were a hostip line was listed in the enstore.conf file, but no interfaces listed. ========== ./src/volume_clerk_client.py ==================================================================================== fix --set-comment argument counting bug sort file list according to location cookie ========== ./src/cpio_odc_wrapper.py ==================================================================================== make ticket optional add vol_labels procedure ========== ./src/cern_wrapper.py ==================================================================================== fix default for declaration date fix vol_labels call fix indenting remove unused DEVICE add fermi specific info fix bug in getting info from ticket lots of changes, backup bug fixes ========== ./src/discipline.py ==================================================================================== make deepcopy of arguments to return, because if not and args are modified, the original args are modified as well discard changes made in the previous release do not reread configuration information ========== ./src/ratekeeper.py ==================================================================================== move non enstore import functions to enstore_functions2 ========== ./src/show_volume_cgi.py ==================================================================================== calculate bytes written ========== ./src/generic_server.py ==================================================================================== check for msg existence first add subscribe for new config msg ========== ./src/scanfiles.py ==================================================================================== skip symbolic links ignore .removed directories ignore .bad directory from top-down scan use alternative path if necessary. take alternative path into consideration while comparing the path make path mismatch to be warning only fix drive comparison fix typo deal with missing field when missing layer 4 take care of no layer 4 exceptions fix a typo now does batch take care of missing layer 1 and/or layer 4 take care of very large file size make mylint and pychecker happy consistent treatment to symbolic links protected for missing keys skip volmap directories guard for none enstore file ========== ./src/monitor_client.py ==================================================================================== If the node name cannot be resolved into an ip, skip the node. Changed FNCTL.O_NONBLOCK to os.O_NONBLOCK in preperation of python 2.2. move non enstore import functions to enstore_functions2 ========== ./src/enstore_overall_status.py ==================================================================================== bug fixes change output directory remove enstore_files import, use enstore_functions2 use enstore_functions2 make enstore_overall_status run on hppc lint fix reduce frequency of emails send mail when cant rcp from a node for overall status page ping node before rcping ========== ./src/monitor_server.py ==================================================================================== Changed FNCTL.O_NONBLOCK to os.O_NONBLOCK in preperation of python 2.2. move non enstore import functions to enstore_functions2 ========== ./src/enstore_saag_network.py ==================================================================================== move non enstore import functions to enstore_functions2 ========== ./src/library_manager_client.py ==================================================================================== Undue the prevous changes. Made changes to the volume assert code. Largest of which is to add the assert pending and active queues to the output from --get-queue. These changes are for the volume assert test. ========== ./src/enstore_plots.py ==================================================================================== do not traceback when there are no xfers in a day add mounts/drive type plots move non enstore import functions to enstore_functions2 ========== ./src/manage_queue.py ==================================================================================== Undue the prevous changes. Made changes to the volume assert code. Largest of which is to add the assert pending and active queues to the output from --get-queue. ========== ./tools/pychecker/Config.py ==================================================================================== Updating to pychecker 0.8.6. ========== ./tools/pychecker/OP.py ==================================================================================== Updating to pychecker 0.8.6. ========== ./tools/pychecker/Stack.py ==================================================================================== Updating to pychecker 0.8.6. ========== ./tools/pychecker/__init__.py ==================================================================================== Updating to pychecker 0.8.6. ========== ./tools/pychecker/checker.py ==================================================================================== Updating to pychecker 0.8.6. ========== ./tools/pychecker/warn.py ==================================================================================== Updating to pychecker 0.8.6.