Enstore Server Upgrades

if you have a driver disk. hit YES.
Remove the kickstart floppy disk.
Insert the appropriate driver disk. hit YES again. the driver disk begin will load.
When the disk has loaded you will be prompted again.
If you wish to load additional driver disks, continue
If you do not have additional driver disks, tab to NO and hit enter.
Otherwise load the next disk and hit enter.
You do not need to reinsert the Enstore kickstart disk.
You may remove the driver disk.

Poking around during the install

When the install Compltes

Post install Reboot

Trouble Shooting

Fdisk Errors

You are unable to login

SYSCONNECT NIC Errors:

Instructions & Addtions to troubleshoot sysconnect

the sk98lin.o modules file

Unwind Motherboard Vendor Bios Firmware

Downtime rules:

Partial Server Upgade Plan

General List

Chih-Hao's notes:

-- Chih-Hao writes:
In light of April 3 being:

the first working day after day light saving time change.
the first working day after Dan Ryan reconstruction begins ...

this is what I'll do for d0 upgrade:

I will start the jobs from home ...

I assume that I will get an e-mail notice of d0en being paused by 8:00 a.m. Please do not touch d0ensrv[036] ... until I send out a notification
I'll wait the 8:10 backup to run its natural course.
After the backup finishes (should be in 10 minutes), I'll stop file_clerk, volume_clerk, info_server, accounting_server, and drivestat_server.
I'll dump the current databases ... should be done in half an hour.
I'll shutdown database servers.
I'll send out e-mail notification to the ring master and cc: to enstore-admin
Then, ISA may shutdown the machines and do the OS upgrade.
I'll beat the traffic to get here ...
After getting the go-ahead for d0ensrv[036], I'll do the rest. The estimated time is about 4 hours.

Upgrade meeting notes from 03/08/2006 11:55 AM

d0en Apr 3,4 upgrade list from the board
Monday and Tuesday:

Start 8am

backup pnfs database (vp)
backup f/v database (ch)
backup acc database (ch)
backup servers' state to srv2 raid (TJ+MZ)
backup servers' state to srv3 raid (TJ+MZ)
Upgrade srv4 (IA)
After (acc db bup) upgrade srv6 (IA+MZ)
After (f/v db bup) upgrade srv0 (IA+MZ)
After (pnfs db bup) upgrade srv1 (IA+MZ) - don't delay this
After (up srv6) upgrade pg srv6 (CH)
After (up srv0) upgrade pg srv0 (CH)
After (up srv1) upgrade pg srv1 (VP)
upgrade srv2 (IA)
upgrade srv3 (IA)
upgrade postgres clients (CH)

Tuesday:
LTO bin istallation in ADIC robot

srv5 and srv7 to be upgraded independent at another time (before or after)

David - QA on upgrades
Pre-stuff

Send out email about home areas and ask to clean up home areas on the srv machines (TJ)
write backup script (TJ+MZ)
kickstart cleanup (TJ+MZ)
HW inventory (TJ+MZ)
Procedure for each SRV (TJ+MZ)

~srv0

~srv1

Issues to verify after a ~srv1 install

stop dcache

start dcache

update farmlets

~srv2

What issues remain with remedy_api

apache

Copy the correct files over

Copy cgi scripts

~srv3

things tweaked after install of stkensrv3.

CRON and histograms

We forgot to make a fresh copy of the ~enstore/CRON and ~root/CRON files from the old to the new stkensrv3 systems. I have copied over the output files that had changed between 12/9 and today, and that hadn't already been superseded. And I've merged the histogram file data.

dCache install issues:

pageDcacheCms*

dcache_page_dccpcms

PageDcacheSRM & pageDcacheKftp

dcap and kftp products

globus - grid certs

total 64
drwxrwxr-x   13 enstore enstore      4096 May 28 2003 .
drwxr-xr-x   27 enstore enstore      4096 Dec 9 15:49 ..
drwxrwxr-x    2 enstore enstore      4096 May 28 2003 bin
drwxrwxr-x    6 enstore enstore      4096 May 28 2003 etc
-rw-r--r--    1 enstore enstore      6715 Apr 24 2002 GLOBUS_LICENSE
drwxrwxr-x    4 enstore enstore      4096 May 28 2003 include
drwxrwxr-x    3 enstore enstore      8192 May 28 2003 lib
drwxrwxr-x    3 enstore enstore      4096 May 28 2003 libexec
drwxrwxr-x    6 enstore enstore      4096 May 28 2003 man
drwxrwxr-x    2 enstore enstore      4096 May 28 2003 sbin
drwxrwxr-x    3 enstore enstore      4096 May 28 2003 setup
drwxrwxr-x    5 enstore enstore      4096 May 28 2003 share
drwxrwxrwx    2 enstore enstore      4096 May 28 2003 tmp
drwxrwxr-x    2 enstore enstore      4096 May 28 2003 var

Certificates

Installs for pageDcache cronjobs

~srv4

~srv6

~srv5 & ~srv7

Outstanding Questions?

tcp-wrappers are installed

Are we correctly setting the hosts.allow ?

# Loopback interface
ALL: localhost 127.0.0.0/255.0.0.0: banners /etc/banners

# FermiLab Network
ALL: .fnal.gov: banners /etc/banners
ALL: 131.225.0.0/255.255.0.0: banners /etc/banners

# Minos Soudan (only needed for STKEn)
ALL: 198.124.212.0/255.255.255.0: banners /etc/banners
ALL: 198.124.213.0/255.255.255.0: banners /etc/banners

# Enstore Private Network
ALL: 192.168.19.0/255.255.255.0: banners /etc/banners

I have sent this note to Troy and Connie. We may have questions about
the sendmail config files. D0enmvr7a uses the installed defaults.

KickStart HowTo

twiki log

How to use the install CD:

If you use an unmodified CD:

Using an Enstore Kickstart CD:

Using a Enstore Floppy install disk:

Before Rebooting:

Begin Install:

Floppy disk

CD