F

		Integration plan for the off-line database servers.
		===================================================


				August 4, 1999


Current configuration (July '99):
-----------------------------------------

The current configurations of the production and development oracle servers are
outlined below.  The  production server is a   SUN E3500 which  has 5 backplane
slots and the  development server is  a SUN E4500  with 8 slots. We  originally
deployed the larger machine as the development server intending to test various
configurations of cpu,  I/O and memory. The  two machines have approximately 80
GB (dev) and  100 GB (prod) of  storage on f/w  scsi. 

The systems are running the SUN Solaris  v2.6 OS. There are currently instances
of  Oracle 8 and  8i  running on the development   server, and a production and
integration instances  of Oracle 8   on the production  machine.   There is one
Apache  web server running on  the dev machine  and two Apache (integration and
production) servers running on the  production server.  There  is a full,  hot,
backup done for the production db daily, and cold backups are  done once a week
for  the development database instances. The  backup  strategy in place employs
the DLT 4000 stacker connected to d0ora2. The hot backups use Oracle's RMAN to
make a  backup image to disk. Then FMB is used to copy the disk image to tape. 

Following is a summary of each hardware configuration. 


d0ora1: E4500 development server
	Storage
	  16 slots of a shared 24 bay 3.5" disk chassis
          scsi#1 has 5 x 9 GB drives 
	  scsi#2 has 4 x 9 GB drives
	CPU
	  2 x UltraSPARC I
	Memory
 	  1 GB
	I/O
	  2 I/O S-Bus board w/ 2 x f/w diff. scsi  host adapters
	    (each i/o card has 2 sbus channels, with 1 sbus on one, 
             and 2 on the other)

d0ora2: E3500 production server
	Storage
	  8 slots of a shared 24 bay 3.5" disk chassis
          scsi#1 has 5 x 9 GB drives
          scsi#2 has 1 x dlt4000 stacker (for backup)
	  1 24 GB drive in shoebox
	  scsi#3 has 9 bay lvd chassis on rancho converter
             1 x 36 GB drive
	CPU
	  4 x UltraSPARC I
	Memory
	  1 GB
	I/O
	  1 x I/O S-Bus board w/ 4 x f/w diff.  scsi  host adapters
	    (each i/o card has 2 sbus channels, with 1 sbus on one, 
             and 2 on the other)

Plan Through March 1, 2000
--------------------------

We plan to continue  to grow and test  the system  throughout the next  several
months. We will add a FC RAID disk subsystem and backup robot to the E4500, and
reconfigure the current  SCSI disk so  it is primarily  on the E3500.  The RAID
disk system  will consist of a  FC  attached dual  raid  controller system with
Fibre  Channel disks.  The  system will  be configured  with  RAID 0,  1, and 5
components suitable   for  the range  of  performance   to  meet  the  database
requirements.   We  will conduct systematic   tests to  understand confirm  the
redundancy, RAID  disk rebuild and hot spare  features and to better understand
the  memory, cpu and  I/O needs. The  current plan is to  have 3 x (4+1) raid 5
sets (usable space of 216 GB), one RAID 1 morror 18 GB set (usable space 18GB),
and 3 spares (total 20 x 18 GB disks) which will provide a. Also, the system we
are planing will have 12 x 9  GB which will be configured  into 4 mirrored sets
and a 2  disk stripe. The  18 GB spares can be  used as 9 GB spares  if  a 9 GB
drive fails. 

The backup robot will be a 120 slot Exabyte robot with 2 Mammoth I drives. This
robot is expandable up to 200 slots and  10 Mammoth II  drives. All of the tape
drives will be attached to the E4500 and  the Legato software will be installed
allowing this machine to control the robot. Backup of the E3500 will be via the
network. If the 10/100  base-T network interface is  too  small we may need  to
upgrade.

Following is a rough plan  of activities to be  pursued over the next 9 months.
Additional details about  the systems and strategies  are included in  the next
section. 

	8/1/99  Develop test suite to explore memory, cpu and I/O usage 
		Evaluate hardware configurations and review all components.
	9/1/99	Detailed plan for 
		-E4500 initial  RAID configuration 
		-E4500 tape drive and robot configuration
		-E3500 scsi disk configuration 
	10/1/99 RAID system delivered
                -connect to E4500 machine.
		-Configure RAID 0, 1 and 5 disk
		-Test various configurations for throughput.
		-Test dual redundancy.
		-Test raid fault recovery
		-Test hot swap features
		Backup ATL delivered.
		-Install on E4500 and test.
		Web server node delivered or identified
		-Prepare web server node.
	11/1/99 E4500 - RAID system and ATL ready for use.
	        -Deploy RAID disk sets.
		-Begin setting up backup to use ATL and Legato software
		-Begin cpu, memory, I/O, and transactions per second tests.
		-populate large test db
		E3500
		-Re-deploy current scsi disk as it becomes available.
		(-work toward moving  all scsi disk to E3500 so disk
                 chassis is no  longer shared between 2 machines. Need 
		 a plan, this may be difficult.) 
	12/1/99 Backup strategy using Legato in place for E3500 and E4500.
		E3500 has sufficient scsi disk for production needs through 
                March 2000.
	1/15/00 Make preliminary estimates for memory, cpu and I/O needs.
	        Start working on upgrade plans for E4500. 
                Determine if additional servers  are needed.
	3/15/00 Begin E3500 <-> E4500,  produstion <-> development transition.
	4/1/00  Start requisitioning additional cpu, memory, et cetera.
		E4500 transitioned to production and integration server
		E3500 transitioned to development server.
	4/15/00 Revisit DB size requirements
		-Determine how much additional disk is  needed in FY00.
		-Start requisitioning disk.


The following studies are needed to properly understand the system and plan for
additional hardware: 

	1. Tests to understand CPU needs. Submit multiple transactions/queries.

	2. Tests to understand memory needs.Submit multiple 
           transactions/queries.

	3. Sizing and use of Integration db.

	4. Understand the resources needed to run the web servers.

	5. Determine which additional servers,sam and other, need to run 
           on the production server machine.


Three year growth and usage plan
---------------------------

Following is a description of the plan for the storage, backup, CPU, memory and
I/O for the Ora  systems.  The production server  will be a critical device for
the D0 off-line and  its availability  will  be 99.9%, not  including carefully
scheduled  down   times  for upgrades.    In  order to   achieve this  level of
availability,  and to  make the system  more easily  maintained, we are  paying
close  attention to redundancy    (dual  power supplies,  dual   redundant raid
controllers, dual IO boards, multiple cpu boards, etc.).  Also, we plan to have
the entire  system  on UPS  (uninterrupted power  supply).  In order  to reduce
security risks, we need  to move the web  servers off of the production  oracle
machine to a dedicated server node. 

Growth needs  for the system  are projected in Table  I.   There are still many
poorly understood and/or missing details. However, this is an attempt to draw a
road map for the next three years which we will continue to update as we better
understand the problem. 
 
DISK
----

We plan to add a FC RAID disk system to the Oracle servers. This system will be
equipped  with dual  raid  controllers which  will be  configured  to  be fully
functional,  at reduced  throughput,   if one of  the   FC I/O channels   (raid
controller, FC connection, host adapter, IO  board, etc.)  fails.  All data and
index  table  space will  be   RAID  5 with  a   4+1 (4  data,  1  parity) disk
configuration. The total needed usable  RAID 5 space is projected  to be 1.5 TB
at  the end of the running  period. The size of the  needed RAID 1 areas should
not increase  over  their initial allocations,  unless  we decide that  certain
heavily used table or  index spaces require performance higher  than the RAID 5
offers. 

Oracle requires  the following special  administrative storage locations.  This
disk  should be  mirrored  (RAID 1)  to provide  a  highly  available and lower
maintenance system. 


	Control files: Few hundred MB files which can be configured
                       to have copies on multiple disks. Typical 
                       arrangement here is to have 3 disks.

	Redo logs:     Very small, ~2 MB, files on disks, typically
                       copied to two disks. These disks can be shared
                       with the control file disks.

	Archived Redo: The archived version of the redo logs used to 
                       recover the db in case of failure. Usually kept for 
                       2 or 3 days. Size depends on activity level and can be
                       large, maybe many GB. 18 GB disk is good, 9 may be ok.

	Roll backs:    Storage for uncommitted data. May be a few hundred MB.


Temporary storage is needed for  the temporary table  space. This disk is  used
for swap space when  memory is exceeded and  should  be highly  perform-ant and
consequently might be 2 or more disks striped together. 

System disk will also be mirrored (not sure about exact configuration).

Backup
------

The data will  be fully backed up on  a weekly basis, with daily  incrementals.
There will be  4 sets of  weekly  backup tapes, and  consequentially, every 5th
week the media used 4 weeks prior will be reused. If  the database grows to 1.5
TB, the full backup will require will 30 tapes for each backup and the full set
of full backup tapes for the entire  4 week cycle will  be 120 tapes. A similar
rotation will be used for the daily backups. It is anticipated that a single 50
GB cartridge will be sufficient  for each daily backup,  and thus there will be
24 daily tapes. Therefore,  a tape library with at  least 134 slots, plus a few
for cleaning tapes and possibly system backup will be needed. 

We plan to use the Oracle RMAN (repository manager) which enables us to perform
"hot" backups  and also reduces  the size needed  for the storage significantly
(not sure exactly what the factor will  be). This configuration requires that a
separate  machine   running   a  repository manager server     be provided  and
maintained. This is not a major resource machine, but  must be reliable, and on
UPS (Uninterrupted Power Supply). 

We are currently  to  purchase Legato software to  use  to manage the  file and
volume locations  in the library for  the  backup. The software  which is being
purchased includes the following products ( my understanding  of this is small,
so details are sketchy and maybe incorrect): 

	1. Network edition for Unix Update service
	   This will allow one host (4500) to backup data stored on it,
	   and, over the network, data stored on up to 5 clients.
	2. Auto-changer Software Module 1-256 slots with update service
	   This is needed to use the 200 slot robot. 
	3. Oracle Module-UNIX client with Update Service
	   This is the part which interfaces with Oracle. One license
	   is needed per client (machine I think). We need 2 licenses.

All  of the backup drives in  the robot will be connected  via (lvd ultra scsi2
for Mammoth II??)  scsi. The E3500 (to be  used for development) server will be
backed  up via the  network.  Additional  servers  can be   backed up over  the
network for about $5k additional license fee each.  (The hardware configuration
for the tape  drives needs to be  worked out. Need to  be sure have enough scsi
busses for the drive throughput. ) 

CPU,  Memory, and IO
--------------------

We can only guess at the needs for CPU, Memory and I/O.  Some details about the
anticipated usage patterns  are included in the table,  but it  is difficult to
understand how  these translate into actual hardware  without testing.   In the
next several months, we plan to rigorously explore our needs  in these areas by
processing test transactions and queries discussed previously. 

We can upgrade the UltraSPARC  336 MHz to UltraSPARC II  400 MHz CPU. We cannot
mix the two on the same machine. Two cpu's and up to 1GB of memory can be added
to each cpu/memory board. Therefore, one might imagine the maximully configured
E4500 with 2 I/O boards and 6 cpu/memory boards with 12 CPUs and 6 GB memory. 


Growth strategy and usage patterns
----------------------------------

Table I shows anticipated growth strategy and estimated usage patterns for next
3 years. Phase 1 will be the period leading up  to the run,  the system will be
used for Monte Carlo data and testing. By Phase 2, the system will be populated
with a  few   hundred million events, in    100,000 files and   the  backup and
performance will be critical.  Phase 3 will begin  to push the size limitations
of the system and really test  the performance for  reads and writes. The final
phase  is  expected to be   in January of  2003 and  at this point  most of the
activity should be  reading the database. However, should  the data  taking run
continue beyond this point this will change. 

The   table shows several  parameters  which  characterize  the size and  usage
patterns. The sizes are estimates based  on Oracle's Designer  2000 tool.  They
are based  on estimated number  of rows per table and  the current table design
for the SAM event and file database.   For example, the  events table, which is
being calculated as holding 999,999,999 rows, maxes out at 692G, which is about
1/2 all the space required.  It  is assumed that  the additional storage needed
for the on-line, calibration, trigger, etc. will  be relatively small, probably
less than 100 GB total. 

Usage applications are  divided into three categories:  transaction processing,
batch processing, and query processing.   Typical examples of each are included
below  the  table.    Additional  work  is   needed to   understand   how these
applications translate into disk I/O s and throughput. The number of concurrent
users is a very rough guess, but accounts  for the fact  that we expect most of
the db access to be through middle-ware servers (for example the sam_db_server)
reducing the number  of concurrent  database  users from the  number one  might
naively expect. 


Table I. Growth and usage patterns for D0 off-line database.


Parameter	Phase 1		Phase 2		Phase 3		Phase 4
		Pre-Run		Early Run	Late Run	Post Run
		(Jul '00)	(Jan '01)	(Jan '02)	(Jan '03 ->)
------------------------------------------------------------------------------
USERS (concurrent)
=====

Conc. Users	10		20		30		20

TABLE & INDEX SPACE (Production db)
===================
(RAID 5 Disk)
Total 

  total table(GB)20		100		1000		1500	
  data table (GB)10		50		500		750
  index table(GB)10  		50		500		750
  #tables	100		100		100		100
  #datafiles    12		50		500		750

Largest Table 
(events)

  Size (GB)	10GB		50GB		500GB		750GB
  #rows(@300B)	10^5		10^8		10^9		1.5*10^9
  #datafiles    4		25		250		400

2nd Largest
(data_files)

  Size (GB)	2GB		10GB		100GB		150GB	
  #rows(@2kB)	10^4		5*10^5		5*10^6		7*10^6
  #datafiles    1		5		50		75

3rd Largest 
(processes)

  Size (GB)	2GB		10GB		100GB		150GB
  #rows(@2kB)	10^4		5*10^5		5*10^6		7*10^6
  #datafiles    1		5		50		75

TABLE & INDEX SPACE (Integration db - assume 10% of production db)
===================
(RAID 5 Disk)

  total table(GB)2		10		100		150	
  data table (GB)1		5		50		75
  index table(GB)1  		5		50		75
  #tables	100		100		100		100
  #datafiles    12		50		500		750


BACKUP
======

Full Backup (hot 1/week, tapes in 4 week rotation)
 (assume recovery takes 2 x longer than backup)
  size(GB)	20		100		1000		1500
  tapes(@50GB)	1		2		20		30	
  rate (MB/s)	12		2x12		10x12		10x12
  duration(hr)	.5		1.2		2.3		3.5
  
Incr. Backup (hot 1/day, 6 daily tapes in 4 week rotation)

  size(GB)	20		20		20		20
  tapes(@50GB)	1		1		1		1	
  rate (MB/s)	12		12		12		12
  duration(hr)	.5		.5		.5		.5

Admin. DISK
===========
(RAID 1 - mirrored)

  System	9 GB		9		9		9
  Control Files	3x9 GB		3x9		3x9		3x9	
  Redo logs	2x9(w/ CF)	2x9(w/ CF)	2x9(w/ CF)	2x9(w/ CF)
  Archived redo	18 GB		18		18		18
  Rollback	9 GB		9		9		9
  
(RAID 0 - striped)

  Temp table 	2x18 GB		2x18		2x18		2x18

APPLICATIONS
============

Transactions(1) 
(OLTP)

  Rate (/min)	<1		10		>10		<1	
  Size		
  #Tables/
   columns

Batch (2)

  Rate (/day)	 1		10		>10		<1	
  Size		
  #Tables/
   columns

Queries(3)
(Decision Support)

  Rate (/min)	<1		1		5		1
  Size
  #Tables/
   columns


CPU & MEMORY
============

UltraSPARC 		4		4		6		6

Memory estimate		1		2		3		3
GByte


Notes: 

1. Typical transactions 
    (assume worst case for transfer:blk size x# tables x 2) 

	a. Adding raw data files, including events 
           -Transaction rate=1/min
           -Data transfer = 1.2 MB + 8k x 10 x 2 = 1.36 MB 
           -Required bandwidth = 
 	   -Each transaction will add to the following tables
				   avg. Bytes	Columns
            +events:			300	10 (4000 events = 1.2 MB)
            +data_files: 		84 	19
            +file_locations: 		30 	1
            +file_lineage:		22	5
            +processes:			51	11
	    +import_processes??:	85	8
	    +data_file_locations:	24	5
	    +event_catalogs:		21	3
	    +physical_data_streams:	23	3
	    +event_catalogs:		21 	3
	    +volumes			53?
	    +triger_lists?:
	    +trigger_streams?:

	   -Total transfer		~750 Bytes in 11 tables

        b. Adding processed files with processing information for consumers.
           -Transaction rate=10/min
           -Data transfer = 8k x 10 x 2 = 160 kB
           -Required bandwidth = 
 	   -Each transaction will add to the following tables
				  avg.Bytes	Columns
            +data_files:		84
            +file_locations:		30
            +file_lineage:		22
            +processes:			51
	    +analyse_processes:		11
	    +consumers:			22
	    +project_files:		15
	    +consumed_files:		35
	    +volumes			53?

	   -Total transfer		~300 in 9 tables

	c. Imported data from remote sites
           -Transaction rate=1/day
           -Data transfer = 8k x 10 x 2 = 160 kB
           -Required bandwidth = 
 	   -Each transaction will add to or update following tables 
					Bytes	Columns
	    +runs:			38
            +data_files:		84
            +file_locations:		30
            +file_lineage:		22
            +processes:			51
	    +analyse_processes:		11
	    +volumes			53?
	    +import_processes:		85	

	   -Total transfer		~300 in 8 tables

	d. Adding and/or manipulating projects
           -Transaction rate=10/day
           -Data transfer = 
           -Required bandwidth = 
 	   -Each transaction will add to or update one of the following tables 
					Bytes	Columns
	    +project_definition		135
	    +project_snapshots		67
	    +analysis_projects		69	    

	   -Total transfer		~100 in 1 tables

2. Batch activities

	a. incorporating on-line db info
           -Transaction rate=8/day (assume run = 4 hrs)
           -Data transfer = 
           -Required bandwidth = 
 	   -Each transaction will add to update following tables 


	b. synchronizing SAM db with Enstore
           -Transaction rate=1/day
           -Data transfer = 
           -Required bandwidth = 
 	   -Each transaction will add to update following tables 

	c. data processing summaries
           -Transaction rate=1/day
           -Data transfer = 
           -Required bandwidth = 
 	   -Each transaction will add to update  following tables

3. Typical queries

	a. File query
	   -rate=1/min
           -Data transfer = 
           -Required bandwidth = 
	   -each query may touch any table in SAM db. Primarily
            +runs:
            +data_files:
	    +data_file_locations:
            +physical_data_streams:
            +application_family:
	    +physical_dataset:
	    +file_lineage:
            +processes:

	b. Event query (assume run partitioning, searching 10% of data)
           -rate=1/hr
           -Data transfer = 
           -Required bandwidth = 
	   -each query will touch primarily one table 
            +events table-10^8 rows x 300B = 30,000MB

	c. Calibration data access
	   -rate=1/min (??) assuming one access per farm processed file
	   -rate=1/hr  (??) assuming  cached and updated only when needed
           -Data transfer = 200 MB
           -Required bandwidth =