File Replication Service Usage Plans - CMS ========================================== CMS collaborators are active in several areas directly related to file replication: 1) CMS is undertaking a large Monte Carlo simulation and reconstruction production run in fall 2000, with of order 1 million events planned to be generated by several different physics groups. The processing of each event involves several stages, each to be performed at different locations, primarily Caltech, Wisconsin, FNAL and CERN. The processed events will be accessed and analyzed by physicists in those and several other locations. This task will be supported by Globus-based ORCA file replication services being developed by researchers at Caltech and CERN, and in collaboration with the European commision DataGrid project. These services will be implemented as a first prototype in time for the fall 2000 production. The prototype will allow replication of the data and meta data in streaming or on-demand modes Once replicas of the produced events have been made, additional processing steps will be executed at the primary sites, followed by further replication of the new results. At that stage, results can be analysed by the distributed groups of CMS physicists. Contact: Irwin Gaines/FNAL, Jim Amundsen/FNAL, Vladimir Litvin/Caltech, Harvey Newman/Caltech, Asad Samar/Caltech 2) Caltech and UCSD are preparing a plan for Tier2 prototypes and Tier1 interaction, which will involve the purchase and installation of hardware and software. ORCA database file replication between CERN, FNAL and the prototype Tier2 servers at Caltech and UCSD will be one of the first tasks. The database files are each typically several hundred MBytes in size. The Tier2 prototypes will probably offer ~2TByte of online disk storage. It is hard at this stage to estimate accurately the WAN traffic from CERN or FNAL to the Tier2 servers. However, we can postulate a half-fill of the available capacity at each site over a couple of days at the start of an analysis or re-reconstruction task, i.e. an average of ~50 Mbits/sec to both sites, followed by replication between the two sites to fill the remainder of available capacity. This second phase will soak up available bandwidth on the SDSC-Caltech link. Contact: Harvey Newman/Caltech, Julian Bunn/Caltech, Reagan Moore/SDSC 3) Objectivity-container and user-collection transport R&D at Caltech is focussing on CMS (and SDSS) data. A prototype replication system using new algorithms and based on ORCA, Globus, PPDG middleware will be ready in time for a demonstration at SuperComputing 2000 in November. This R&D is progressing in close collaboration with Globus and Johns Hopkins University. Contact: Koen Holtman/Caltech 4) There are ongoing tests of high performance networking between the various sites (Caltech, FNAL, CERN, and several other sites active in CMS software and computing), most notably between SLAC and Caltech in relation to the 100 MB/sec PPDG milestone. It is hoped to use ORCA Objectivity database files (as well as BaBar files) once the network is tuned. Coordination between networking experts at all the collaborating institutes is important to ensure that TCP/IP capabilities match, and routes etc. are set correctly. Contact: Les Cottrell/SLAC, Davide Salomoni/SLAC, Julian Bunn/Caltech, James Patton/Caltech