Trip report to Brookhaven National Laboratory Yujun Wu 10-29-2001 After the second US LHC Computing Facilities Workshop in July 2001 at Fermilab, the people at both ATLAS and CMS user facilities recognized the importance to study the disk storage performance and reliability issues. Since Brookhaven National Laboratory has a lot of experience with the large-scale disk raid storage system, I went to Brookhaven National Laboratory to look at their RAID storage system through Vivian and Rich's arrangement. Another purpose of my visit is to show them the disk performance benchmark tools I have collected. It was a very exciting, busy week for me during my staying at Brookhaven National Laboratory. The system administrators who work on the disk storage systems at BNL are Maurice Askinazi and Dave Free. During the whole week I stayed at BNL, Maurice and Dave were very warm-hearted and showed me a lot of things, including their storage systems, how to configure their storage systems, etc. I learned a lot from their experience. I got the basic ideas of what kind of disk storage they are using, and how to do the disk array configurations. Maurice and Dave also taught me how to deal with the vendors. Maurice and Dave got free evaluation units from the vendors. They even need not pay the shipping cost. They showed me the tricks to deal with the companies, and how to do a bid with the companies. I believe their practice saved a lot of money for BNL. I learned how they did their benchmark test, and I showed Maurice the disk performance benchmark tools I have collected. In the mean time, I also made contacts with people from MTI and LSI. I learned how the companys' engineers tune and configure their products. Also, with Rich's help, I got a chance to exchange ideas with Jason Smith about the monitoring tools. BNL has done a lot of work on this. Probably we can learn from them too. In summary, the whole trip was very useful for me. First, the collaboration of the national laboratories can help with bargaining with vendors. Maurice, Dave and I all agree, that even just the presence of people from other national laboratories in front of the vendors is worth a lot. In this way, vendors will not think the sale to BNL/FNAL will be the final deal. They have to work hard in order to get more deals from other national laboratories. Second, I learned the basic ideas about the BNL disk array setup, configuration and usage. Third, I had a chance to have direct contact with vendors. And last, but not least, I believe it is important to learn from the bargaining experience of BNL. I think we need not pay the extra price, if we can learn from Maurice's experience. Should we invite Maurice as our consultant? Overall, I think this was a very successful trip and I hope that we could invite Maurice and Dave to Fermilab sometime. *********************************************************** The following were daily activities: Sunday (10/21/01): I arrived at BNL on Sunday night around 10:20pm. The secretary forgot to inform the Security Department to put my name on their visitors' list. But I managed to get into the facility without trouble using my old NSLS ID which is still valid. Monday (10/22/01): I met Maurice and Dave around 8:30am. Maurice told me their zzyzx disk went bad on Sunday night. He had to come in to try to fix the problem and stayed there until 1:00am. That morning, they decided to restore failed scsi disk channels. They had to shut down their RHIC NFS servers to do this. Once the channels were okay, we rebuilt the disk drives and restored the raidsets. Everything then worked fine. I learned that the zzyzx performance is ok, but not very reliable. And they all like the management GUI. In the afternoon, Maurice and Dave showed me several of their disk array management tools, and the advantage and disadvantages of each disk system. Around 5:30pm, MTI finally delivered their test system to BNL. It sat overnight to acclimate to the temperature and humidity of the room. Tuesday (10/23/01): Several engineers from MTI came to help with the configuration of their disk array system. They first powered their system up and then downloaded new controller firmware. The firmware took many attempts to install because it is a pre-pre-released version. They also wrote new management software on the fly to accomodate firmware changes. Then they decided to verify the disk firmware revisions. The verification was taking a long time and was eventually abandoned. The MTI people worked quite late and finally got one controller to reach the specified performance, and another one only reached half of it. I watched over part of these processes, although I could not help much. Wednesday (10/24/01) MTI engineers continued their efforts of configuration to make their system worked well. They thought the performance problem was due to latency resulting from having a hub on the fibre channel back end loop. The hub is necessary to accomodate the increased disk count which makes the product affordable. Maurice and Dave were not very happy with this. The salesperson from MTI told Maurice that the system would run without much effort after powering up. One engineer from MTI had to leave in the morning. Maurice told the MTI people that he had to make a decision whether to accept their unit in the afternoon. In the late afternoon, it was obvious that the engineers from MTI could not tune their system to the specified performance. They were also unable to get it to work on the fibre channel switch and several environmental problems had popped up. A short telephone meeting was held between Maurice, MTI engineers and their corporate people. Maurice was very unsatisfied with the MTI unit, but gave the MTI engineers an extra 3 hours to see if they could tune their system to the specified performance level and work out the compatability problems. The MTI engineers started the entire configuration over. They tried their best, but could not reach the specified performance, actually reaching only half their expected performance. At 9:00pm, They were told to give up. Everybody was tired and frustrated to fail after such a hard effort was made. Thursday (10/25/01) Because of the Tuesday and Wednesday's hard work, Maurice and Dave were all tired. It was an easy day. In the morning, there was a group/department meeting. Maurice invited me to attend. Various issues, status and progress about their computing facilities were addressed during the meeting. After the meeting, Maurice, Dave and I prepared for LSI people to come to upgrade their controller firmware on Friday. In the afternoon, I spent some time looking at the monitoring system at BNL. I mainly talked with Jason Smith---introduced by Rich. I showed them Iosif's monitoring tool (idea) and MRTG. Jason showed me the monitoring tool done by Tony Chan. This tool currently monitors about 200 nodes at BNL. It uses the MySQL database to store the log info. He also showed me several other popular monitoring tools. Then I downloaded a package of my disk benchmark performance tools and gave them to Maurice. We compiled the dt, tiobench tools and ran them on their system. Actually, Maurice has a very nice way to do his benchmarks. I plan also to put it into my package. It was a very busy late afternoon and evening. Joe broadcasted that the wonder data disk would be rebuilt. I had to copy out my data. Friday (10/26/01): The LSI/OSSI engineers came to upgrade the firmware on the controller. They decided that the upgrade was inappropriate because of legal reasons. Instead they installed their controller failover program onto the BNL test server despite previous declarations that it wouldn't work in their environment. It took 2-3 hours of work. It seemed to work though a new problem regarding HBA compatability with the Brocade switch and the Metastor controllers appeared. Dave showed me their serial console switch from Baytech. The price is very reasonable. It is very powerful and convenient. Maurice and Dave also told me ways to deal with salespeople to get free evaluation units. I left BNL around 1:30pm to catch the flight at 4:15PM at Islip.