ESnet outage 23 Mar

From: RCF/USAtlas Staff (rcfstaff@bnl.gov)
Date: Thu Mar 22 2001 - 12:53:42 EST

  • Next message: RCF/USAtlas Staff: "ESnet outage - Clarification"

    This message is to give you the earliest notice possible of a planned total
    outage of the Qwest 550 ATM backbone, upon which ESnet rides. The outage
    is planned to start at 7PM, tomorrow, 23 Mar, and conclude within 10
    hours. IT WILL TAKE DOWN ESSENTIALLY ALL OF ESNET DURING THAT TIME!!

    The driving event is an error in the Lucent ATM switches that Qwest is
    using to supply services to ESnet. This "bug" causes an event called a
    cascading re-route, whereby all of the PVCs (virtual connections) in the
    ATM backbone are re-routed caused massive changes in latency and packet
    loss. We have been hit with this event several times, the most recent was
    twice this last week-end. The scheduled downtime is to be used to reload
    all of the high-end 550 Lucent ATM switches with new code. The procedure
    advised by Lucent is to take all the switches off-line during this reload,
    estimated to take 10 hours.

    Clearly we (ESnet) are very unhappy about this and have pushed for a less
    ugly solution or at least more notice - i.e. delay the outage. Qwest
    claims they have no options and must move forward - they have deployed
    people on airplanes, coordinated schedules, shipped parts, notified other
    customers, etc.

    I was just notified about this late yesterday afternoon and had hoped to
    push for a better solution, but was unable to do so in this morning's
    conference call. So I am sending this notice based on what I know at the
    moment.

    I anticipate this may have serious implications for some sites, so would
    ask the ESnet Site Coordinators to forward this information as quickly as
    possible to allow people to undertake whatever advance preparations they can.

    Needless to say I am extremely unhappy about this situation, but at this
    point I think our choice is a scheduled (with very short notice) outage, or
    face the likelihood of an unscheduled outage with no advance notice, and
    without the resources available that this event will have.

    As soon as we are able to do further planning, I will keep you up-to-date
    with and additional pertinent information.

    JFL

    --
    This message forwarded from the RCF announcements page.
    Recent messages are available at:
    http://www.rhic.bnl.gov/RCF/Announcements/announce.html
    



    This archive was generated by hypermail 2b29 : Thu Mar 22 2001 - 12:55:51 EST