Major ESnet outage scheduled for Mar 23 7PM PST

From: Francesca Verdier (fverdier_at_lbl.gov)
Date: 03/22/2001


Dear NERSC users,

We have received the message below from ESnet.  Tomorrow, Friday March
23, a major Quest/ESnet outage is scheduled from 7PM Pacific Time
through
5AM PST Saturday March 24.  Internet access to NERSC machines may not be 
available if there is a Quest link between your site and NERSC.  NERSC
is 
not directly attached to Quest at this time.

Sincerely,
Francesca Verdier -- NERSC User Services

-------- Original Message --------
Subject: FW: URGENT: Major Qwest/ESnet outage
Date: Thu, 22 Mar 2001 10:18:54 -0800
From: "Howard Walter" <hawalter@lbl.gov>
To: <nstaff@nersc.gov>

The timezone referenced was PST, i.e. the outage is planned from 7PM to
5AM
PST starting on Friday night.

JFL

-----Original Message-----
From: James F. Leighton [mailto:jfl@es.net]
Sent: Thursday, March 22, 2001 9:51 AM
To: essc@es.net; escc@es.net; esnet-status@es.net
Cc: routing@es.net; merola@lbl.gov; cwmccurdy@lbl.gov
Subject: URGENT: Major Qwest/ESnet outage


This message is to give you the earliest notice possible of a planned
total
outage of the Qwest 550 ATM backbone, upon which ESnet rides.  The
outage
is planned to start at 7PM, tomorrow, 23 Mar, and conclude within 10
hours.  IT WILL TAKE DOWN ESSENTIALLY ALL OF ESNET DURING THAT TIME!!

The driving event is an error in the Lucent ATM switches that Qwest is
using to supply services to ESnet.  This "bug" causes an event called a
cascading re-route, whereby all of the PVCs (virtual connections) in the
ATM backbone are re-routed caused massive changes in latency and packet
loss.  We have been hit with this event several times, the most recent
was
twice this last week-end.  The scheduled downtime is to be used to
reload
all of the high-end 550 Lucent ATM switches with new code.  The
procedure
advised by Lucent is to take all the switches off-line during this
reload,
estimated to take 10 hours.

Clearly we (ESnet) are very unhappy about this and have pushed for a
less
ugly solution or at least more notice - i.e. delay the outage.  Qwest
claims they have no options and must move forward - they have deployed
people on airplanes, coordinated schedules, shipped parts, notified
other
customers, etc.

I was just notified about this late yesterday afternoon and had hoped to
push for a better solution, but was unable to do so in this morning's
conference call.  So I am sending this notice based on what I know at
the
moment.

I anticipate this may have serious implications for some sites, so would
ask the ESnet Site Coordinators to forward this information as quickly
as
possible to allow people to undertake whatever advance preparations they
can.

Needless to say I am extremely unhappy about this situation, but at this
point I think our choice is a scheduled (with very short notice) outage,
or
face the likelihood of an unscheduled outage with no advance notice, and
without the resources available that this event will have.

As soon as we are able to do further planning, I will keep you
up-to-date
with and additional pertinent information.

JFL

This archive was generated by hypermail 2.1.6 : 08/21/2008 PDT