Network Outage at Los Angeles Airport
In 11th august 2007, 17000 people were stranded at Los Angeles Airport because of software failure. The above problem hit all computer systems of US customs and border protection department. This outage was caused by a network card at the airport which instead of shutting down as it was expected, it kept sending out incorrect data across the network and this green screened the whole US custom and protection systems bringing it to a halt. In addition, because the management did not know who was where, the airport was shut down by the agency, and more than 17000 people was stranded for up to ten hours across US. In addition, people were not allowed to either enter or leave US using this airport.
- faulty software embedded on the network card
The network card could not control the system because of faulty software which was embedded on the network card. Therefore, ERAM (En Route Automation Modernization) systems instead of sending correct data to custom and protection network, it sends out data that was faulty, and the whole system came to an abrupt halt
- Use of old computer systems whose unique language was only well understood by
technicians who had already retired.
ERAM systems are better than old computer systems since they allow traffic at various en route centers to identify and control planes which are at high attitude. At the time of the incidence federal aviation authority was planning to upgrade air traffic control systems from radar based old systems to satellite-based ERAM systems.but the rollout of this program was so many years behind schedule because it required a lot of money to execute. In addition, probable that the new technicians who were handling the system after the retirement of old technician did not understand fully how the system was operated and therefore, could not even fix minor software issues. This explains why it took long for the system to be corrected.
Chronology of events
- Who it happened
It is at 1.00pm pacific daylight time on 11th august 2007,when U.S custom and protection network at Los Angeles airport started to experience an outage. At first, computer systems at US, custom, and protection experienced time delays in processing of passengers’s documents. The response time during outage was approximately 2 to 3 minutes but when the computer systems were normal, the response time was less than five seconds. From 4.16 p.m. pacific daylight time, officers at custom and border protection could not perform any query on remote database and therefore, arriving passengers could not be processed using remote system in the airport. During the initial stage, US, custom and protection agency could not use remote systems to process arriving passengers. In addition, deployment of full back system delayed for an hour because field technicians were not there at the time of the outage.
17,000 passenger were affected by the outage because of delay in passenger processing. The area for international arrivals was full of passengers and the number of people in the jetways and waiting areas was restricted by the airport’s fire marshal. As a result, the arriving passengers were not allow to leave and they had to sit in over 60 planes that were on the tarmac road for a long time. In addition, international departures were affected and incoming flights were diverted to California and Ontario airports, which are over 55 miles away.
- Action taken during network outage
Communication vendor Sprint and staff officials at U.S. Custom and border protection took various actions as they tried to resolve network outage at Los Angeles airport. Particularly, officials at the remote locations and at the airport worked together in identifying the cause of network outage to restore passenger process
12.50p.m to 2.00 p.m. pacific daylight time
At 12.50p.m pacific daylight time on 11th august 2007, U.S custom and border protection staff at Los Angeles reported time delay response when using treasury enforcement communication system. In addition, the network operation center at US, custom and border protection agency was alerted about the inaccessibility of Los Angeles airport router. At 1.16p.m pacific daylight time, the problem of accessing communication vendor sprint router was reported by network operation center. US custom, and border protection duty officers, and help desk were notified about the problem. Network operation center instructed US. Custom and border protection onsite technician at the
airport to restart communication devices and this lead to the devices being turned on and off by the router. Nevertheless, restarting the communication devices did not provide a solution to the problem and at this time, the custom, and border protection technicians reported an average of 2 to 3 minutes query response, which was 31 times slower than usual. At 1.55p.m pacific daylight time, sprint reported to network operation centers that routers at Los Angeles airport responded electronically and that communication lines were active. Sprint proposed custom and border protection communication system be restarted.
2.00pm to 3.00p.m pacific day light time
Network operation center asked custom and border protection onsite technician to restart the device and verify flow to the router. The circuits were confirmed not to be disconnected by the sprint and duty officer at custom and border protection was notified. At 2.28p.m network operation center verified and alerted sprint that there was power to custom and border and protection communication system. Network operation was to call back sprint after verification by custom and border protection technician verify that sprint router were powered
3.00pm to 4.00p.m pacific daylight time
Customs and border protection onsite staff confirmed presence of power in the router and requested status update. Network operation center established teleconference call with custom and border protection onsite officer and sprint. At 3.57 p.m. pacific daylight time, the network operation center reported to sprint the presence of power in the router and requested sprint to send a technician to Los Angeles airport.
4.00 p.m to 5.00 p.m pacific daylight time
Custom and border protection acting port director and area local network field manager requested status update. Custom and border protection database last query was from Los Angeles airport was at 4.16 p.m. pacific daylight time sprint technician was sent to Los Angeles airport at 4.20.pm pacific daylight time.
5.00 p.m. to 6.00p.m pacific day light time
A second custom and border protection onsite technician arrived at Los Angeles airport and started assisting in terminals 4 and 5. The network operation center provided outage status information to custom and border protection duty officer and the local area network administrator at the airport
6.00 p.m. to 7.00 p.m. pacific daylight time
Custom and border protection started to assist in restoring Bradley terminal. Custom and border protection duty officer initiated a conference call about the outage and provided the number to network operation center and sprint. Custom and border protection deputy field officer at Los Angeles airport, office of field operation duty officer, area manager of southern California and deputy area manager were given the status update. Sprint technician arrived at Los Angeles airport and verified that communication equipment was working well. In addition, although sprint router was responding electronically, it however could not administer the router remotely. The router restarted in a busy state when powered on by the sprint technician. Nevertheless, when area local network was disconnected and then router restarted, router was remotely administered by sprint.To the surprise of many, it was discovered that the problem was with the local area network of custom and border protection agency at the airport and not sprint systems.
7.00p.m to 9.00 p.m. pacific daylight time
Support staff at sprint remote network had conference call with custom and border protection onsite technician and they started the process of identifying problem with local area network at the airport. Sprint technician was instructed by support staff at sprint remote network to connect a laptop with modem to custom and border protection switch. By communication through modem, support staffs at sprint remote network evaluated traffic on custom and border protection switch using HyperTerminal software of the laptop
9.00 p.m. to 11p.m pacific daylight time
At 9p.m pacific daylight time, all terminals except Bradley terminal started to process passengers using custom and border protection database. Custom and border protection technician started to isolate the problem by disconnecting media converter and wireless network from the network though the action did solve the outage problem. Custom and border protection field technician troubleshoot at Bradley terminal by hot swapping the components. The communication device burned and filled the room with smoke. However, the process continue by restoring the functionality of a decommissioned switch which was discovered by the technicians at the airport
- p.m. to 11.45 pm pacific daylight time
Circuit containing 12 devices at Bradley terminal was disconnected by custom and border field technicians. Database query was then performed and all the 12 devices at Bradley terminal remained disconnected from local area network of Los Angeles airport, and the whole system resumed functioning at 11.40 p.m. pacific daylight time