Air Traffic still in control after failure
- Paul McRae
- Sep 5, 2023
- 6 min read
Updated: Sep 6, 2023
Criticism may not be justified on closer inspection
Sep 5th 2023
It’s the last thing you want to hear when you are preparing to make your way home - that your flight has been cancelled. The dread that washes over you. How long is it delayed for? What am I going to do now? Where will I sleep? What about my family? What about work?
Thousands of travellers had that same feeling on Monday 28th August 2023 when a fault forced Britains Air Traffic Controllers to fall back to manual procedures, reducing air traffic flow to approximately 75% of its normal volume.
The after-effects were still being experienced several days after the event, but what exactly caused the problem and could it happen again?

Calm on the outside, chaotic on the inside
It had to happen on a Bank Holiday Monday. Not that there’s ever a good time for a problem like this to crop up. Having worked in and around an airport for over 20 years, I’ve witnessed several aviation events up close that have brought aircraft to a standstill, resulting in a very eerie, subdued atmosphere on the runway while at the same time causing widespread operational disruption in and around the the terminal building:-
9/11
The volcano ash cloud from Iceland in 2010
The Beast from the East in 2018
I had left for pastures new by the time the Covid Pandemic had struck, resulting in the closure of airports globally, but I know how hard this hit all airports - and other industries - deeply and over a long period of time.
This recent particular event at the end of August came out of the blue and also had a big impact on all those affected, leading to widespread disruption and leaving thousands of passengers stranded at home and abroad. According to the aviation data firm Cirium, 790 departing flights were cancelled on the Bank Holiday Monday, about a quarter of all departures, with 785 incoming flights also wiped from the schedule - 1575 flights in total.
Of course, all airlines flying in and out of the UK were affected, meaning carriers could not simply switch passengers to alternative planes operated by other companies. No such alternative arrangements were possible. Cue panicked, frustrated, angry and emotional passengers lining up at Airline and Handling Agent Customer Service Desks up and down the UK and abroad, desperately trying to arrange alternative flights for the coming days. Men, women and children wandering the check-in areas and departure lounges or sleeping on the floor, seats or anything else where they could find a space. An all too familiar sight in situations like these.
Aside from the personal and professional disruption for passengers, the event is reported to have potentially caused more than £100m worth of losses placing intense scrutiny on the National Air Traffic Service as, despite being at the centre of the disruption, they are not liable for any of the alleged costs incurred.
So what happened and how did this issue have such a big impact on the system and on flights?
Flight Plan Let’s start from the beginning and with Flight Plans. Airlines are required to submit flight plans to NATS of their routes in advance of departure which must be approved before take off to ensure air space is managed in a safe manner. For context, NATS process around 6,000 flight plans every day, or 2.5 million per year. These flight plans are then fed in to NATS systems and are then processed in certain formats to allow the many flights per day to be managed. It appears that one particular flight plan which was added to the system, for some reason not yet disclosed, has been responsible for corrupting a part of NATS system, causing Air Traffic Controllers to cease the use of automation and go in to manual processing contingency mode resulting in the number of flights that are normally processed being significantly restricted.
National Air Traffic Services chief executive Martin Rolfe explained that Initial investigations into the problem show it did relate to some of the flight data received.
"Our systems, both primary and the back-ups, responded by suspending automatic processing to ensure that no incorrect safety-related information could be presented to an air traffic controller or impact the rest of the air traffic system."
Sounds like the system is doing its job, exactly as designed, to me. Not processing “erroneous” data so that mistakes can not be made, is surely common sense.
However as manual processing had to happen for a sustained period of time while the fault was investigated, understood and rectified the duration of the disruption was great, causing huge operational and financial impact with questions inevitably being asked how this erroneous data was allowed to cause such a long-lasting impact. As with all incidents, however critical, the quicker they are resolved, the less operational, reputational and financial impact they are likely to cause.
Consider this
Now this was, of course, a highly disruptive and unfortunate incident - But what was the alternative? For the system to let the rogue data through, allow it to be processed and potentially enable the possibility of an adverse safety or operational situation?
I’ve managed enough incidents in my time to know that any form of mitigation which is suggested by critics after an event like this which seems “simple” to implement is not - and that these “black swan” moments are extremely rare for that very reason - sometimes mitigations just aren’t possible and incidents like these can and will happen - and often due to the prioritisation of safety. It comes 1st above everything else, even financial, operational or reputational. As a business, you can recover financially and operationally from the aftermath of an incident but recovering reputationaly after a serious safety incident is much more difficult.
As I stated in a previous BLOG earlier this year reporting on the 999 outage, these systems, 99.99% of the time, enable an incredible amount of information to be distributed around it’s infrastructure, enabling a standard of service, be it routing emergency telephone calls or managing and controlling air traffic flow safely, that would not normally be possible. Advances in automation and processing power allow these tasks to function and be processed at far higher volumes than would be possible by controllers processing data manually.
This is reported to be the worst disruption to UK air traffic control in almost a decade however, let’s consider that for a moment.
Even if we take the last 10 years as a timeframe for analysis and we take the restriction in traffic flow to be down by 25% on the bank holiday Monday in question, for 3650 days we have enjoyed 25% more traffic to be processed every day than normally would be possible if traffic was being managed manually. On the day, 3,049 flights were due to depart UK airports and 3,054 were scheduled to arrive, according to further analysis by aviation analytics firm Cirium.
This equates to more than 540,000 seats on departing planes and 543,000 on arriving planes. This equates to an average of 177 seats per flight per departing aircraft.
Factor this advantage in to numbers on seats for airlines and this averages out at 44 extra passengers able to fly every day per flight, 1328 extra per month per flight, 16,151 extra per year per flight and 161,512 extra passengers per flight on average over the last 10 years.
Yes, the airlines pay for the offering and should expect a level of service and value for the fees paid however these volumes would not be possible without the technology in place running so consistently and reliably over the last decade.

Conclusion
Again, as always with an incident of this magnitude, there will be an independent review. The Civil Aviation Authority will be putting together a report in the coming days and weeks which will be reviewed to see whether there are lessons to learn for the future.
In the meantime "We understand the way the system didn't handle the data… the way it failed, if you like. So we have put in place, already, procedures to make sure if that happens again, we can resolve it very, very quickly” Martin Rolfe concluded.
This will be welcomed by airlines and passengers but while the scrutiny continues for now before it eventually and inevitably subsides, spare a thought for the fact that the system in place has not only enabled the processing of significantly greater volumes of passengers up until last week but that on Monday 28th August, when it did encounter a problem, safety was not compromised. Surely, that is the most important part of the story here.
Comments