The Times Australia
The Times World News

.

Optus has revealed the cause of the major outage. Could it happen again?

  • Written by Mark A Gregory, Associate Professor, School of Engineering, RMIT University
Optus has revealed the cause of the major outage. Could it happen again?

Around 4.05am on Wednesday November 8 2023, Optus suffered a nationwide network outage lasting well into the evening, more than 12 hours later.

Now, Optus has released some information on what happened, stating[1] “we now know what the cause was and have taken steps to ensure it will not happen again”.

As a telecommunications expert, I believe we should have no confidence in this statement, because the poorly worded explanation leaves many questions unanswered.

Could a similar outage happen again? We don’t know – but there are ways to make it less likely.

Read more: Optus blackout explained: what is a ‘deep network’ outage and what may have caused it?[2]

How did the outage unfold?

The Optus outage caused all services to go offline. Landlines, mobile phones, home internet, small business and enterprise, and cloud connections all dropped out.

The most serious impact of the outage was that Optus landlines couldn’t dial 000[3] and Optus mobile phones were unable to connect to the 000 emergency call service unless the connection occurred through Telstra or Vodafone infrastructure.

More than 10 million Optus customers were affected by the outage that brought Melbourne’s trains to a halt[4] and left Optus’s small business customers unable to carry out EFTPOS transactions[5].

So, what went wrong with Optus?

Optus has revealed that a “routine software upgrade” triggered a cascading failure in the Optus internet protocol[6] (IP) core network – the central backbone of their network that authorises device access and provides customer management.

Optus has provided a brief answer on why the entire network went offline[7]:

“These routing information changes propagated through multiple layers in our network and exceeded preset safety levels on key routers which could not handle these. This resulted in those routers disconnecting from the Optus IP Core network to protect themselves.”

Routing information is used to find a path from one location on the internet to another – a router is a device that manages the traffic flows.

The explanation provided by Optus points to human error. This confirms what industry experts suspected had happened[8]. The resulting flood of “routing information changes” overwhelmed[9] key routers in the core network causing them to disconnect, thereby bringing the entire network to a halt.

Read more: Explainer: what is the 'core network' that was crucial to the Optus outage?[10]

Photo of a white wifi router on a desk with a person working on laptop in the backround
Your internet router is a home version of a device that manages data traffic flow. Teerasan Phutthigorn/Shutterstock[11]

Should the outage have been preventable?

Outages of this kind are not uncommon – human error has led to major companies going offline in the past.

But an entire telecommunications network going offline is unusual. The network should be designed in such a way that redundancy (backups) and resiliency are built in from the outset.

Before a software upgrade occurs, there should be modelling, testing and several layers of sign-off.

In case something goes wrong, there should be infrastructure and system redundancy. An automated or manual procedure should exist to ensure the redundant systems become operational within a few minutes.

In 2021, Facebook, WhatsApp and Instagram[12] disappeared from the internet for roughly six hours due to an incorrect routing configuration.

Meta’s lengthy and informative statement[13] at the time provides an example of the level of detail that we should expect Optus to provide.

With the Optus outage and similar incidents at other companies that have led to major outages, in nearly every case the outage was preventable and highlighted deficiencies in the organisation.

Read more: In a crisis, Optus appears to be ignoring Communications 101[14]

What should Optus do now?

The national outage means the Optus network is not fit for purpose[15].

It can be assumed Optus has a number of deficiencies, such as problems with engineering capability, testing, procedures, network redundancy and resilience.

Optus states they are “committed to learning from what has occurred” and will continue to work to “increase the resilience” of their network.

For this to lead to an effective outcome, Optus will need to carry out a review and put in place new processes, infrastructure and systems to prevent a similar outage in the future.

How do we know a similar outage won’t happen again?

We don’t.

We need enhanced government regulation of the Australian telecommunications network operators to provide improved visibility of the redundancy and resilience of their networks. The Senate has commenced an inquiry[16] into the Optus outage.

Telecommunications is an essential service. Australians should be able to connect to the 000 emergency call service at all times. Reliable access to medical services, EFTPOS and the internet are vital.

If necessary, penalties should be introduced into the Telecommunications Act 1997[17] to ensure telecommunications network operators implement and maintain “best practice” related to network operation, redundancy and resilience.

References

  1. ^ stating (www.optus.com.au)
  2. ^ Optus blackout explained: what is a ‘deep network’ outage and what may have caused it? (theconversation.com)
  3. ^ couldn’t dial 000 (www.sbs.com.au)
  4. ^ trains to a halt (www.9news.com.au)
  5. ^ EFTPOS transactions (www.abc.net.au)
  6. ^ internet protocol (www.cloudflare.com)
  7. ^ why the entire network went offline (www.sbs.com.au)
  8. ^ suspected had happened (www.scimex.org)
  9. ^ overwhelmed (www.theguardian.com)
  10. ^ Explainer: what is the 'core network' that was crucial to the Optus outage? (theconversation.com)
  11. ^ Teerasan Phutthigorn/Shutterstock (www.shutterstock.com)
  12. ^ Facebook, WhatsApp and Instagram (blog.cloudflare.com)
  13. ^ Meta’s lengthy and informative statement (www.facebook.com)
  14. ^ In a crisis, Optus appears to be ignoring Communications 101 (theconversation.com)
  15. ^ not fit for purpose (www.abc.net.au)
  16. ^ commenced an inquiry (www.aph.gov.au)
  17. ^ Telecommunications Act 1997 (legislation.gov.au)

Read more https://theconversation.com/optus-has-revealed-the-cause-of-the-major-outage-could-it-happen-again-217564

Times Magazine

Headless CMS in Digital Twins and 3D Product Experiences

Image by freepik As the metaverse becomes more advanced and accessible, it's clear that multiple sectors will use digital twins and 3D product experiences to visualize, connect, and streamline efforts better. A digital twin is a virtual replica of ...

The Decline of Hyper-Casual: How Mid-Core Mobile Games Took Over in 2025

In recent years, the mobile gaming landscape has undergone a significant transformation, with mid-core mobile games emerging as the dominant force in app stores by 2025. This shift is underpinned by changing user habits and evolving monetization tr...

Understanding ITIL 4 and PRINCE2 Project Management Synergy

Key Highlights ITIL 4 focuses on IT service management, emphasising continual improvement and value creation through modern digital transformation approaches. PRINCE2 project management supports systematic planning and execution of projects wit...

What AI Adoption Means for the Future of Workplace Risk Management

Image by freepik As industrial operations become more complex and fast-paced, the risks faced by workers and employers alike continue to grow. Traditional safety models—reliant on manual oversight, reactive investigations, and standardised checklist...

From Beach Bops to Alpine Anthems: Your Sonos Survival Guide for a Long Weekend Escape

Alright, fellow adventurers and relaxation enthusiasts! So, you've packed your bags, charged your devices, and mentally prepared for that glorious King's Birthday long weekend. But hold on, are you really ready? Because a true long weekend warrior kn...

Effective Commercial Pest Control Solutions for a Safer Workplace

Keeping a workplace clean, safe, and free from pests is essential for maintaining productivity, protecting employee health, and upholding a company's reputation. Pests pose health risks, can cause structural damage, and can lead to serious legal an...

The Times Features

Duke of Dural to Get Rooftop Bar as New Owners Invest in Venue Upgrade

The Duke of Dural, in Sydney’s north-west, is set for a major uplift under new ownership, following its acquisition by hospitality group Good Beer Company this week. Led by resp...

Prefab’s Second Life: Why Australia’s Backyard Boom Needs a Circular Makeover

The humble granny flat is being reimagined not just as a fix for housing shortages, but as a cornerstone of circular, factory-built architecture. But are our systems ready to s...

Melbourne’s Burglary Boom: Break-Ins Surge Nearly 25%

Victorian homeowners are being warned to act now, as rising break-ins and falling arrest rates paint a worrying picture for suburban safety. Melbourne residents are facing an ...

Exploring the Curriculum at a Modern Junior School in Melbourne

Key Highlights The curriculum at junior schools emphasises whole-person development, catering to children’s physical, emotional, and intellectual needs. It ensures early year...

Distressed by all the bad news? Here’s how to stay informed but still look after yourself

If you’re feeling like the news is particularly bad at the moment, you’re not alone. But many of us can’t look away – and don’t want to. Engaging with news can help us make ...

The Role of Your GP in Creating a Chronic Disease Management Plan That Works

Living with a long-term condition, whether that is diabetes, asthma, arthritis or heart disease, means making hundreds of small decisions every day. You plan your diet against m...