Planning For Business Continuity & Service Affecting Issues

Planning For Business Continuity & Service Affecting Issues

Tony Leary

Tony Leary

Chief Information Security|Kerv Digital

Published 24/10/22 under:

Have a question?

Get in touch

In terms of the post Covid era, the lessons of the last few years are clear. Business continuity means that workforces over the globe were forced to adapt quickly and, in some cases, very dramatically. Does that mean now that this process should begin to slow down? Absolutely not.

For those that are keen on laying out a roadmap for the future, in this blog our Chief Information Security Officer, Tony Leary, lays out the fundamentals that you’ll need to bear in mind.

First and foremost, always remember that planning for service-affecting issues is good practice and may keep you in business! 

Pre-pandemic, disaster recovery and business continuity were topics many found contrived, and perhaps even pointless. But those organisations that had a documented and tested ‘work from home’ business continuity plan, perhaps to mitigate the loss of a key building, likely coped with the first Covid lockdown better than those that didn’t.

Time is of the Essence

Business continuity as ‘availability’ forms part of the information security ‘CIA triad’ along with confidentiality and integrity, so it’s very much part of IT security architecture and governance practice. Availability is usually expressed as a % of uptime over a period e.g., 99.9% measured monthly means that a service may have up to 0.1% (a bit under 45mins) of unplanned downtime a month. As availability is usually backed by a contractual agreement, IT suppliers must be confident that they can comfortably meet this figure for their service. Confident rather than certain, as availability is perhaps the most expensive risks to manage, given component failures are usually mitigated by adding spare capacity that may only be used if a failure occurs.

Breaking Down the Breakdowns

In common with every other aspect of IT, there are a plethora of initialisms. Besides availability, business continuity is often referred to as disaster recovery (DR) or service continuity. Some other terms that it’s good to be aware of are:

  • RPO: Recovery Point Objective: in simple terms, this represents the maximum amount of data a service consumer is willing to lose if the service fails, so a 1hr RPO means up to an hour of data would be lost.
  • RTO: Recovery Time Objective: how long it should take to recover a service following an incident that impacted its availability.
  • MTPD: Maximum Tolerable Period of Disruption: the time an organisation can tolerate the loss of a service, given any other processes, such as rekeying in that 1hr of lost data, which may be necessary following service recovery.

The relationship between these three terms is typically RPO<RTO<MTPD, e.g., while up to one hour of data may be lost, up to 24 hours may be needed to restore the whole system (24 hour RTO) and there may then be a further 12 hours allowed to key in data recorded elsewhere (perhaps even on paper) while the service was down, giving an MTPD of 36 hours.

So where do these values come from? First, business stakeholders must provide the MTPD envelope that their service is required to operate within. Next would be any constraints from third parties and/or vendor technologies: enterprises rarely work in isolation, and are often dependent on other, existing services or platforms. Once the service is built, but before it is ‘live’, testing is vital to prove that the requirements can be met.

It’s easy to see this is an area that is critical to understand for any new service. Quantifying customer risk appetite helps architects narrow down architectural options, whether they are building a service, or selecting one from a third party who may be willing to commit to RPO/RTO figures in a contract.

The Cloud Continuity Conundrum

The emergence of cloud services has altered the IT and security landscape in lots of ways, so it shouldn’t be surprising that approaches to business continuity need to change too.

Cloud services are built from the ground up to be highly resilient and are obviously closely monitored by vendor support teams, so are likely to be more reliable than the majority of traditional, on-premise services that customers may run themselves.

Of the three main types of cloud product; infrastructure-as-a-service (IaaS) and platform-aaS, which are based on discrete components, do usually offer options around resilience, whereas Software-aaS is typically provided as a full managed service and ‘sold as seen’ with only an availability SLA. There are cloud services (usually IaaS products) offering RPO/RTO SLAs from Azure and AWS but the vast majority of services only offer an availability SLA.

While you may not get a contracted RPO/RTO SLA from a cloud provider, there may be the possibility of self-assurance through testing e.g., replicating a failover by directing/blocking network traffic, or disabling components.

This approach has its limits, however. Some services are so abstracted that they provide no way for a cloud consumer to force any kind of failover. It’s therefore possible (and even likely when using PaaS and SaaS) to build a service from cloud components that offer neither an RPO/RTO SLA, nor the means to establish one manually through testing.

RT-Au Revoir?

Where does this leave an industry standard resilience metric such as RTO? It’s likely that its relevance will fade as more services move towards the cloud, based on the services offered currently. Though conversely perhaps cloud providers needing to both attract on-premise hold-outs, while differentiating themselves, may see an opportunity in providing RPO and RTO SLAs in the future.  In the meantime, it’s vital for architects and stakeholders to take such constraints into account as early in the project lifecycle as possible.

Related

You might also be interested in

From our world to yours

Life at Kerv as a Head of Customer Success

From our world to yours

How Financial Institutions are Overcoming Regulatory Challenges

From our world to yours

PureGym emerges one of the fittest at this year’s European Contact Centre...

From our world to yours

Migration Services Assured for Genesys PureConnect Contact Centre Customers

From our world to yours

Mobile Voice Recording: Migration, Integration and Monitoring

From our world to yours

Does your existing solution ensure MS Teams eComms Compliance?

From our world to yours

End-to-End Communications and Compliance Guide

From our world to yours

Cost Effective Strategies for Optimising AI Benefits in BPOs

From our world to yours

MCA Selects Kerv Digital for Innovative Microsoft Dynamics 365 Programme

From our world to yours

The Necessity of AI for Business Leaders

From our world to yours

Compliance Cloud: The Comprehensive Managed Recording Solution

From our world to yours

Rapid-WAN: High-Speed, Uninterrupted SD-WAN

From our world to yours

Life at Kerv as a Principal Solutions Consultant

From our world to yours

Four Top Tips for Contact Centre Cost-Cutting in 2024

From our world to yours

Back to Basics: Microsoft Power Platform

From our world to yours

Kerv wins Rising Star Award at the CRN Channel Awards 2023! 

From our world to yours

Guide to optimising your Azure cloud environment

From our world to yours

Bewitching Business with Microsoft Magic

From our world to yours

Retail Revolution: Boosting Connectivity & CX with SD-WAN

From our world to yours

Taking Advantage of Generative AI & Microsoft Copilot – Business Leaders Workshop

From our world to yours

Accelerating Business: The Power of Kerv Rapid-WAN

From our world to yours

How the FCA’s New Consumer Duty Affects Compliance Recording: A Deep Dive...

From our world to yours

The Road to a Green Future: Sustainability for the Transport Industry via...

From our world to yours

Life at Kerv as an Account Manager

From our world to yours

PureGym Solves Peak Demand Challenges with Digital and AI

Have a question?

Leave your details and a member of the team will be in touch to help.

"*" indicates required fields

By pressing send, you agree to our Terms and Conditions and Privacy Policy.
This field is for validation purposes and should be left unchanged.