I came across one interesting blog post on Finextra which made me think of a topic that had been on my mind for a while now… the systemic risks of cloud computing concentration. It seems like everyone has moved or is moving from maintaining large, expensive data centers to letting Amazon, Microsoft, or Google worry about buildings, infrastructure, and hardware. I can’t say I blame them, especially since getting new servers and other hardware has become a much more difficult and time-consuming process now that all of our supply chains seem to have been broken.
But there’s also a downside – when one of the big cloud providers is having a bad day, people notice – most of the websites and services that we rely on depend on the operation of at least one of these providers. And there have been major outages over the past year. So far, these outages have had no systemic impact on the financial system. Until now.
While the big cloud providers have all sorts of options to make the systems in their perimeter fault tolerant to some degree, we’ve seen provider-level outages that have disrupted the internet. In order to achieve true resilience when one of these events occurs, organizations need to think about true multi-cloud solutions – and there are significant hurdles that must be overcome to achieve this.
The biggest hurdle is cloud providers’ tempting managed offerings — managed Kubernetes clusters, databases, serverless services — which are great for getting new services up and running quickly, but make multi-cloud operation difficult, if not impossible. Even if another vendor has the same type of managed database, it will be just different enough from your primary vendor to make porting your systems expensive and time-consuming. It’s not a bug – it’s a feature. Salespeople want to lock customers into their product (and who can blame them?).
In the financial world, regulators take noteand institutions and their service providers (as well as cloud providers) need to think about true multi-cloud resiliency solutions before the next big outage occurs.
If you’re at the start of your cloud journey and your application is mission critical, design it to be multi-cloud from day one – it’ll be waaaaaay less expensive and complex than trying to fix the problem afterwards reaching one million customers.
When making architectural decisions, consider the benefits — and costs — of adopting core services specific to your primary cloud provider. Consider how you could/could replicate them in another vendor’s environment BEFORE you lock yourself in.
Given the increasing automation and speed we are seeing in financial services, it is only a matter of time before there is an event that really galvanizes regulators’ attention; Now is the time to think about diversifying your cloud infrastructure.
*** This is a syndicated blog from the Security Bloggers Network of The paranoid prose of Al Berg written by Al Berg. Read the original post at: https://paranoidprose.blog/2022/05/21/cloud-computing-concentration-and-systemic-risk/