What We Call Security: Simulation for Remediation and Design (6/7)

I personally feel like the field of storage has come a long way over the last 20 years. What used to be thought of mundane tape backups has potentially become one of the most innovative spaces in IT, with further innovation in solutions for cloud, virtual, and on-premises alike. Within it are some powerful opportunities for security practitioners I feel need to be better known and understood.

Today I want to focus particularly on how Thin Digital Twins can be leveraged to get so much more out of storage (and recovery). They were what started my deeper interest in storage from a security perspective. My reason for looking into them? To increase my ability to change the unchangeable.

Most security professionals are more than a little familiar with the frustrations of dealing with legacy systems, applications, and networks (we’ll refer to all these as “systems” going forward). Systems for which they need to mitigate risks in evermore complicated ways due to their inflexibility, lack of built-insecurity, absence of support, and sometimes also a lack of documentation.‍

Note: Fairly new virtual and cloud-based systems are perfectly able to fall into the troublesome legacy category that we usually associate with more traditional on-prem systems if there was poor architecture in systems, applications, and process!‍

Worse still, legacy systems are frequently associated with fundamental (and critical) business processes. This often results in strong resistance by the business to us doing anything even remotely invasive with them for fear of disaster.

In an earlier instalment of this series, I talked about how a significant part of the purpose of a security programme is to define how things should be built so that new systems meet our security requirements. The same applies here, but for many of these legacy systems it’s likely to be years before they are replaced. As a result, security teams spend vast amounts of resource adding compensating controls and managing risk around them.

But what if we could wipe out that hesitancy by the business? What if we could modify network configurations, code, system design, update software to versions we don’t even know will work, do aggressive, potentially destructive, penetration testing, deploy patches with abandon, in a production environment context, all without needing to be concerned about the business impact, or even needing to go through change control?

Remediation that would take years could be done in weeks. Certainly in less time than it would take just to roll out another “compensating control” when not allowed to touch the systems, networks, or applications themselves.

Of course, the business impact and risk appetite of the organisation would never allow that. But what if there was no impact? What if the worst possible outcome of our intervention into these legacy systems didn’t disrupt a thing?

The business wouldn’t really care then, would they? But how would you do that? Those two things are diametrically exposed, mutually exclusive, you can’t have it both ways.

Or can you? Enter the world of Thin Digital Twins.

In this article I am referring to Thin Digital Twins in the context of VM2020’s CyberVR, but the general concept of general concept of a Digital Twin is a functional digital copy of a system (which can be a complex system of systems, such as an entire environments) that can be used for simulation, validation, and modelling… under the same exact conditions as the production one.

The Thin bit is the clever part and involves some really interesting patents. It allows you to functionally recreate complex systems (and whole environments) using only a small fraction of the resource.

And that’s where the practicality comes in. Most organisations do not have the spare compute, memory, or storage to allocate in order to create full copies of their environment, but by using Thin Digital Twins it becomes possible to create functional replicas using only a small amount of spare capacity.

This means environmental changes can be tested under conditions equivalent to full production by using clones of production that replicate all environmental factors.

What does this mean to us?

Well, in the case of legacy systems, we can now generate a materially identical instance of any system (which we can pull from storage as our backups are essentially “dead” copies of our environments waiting to be brought to life) and do anything we want to them without consequence.

We can try integrating new functionality, code changes, testing the impacts of updates or patches, or even rip out a system entirely and replace it with something else to see if there’s any impact to the business process. We can alter one system at a time to see if there are impacts with other replicated systems, or we can make comprehensive changes to our whole environment.

We can find out if configuration hardening changes have any negative impact on the system’s needed functionality, or even aggressively pen-test “production” systems without any fear of consequence and gain insights we never could before.

In previous instalments we talked about how reactive security practices (and having to put in “compensating controls” is definitely one of those) keeps us from working on the long-term strategic changes. The kind that can sustainably improve an organisation’s security posture to not only reduce risk but also reduce the cost to the business and our workload.

We discussed how outputs from business processes resulted in more and more risk to manage (and, consequently, work) due to defects caused by a lack of security thinking and integration in the creation of those processes.

If you think of removing the security defects of these processes like picking up stones on the road that are damaging passing cars, then legacy system are a bit like boulders; We can clean up the smaller things relatively easily and make sure new stones don’t end up on the roadway, but the boulders can’t be moved and we’re going to have to do all sorts of things to keep them from causing incidents.

Thin Digital Twins can give us the leverage and power to clear these boulders, often our biggest obstacles, much faster.

They’re also powerful tools for any kind of modelling of new systems, or any changes or interventions to existing systems that usually incur some resistance.

They help us prevent more from happening by enabling us to reduce existing security risks more aggressively (without us causing business risks), and better ensure that new systems aren’t introducing new ones.

In a breach recovery scenario, the ability of Thin Digital Twins to replicate an environment with fractional resource allows us to perform forensics in parallel to recovering the full environment.

This means we can immediately proceed to recovery and then only focus on cleaning up the parts of the environment that need to be. In the past, forensics and clean-up had to happen before the recovery could run because there simply wouldn’t be enough spare computing resource to run a full parallel environment.

This parallelisation of forensics and recovery saves time, and as we know, during a recovery operation, time is everything.

There’s another element of parallelisation that is unequalled with the VM2020 offering. It relates to just how fast data can be pulled out of your backups.

In past instalments I’ve mentioned recovery times of four hours to give an example. These are not what most people would consider realistic figures due to the incredible I/O loads involved in moving data out of immutable storage.Recovery times of two to three weeks are likely more typical, though they would include several days of forensics before recovery (restoring data) was started in earnest.

So why have I been making arguments with unrealistic figures?

Well, they aren’t unrealistic anymore, because VM2020’s solution is so effective at optimising this process that in recent tests using Hitachi Vantara’s Ops Center Protector the recovery of more than 1,500 virtual machines with over 100TB of data was achieved in 70 minutes.‍

70minutes. Think about how being able to recover that much in such a short time changes our risk management calculations! Think about how much that limits the impacts of any risk, of any disruption becoming a major one.‍‍

‍Note: While those figures were achieved on virtual machines, similar performance is possible both for cloud and physical systems due to patented technology allowing CyberVR to instantly virtualise and “devirtualise” systems and apply the same process.‍

As mentioned previously, that makes the combination of Hitachi Vantara and VM2020the fastest recovery solution in the world. Read for yourself using the links below the article.

https://www.hitachivantara.com/en-us/pdf/solution-profile/worlds-fastest-ransomware-recovery-from-immutable-snapshots-vmware-environments.pdf

https://www.hitachivantara.com/en-us/insights/using-thin-digital-twins-to-gang-up-on-ransomware.html

And that seems like a pretty good place for me to end this instalment. I’ll leave you with a few links to find out more about VM2020 Solutions and hope you’ll join me for our final and perhaps more conventional instalment where we have a look at what a good recovery looks like and how we should prepare to be able to execute it and leverage all the benefits at our disposal.

‍

Next article: What We Call Security: The Importance of Recovering Well (7/7)

What We Call Security: Simulation for Remediation and Design (6/7)

Make the shift today towards proven cyber resilience

Info

general

Newsletter