Tom Hext, AI Product Manager at Redstor

Is AI vital to keeping backup data safe from malware?

Every day more than 350,000 new types of malware are unleashed on the Internet. The scale of the problem is so massive, it is no longer enough to have traditional anti-virus software, solely defending against known threats.

We spoke to Tom Hext, AI Product Manager at Redstor, about the problem of ‘zero-day threats’, why new, advanced protection technologies, using artificial intelligence, are needed and how machine learning is leading the way.

Redstor is a leading cloud data management provider delivering automated cloud backup, DR & archiving to businesses. Tom Hext, AI Product Manager at the company had this to say:

What cyber threats is AI helping to combat?
Malware frequently targets backup data. It will hide undetected inside business networks for longer than any retention policy, seeking out and infecting all backups, making malware-free recoveries hugely challenging.

Even with the very best security countermeasures in place, organisations cannot afford to rule out the prospect that one day soon their live environment will be compromised.

Once that happens, it quickly becomes a race against time to detect and halt an attack – before the impact becomes catastrophic.

In today’s on-demand world, making a speedy recovery is of paramount importance – but the race may already be lost if backup data is compromised too.

Security tools typically trap threats by matching malware signatures to databases of known harmful code, but more sophisticated threats avoid signature detection.

Malicious authors have quickly realised they can wreak havoc by writing single-use malware, never seen before by the security community.

Zero-day threats are particularly hard to spot because the very fact that they will not have been seen before means they will not match any known malware signatures.

This is where machine-learning has a vital role to play.

What exactly is machine learning?
Machine learning is exactly what it says on the tin. Whereas most forms of artificial intelligence involve teaching a machine to follow a set of rules or spot patterns, machine learning takes this one step further and teaches a machine to adapt, grow and become more intelligent by constantly presenting it with new data and challenges.

A machine learning model is able to respond to new results, unseen variables and in a short space of time can learn to recognise these as, in this case, a threat. Teaching a machine to be able to use data to reason and make decisions can automate an endless amount of complex processes and help to spot characteristics within data that would be impossible with the human eye.

When applied to malware detection, the hidden properties that make malware so hard to detect are the very things that a machine learning model is trained to recognise and constantly re-learn and when combining all of the characteristics that it has found and make a decision on whether than piece of data could become harmful.

How does machine learning help protect networks from zero-day threats?
Zero day threats, or those that are not commonly known are harder to detect because a machine has never seen them before and is therefore, in most cases, not trained to recognise its characteristics.

By training the model constantly with new ‘virus definitions’, or characteristics, we can help our model to recognise the newest forms of threat and detect them before they are able to cause damage.

Today, ML can thrive off a wide range of data on host, network and cloud-based anti-malware components, training itself with better accuracy than ever before.

This is crucial because malware is growing in sophistication as well as scope, and the risk of it inflicting huge operational and reputational issues on an organisation continues to rise.

To combat this, all good anti-malware software these days employs types of heuristic algorithms.

Good heuristics can prevent zero-day attacks, and a fine example of heuristic technology is machine-learning malware analysis.

Malware is evolving rapidly, so the algorithms must evolve rapidly as well. It’s a constant, ongoing process.

How important is ML as an additional layer of protection?
The National Cyber Security Centre strongly recommends deploying a multi-layer security strategy as the best way to thwart the increasing number of attacks that target both primary and backup copies of data.

When considering products to keep networks safe, security features that utilise machine learning and artificial intelligence should be high up on the list.

Organisations are now deploying ML to detect and remove malware after every backup from servers, laptops and the cloud.

Ringfencing backup data in this way provides additional protection – and much needed peace of mind.

No CEO or head of IT wants to be left waiting nervously for confirmation that backups are in a safe state.

So when the future of a business rests firmly on an organisation’s capability to restore mission-critical files, ML can help provide that extra reassurance.

Why is machine learning more prevalent now?
At the beginning of last year, the digital universe consisted of an estimated 44 zettabytes of data – by 2025, more than 10 times that amount is expected to be created EVERY 24 hours.

The capacity to collect and filter huge sums of information is too cumbersome for even a large workforce to undertake.

However, this age of ‘big data’ and massive computing allows artificial intelligence to learn through brute force.

Machine-learning anti-malware software can never be client driven, because even the PCs and mobile devices of the largest corporations are only exposed to small, limited samples of malware.

Proper ML requires ‘big data’ processing and cloud-based systems – and it is deployed a lot more frequently these days because effective technology is much cheaper.

Now that cloud servers are more available, ML malware analysis is more accessible too.

Is machine learning coming up with new ways of hunting malware?
Machine learning has a variety of approaches that it takes to a solution rather than a single method.

Another way in which ML enables improved detection, is by hunting malware based on behaviour modelling.

Bad-behaviour modelling looks at actions such as accessing saved passwords, local documents, browsing history, or contacts.

This limits malware detection tools to acting only on what they are programmed to do, whereas hunting models using good-behaviour modelling are much harder to circumvent.

For instance, machine learning will determine when an employee is most likely to log in to a network or access certain file shares.

So anything outside the norm will be flagged up, such as when:

  • An employee or device transfers huge volumes of data.
  • A connection is made to another network or device outside normal use or normal hours.
  • An employee uses programs or tools that do not fit with their remit e.g. a finance worker runs a network scan late at night.
  • An employee or device uses an excessive amount of computer resources such as CPU, GPU, or memory.
  • Human error is responsible for accidentally deleting data in a way that is out of context for normal behaviour.

For machine learning to be effective, good-behaviour modelling requires the capturing, analysis, and processing of massive amounts of data – and cloud-based services have made the processing power to do that far more affordable.

What should data controllers do next?
While the threat of malware is constantly evolving, the ML to combat it is too – and Redstor is already leveraging the latest technology to protect backup data.

When customers purchase automated malware detection as an added feature, every backup from a server, laptop and any other end-point machine or device will be checked for files that resemble malware in appearance or behaviour.

This provides a powerful additional layer of protection that complements existing antivirus software.

Users have nothing to configure, install or upgrade, there is no impact on internal resources and Redstor preserves the sanctity of customer data, which is encrypted at source, in transit and at rest.

When a suspicious file is detected, a notification then gives the user the option to delete the file, revert to a previous safe version, mark it as safe or leave it in quarantine.

Read the latest edition of PCR’s monthly magazine below:

Like this content? Sign up for the free PCR Daily Digest email service to get the latest tech news straight to your inbox. You can also follow PCR on Twitter and Facebook.


Check Also

AI will reshape the finance sector – here’s how

Artificial intelligence (AI) is set to play an ever-increasing role in financial services and will …