Machine Learning Models: A Dangerous New Attack Vector

Machine Learning Models: A Dangerous New Attack Vector

Threat actors can hijack machine learning (ML) models that power artificial intelligence (AI) to deploy malware and move laterally across corporate networks, researchers have found. These models, which are often publicly available, serve as a new launchpad for a range of attacks that can also poison an organization’s supply chain – and businesses need to be prepared.

Researchers from the SAI team at HiddenLayer have developed a proof-of-concept (POC) attack that shows how a malicious actor can use ML models – the decision-making system at the heart of almost all modern cloud-powered solutions. IA – to infiltrate corporate networks, they revealed in a blog post published on December 6. The research is credited to Tom Bonner of HiddenLayer, senior director of adversarial threat research; Marta Janus, principal researcher on contradictory threats; and Eoin Wickens, senior researcher on contradictory threats.

A recent report by CompTIA found that over 86% of CEOs surveyed said their respective companies were using ML as a mainstream technology in 2021. Indeed, solutions as broad and varied as self-driving cars, robots, medical equipment , missile guidance systems, chatbots, digital assistants, facial recognition systems, and online recommendation systems depend on ML to function.
Due to the complexity of deploying these models and the limited computing resources of most companies, organizations often use open source model sharing repositories in their deployment of ML models, where the problem lies, the researchers said. .

“These repositories often lack comprehensive security controls, which ultimately passes the risk to the end user – and attackers rely on it,” they wrote in the post.

Anyone who uses pre-trained machine learning models obtained from untrusted sources or public model repositories is potentially exposed to the type of attack demonstrated by the researchers, said Marta Janus, principal ML researcher at HiddenLayer. , at Dark Reading.

“Additionally, businesses and individuals that rely on trusted third-party models may also be exposed to supply chain attacks, in which the provided model has been hijacked,” she says.

An advanced attack vector

The researchers demonstrated how such an attack would work in a POC focused on the open-source PyTorch framework, also showing how it could be extended to target other popular ML libraries, such as TensorFlow, scikit-learn, and Keras.

Specifically, the researchers embedded a ransomware executable into the model weights and biases using a technique akin to steganography; that is, they replaced the least significant bits of each float in one of the model’s neural layers, Janus explains.

Then, to decode the binary and run it, the team used a flaw in the PyTorch/pickle serialization format that allows loading arbitrary Python modules and runtime methods. They did this by injecting a small Python script at the start of one of the model files, preceded by an instruction to run the script, Janus explains.

“The script itself reconstructs the payload from the tensor and injects it into memory, without dumping it to disk,” she says. “The hijacked model is still functional and its accuracy is not visibly affected by any of these changes.”

The resulting weaponized model evades current detection by antivirus and endpoint detection and response (EDR) solutions while suffering only a very insignificant loss in effectiveness, the researchers said. Indeed, today’s most popular anti-malware solutions provide little or no support for detecting ML-based threats, they said.

In the demo, the researchers deployed a 64-bit sample of the Quantum ransomware to a Windows 10 system, but noted that any bespoke payload can be distributed this way and tailored to target different operating systems, such as Windows, Linux and Mac, as well as other architectures, such as x86/64.

The business risk

For an attacker to leverage ML models to target organizations, they must first obtain a copy of the model they wish to hijack, which in the case of publicly available models is as simple as downloading it. from a website or extract it from an application using it.

“In one possible scenario, an attacker could access a public model repository (such as Hugging Face or TensorFlow Hub) and replace a legitimate benign model with its Trojan version that will run the embedded ransomware,” says Janus. “Until the breach is detected, anyone who downloads the Trojan model and loads it onto a local machine will be ransomed.”

An attacker could also use this method to carry out a supply chain attack by hijacking a service provider’s supply chain to distribute a Trojan model to all service subscribers, she adds. “The backdoor model could provide a foothold for further lateral moves and allow adversaries to exfiltrate sensitive data or deploy other malware,” Janus said.

The business implications for a company vary, but can be serious, the researchers said. They range from the initial compromise of a network and subsequent lateral movement to the deployment of ransomware, spyware, or other types of malware. Attackers can steal data and intellectual property, launch denial of service attacks or even, as mentioned, compromise an entire supply chain.

Mitigation and Recommendations

The research is a warning to any organization using pre-trained ML models downloaded from the internet or provided by a third party to treat them “like any untrusted software,” Janus says.

Such models should be scanned for malicious code – although there are currently few products that offer this functionality – and undergo thorough evaluation in a secure environment before being run on a physical machine or put into production. she tells us.

Additionally, anyone producing machine learning models should use secure storage formats – for example, formats that do not allow code execution – and cryptographically sign all their models so that they cannot not be altered without breaking the signature.

“Cryptographic signing can ensure model integrity the same way it does for software,” says Janus.

Overall, the researchers said undertaking a security posture of understanding risks, addressing blind spots, and identifying areas for improvement in terms of the ML models deployed in an enterprise can also help mitigate a attack of this vector.

#Machine #Learning #Models #Dangerous #Attack #Vector

Leave a Comment

Your email address will not be published. Required fields are marked *