6K-plus AI models may be affected by critical RCE vulnerability

A critical vulnerability in a popular Python package for large-language models (LLMs) may affect more than 6,000 models and could lead to supply chain attacks.

Click for more special coverage

The open-source llama-cpp-python package was found to be vulnerable to server-side template injection, which could lead to remote code execution (RCE). The flaw, tracked as CVE-2024-34359, was discovered by Patrick Peng, a security researcher and developer who goes by retro0reg online.

The llama-cpp-python package provides Python bindings for the widely popular llama.cpp library; llama.cpp is a C++ library to run LLMs like Meta’s LLaMA and models from Mitral AI on one’s own personal computer. The llama-cpp-python package further enables developers to integrate these open-source models into Python.

CVE-2024-34359, which has a critical CVSS score of 9.7, risks RCE due to an improper implementation of the Jinja2 template engine. The flaw allows chat templates stored in metadata to be parsed by Jinja2 without sanitization or sandboxing, creating an opening for attackers to inject malicious templates, Peng explained in a blog post.

Peng uploaded a proof-of-concept exploit for the vulnerability on Hugging Face, demonstrating how a model compromised with a malicious template could execute arbitrary code upon loading or initiating a chat session. Peng’s blog post also describes how the malicious code can be injected into model downloaded as a .gguf file, a common file format for sharing natural language processing (NLP) models on open-source hubs such as Hugging Face.

More than 6,000 models on Hugging Face use llama_cpp_python, Jinja2 and the gguf format, according to Checkmarx. A threat actor could potentially download a vulnerable model, inject the .gguf metadata with their own malicious template, and redistribute the model for supply chain attacks on unsuspecting AI developers.

A fix for CVE-2024-34359 was added in version 0.2.72 of llama_cpp_python last week. This version adds input validation and sandboxing measures while rendering templates.

“The discovery of CVE-2024-34359 servers as a stark reminder of the vulnerabilities that can arise at the confluence of AI and supply chain security. It highlights the need for vigilant security practices throughout the lifecycle of AI systems and their components,” the Checkmarx blog post concludes. “As AI technology becomes more embedded in critical applications, ensuring these systems are built and maintained with a security-first approach is vital to safeguard against potential threats that could undermine the technology’s benefits.”