Securing artificial intelligence (AI) presents a complex and ever-shifting challenge. As AI technology evolves, so do the risks posed by new attack surfaces and new attack strategies. This requires an adaptive approach, with developers, organizations and governments continually assessing and updating security measures. As an industry, we must also recognize and counteract the risks inherent in rapid technological development, where security often becomes a secondary concern.
New guidance from NIST advances our understanding of AI risk today by offering an overview of current attack techniques, establishing a shared taxonomy, and reviewing current approaches to mitigation. However, as NIST computer scientist and publication contributor Apostol Vassilev noted, available defenses “lack robust assurances that they fully mitigate the risks” and better approaches are needed.
Toward this end, the security industry should advocate for a two-tiered approach that encompasses both predictive and proactive security to create safe and trustworthy AI systems. AI developers should anticipate and preemptively address potential attacks in the initial design phase by incorporating robust security measures into the AI system itself. Additionally, we recommend employing a novel framework that uses AI itself to proactively identify flaws in new AI systems and devise a resilient defense.
Mitigating known risks starts with defining specific security measures and protocols to guide AI development and deployment. For example, consider deploying a natural language processing (NLP) AI model for customer support chatbots in an e-commerce setting. In this scenario, implementing robust security measures within the core mechanisms of the NLP model can help prevent potential exploitation and abuse.
Examples of best-practice security measures for an AI customer support chatbot include:
Input validation and sanitization: Ensure the AI system incorporates stringent input validation mechanisms to sanitize user inputs effectively. This addresses preventing malicious actors from injecting harmful commands or attempting to manipulate the system through carefully crafted inputs.
Adversarial testing for NLP models: Implement thorough adversarial testing specifically tailored for NLP models. This involves exposing the model to intentionally crafted inputs designed to exploit vulnerabilities. By subjecting the AI system to various adversarial scenarios, developers can identify and fortify potential weak points, enhancing the model's resilience.
Continuous monitoring and anomaly detection: Establish a continuous monitoring system equipped with anomaly detection algorithms. This enables real-time identification of unusual patterns or deviations from the norm in the AI's behavior. Rapid detection allows for prompt mitigation of potential security threats, minimizing the impact of any malicious activities.
By incorporating these concrete security measures into the very fabric of the NLP AI model, developers can significantly enhance the security posture of the system.
One of the inhibitors to AI security has been the complexity of today’s massively multilevel neural network models and the mammoth size of their training datasets. As a result, large language models (LLMs) and other AI products exceed any human-scale effort to explore where vulnerabilities may lurk.
We can accomplish proactive security for these AI systems by using a framework of distinct AI components that do what’s beyond the scope of a human team. The goal of this innovative approach: create a robust security system applicable to diverse AI introductions with a one-time investment, fostering a continuous cycle of enhancement and fortification across an organization’s AI portfolio. The framework components are:
- New AI: The emerging AI system we aim to secure.
- Interpreter and Simulator AIs: An Interpreter is AI aimed at understanding the New AI's mechanism and educating both the Blue and Red Team AIs about it. The Simulator AI mimics the “New AI” and adds an extra layer of security testing, allowing the Blue and Red Team AIs to run their attacks and defenses without harming the actual "New AI."
- Red Team AI: An AI system tasked with identifying flaws in the New AI's mechanism. This AI takes on the role of an assailant, actively seeking and identifying potential weaknesses in the New AI's mechanism. This proactive approach lets us discover and rectify vulnerabilities before they can be exploited by real-world threats.
- Blue Team AI: An AI focused on countering Red Team attacks. The Blue Team AIs are designed to counteract the attacks identified by the Red Team, forming a resilient defense against potential threats. Additionally, it offers valuable insights to New AI developers, fostering a continuous feedback loop for ongoing improvement.
As AI technology evolves, so must the strategies to secure it. This requires an adaptive, collaborative approach across the AI and security communities, continually assessing and updating security measures in response to new threats and AI technological advancements.
Amir Shachar, chief AI scientist, Skyhawk Security