NIST report unveils the tactics behind machine learning exploits

NIST report unveils the tactics behind machine learning exploits

National Institute of Standards and Technology (NIST) recently published a comprehensive report titled Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitigations (NIST AI 100-2e2025), which addresses the increasing threats posed to artificial intelligence (AI) systems by adversarial machine learning (AML) techniques. The report provides a thorough analysis of the different types of attacks that can be leveraged against predictive (PredAI) and generative AI (GenAI) systems. It also offers insights into effective strategies for mitigating these attacks, aiming to create a framework for improving the robustness and security of AI applications.

The growing reliance on AI, from autonomous vehicles to financial services and healthcare, makes understanding and managing these risks essential. With AI models becoming increasingly complex and integrated into real-world systems, adversarial attacks—such as those manipulating training data, exploiting model vulnerabilities, or breaching privacy—pose significant threats. These attacks have been shown to manipulate outcomes in ways that are harmful not only to the systems themselves but also to the organisations and individuals who depend on them.

What does the report reveal?

The report unveils a comprehensive taxonomy for categorising adversarial machine learning attacks. It breaks down the types of adversarial behaviours that can compromise AI models, providing a clear structure for understanding how and when these attacks occur. The attacks are classified based on key factors such as:

  • Stages of machine learning lifecycle: Whether the attack is launched during the training phase or once the model is deployed, each phase presents unique vulnerabilities.
  • Attacker objectives: These range from availability breakdowns (disrupting system functionality) to integrity violations (forcing models to misclassify or malfunction) and privacy compromises (exposing sensitive data).
  • Capabilities and knowledge: This section categorises attackers based on the tools and knowledge they have at their disposal, from ‘white-box’ attackers with full knowledge of the system to ‘black-box’ attackers who only have limited access.

The report draws attention to the importance of security, resilience, and robustness in AI systems, as these are the key components of trustworthy AI. The report’s methodology aligns with the NIST AI Risk Management Framework, which is crucial for businesses and organisations that are increasingly deploying AI technology.

Key kighlights:

  1. Comprehensive attack taxonomy:
    • The report introduces a well-organised taxonomy of adversarial machine learning (AML) attacks, distinguishing between attacks on PredAI systems (such as evasion, poisoning, and privacy breaches) and GenAI systems (which include poisoning and prompt injection).
    • It also outlines the stages at which attacks can occur—during training, deployment, or even post-deployment, helping organisations pinpoint vulnerable stages in the AI lifecycle.
  2. Mitigation strategies:
    • Along with categorising attacks, the report provides an array of mitigations for each type of attack, including data poisoning countermeasures, adversarial training techniques, and methods for improving model robustness.
    • These mitigations are crucial in ensuring the security of AI models, especially as attacks evolve to bypass existing defences.
  3. Attack classification based on knowledge:
    • One of the report’s unique contributions is its breakdown of adversarial attacks by the level of knowledge the attacker has—white-box, black-box, and gray-box. This classification helps security professionals understand the range of risks posed by different types of adversaries and plan defence strategies accordingly.
  4. Multimodal attack insights:
    • The report touches upon the emerging threat of multimodal attacks, which involve adversarial manipulation across multiple data types, such as combining image, text, and audio data. This insight highlights the growing complexity of AI models and the need for more robust defences that can handle diverse attack vectors.
  5. Real-world case studies:
    • The report offers real-world examples of evasion attacks, such as those targeting face recognition systems and phishing detection models. These case studies underline the tangible risks of adversarial machine learning and the importance of proactive security measures.

Why does it matter?

As AI continues to shape industries, ensuring its integrity, privacy, and availability is no longer optional—it’s essential. The NIST report matters for several reasons:

Regulatory and standards development: By defining a common language for AML, the report lays the groundwork for future regulatory frameworks and industry standards that will ensure AI’s responsible and secure deployment.

Security of critical systems: Adversarial attacks have already caused real-world disruptions in sectors like healthcare, autonomous driving, and finance. A breach in these areas can lead to catastrophic consequences, from data leaks to system failures.

Framework for security: The report provides a framework for understanding and mitigating AML attacks, which is crucial for developers, security teams, and policymakers striving to create secure AI systems.

Evolving threat landscape: The report acknowledges the rapid evolution of adversarial techniques, making it clear that traditional methods of securing software are insufficient for AI systems. New, AI-specific solutions are necessary to stay ahead of these emerging threats.

Go to Top