What are AI adversarial attacks?

AI adversarial attacks are deliberate manipulations of AI system inputs designed to cause incorrect outputs, extract sensitive information, or degrade model performance. They include techniques like adversarial examples, prompt injection, data poisoning, and model evasion.

How have AI attacks evolved recently?

AI attacks have evolved from academic proof-of-concepts to sophisticated real-world threats. Recent trends include multi-turn prompt injection against conversational AI, cross-model attacks that transfer between LLMs, supply chain attacks on ML pipelines, and automated adversarial tools available to low-skilled attackers.

How can organisations protect against AI adversarial attacks?

Protection against AI adversarial attacks requires a layered approach including input validation and filtering, adversarial training, model monitoring for anomalous behaviour, red team testing following OWASP LLM Top 10, and establishing AI-specific incident response procedures.

AI Adversarial Attacks: What Changed in 2025

The Evolution of Adversarial Attacks

2025 marked a turning point in adversarial machine learning. As AI systems became more prevalent in critical infrastructure, the sophistication and frequency of adversarial attacks increased dramatically. Adversarial attacks on AI involve deliberately crafted inputs designed to cause machine learning models to make incorrect predictions or behave unexpectedly. Unlike traditional cyberattacks that exploit software bugs or misconfigurations, adversarial attacks target the model's decision-making logic itself. In 2025, we saw real-world incidents including adversarial images that bypassed facial recognition systems in physical access control, manipulated loan application data that evaded fraud detection models, and prompt injection attacks that subverted customer-facing chatbots into revealing internal system prompts. Traditional security tools such as firewalls, intrusion detection systems, and endpoint protection are not designed to detect these threats because the attack payload is often a perfectly valid input that happens to fall on the wrong side of the model's decision boundary.

Key Attack Trends from 2025

1. Model Poisoning at Scale

Supply chain attacks targeting training data became a major concern. Attackers developed more sophisticated techniques for injecting poisoned data into large-scale training pipelines.

2. Multimodal Attack Vectors

As multimodal AI systems gained adoption, attackers began exploiting vulnerabilities across different data types—combining image, text, and audio inputs to bypass traditional defenses.

3. Automated Attack Tooling

The emergence of automated adversarial attack frameworks lowered the barrier to entry, allowing less sophisticated actors to execute complex attacks.

Notable Incidents

Financial Sector: Several banks reported adversarial attempts against fraud detection systems
Autonomous Vehicles: Research demonstrated successful attacks against vision systems in varied conditions
Healthcare: Medical imaging AI systems showed vulnerability to adversarial perturbations

Defense Advancements

2025 also saw significant progress in defensive techniques:

Improved adversarial training methods
Better detection systems for identifying adversarial inputs
Standardization of AI security testing frameworks

Preparing for 2026

Organizations deploying AI systems must prioritize adversarial robustness through structured, repeatable testing. This starts with adversarial testing during model development, where security teams generate adversarial examples against the model and measure its resilience. From there, organizations should conduct regular red team exercises that simulate real-world attack scenarios specific to their deployment context. A financial services firm using ML for credit scoring, for example, needs testing that covers adversarial manipulation of applicant data, model extraction through API probing, and data poisoning scenarios where attackers influence future training data. Continuous monitoring in production completes the picture: tracking model inputs for statistical anomalies, logging prediction confidence scores, and alerting when drift patterns suggest an active adversarial campaign. In Singapore, the Cyber Security Agency (CSA) has been pushing for more rigorous AI security testing, and financial institutions face additional scrutiny under MAS guidelines on technology risk management.

Attack Techniques That Defined 2025

Three attack categories dominated the adversarial ML landscape in 2025. Evasion attacks, where inputs are perturbed just enough to cross a model's decision boundary, moved beyond image classifiers into tabular data and time-series models. Researchers demonstrated that adding carefully crafted noise to loan application features — income, debt ratio, employment length — could flip a fraud detection model's output from "flagged" to "approved" with a success rate above 80%. These weren't theoretical lab conditions. The perturbations were small enough to stay within realistic input ranges, making detection by business rules nearly impossible. Organizations running ML security assessments need to test whether their models hold up against adversarial examples crafted for their specific feature distributions and decision thresholds.

Model extraction attacks also accelerated. By systematically querying an API endpoint with targeted inputs and recording outputs, attackers can reconstruct a functional copy of proprietary models. In 2025, several SaaS companies discovered their classification models had been cloned through API probing, with competitors deploying near-identical models within weeks. Data poisoning attacks targeting the supply chain proved equally damaging: attackers submitted manipulated samples to crowd-sourced labeling platforms, corrupting training datasets before the model was even built. Defense against supply chain poisoning requires auditing data provenance, running anomaly detection on incoming training samples, and maintaining strict controls over who can contribute to training pipelines.

Regulatory Response in Singapore

Singapore's regulatory bodies have started treating AI security as a governance obligation rather than a technical afterthought. The Monetary Authority of Singapore (MAS) updated its Technology Risk Management (TRM) guidelines to address AI and ML model risk explicitly, requiring financial institutions to validate model integrity, monitor for adversarial manipulation, and maintain incident response procedures for AI-specific attacks. Banks and payment processors deploying ML for credit scoring, transaction monitoring, or customer onboarding must now demonstrate that their models have been tested against adversarial inputs — not just for accuracy, but for resilience.

The Cyber Security Agency of Singapore (CSA) released advisory guidance on securing AI systems, covering threat modeling for ML deployments, supply chain integrity for training data, and the importance of adversarial red team testing. For organizations in healthcare, the Synapxe (formerly IHiS) framework for healthtech systems imposes similar requirements on any ML model processing patient data. Government agencies deploying AI for citizen-facing services face scrutiny under the Smart Nation digital governance standards. The pattern is consistent: regulators expect evidence of adversarial testing, not just model accuracy metrics.

Industry-Specific Impact

Adversarial attacks don't affect every sector the same way. In fintech, the attack surface centres on fraud detection and credit decisioning models. A successful evasion attack against a real-time transaction monitoring system means fraudulent transfers go through undetected. In healthcare, adversarial perturbations to medical imaging inputs — MRI scans, X-rays, pathology slides — can cause diagnostic models to miss tumors or misclassify conditions. The stakes are clinical, not just financial. Several research teams demonstrated that adversarial noise invisible to radiologists could cause chest X-ray classifiers to flip diagnoses for pneumothorax with over 90% success rates.

Government and critical infrastructure face a different flavour of risk. AI models used for threat intelligence correlation, network anomaly detection, or physical security surveillance can be manipulated to generate false negatives during actual attacks. A nation-state adversary doesn't need to breach the network if they can blind the monitoring AI. Organizations running operational technology (OT) environments with AI-based anomaly detection should consider red team exercises that specifically test whether adversarial inputs can suppress alerts during a simulated intrusion. The common thread across industries: if a model makes decisions that affect safety, money, or access, it needs adversarial testing proportional to the consequences of a wrong decision.

Practical Defense Measures

Defending against adversarial attacks requires a different mindset than traditional cybersecurity. Patching a vulnerability in software replaces a bug with a fix. Hardening a model against adversarial inputs is an ongoing process because the attack surface shifts as the model encounters new data distributions. The most effective defense strategy combines three layers. First, adversarial training: augmenting the training dataset with adversarial examples so the model learns to classify them correctly. This raises the cost of attack but doesn't eliminate it — attackers can generate new adversarial examples against the hardened model. Second, runtime monitoring: tracking input statistics, prediction confidence distributions, and feature value distributions in production to flag inputs that look statistically unusual before the model processes them.

Third, structural defenses that limit what an attacker can achieve even if they successfully craft an adversarial input. These include input preprocessing (quantization, compression, or randomization that destroys the carefully crafted perturbation), ensemble methods where multiple models vote on the prediction, and output calibration that refuses to return high-confidence predictions on inputs the model hasn't seen during training. No single defense is sufficient. The goal is to make attacks expensive enough that adversaries move to easier targets. Organizations should treat adversarial testing as a continuous discipline — integrated into the ML development lifecycle, not bolted on before deployment. A structured approach following ML security assessment methodology provides the framework for making this repeatable and measurable.

Need Help Securing Your AI Systems?

Bravix Infosecurity provides AI red teaming and adversarial testing for organizations in Singapore. CREST-certified consultants, real attack simulation, actionable findings.

View AI Security Services