AI SEC2026· 6 min read

AI on the Attack Surface: What Security Teams Need to Know

Every major technology wave expands the attack surface. AI is expanding it in two directions simultaneously: new categories of attack that target AI systems directly, and AI-powered offensive capabilities that make existing attack techniques faster, cheaper, and more scalable. Security teams that are waiting for this threat landscape to stabilize before acting are already behind.

New Threat Categories: Attacking AI Systems

The deployment of large language models and AI systems into enterprise workflows has introduced attack categories that most security teams have no established playbooks for. These aren't hypothetical future threats — they're being actively explored and, in some cases, actively exploited against production systems today.

// PROMPT INJECTION

Malicious instructions embedded in user input or external data that override the AI system's intended behavior. Analogous to SQL injection but targeting model instructions rather than database queries. Particularly dangerous when LLMs have access to tools, APIs, or sensitive data.

// MODEL THEFT

Extracting a proprietary model's behavior and weights through systematic querying — effectively creating a functional copy without access to training data or architecture. Concerns are highest for organizations whose competitive advantage is embedded in model quality.

// DATA POISONING

Corrupting training data to influence model behavior at inference time. Backdoor attacks insert patterns that cause specific inputs to trigger malicious outputs while appearing normal otherwise. Supply chain attacks targeting public training datasets are a growing concern.

// ADVERSARIAL INPUTS

Carefully crafted inputs that cause misclassification or unexpected behavior in ML models. In security contexts, adversarial examples can be used to evade malware detectors, bypass content moderation, or fool computer vision systems used in physical security.

Indirect prompt injection deserves particular attention as enterprise LLM deployments proliferate. When an AI assistant is given access to external data — emails, documents, web content — an attacker can embed instructions in that external content that the model treats as legitimate commands. A malicious document that instructs the AI to exfiltrate calendar data when summarized, or a webpage that instructs the AI assistant to forward sensitive conversations, represents a new class of supply chain attack that traditional security controls are not designed to catch.

Prompt injection is the SQL injection of the AI era. The underlying dynamic is identical: user-controlled input is being interpreted as executable instructions rather than treated as data. The mitigation principles are also analogous — strict separation between instructions and data, input validation, and least-privilege access for AI systems.

LLM-Powered Offensive Capabilities

The defensive community focuses heavily on AI as a target, but the more immediate and measurable impact on most organizations is AI as an offensive tool. The barrier to entry for sophisticated social engineering attacks has collapsed. LLMs can generate grammatically flawless, contextually appropriate phishing emails at scale — tailored to specific targets, written in the target's native language, referencing real organizational context scraped from public sources. The volume, quality, and personalization of phishing that security awareness programs were designed for a year ago no longer matches what's landing in inboxes today.

Voice deepfakes represent a rapidly maturing threat that organizations are largely unprepared for. The CEO fraud variant — where an attacker calls a finance employee impersonating the CFO to authorize a wire transfer — is well documented and has been successful for years using simple voice-over-IP spoofing. With voice cloning technology now accessible via consumer tools using seconds of source audio, the fidelity of these attacks has improved dramatically. Documented cases of successful deepfake-based social engineering attacks against enterprises are no longer rare outliers. Verification protocols for high-value financial transactions, wire transfers, and credential resets need to be redesigned with this threat model in mind.

Automated vulnerability research and exploit development is another area where AI capabilities are materially improving attacker efficiency. While AI systems have not yet been shown to independently discover novel zero-day vulnerabilities at scale, they have demonstrably accelerated the pipeline from vulnerability disclosure to working proof-of-concept exploit — compressing timelines that security teams have historically relied on for patch deployment. The time between CVE publication and weaponized exploit has trended shorter for years; AI-assisted exploit development is likely to continue that trend.

Defending AI Systems

Security controls for AI systems are an emerging discipline without established industry standards, but several principles from conventional security translate directly. Input validation — treating all user-supplied content as untrusted and validating it against expected formats before passing it to a model — applies to LLM deployments just as it does to web applications. Output filtering — monitoring model outputs for sensitive data, unexpected content, or signs of prompt injection success — is analogous to output encoding in web security. Neither is sufficient alone; defense in depth requires both.

Access control for AI systems requires particular care. An LLM with access to a company's entire document store, email system, and API surface is a high-value target for indirect prompt injection attacks. The principle of least privilege applies directly: AI systems should have access only to the data and tools necessary for their specific function. An AI assistant for customer support doesn't need access to HR records or financial systems. Designing these boundaries requires explicit security review at deployment time, not as an afterthought after the system is in production.

AI-Native Security Tools and What Security Teams Should Do Now

The security vendor ecosystem has moved quickly to integrate AI into defensive tooling. AI-assisted alert triage, automated investigation playbooks, natural language query interfaces for SIEM data, and AI-generated detection rules are now mainstream features rather than differentiators. The practical value of these tools varies considerably — some genuinely improve analyst throughput and detection quality, others are primarily marketing features. Evaluating AI-powered security tools requires the same rigor as evaluating any security tool: what specific problem does this solve, how does it perform against realistic data, and what does the false positive rate look like in production?

For organizations deploying their own AI systems, security review should begin at the design phase. Key questions: what data does the model have access to, what actions can it take, how is user input separated from system instructions, how are outputs validated before acting on them, and how is model behavior monitored for anomalous changes? These questions don't require AI expertise to ask — they're applications of standard security design review principles to a new technology surface. Organizations that treat AI security as a specialized domain separate from their existing security practice will miss straightforward controls. Organizations that apply their existing security discipline to AI deployments will find they have a significant head start.

AI Security Prompt Injection LLM Deepfakes Adversarial ML Data Poisoning Social Engineering Threat Intelligence 2026

👨‍💻
Mayur Rele
Senior Director, IT & Information Security · Parachute Health

15+ years in DevOps, cloud, and cybersecurity. 700+ research citations. Scientist of the Year 2024.

← Back to all articles