The Silent Backdoor: Why AI Prompt Injection Could Be Your Next Data Breach

aarongreenman3
Aug 15
5 min read

Updated: Sep 15

Author: Aaron Greenman, Managing Director

AI adoption is accelerating through embedded in chatbots, internal knowledge tools, and agentic workflows. But as adoption grows, so do the vulnerabilities.

Among the most urgent is prompt injection: the manipulation of AI through seemingly benign input that causes it to reveal sensitive data, perform unauthorised actions, or bypass critical controls. Businesses need to be vigilant, as the risks posed by prompt injection are similar to the SQL injection crisis of the early web era.

What Is Prompt Injection and the Escalating Risk

At its core, prompt injection exploits an AI’s inability to distinguish between legitimate prompts and malicious input. Through crafted language, attackers can hijack the model’s behavior, often without code. This technique has been recognised by OWASP as the top Large Language Model (LLM) security risk in its 2025 update (source: https://en.wikipedia.org/wiki/Prompt_injection).

Moreover, indirect prompt injection, where malicious instructions are embedded in external content (e.g., email, documents and even emojis), poses a significant threat. When such content is processed by AI, instructions can be executed unwittingly, leading to data exfiltration, operational disruption, or misinformation. Microsoft has documented cases like resumes manipulated to bypass applicant screening, or email summaries engineered to leak sensitive information (source: https://techcommunity.microsoft.com/blog/microsoft-security-blog/architecting-secure-gen-ai-applications-preventing-indirect-prompt-injection-att/4221859).

The Hidden Insider Risk

Most discussions of prompt injection focus on external threats, but insider risk is a powerful vector:

Malicious insiders might inject compromised prompts into internal systems or documents that AI tools process.
Unwitting employees could introduce indirect prompt injections via shared files or plugins.
Combined with agentic architectures, such manipulations can trigger lateral movement, privilege escalation, or covert data exfiltration.

For example, an insider could alter internal templates or knowledge documents that AI tools use as context, triggering actions that compromise sensitive data. This highlights the need for strong internal governance, not just external perimeter controls.

The Role, and Risk of Model Context Protocol (MCP)

What is MCP?

MCP, introduced by Anthropic in November 2024, standardises how AI models connect to external tools and data sources. Think of it as the “USB‑C port for AI”, a universal interface that simplifies integration with services like file systems, Slack, CRMs, or databases.

Why is it used?

Fast, scalable integrations are key to modern AI systems. MCP enables development teams to plug in AI services without building bespoke connectors for each tool, accelerating deployment and enabling more sophisticated agentic workflows

Security concerns

Unfortunately, all this convenience comes with serious risks:

Expanded attack surface: Each MCP server grants AI agents access to tools/data, potentially exposing systems to broad access and misuse
Tool poisoning and lookalikes: Malicious MCP servers can masquerade as trusted providers, exfiltrating data or executing unauthorised commands
Cross-server exfiltration: Recent research demonstrated how even minimal, seemingly benign MCP servers (e.g., weather data) can be chained to discover and exploit banking tools, enabling data theft by relatively unskilled attackers
Token theft and unauthorised access: Without robust IAM and validation, MCP systems may expose credentials or elevate access beyond intended scope

In response, organisations are beginning to build platforms and governance around MCP deployments. For instance, Archestra is developing an open-source, security-first platform for enterprise MCP use and Obot AI has launched a gateway to securely manage MCP server adoption. However, adoption of such tools is limited, and many organisations may be unaware of their current MCP exposure.

Recent Context: Promptware & Expanding AI Attack Landscape

Prompt injection isn’t just a niche technical vulnerability, it’s a business risk with enterprise-wide consequences. The same AI integrations that boost productivity and customer engagement are often given direct access to sensitive systems, proprietary data, and automated decision-making. If those AI channels are compromised, attackers or even careless insiders can use them as a backdoor into the very heart of the business.

The external threat landscape is evolving rapidly. Recently, researchers uncovered a vulnerability in Google’s Gemini AI, where malicious Google Calendar invites called “promptware” can trigger hidden prompts that disclose user data or enable spam (source: https://www.tomshardware.com/tech-industry/cyber-security/googles-ai-could-be-tricked-into-enabling-spam-revealing-a-users-location-and-leaking-private-correspondence-with-a-calendar-invite-promptware-targets-llm-interface-to-trigger-malicious-activity).

Strengthened Defense-in-Depth: Four Key Layers

Protecting against AI prompt injection and MCP-related threats requires a security model that is as dynamic as the risks themselves. Rather than relying on a single “magic bullet” tool, organisations need continue expanding security layers to integrate AI technical safeguards, process controls, and governance measures across the entire ecosystem.

This means reinforcing defences at the web application level, within the AI models themselves, at the data and integration layers, and through strong insider threat governance. Each layer plays a distinct role, and together they create a resilient barrier against evolving attack methods:

1. Web & Application Layer

Enforce input validation and output encoding.
Patch API gateways and restrict browser-level vulnerabilities.

2. AI Model Layer

Deploy AI guardrails / classifiers on both inbound prompts and outbound responses.
Incorporate detection strategies for indirect prompt injection, e.g., monitor documents and data sources processed by AI.

3. Data & Tools Layer

Enforce least privilege on API and MCP tool access including restricting write access where it is not required.
Conduct tool vetting, token scope restriction, and continuous logging.

4. Governance & Insider Controls

Implement access reviews on MCP servers and AI-fed data sources.
Monitor human and AI agent behaviour for anomalous modifications to internal prompts or documents.
Educate employees on safe content submission and versioning.

Board-Level Actions

The nature of prompt injection risk means it can’t be treated as a purely technical demands executive visibility and ownership, both because of the scale of potential impact and the cross-functional nature of the response.

Boards and leadership teams must ensure AI security is embedded into governance, risk, and compliance structures, with clear accountability and the resources to execute.

These actions provide a practical starting point for turning strategy into measurable protection.

Initiate organisation-wide discovery of AI tools and MCP usage (including shadow IT).
Establish AI-specific security policies and governance, managed cross-functionally.
Integrate AI prompt injection and MCP risk scenarios into red teaming and pen testing (including insider threat simulations).
Monitor AI outputs and agent activity using real-time detection systems.
Assign clear security ownership, no AI project should proceed without InfoSec oversight.

Final Thoughts

Prompt injection is not a future risk, it’s here, and it's evolving. When combined with agentic AI systems and MCP, it magnifies the threat dramatically.

But this is not unsolvable. A well-crafted, layered approach grounded in defense-in-depth, identity-aware protocols, and insider risk controls can give the organisation a strong shield.

The time to act is now, before AI becomes your weakest link.

How Spherion Can Help

At Spherion, we understand that securing AI is not just a technology challenge – it’s a governance, risk, and assurance imperative. Our team combines deep expertise in cybersecurity, data governance, and internal audit with practical, hands-on experience in AI risk assessment and control design.

We can help your organisation minimise prompt injection and MCP-related risks by:

AI Risk Discovery & Assessment – Mapping where AI (including shadow AI) is in use across your enterprise, identifying potential prompt injection and MCP exposure points.
Governance & Policy Development – Designing AI usage policies, control frameworks, and approval processes to ensure secure deployment from day one.
Technical Control Testing & Red Teaming – Simulating real-world prompt injection and indirect prompt injection attacks, including insider threat scenarios, to validate your defences.
MCP Security & Access Hardening – Reviewing and tightening MCP configurations, applying least-privilege access, and securing API/tool integrations.
Continuous Monitoring & Assurance – Implementing ongoing oversight, anomaly detection, and reporting processes to protect AI-enabled systems in real time.
Training & Awareness – Equipping your teams to recognise, avoid, and respond to prompt injection attempts, both external and insider-driven.