Everyone is excited by the potential of agentic AI. Give an autonomous AI agent a task, and it can reason, make a plan, take action through tools and services and adapt its approach as required until the job is done, all with minimal human intervention. Agents are starting to work in teams with other agents (and humans) to take on complex enterprise workflows.
But all this power and autonomy also brings significant security risks and governance challenges. This article explores some of the insights on this topic that Jeff Crume, IBM Distinguished Engineer and Security Expert, highlights in a couple of recent IBM videos here and here, with Josh Spurgin, an AI Customer Success Engineer.
AI agents amplify security challenges
The first thing to take note of is that AI agents come with some unique security challenges. Unlike rules-based systems which consistently perform the same actions in a pre-defined way, agentic AI is probabilistic. Decisions are based on probabilities. Even with the same set of inputs an agent can make different decisions and produce different outputs. Moreover, because agents learn and adapt based on context and feedback, it is even more difficult to test and predict what they will do.
Another factor is that agentic systems typically connect to one or more tools, databases, apps, browsers, APIs and internal systems. This opens more avenues that can be exploited for malicious reasons.
To make things even more concerning, with AI agents, AI decisions are embedded in automated workflows. In other words, autonomous agents can repeat the same bad action or poor decision repeatedly and at scale. So, the impact of compromised security can be compounded before anyone even notices.
The security risks
One of the biggest threats is 'highjacking'. This is where a bad actor takes control of an agent and makes it operate on their behalf.
This can happen through prompt injection: sophisticated prompts are used to deliberately confuse agents and trick them into doing things or making decisions they were never supported to do, like transferring funds, modifying internal records or releasing confidential information.
Another risk is model infection, when the underlying AI model behind the agent is infected by a virus that compromises it in ways that alter its behaviour (similar to viruses infecting software). In some cases, AI specific viruses attack how AI models work (for example by changing the model weights). this makes it even more important for organizations to verify that their AI models come from trusted suppliers. Other threats include:
Data poisoning: This is when the underlying data sources that a model is trained on are modified. The attackers manipulate the data used to train, fine-tune, or update the model so that it learns the wrong behaviour. In that case, the model's decisions are likely to be incorrect or biased.
AI evasion attacks: These involve bad actors altering or manipulating the inputs that go into AI systems at the inference stage when they are making decisions. Again, the aim is to confuse the agents and get them to take the wrong actions.
Extraction attacks: These occur when bad actors try to harvest IP or confidential information from the model. For example, they can repeatedly query the AI to learn how the model operates, and even to recreate it. This intelligence can then be used to manipulate the AI agents into revealing sensitive information, such as confidential internal documents or customer details.
Governance concerns
Strong security and governance around agentic AI is paramount. But its non-deterministic and adaptive nature makes it difficult to reliably predict how a system will react each time, making governance even more challenging.
Josh Spurgin gave an example of a fictitious recruitment company that was using AI agents to autonomously read candidate resumes and make job offers. On one occasion, the agentic system made an offer to the wrong candidate and was sued for discrimination. But with the underlying reasoning buried deep inside complex models, demonstrating transparency about how and why decisions like this are made can be a major challenge. How much autonomy should AI agents be given? Where should humans be stepping into the loop?
With threats and potential vulnerabilities occuring across the entire agentic AI lifecycle, a continous end-to-end approach is required. It needs to incorporate everything from having oversight and assurance over model creation, training (including updating/fine-tuning) and agent development, to monitoring and evaluating how agents operate in production (their decisions, actions and tool use).
Protecting against threats
Most enterprises will build agentic AI applications using models developed by third parties. If that's the case, it's essential to have confidence in the integrity of those models and the data used to train them.
Next, you need to build an accurate picture of where you are using AI and agentic AI throughout the organization, using specific tools to catalogue them. This is becoming increasingly important because many instances of AI can be deployed in teams and departments outside of the oversight of IT, as well as individuals sometimes bringing in unauthorized tools or shadow AI.
Deploying an AI Security Posture Management (AI-SPM) system is another important step. These tools continuously evaluate agentic AI systems' security posture, flagging security vulnerabilities and risks. How are agents accessing sensitive data? Do specific agents have excessive permissions to tools or internal systems? Are AI components complying with the organization's security policies?
AI-SPM systems also monitor and enforce policies that keep AI systems within defined boundaries, ensuring they don't inadvertently expose sensitive data or critical systems to attack.
An important aspect of this is putting in place effective identity and access management for AI agents. Each agent should be given a unique ID and credentials with specific boundaries to control the data and tools it can access and the scope of decisions and actions it can take.
It's important to apply the principles of role-based access control and least privilege and just in time access, meaning agents only access systems they strictly need to, according to their role - and only for a limited time window (and only at points in a workflow when they need to).
Of course, while you want agents operating autonomously for routine, low-risk decisions, there will be other decisions that require human approval. You will want to have a human in the loop for decisions like approving payments, accessing sensitive data, making hiring decisions, or making changes to business-critical IT systems.
An AI-specific firewall, gateway, or proxy can be implemented to inspect and filter interactions with the AI for suspicious activity, similar to a traditional firewall. The firewall can flag or block prompts or requests that it identifies as trying to confuse or manipulate the AI agent and control what is allowed to go out, such as documents and data.
Constantly shifting threat landscape
These are just some of the key security and governance considerations that organizations need to be aware of. Protecting agentic AI is a rapidly evolving challenge, with the threat landscape constantly shifting. A proactive, comprehensive approach is required from the outset.
Companies are rightly excited about the power and versatility of agentic AI, but you can't fully unlock its value without being totally confident in your ability to trust, manage and control it.
