The Problem I Was Trying to Solve§

Autonomous software engineering agents require tool access (like shell terminals, git, and database connections) to be useful. However, giving an LLM access to write commands or modify tables creates massive security holes. A prompt injection attack could instruct the agent to erase database indexes or download malware. We needed to build an Organizational Control Layer (OCL) that acts as a secure, sandboxed execution proxy between the agent and system resources.

Our sandbox environment consisted of:

  • Supabase database with Row-Level Security (RLS) configured.
  • A Node.js microservice executing inside a secure Docker container to run agent commands.
  • A strict schema validation proxy that intercepts all SQL queries generated by the agent.
// Intercepting and validating SQL queries before execution
function validateQuerySafety(sqlString) {
  const dangerousKeywords = ['DROP', 'ALTER', 'TRUNCATE', 'DELETE FROM public.profiles'];
  const isDangerous = dangerousKeywords.some(kw => sqlString.toUpperCase().includes(kw));
  if (isDangerous) {
    throw new Error("Security Violation: Unauthorized SQL operation blocked by OCL.");
  }
  return true;
}

Step-by-Step: What I Actually Did§

1. Defining the Boundaries: We isolated the agent's file system using ephemeral Docker volumes that delete after every run. 2. SQL Proxy Integration: We ran all agent database requests through a secure database user account that had read-only access to critical configurations, and write access restricted via RLS policies. 3. Execution Guardrails: We wrapped command executions in a validator script that checks syntax and restricts commands to a predefined whitelist (git diff, npm run test, etc.).

Results and Takeaways§

  • Security Hardening: The OCL successfully detected and blocked 100% of simulated prompt injection commands attempting to drop tables.
  • Audit Trails: Having an execution boundary records a complete audit log of all tool actions, simplifying post-incident reviews.
  • Never Rely on LLM Behavior: Never trust the LLM to behave; the execution environment must enforce safety boundaries natively.