Agent Sentinel: Out-of-Band Safety Gateway for Cloud AI Agents

Project summary

March 2026 update: Since this proposal was submitted, significant progress has been made. A live enforcement API is deployed on AWS returning structured ALLOW/BLOCK/HUMAN REQUIRED decisions in milliseconds. A weighted risk scoring engine is operational. A versioned policy store is backed by DynamoDB with full audit logging. The project has been submitted to the NVIDIA Inception Program.

Website: agentsentinel.co

GitHub: https://github.com/indranimaz23-oss/agent-sentinel

AI agents are starting to operate real infrastructure. They can delete servers, change permissions, and move data across cloud environments. As these systems become more autonomous, mistakes or unexpected behavior can cause immediate operational problems.

Most current safeguards rely on prompts, internal model reasoning, or simple permission controls. These approaches fail if an agent hallucinates, is misconfigured, or is manipulated through prompt injection. They also struggle to detect situations where several individually safe actions combine into a dangerous sequence.

Agent Sentinel takes a different approach. It introduces an external safety gateway that sits between AI agents and infrastructure APIs. Instead of relying on the agent to regulate itself, every proposed action is evaluated before execution. The agent holds no direct AWS credentials. Every call proxies through Sentinel first. There is no way to bypass it.

The system converts human safety instructions into structured policies and evaluates agent actions against those policies. It uses risk scoring and sequence analysis to detect potentially dangerous patterns of behavior.

Based on this evaluation, the gateway returns one of three decisions: allow the action, block it, or require human approval.

What are this project's goals? How will you achieve them?

The goal of this project is to build a working prototype of Agent Sentinel, an out-of-band safety gateway that evaluates infrastructure actions proposed by AI agents before they are executed.

Significant progress has already been made on the core components.

COMPLETED: Risk scoring engine A weighted 0.0-1.0 risk score is assigned to every agent action based on action type, environment sensitivity, and resource criticality. Deleting a production S3 bucket scores 1.0 and is blocked instantly. Reading dev logs scores 0.05 and is allowed immediately.

COMPLETED: Policy store A versioned PolicyV1 schema is deployed with DynamoDB backend, idempotent storage, hash verification, and a full audit trail of every decision with action ID and timestamp.

COMPLETED: Enforcement API A live FastAPI gateway is deployed on AWS. Every agent action passes through a three-layer evaluation: hard policy rules, risk scoring, and sequence context. The API returns structured ALLOW, BLOCK, or HUMAN REQUIRED decisions in milliseconds.

IN PROGRESS: LLM policy compiler Converting natural language safety instructions into structured enforcement policies via AWS Bedrock. For example: "Never let any agent modify IAM roles after 6pm" becomes a versioned policy with correct actions, conditions, and effect.

IN PROGRESS: Sequence analysis Multi-step chain detection across agent sessions. Individually safe actions that form a dangerous pattern — modify IAM, export data, delete logs — are detected and escalated before the sequence completes.

PLANNED: Human approval webhook Slack notification and approval workflow for HUMAN REQUIRED decisions. The agent pauses and waits for a human to approve or deny before execution resumes.

The result of this project will be a fully working prototype and documented architecture demonstrating how external safety boundaries can make AI-driven infrastructure automation significantly safer.

How will this funding be used?

The core enforcement gateway is now live and tested on AWS. This funding will accelerate the three remaining components needed to complete the prototype.

LLM policy compiler via AWS Bedrock The current compiler uses pattern matching to convert natural language instructions into structured policies. This funding will replace it with a real LLM-based compiler using AWS Bedrock, making it possible to express complex safety rules in plain English and have them automatically enforced.

Sequence analysis engine

Building the multi-step chain detection layer requires collecting realistic agent behavior examples, defining dangerous sequence patterns, and testing detection logic across a range of adversarial scenarios. This is the most technically challenging remaining component.

Human approval webhook

Building the Slack notification and approval workflow so that HUMAN REQUIRED decisions pause execution and route to a real human for review.

Infrastructure experiments

Simulated AI agents will propose realistic actions such as deleting resources, modifying permissions, and moving data. These experiments will validate risk scoring, test policy gaps, and document failure modes.

Cloud infrastructure is partially supported through AWS Activate credits. This allows grant funding to focus primarily on engineering development and prototype validation.

Who is on your team? What's your track record on similar projects?

Indrani Mazumdar — Founder and Lead Developer

I am the founder and lead developer of Agent Sentinel. I am an AI architect and machine learning engineer with a career spent building and securing large-scale systems in operational environments.

Early in my career at Verizon, I worked in cybersecurity building autonomous threat-hunting models to catch anomalous patterns in network activity. That experience taught me that in security-critical environments you cannot rely on best-case behavior. You have to build systems that assume things will go wrong.

More recently my work has focused on the practical deployment of generative AI, specifically retrieval-augmented generation and enterprise-grade guardrails. I have seen firsthand how unpredictable AI can be when it interacts with real-world operational data.

I am building Agent Sentinel because I have spent years fixing broken infrastructure manually in enterprise environments. I know exactly how a single misconfigured script or a misunderstood command can take down a production system. As we give AI agents the keys to our cloud infrastructure we are introducing a risk profile that current tools are not ready for. I would rather build the emergency brake now than wait to see it fail in production.

The core enforcement gateway is now live on AWS. The project has been submitted to the NVIDIA Inception Program. Development is ongoing.

Kaustav Mukherjee — Researcher, Mount Sinai

Active NIH grant recipient with deep expertise in rigorous research methodology and safety evaluation. Brings clinical research standards into AI safety testing and validation, ensuring Agent Sentinel's risk models are evaluated with the same rigor applied to safety-critical systems in medicine.

What are the most likely causes and outcomes if this project fails?

One possible challenge is that accurately modeling the risk of infrastructure actions may be more difficult than anticipated. AI agents can generate a wide range of actions across many environments, and defining policies that generalize well across these situations may require more iteration and experimentation than expected.

Another potential difficulty is sequence analysis. While individual actions may appear safe, identifying risky multi-step patterns requires collecting sufficient examples of agent behavior and refining the logic used to detect dangerous sequences.

There is also a possibility that existing cloud permission systems already provide enough safeguards for some environments, reducing the perceived need for an additional safety gateway.

If the project does not fully succeed in building a robust interception system, the work will still produce useful outcomes. The prototype and documentation will help clarify where current safety mechanisms for AI agents are insufficient and what types of guardrails are most effective.

Even a partial result would contribute to the broader understanding of how autonomous systems should interact with critical infrastructure and where additional safety controls are required.

How much money have you raised in the last 12 months, and from where?

No external investment has been raised for this project. Early prototype development has been supported through cloud infrastructure credits from AWS Activate. The project has been submitted to the NVIDIA Inception Program, providing access to GPU resources, software tools, and technical support for future development.