Framework Integrations - Agent Sentinel

Overview

Agent Sentinel provides native integrations with popular agent frameworks, enabling automatic tracking of:

Agent actions and tool calls
LLM invocations
Chain execution
Token usage and costs
Errors and retries

The “Big Three” frameworks are supported:

LangChain: Callback-based integration
CrewAI: Wrapper-based integration
AutoGen: Hook-based integration (NEW!)

AutoGen integration

SentinelInspector

Microsoft’s AutoGen is the leading framework in enterprise environments. Agent Sentinel provides seamless integration using AutoGen’s built-in register_reply hook system.

from autogen import AssistantAgent, UserProxyAgent
from agent_sentinel.integrations.autogen import SentinelInspector

# Create the Sentinel inspector
sentinel = SentinelInspector(
    run_name="my_autogen_run",
    enforce_policies=True,  # Enable active blocking
    track_costs=True,       # Track LLM costs
    track_messages=True,    # Track message flow
)

# Create AutoGen agents as normal
assistant = AssistantAgent(
    name="assistant",
    system_message="You are a helpful AI assistant.",
    llm_config={
        "model": "gpt-4",
        "api_key": "your-api-key-here"
    }
)

user_proxy = UserProxyAgent(
    name="user_proxy",
    human_input_mode="NEVER",
    max_consecutive_auto_reply=10,
)

# Secure the agents with one line each
sentinel.register(assistant)
sentinel.register(user_proxy)

# Run as normal - Sentinel is now monitoring everything
sentinel.start_run()

user_proxy.initiate_chat(
    assistant,
    message="What's the weather in San Francisco?"
)

sentinel.end_run()

# Get execution summary
summary = sentinel.get_run_summary()
print(f"Cost: ${summary['run_cost_usd']:.6f}")
print(f"Messages: {summary['message_count']}")
print(f"LLM Calls: {summary['llm_call_count']}")

How it works

AutoGen’s architecture is different from LangChain/CrewAI:

Agents communicate via messages: Agents send messages back and forth
Reply chain system: AutoGen uses register_reply() to inject hooks
LLM config: LLMs are configured through llm_config dict

Sentinel leverages this by:

Hooking the reply chain (position 0 = highest priority)
Wrapping LLM generation to track token costs
Monitoring message flow for audit trail

This is simpler than CrewAI because AutoGen has a built-in hook system designed exactly for this purpose!

What’s tracked

The inspector automatically tracks:

Agent-to-agent messages:
- Sender and recipient
- Message content (preview)
- Message count per agent
- Timestamp and sequence
LLM calls:
- Model name
- Token usage (prompt + completion)
- Calculated cost
- Duration
Policy enforcement:
- Authorization checks before replies
- Budget validation
- Rate limiting
- Intervention recording

Run lifecycle

Mark run boundaries for accurate tracking:

sentinel = SentinelInspector(run_name="conversation_1")

# Register agents
sentinel.register(assistant)
sentinel.register(user_proxy)

# Mark start
sentinel.start_run()

try:
    # Run conversation
    user_proxy.initiate_chat(assistant, message="Hello")
    
    # Mark successful completion
    sentinel.end_run(outcome="completed")
    
except Exception as e:
    # Mark failed completion
    sentinel.end_run(outcome="failed")
    raise

Run summary

Get detailed statistics after execution:

summary = sentinel.get_run_summary()

print(f"Run: {summary['run_name']}")
print(f"Duration: {summary['duration_seconds']:.2f}s")
print(f"Cost: ${summary['run_cost_usd']:.6f}")
print(f"Messages: {summary['message_count']}")
print(f"LLM Calls: {summary['llm_call_count']}")

# Per-agent breakdown
for agent_name, count in summary['agent_message_counts'].items():
    print(f"  {agent_name}: {count} messages")

# Cost breakdown
for action, cost in summary['action_costs'].items():
    print(f"  {action}: ${cost:.6f}")

Policy enforcement

Sentinel blocks agents that violate policies:

from agent_sentinel.policy import PolicyEngine

# Set strict budget
PolicyEngine.configure(run_budget=0.10)  # $0.10 max

sentinel = SentinelInspector(
    run_name="limited_run",
    enforce_policies=True  # Active blocking enabled
)

sentinel.register(assistant)
sentinel.start_run()

try:
    # This will be blocked if cost exceeds $0.10
    user_proxy.initiate_chat(
        assistant,
        message="Write a very long essay..."
    )
except BudgetExceededError as e:
    print(f"BLOCKED: {e}")
    # The assistant was prevented from replying

When blocked, Sentinel:

Returns a blocking message to the agent
Records an intervention for dashboard visibility
Prevents the agent from generating a reply
Raises BudgetExceededError to stop execution

Convenience function

Create and secure agents in one step:

from agent_sentinel.integrations.autogen import create_sentinel_agents
from autogen import AssistantAgent, UserProxyAgent

agents, sentinel = create_sentinel_agents(
    agent_configs=[
        {
            "agent_class": AssistantAgent,
            "name": "assistant",
            "llm_config": {"model": "gpt-4", "api_key": "..."}
        },
        {
            "agent_class": UserProxyAgent,
            "name": "user_proxy",
            "human_input_mode": "NEVER"
        }
    ],
    run_name="my_run"
)

# Agents are already secured
sentinel.start_run()
agents[1].initiate_chat(agents[0], message="Hello!")
sentinel.end_run()

LangChain integration

SentinelCallbackHandler

Use Agent Sentinel’s callback handler to track all LangChain activity:

from agent_sentinel.integrations.langchain import SentinelCallbackHandler
from langchain_openai import ChatOpenAI
from langchain.agents import create_openai_functions_agent, AgentExecutor
from langchain.tools import tool

# Create callback handler
sentinel_handler = SentinelCallbackHandler(
    agent_id="langchain-agent",
    run_id="run-123"
)

# Define tools
@tool
def search(query: str) -> str:
    """Search for information"""
    return f"Results for: {query}"

@tool
def calculate(expression: str) -> str:
    """Calculate a math expression"""
    return str(eval(expression))

# Create agent with callback
llm = ChatOpenAI(model="gpt-4o", callbacks=[sentinel_handler])
tools = [search, calculate]

agent = create_openai_functions_agent(llm, tools, prompt)
agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    callbacks=[sentinel_handler],
    verbose=True
)

# Run agent - all activity automatically tracked
result = agent_executor.invoke({
    "input": "What is 15 * 23?"
}, config={"callbacks": [sentinel_handler]})

What’s tracked

The callback handler automatically tracks: 1. LLM calls:

Model name (normalized to pricing database)
Prompt tokens
Completion tokens
Total tokens
Calculated cost (using latest pricing data)
Duration
Run ID and parent run ID (for nested chains)

2. Tool/function calls:

Tool name
Input arguments (first 200 chars)
Output/result (first 200 chars)
Duration
Success/failure status
Error details (if failed)

3. Agent actions:

Tool selection
Agent reasoning
Observations
Action outcomes

4. Chain execution:

Chain type/name
Chain start/end
Duration
Nested chain relationships (parent/child)
Error handling

5. Policy enforcement (when enforce_policies=True):

Authorization checks before LLM calls
Authorization checks before tool execution
Budget validation
Intervention recording when actions are blocked

Policy enforcement

Sentinel actively blocks operations that violate policies:

from agent_sentinel import PolicyEngine, BudgetExceededError
from agent_sentinel.integrations.langchain import SentinelCallbackHandler
from langchain_openai import ChatOpenAI

# Set strict budget
PolicyEngine.configure(run_budget=0.10)  # $0.10 max

sentinel_handler = SentinelCallbackHandler(
    run_name="limited_run",
    enforce_policies=True  # Active blocking (default: True)
)

llm = ChatOpenAI(model="gpt-4", callbacks=[sentinel_handler])

try:
    # LLM call authorization checked BEFORE API call
    response = llm.invoke("Write a very long essay...")
    # If current spend + estimated cost > budget, blocked
except BudgetExceededError as e:
    print(f"BLOCKED: {e}")
    # LLM call was prevented
    # Intervention recorded for dashboard

When blocked:

on_llm_start() runs authorization check
PolicyEngine.check_action() validates budget
If violation, BudgetExceededError is raised
Intervention is recorded (type, reason, cost, inputs)
LLM call never reaches the API
Exception propagates to your code

The same authorization happens for tools via on_tool_start().

Run summary

Get a complete summary after execution:

# After agent completes
summary = sentinel_handler.get_run_summary()

print(f"Total cost: ${summary['total_cost_usd']:.4f}")
print(f"LLM calls: {summary['llm_calls']}")
print(f"Tool calls: {summary['tool_calls']}")
print(f"Total actions: {summary['total_actions']}")
print(f"Errors: {summary['errors']}")

# Cost breakdown by action type
for action_type, cost in summary['costs_by_action'].items():
    print(f"  {action_type}: ${cost:.4f}")

Async LangChain

The callback handler supports async chains:

from langchain.chains import LLMChain

chain = LLMChain(llm=llm, prompt=prompt, callbacks=[sentinel_handler])

# Async execution
result = await chain.ainvoke({"input": "Hello"})

LangGraph integration

SentinelToolNode

SentinelToolNode wraps a LangGraph tool node so every tool invocation runs through the Sentinel policy engine — including evidence requirements, grounding rules, and structured self-repair feedback when a call is rejected.

from agent_sentinel import PolicyConfig, PolicyEngine
from agent_sentinel.integrations.langgraph import SentinelToolNode

PolicyEngine._config = PolicyConfig(
    evidence_requirements={"issue_refund": ["lookup_order"]},
    grounding_rules={
        "issue_refund": {
            "amount": {"source_action": "lookup_order", "source_field": "amount"},
        }
    },
    commit_actions=["issue_refund"],
    evidence_actions=["lookup_order", "validate_refund"],
)
PolicyEngine._initialized = True

tools = SentinelToolNode(
    tools={
        "lookup_order":   (lookup_order,   {"produces_evidence": True}),
        "validate_refund": (validate_refund, {"produces_evidence": True}),
        "issue_refund": (
            issue_refund,
            {
                "is_commit": True,
                "requires": ["lookup_order"],
                "grounding_rules": {
                    "amount": {"source_action": "lookup_order", "source_field": "amount"},
                },
            },
        ),
    }
)

graph_builder.add_node("tools", tools)

Each entry maps a tool name to (callable, guard_kwargs) where guard_kwargs accepts the same metadata as @guarded_action (is_commit, requires, grounding_rules, produces_evidence, risk_level, etc.).

Self-repair on block

When the policy engine rejects a tool call, SentinelToolNode returns a ToolMessage whose content is a structured remediation payload the LLM can read on the next turn:

{
  "blocked": true,
  "reason_code": "MISSING_EVIDENCE",
  "reason": "issue_refund requires evidence from lookup_order",
  "retry_guidance": "Call lookup_order(order_id=...) first, then retry issue_refund."
}

Most frontier models will read the retry_guidance and self-correct without any prompt-engineering on your side.

Working example

A complete runnable example lives at examples/langgraph_sentinel.py in the public SDK repo — a 3-tool refund workflow that demonstrates evidence requirements, grounding constraints, and self-repair when the agent attempts a commit before the prerequisite lookup.

pip install langchain-core langgraph
python examples/langgraph_sentinel.py

CrewAI integration

SentinelCrew wrapper

SentinelCrew provides automatic security injection for CrewAI - transforming it from passive tracking to active “Visa-like” control:

from crewai import Agent, Task
from crewai_tools import SerperDevTool
from agent_sentinel.integrations.crewai import SentinelCrew

# Standard CrewAI setup - NO CHANGES NEEDED
search_tool = SerperDevTool()

researcher = Agent(
    role="Researcher",
    goal="Research topics thoroughly",
    backstory="Expert researcher with attention to detail",
    tools=[search_tool],  # Tools are auto-secured!
    verbose=True
)

writer = Agent(
    role="Writer",
    goal="Write compelling content based on research",
    backstory="Professional writer with 10 years experience",
    verbose=True
)

# Define tasks
research_task = Task(
    description="Research the latest trends in AI safety for 2025",
    agent=researcher,
    expected_output="Detailed research findings with citations"
)

writing_task = Task(
    description="Write a 500-word article based on the research",
    agent=writer,
    expected_output="Publication-ready article",
    context=[research_task]
)

# SentinelCrew automatically secures everything
crew = SentinelCrew(
    agents=[researcher, writer],
    tasks=[research_task, writing_task],
    run_name="ai_safety_article",
    enforce_policies=True,   # Active blocking (default: True)
    max_agent_steps=50,      # Prevent runaway agents
    detect_loops=True,       # Detect infinite loops
    verbose=True
)

# Fully secured execution
result = crew.kickoff()

# Get detailed summary
summary = crew.get_run_summary()
print(f"Duration: {summary['duration_seconds']:.2f}s")
print(f"Cost: ${summary['run_cost_usd']:.6f}")
print(f"Agents: {summary['num_agents']}")
print(f"Tasks: {summary['num_tasks']}")

What’s automatically secured

SentinelCrew injects security at three levels: 1. Tool Injection (The “Chip Reader”)

Wraps ALL agent tools automatically
No manual decoration required
Works with SerperDevTool, DuckDuckGoSearch, FileReadTool, etc.
Authorization checks run BEFORE tool execution
Failed authorization blocks the tool and records intervention

2. LLM Monitoring (The “LLM Meter”)

Attaches SentinelCallbackHandler to all agent LLMs
Tracks token costs in real-time
Enforces budget limits before expensive API calls
Works with OpenAI, Anthropic, and other LangChain-compatible LLMs

3. Step Monitoring (The “Safety Net”)

Tracks agent step counts
Detects runaway agents (infinite loops)
Enforces max_agent_steps limit
Identifies repetition patterns
Records interventions when agents are stopped

Run summary

Get comprehensive execution statistics:

summary = crew.get_run_summary()

# Output:
{
    "run_name": "ai_safety_article",
    "num_agents": 2,
    "num_tasks": 2,
    "duration_seconds": 45.7,
    "run_cost_usd": 0.234,
    "total_cost_usd": 0.234,
    "action_counts": {
        "tool:search_tool": 3,
        "llm_call:gpt-4o": 5
    },
    "action_costs": {
        "llm_call:gpt-4o": 0.234
    },
    "started_at": 1704390000.0,
    "completed_at": 1704390045.7
}

Policy enforcement example

Prevent runaway costs with active blocking:

from agent_sentinel import PolicyEngine, BudgetExceededError
from agent_sentinel.integrations.crewai import SentinelCrew

# Set strict budget
PolicyEngine.configure(run_budget=0.50)  # $0.50 max

crew = SentinelCrew(
    agents=[researcher, writer],
    tasks=[research_task, writing_task],
    run_name="limited_run",
    enforce_policies=True  # Active blocking enabled
)

try:
    result = crew.kickoff()
except BudgetExceededError as e:
    print(f"BLOCKED: {e}")
    # Tool or LLM call was prevented from executing
    # Intervention recorded for dashboard visibility

When a tool or LLM is blocked:

Authorization fails before execution
Intervention is recorded (type, reason, risk level)
BudgetExceededError or PolicyViolationError is raised
Agent execution stops cleanly
Visible in Console → Interventions page

Runaway agent protection

Prevent infinite loops and excessive iterations:

crew = SentinelCrew(
    agents=[researcher],
    tasks=[task],
    run_name="protected_run",
    max_agent_steps=50,      # Stop after 50 steps
    detect_loops=True,       # Detect repetition
)

# If agent exceeds 50 steps or repeats the same action 5 times:
# - PolicyViolationError is raised
# - Intervention is recorded
# - Execution stops
# - Dashboard shows the issue

Loop detection: If an agent repeats the same action 5+ times in a row, a warning is logged and an intervention is recorded. Step limit: If an agent exceeds max_agent_steps, a critical intervention is recorded and execution is blocked.

Wrapping existing crews

Retrofit existing CrewAI crews without rewriting code:

from agent_sentinel.integrations.crewai import wrap_existing_crew
from crewai import Crew, Agent, Task

# Existing crew code
agent = Agent(role="Researcher", goal="Research topics", backstory="Expert")
task = Task(description="Research AI", agent=agent)
crew = Crew(agents=[agent], tasks=[task])

# Add Sentinel tracking
wrapped_crew = wrap_existing_crew(
    crew=crew,
    agent_id="legacy-crew",
    run_id="run-789"
)

# Execute - now tracked
result = wrapped_crew.kickoff()
summary = wrapped_crew.get_run_summary()

Individual action wrapping

For fine-grained control, wrap individual actions:

from agent_sentinel.integrations.crewai import wrap_crew_action
from crewai import Agent, Task

@wrap_crew_action(name="web_search", cost_usd=0.02, tags=["search", "external"])
def search_web(query: str) -> str:
    """Search the web for information."""
    # Your search logic
    results = perform_search(query)
    return results

@wrap_crew_action(name="analyze_data", cost_usd=0.01, tags=["analysis"])
def analyze_data(data: dict) -> dict:
    """Analyze data and extract insights."""
    # Your analysis logic
    insights = extract_insights(data)
    return insights

# Use wrapped tools with agents
analyst = Agent(
    role="Data Analyst",
    goal="Analyze data and provide insights",
    backstory="Expert data analyst",
    tools=[search_web, analyze_data]  # Both tools are tracked
)

# Each tool execution is logged with:
# - Action name (web_search, analyze_data)
# - Cost ($0.02, $0.01)
# - Duration
# - Tags (search, external, analysis)
# - Success/failure status

The @wrap_crew_action decorator is a thin wrapper around @guarded_action that adds CrewAI-specific tags and metadata.

Custom framework integration

For frameworks not yet supported, use the low-level @guarded_action decorator:

from agent_sentinel import guarded_action
from your_framework import Agent, Task

class TrackedAgent(Agent):
    @guarded_action(name="agent_step", cost_usd=0.0)
    def step(self, task):
        # Your agent logic
        result = super().step(task)
        return result

    @guarded_action(name="tool_call", cost_usd=0.02)
    def call_tool(self, tool_name, args):
        result = super().call_tool(tool_name, args)
        return result

Combining integrations

Use multiple integrations together:

from agent_sentinel.integrations import instrument_openai
from agent_sentinel.integrations.langchain import SentinelCallbackHandler
from langchain_openai import ChatOpenAI

# Instrument OpenAI globally
instrument_openai()

# Use LangChain callback for structured tracking
handler = SentinelCallbackHandler(agent_id="hybrid-agent")

llm = ChatOpenAI(model="gpt-4o", callbacks=[handler])

# Both instrumentation layers work together:
# 1. OpenAI instrumentation tracks raw API calls + costs
# 2. LangChain handler tracks chain/agent structure

Best practices

Use framework integrations when available: Framework-specific integrations provide better structure and context than raw @guarded_action decorators.

Set agent_id and run_id: Always provide identifiers for filtering and analysis in the web console.

Review run summaries: Use get_run_summary() to understand cost breakdown and identify expensive operations.

Framework compatibility: Test integrations when upgrading LangChain or CrewAI versions, as internal APIs may change.

Troubleshooting

”LangChain events not tracked”

Ensure callbacks are passed at all levels:

# ✅ Correct - callbacks at LLM, agent, and executor
llm = ChatOpenAI(callbacks=[handler])
agent = create_openai_functions_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools, callbacks=[handler])
result = executor.invoke(input, config={"callbacks": [handler]})

# ❌ Wrong - missing callbacks
llm = ChatOpenAI()  # No callbacks!
executor = AgentExecutor(agent=agent, tools=tools)
result = executor.invoke(input)  # Not tracked

“CrewAI costs not accurate”

CrewAI cost tracking depends on:

LLM instrumentation (e.g., instrument_openai())
Tool cost annotations via @wrap_crew_action

Ensure both are configured for accurate cost tracking.

”Duplicate events”

If using both LLM instrumentation and framework callbacks, you may see duplicate LLM call records. This is expected - one from the low-level instrumentation, one from the framework callback. The framework callback provides richer context (chain name, agent reasoning) while the low-level instrumentation provides precise token costs.

Example: Full stack tracking

from agent_sentinel import enable_remote_sync, PolicyEngine
from agent_sentinel.integrations import instrument_openai
from agent_sentinel.integrations.langchain import SentinelCallbackHandler
from langchain_openai import ChatOpenAI
from langchain.agents import create_openai_functions_agent, AgentExecutor

# 1. Configure policies
PolicyEngine.configure(
    session_budget=5.0,
    run_budget=1.0,
    rate_limits={
        "openai_chat_completion": {
            "max_count": 50,
            "window_seconds": 60
        }
    }
)

# 2. Enable platform sync
enable_remote_sync(
    platform_url="https://platform.agentsentinel.dev",
    api_token="as_your_api_key_here",
    run_id="run-production-123"
)

# 3. Instrument LLM provider
instrument_openai()

# 4. Create LangChain agent with callback
handler = SentinelCallbackHandler(
    agent_id="production-agent",
    run_id="run-production-123"
)

llm = ChatOpenAI(model="gpt-4o", callbacks=[handler])
agent = create_openai_functions_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools, callbacks=[handler])

# 5. Run agent - fully tracked with policies enforced
try:
    result = executor.invoke({"input": user_query}, config={"callbacks": [handler]})
    summary = handler.get_run_summary()
    print(f"Success! Cost: ${summary['total_cost_usd']:.4f}")
except BudgetExceededError:
    print("Budget exceeded - agent stopped")

​Overview

​AutoGen integration

​SentinelInspector

​How it works

​What’s tracked

​Run lifecycle

​Run summary

​Policy enforcement

​Convenience function

​LangChain integration

​SentinelCallbackHandler

​What’s tracked

​Policy enforcement

​Run summary

​Async LangChain

​LangGraph integration

​SentinelToolNode

​Self-repair on block

​Working example

​CrewAI integration

​SentinelCrew wrapper

​What’s automatically secured

​Run summary

​Policy enforcement example

​Runaway agent protection

​Wrapping existing crews

​Individual action wrapping

​Custom framework integration

​Combining integrations

​Best practices

​Troubleshooting

​”LangChain events not tracked”

​“CrewAI costs not accurate”

​”Duplicate events”

​Example: Full stack tracking

​See also

Overview

AutoGen integration

SentinelInspector

How it works

What’s tracked

Run lifecycle

Run summary

Policy enforcement

Convenience function

LangChain integration

SentinelCallbackHandler

What’s tracked

Policy enforcement

Run summary

Async LangChain

LangGraph integration

SentinelToolNode

Self-repair on block

Working example

CrewAI integration

SentinelCrew wrapper

What’s automatically secured

Run summary

Policy enforcement example

Runaway agent protection

Wrapping existing crews

Individual action wrapping

Custom framework integration

Combining integrations

Best practices

Troubleshooting

”LangChain events not tracked”

“CrewAI costs not accurate”

”Duplicate events”

Example: Full stack tracking

See also