Overview
Agent Sentinel provides native integrations with popular agent frameworks, enabling automatic tracking of:
- Agent actions and tool calls
- LLM invocations
- Chain execution
- Token usage and costs
- Errors and retries
The “Big Three” frameworks are supported:
- LangChain: Callback-based integration
- CrewAI: Wrapper-based integration
- AutoGen: Hook-based integration (NEW!)
AutoGen integration
SentinelInspector
Microsoft’s AutoGen is the leading framework in enterprise environments. Agent Sentinel provides seamless integration using AutoGen’s built-in register_reply hook system.
from autogen import AssistantAgent, UserProxyAgent
from agent_sentinel.integrations.autogen import SentinelInspector
# Create the Sentinel inspector
sentinel = SentinelInspector(
run_name="my_autogen_run",
enforce_policies=True, # Enable active blocking
track_costs=True, # Track LLM costs
track_messages=True, # Track message flow
)
# Create AutoGen agents as normal
assistant = AssistantAgent(
name="assistant",
system_message="You are a helpful AI assistant.",
llm_config={
"model": "gpt-4",
"api_key": "your-api-key-here"
}
)
user_proxy = UserProxyAgent(
name="user_proxy",
human_input_mode="NEVER",
max_consecutive_auto_reply=10,
)
# Secure the agents with one line each
sentinel.register(assistant)
sentinel.register(user_proxy)
# Run as normal - Sentinel is now monitoring everything
sentinel.start_run()
user_proxy.initiate_chat(
assistant,
message="What's the weather in San Francisco?"
)
sentinel.end_run()
# Get execution summary
summary = sentinel.get_run_summary()
print(f"Cost: ${summary['run_cost_usd']:.6f}")
print(f"Messages: {summary['message_count']}")
print(f"LLM Calls: {summary['llm_call_count']}")
How it works
AutoGen’s architecture is different from LangChain/CrewAI:
- Agents communicate via messages: Agents send messages back and forth
- Reply chain system: AutoGen uses
register_reply() to inject hooks
- LLM config: LLMs are configured through
llm_config dict
Sentinel leverages this by:
- Hooking the reply chain (position 0 = highest priority)
- Wrapping LLM generation to track token costs
- Monitoring message flow for audit trail
This is simpler than CrewAI because AutoGen has a built-in hook system designed exactly for this purpose!
What’s tracked
The inspector automatically tracks:
-
Agent-to-agent messages:
- Sender and recipient
- Message content (preview)
- Message count per agent
- Timestamp and sequence
-
LLM calls:
- Model name
- Token usage (prompt + completion)
- Calculated cost
- Duration
-
Policy enforcement:
- Authorization checks before replies
- Budget validation
- Rate limiting
- Intervention recording
Run lifecycle
Mark run boundaries for accurate tracking:
sentinel = SentinelInspector(run_name="conversation_1")
# Register agents
sentinel.register(assistant)
sentinel.register(user_proxy)
# Mark start
sentinel.start_run()
try:
# Run conversation
user_proxy.initiate_chat(assistant, message="Hello")
# Mark successful completion
sentinel.end_run(outcome="completed")
except Exception as e:
# Mark failed completion
sentinel.end_run(outcome="failed")
raise
Run summary
Get detailed statistics after execution:
summary = sentinel.get_run_summary()
print(f"Run: {summary['run_name']}")
print(f"Duration: {summary['duration_seconds']:.2f}s")
print(f"Cost: ${summary['run_cost_usd']:.6f}")
print(f"Messages: {summary['message_count']}")
print(f"LLM Calls: {summary['llm_call_count']}")
# Per-agent breakdown
for agent_name, count in summary['agent_message_counts'].items():
print(f" {agent_name}: {count} messages")
# Cost breakdown
for action, cost in summary['action_costs'].items():
print(f" {action}: ${cost:.6f}")
Policy enforcement
Sentinel blocks agents that violate policies:
from agent_sentinel.policy import PolicyEngine
# Set strict budget
PolicyEngine.configure(run_budget=0.10) # $0.10 max
sentinel = SentinelInspector(
run_name="limited_run",
enforce_policies=True # Active blocking enabled
)
sentinel.register(assistant)
sentinel.start_run()
try:
# This will be blocked if cost exceeds $0.10
user_proxy.initiate_chat(
assistant,
message="Write a very long essay..."
)
except BudgetExceededError as e:
print(f"BLOCKED: {e}")
# The assistant was prevented from replying
When blocked, Sentinel:
- Returns a blocking message to the agent
- Records an intervention for dashboard visibility
- Prevents the agent from generating a reply
- Raises
BudgetExceededError to stop execution
Convenience function
Create and secure agents in one step:
from agent_sentinel.integrations.autogen import create_sentinel_agents
from autogen import AssistantAgent, UserProxyAgent
agents, sentinel = create_sentinel_agents(
agent_configs=[
{
"agent_class": AssistantAgent,
"name": "assistant",
"llm_config": {"model": "gpt-4", "api_key": "..."}
},
{
"agent_class": UserProxyAgent,
"name": "user_proxy",
"human_input_mode": "NEVER"
}
],
run_name="my_run"
)
# Agents are already secured
sentinel.start_run()
agents[1].initiate_chat(agents[0], message="Hello!")
sentinel.end_run()
LangChain integration
SentinelCallbackHandler
Use Agent Sentinel’s callback handler to track all LangChain activity:
from agent_sentinel.integrations.langchain import SentinelCallbackHandler
from langchain_openai import ChatOpenAI
from langchain.agents import create_openai_functions_agent, AgentExecutor
from langchain.tools import tool
# Create callback handler
sentinel_handler = SentinelCallbackHandler(
agent_id="langchain-agent",
run_id="run-123"
)
# Define tools
@tool
def search(query: str) -> str:
"""Search for information"""
return f"Results for: {query}"
@tool
def calculate(expression: str) -> str:
"""Calculate a math expression"""
return str(eval(expression))
# Create agent with callback
llm = ChatOpenAI(model="gpt-4o", callbacks=[sentinel_handler])
tools = [search, calculate]
agent = create_openai_functions_agent(llm, tools, prompt)
agent_executor = AgentExecutor(
agent=agent,
tools=tools,
callbacks=[sentinel_handler],
verbose=True
)
# Run agent - all activity automatically tracked
result = agent_executor.invoke({
"input": "What is 15 * 23?"
}, config={"callbacks": [sentinel_handler]})
What’s tracked
The callback handler automatically tracks:
1. LLM calls:
- Model name (normalized to pricing database)
- Prompt tokens
- Completion tokens
- Total tokens
- Calculated cost (using latest pricing data)
- Duration
- Run ID and parent run ID (for nested chains)
2. Tool/function calls:
- Tool name
- Input arguments (first 200 chars)
- Output/result (first 200 chars)
- Duration
- Success/failure status
- Error details (if failed)
3. Agent actions:
- Tool selection
- Agent reasoning
- Observations
- Action outcomes
4. Chain execution:
- Chain type/name
- Chain start/end
- Duration
- Nested chain relationships (parent/child)
- Error handling
5. Policy enforcement (when enforce_policies=True):
- Authorization checks before LLM calls
- Authorization checks before tool execution
- Budget validation
- Intervention recording when actions are blocked
Policy enforcement
Sentinel actively blocks operations that violate policies:
from agent_sentinel import PolicyEngine, BudgetExceededError
from agent_sentinel.integrations.langchain import SentinelCallbackHandler
from langchain_openai import ChatOpenAI
# Set strict budget
PolicyEngine.configure(run_budget=0.10) # $0.10 max
sentinel_handler = SentinelCallbackHandler(
run_name="limited_run",
enforce_policies=True # Active blocking (default: True)
)
llm = ChatOpenAI(model="gpt-4", callbacks=[sentinel_handler])
try:
# LLM call authorization checked BEFORE API call
response = llm.invoke("Write a very long essay...")
# If current spend + estimated cost > budget, blocked
except BudgetExceededError as e:
print(f"BLOCKED: {e}")
# LLM call was prevented
# Intervention recorded for dashboard
When blocked:
on_llm_start() runs authorization check
PolicyEngine.check_action() validates budget
- If violation,
BudgetExceededError is raised
- Intervention is recorded (type, reason, cost, inputs)
- LLM call never reaches the API
- Exception propagates to your code
The same authorization happens for tools via on_tool_start().
Run summary
Get a complete summary after execution:
# After agent completes
summary = sentinel_handler.get_run_summary()
print(f"Total cost: ${summary['total_cost_usd']:.4f}")
print(f"LLM calls: {summary['llm_calls']}")
print(f"Tool calls: {summary['tool_calls']}")
print(f"Total actions: {summary['total_actions']}")
print(f"Errors: {summary['errors']}")
# Cost breakdown by action type
for action_type, cost in summary['costs_by_action'].items():
print(f" {action_type}: ${cost:.4f}")
Async LangChain
The callback handler supports async chains:
from langchain.chains import LLMChain
chain = LLMChain(llm=llm, prompt=prompt, callbacks=[sentinel_handler])
# Async execution
result = await chain.ainvoke({"input": "Hello"})
CrewAI integration
SentinelCrew wrapper
SentinelCrew provides automatic security injection for CrewAI - transforming it from passive tracking to active “Visa-like” control:
from crewai import Agent, Task
from crewai_tools import SerperDevTool
from agent_sentinel.integrations.crewai import SentinelCrew
# Standard CrewAI setup - NO CHANGES NEEDED
search_tool = SerperDevTool()
researcher = Agent(
role="Researcher",
goal="Research topics thoroughly",
backstory="Expert researcher with attention to detail",
tools=[search_tool], # Tools are auto-secured!
verbose=True
)
writer = Agent(
role="Writer",
goal="Write compelling content based on research",
backstory="Professional writer with 10 years experience",
verbose=True
)
# Define tasks
research_task = Task(
description="Research the latest trends in AI safety for 2025",
agent=researcher,
expected_output="Detailed research findings with citations"
)
writing_task = Task(
description="Write a 500-word article based on the research",
agent=writer,
expected_output="Publication-ready article",
context=[research_task]
)
# SentinelCrew automatically secures everything
crew = SentinelCrew(
agents=[researcher, writer],
tasks=[research_task, writing_task],
run_name="ai_safety_article",
enforce_policies=True, # Active blocking (default: True)
max_agent_steps=50, # Prevent runaway agents
detect_loops=True, # Detect infinite loops
verbose=True
)
# Fully secured execution
result = crew.kickoff()
# Get detailed summary
summary = crew.get_run_summary()
print(f"Duration: {summary['duration_seconds']:.2f}s")
print(f"Cost: ${summary['run_cost_usd']:.6f}")
print(f"Agents: {summary['num_agents']}")
print(f"Tasks: {summary['num_tasks']}")
What’s automatically secured
SentinelCrew injects security at three levels:
1. Tool Injection (The “Chip Reader”)
- Wraps ALL agent tools automatically
- No manual decoration required
- Works with SerperDevTool, DuckDuckGoSearch, FileReadTool, etc.
- Authorization checks run BEFORE tool execution
- Failed authorization blocks the tool and records intervention
2. LLM Monitoring (The “LLM Meter”)
- Attaches
SentinelCallbackHandler to all agent LLMs
- Tracks token costs in real-time
- Enforces budget limits before expensive API calls
- Works with OpenAI, Anthropic, and other LangChain-compatible LLMs
3. Step Monitoring (The “Safety Net”)
- Tracks agent step counts
- Detects runaway agents (infinite loops)
- Enforces
max_agent_steps limit
- Identifies repetition patterns
- Records interventions when agents are stopped
Run summary
Get comprehensive execution statistics:
summary = crew.get_run_summary()
# Output:
{
"run_name": "ai_safety_article",
"num_agents": 2,
"num_tasks": 2,
"duration_seconds": 45.7,
"run_cost_usd": 0.234,
"total_cost_usd": 0.234,
"action_counts": {
"tool:search_tool": 3,
"llm_call:gpt-4o": 5
},
"action_costs": {
"llm_call:gpt-4o": 0.234
},
"started_at": 1704390000.0,
"completed_at": 1704390045.7
}
Policy enforcement example
Prevent runaway costs with active blocking:
from agent_sentinel import PolicyEngine, BudgetExceededError
from agent_sentinel.integrations.crewai import SentinelCrew
# Set strict budget
PolicyEngine.configure(run_budget=0.50) # $0.50 max
crew = SentinelCrew(
agents=[researcher, writer],
tasks=[research_task, writing_task],
run_name="limited_run",
enforce_policies=True # Active blocking enabled
)
try:
result = crew.kickoff()
except BudgetExceededError as e:
print(f"BLOCKED: {e}")
# Tool or LLM call was prevented from executing
# Intervention recorded for dashboard visibility
When a tool or LLM is blocked:
- Authorization fails before execution
- Intervention is recorded (type, reason, risk level)
BudgetExceededError or PolicyViolationError is raised
- Agent execution stops cleanly
- Visible in Console → Interventions page
Runaway agent protection
Prevent infinite loops and excessive iterations:
crew = SentinelCrew(
agents=[researcher],
tasks=[task],
run_name="protected_run",
max_agent_steps=50, # Stop after 50 steps
detect_loops=True, # Detect repetition
)
# If agent exceeds 50 steps or repeats the same action 5 times:
# - PolicyViolationError is raised
# - Intervention is recorded
# - Execution stops
# - Dashboard shows the issue
Loop detection: If an agent repeats the same action 5+ times in a row, a warning is logged and an intervention is recorded.
Step limit: If an agent exceeds max_agent_steps, a critical intervention is recorded and execution is blocked.
Wrapping existing crews
Retrofit existing CrewAI crews without rewriting code:
from agent_sentinel.integrations.crewai import wrap_existing_crew
from crewai import Crew, Agent, Task
# Existing crew code
agent = Agent(role="Researcher", goal="Research topics", backstory="Expert")
task = Task(description="Research AI", agent=agent)
crew = Crew(agents=[agent], tasks=[task])
# Add Sentinel tracking
wrapped_crew = wrap_existing_crew(
crew=crew,
agent_id="legacy-crew",
run_id="run-789"
)
# Execute - now tracked
result = wrapped_crew.kickoff()
summary = wrapped_crew.get_run_summary()
Individual action wrapping
For fine-grained control, wrap individual actions:
from agent_sentinel.integrations.crewai import wrap_crew_action
from crewai import Agent, Task
@wrap_crew_action(name="web_search", cost_usd=0.02, tags=["search", "external"])
def search_web(query: str) -> str:
"""Search the web for information."""
# Your search logic
results = perform_search(query)
return results
@wrap_crew_action(name="analyze_data", cost_usd=0.01, tags=["analysis"])
def analyze_data(data: dict) -> dict:
"""Analyze data and extract insights."""
# Your analysis logic
insights = extract_insights(data)
return insights
# Use wrapped tools with agents
analyst = Agent(
role="Data Analyst",
goal="Analyze data and provide insights",
backstory="Expert data analyst",
tools=[search_web, analyze_data] # Both tools are tracked
)
# Each tool execution is logged with:
# - Action name (web_search, analyze_data)
# - Cost ($0.02, $0.01)
# - Duration
# - Tags (search, external, analysis)
# - Success/failure status
The @wrap_crew_action decorator is a thin wrapper around @guarded_action that adds CrewAI-specific tags and metadata.
Custom framework integration
For frameworks not yet supported, use the low-level @guarded_action decorator:
from agent_sentinel import guarded_action
from your_framework import Agent, Task
class TrackedAgent(Agent):
@guarded_action(name="agent_step", cost_usd=0.0)
def step(self, task):
# Your agent logic
result = super().step(task)
return result
@guarded_action(name="tool_call", cost_usd=0.02)
def call_tool(self, tool_name, args):
result = super().call_tool(tool_name, args)
return result
Combining integrations
Use multiple integrations together:
from agent_sentinel.integrations import instrument_openai
from agent_sentinel.integrations.langchain import SentinelCallbackHandler
from langchain_openai import ChatOpenAI
# Instrument OpenAI globally
instrument_openai()
# Use LangChain callback for structured tracking
handler = SentinelCallbackHandler(agent_id="hybrid-agent")
llm = ChatOpenAI(model="gpt-4o", callbacks=[handler])
# Both instrumentation layers work together:
# 1. OpenAI instrumentation tracks raw API calls + costs
# 2. LangChain handler tracks chain/agent structure
Best practices
Use framework integrations when available: Framework-specific integrations provide better structure and context than raw @guarded_action decorators.
Set agent_id and run_id: Always provide identifiers for filtering and analysis in the web console.
Review run summaries: Use get_run_summary() to understand cost breakdown and identify expensive operations.
Framework compatibility: Test integrations when upgrading LangChain or CrewAI versions, as internal APIs may change.
Troubleshooting
”LangChain events not tracked”
Ensure callbacks are passed at all levels:
# ✅ Correct - callbacks at LLM, agent, and executor
llm = ChatOpenAI(callbacks=[handler])
agent = create_openai_functions_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools, callbacks=[handler])
result = executor.invoke(input, config={"callbacks": [handler]})
# ❌ Wrong - missing callbacks
llm = ChatOpenAI() # No callbacks!
executor = AgentExecutor(agent=agent, tools=tools)
result = executor.invoke(input) # Not tracked
“CrewAI costs not accurate”
CrewAI cost tracking depends on:
- LLM instrumentation (e.g.,
instrument_openai())
- Tool cost annotations via
@wrap_crew_action
Ensure both are configured for accurate cost tracking.
”Duplicate events”
If using both LLM instrumentation and framework callbacks, you may see duplicate LLM call records. This is expected - one from the low-level instrumentation, one from the framework callback. The framework callback provides richer context (chain name, agent reasoning) while the low-level instrumentation provides precise token costs.
Example: Full stack tracking
from agent_sentinel import enable_remote_sync, PolicyEngine
from agent_sentinel.integrations import instrument_openai
from agent_sentinel.integrations.langchain import SentinelCallbackHandler
from langchain_openai import ChatOpenAI
from langchain.agents import create_openai_functions_agent, AgentExecutor
# 1. Configure policies
PolicyEngine.configure(
session_budget=5.0,
run_budget=1.0,
rate_limits={
"openai_chat_completion": {
"max_count": 50,
"window_seconds": 60
}
}
)
# 2. Enable platform sync
enable_remote_sync(
platform_url="https://platform.agentsentinel.dev",
api_token="as_your_api_key_here",
run_id="run-production-123"
)
# 3. Instrument LLM provider
instrument_openai()
# 4. Create LangChain agent with callback
handler = SentinelCallbackHandler(
agent_id="production-agent",
run_id="run-production-123"
)
llm = ChatOpenAI(model="gpt-4o", callbacks=[handler])
agent = create_openai_functions_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools, callbacks=[handler])
# 5. Run agent - fully tracked with policies enforced
try:
result = executor.invoke({"input": user_query}, config={"callbacks": [handler]})
summary = handler.get_run_summary()
print(f"Success! Cost: ${summary['total_cost_usd']:.4f}")
except BudgetExceededError:
print("Budget exceeded - agent stopped")
See also