Agentic AI

Governance, Not Guardrails: Circuit Breakers for 25 AI Agents

Guardrails are filters. Governance is an architecture. Here's the 5-circuit-breaker system and 3-tier action classification we designed first, then built BUCC around, not bolted on after.

HumanApril 9, 202620 min read

Governance, Not Guardrails: Circuit Breakers for 25 AI Agents

If you've watched the AI industry in the last 18 months, you've noticed a pattern: everyone talks about safety. Everyone wants governance. But when you ask them what that actually means, you get vague answers. Input validation. Output filtering. "Alignment." Guardrails.

These are necessary but not sufficient.

Real governance for autonomous systems is about something deeper: controlling behavior at scale, making informed decisions about risk, maintaining visibility into what's happening, and being able to stop everything if something goes wrong.

This is an architecture problem, not a filtering problem.

Why Governance Matters (And Why It's Usually Wrong)

Most teams approach AI governance like they approach regular software safety: write some rules, enforce them at the boundary.

Input validation → block malicious prompts

Output filtering → block bad completions

API key restriction → only let agents access certain services

Rate limiting → slow down agents that are acting weird

These are all good. But they're not governance.

Real governance is about autonomous agents. Once you've deployed an agent into production, it's going to make decisions you didn't anticipate. It's going to encounter situations you didn't plan for. It's going to try to take actions that are technically allowed but maybe not aligned with your intentions.

In a traditional software system, this is a bug. You fix the code and redeploy. But with an autonomous agent, the agent is supposed to make decisions. The code is the decision-making logic, not the business logic.

So the question becomes: how do you govern a system that's supposed to think?

The answer is: you create a framework that lets the system think, but controls what actions it can take based on risk. You make some decisions human-in-the-loop. You make other decisions agent-only. And you have a way to know what's happening and stop everything if needed.

The 3-Tier Action Classification System

We classify every action an agent tries to take into one of three categories based on risk.

Tier 1: Auto-Execute

Low-risk, high-frequency operations that we're confident about.

Examples:

Status updates and information retrieval (checking agent health, reading database records)
Routine financial transactions below a threshold (e.g., any transaction under $100)
Internal communications (agent-to-agent messages, logging)
Approved API calls within scope (queries to DeepSearch for research)

What happens: The agent executes the action immediately. It's logged in the audit trail (full transparency). But there's no blocking, no approval queue, no pause. Why? Because if you block every read operation, your agents become paralyzed.

Frequency: In a typical day, most agent actions are T1.

Tier 2: Notify

Medium-risk operations that we want to be aware of but don't need to block.

Examples:

Financial transactions above $100 (but below $5,000)
External communications (messages to PulseChat, ChatBridge, CipherMail)
New tool integrations (agent is requesting access to a tool)
Scheduling of future actions (agent is scheduling a task for later)

What happens: The agent executes the action immediately. But a notification appears on the CEO Dashboard showing exactly what happened. The human sees it, understands the reasoning, and can:

Approve (great, no action needed)
Watch (I'm concerned but not blocking)
Escalate (this is weird, let's investigate)

If something looks wrong, the human can escalate to governance enforcement (more on that later), which will prevent the next T2 action from executing.

Frequency: Maybe 10-20% of agent actions are T2.

Tier 3: Block-Until-Approved

High-risk operations that require explicit human approval before execution.

Examples:

Financial transactions above $5,000
Policy modifications
New agent provisioning
Tool integration requests that involve sensitive data
Communications in sensitive domains
Decisions that affect organizational structure or governance rules

What happens: The action blocks. An approval request appears on the CEO Dashboard showing:

What is the agent trying to do?
Why does it think this should happen?
What would change if we approve?
Is this aligned with current policy?
Any red flags?

The human reviews, makes a decision (approve, reject, modify), and the agent is notified. The decision is logged with full context.

Frequency: Maybe 5% of actions are T3. They're important but not frequent.

Configurable Tiers, Not Fixed Rules

A critical insight: T1, T2, and T3 aren't universal. They're configurable based on context.

Configurable by Agent

You trust different agents differently. Maybe the research agent has broader T1 permissions because you've validated its decision-making. Maybe the communications agent has stricter T1 permissions because communications carry organizational risk.

Example configuration:

the research agent: Financial transactions up to $500 are T1, $500-5,000 are T2, above $5,000 are T3
the communications agent: All external communications are T2 (no T1 communications to external parties)
the finance agent: Any transaction above $1,000 is T3 (lower threshold for financial risk)

Configurable by Time

After business hours, you might want stricter governance. Maybe T2 becomes T3 after 6pm (all decisions require human review when the full leadership team isn't available).

Example configuration:

Business hours (9am-5pm): T2 is acceptable for routine decisions
After hours (5pm-9am): T2 becomes T3 (all decisions require approval, SLA extended to 30 minutes)
Weekends: All T2 becomes T3

Configurable by Operational State

In normal operations, T2 is fine. But during an incident, you might want everything to be T3. During a security lockdown, you might want everything to be T1 max (no external calls).

Example configuration:

Normal: T1 and T2 auto-execute, T3 requires approval
Incident mode: All T2 becomes T3, approval SLA drops to 10 minutes
Security lockdown: Only T1 operations allowed, nothing touches the outside world

Configurable by Risk Level

Define risk dynamically. Financial transactions scale: $0-100 is T1, $100-5000 is T2, $5000+ is T3. But you can adjust the thresholds based on operational state or agent performance.

Example configuration:

Normal circumstances: $0-100 T1, $100-5000 T2, $5000+ T3
High spending quarter: $0-50 T1, $50-1000 T2, $1000+ T3 (lower thresholds)
Exceptional circumstances: All transactions T3 (everything requires approval)

The 5 Circuit Breakers: Graduated Governance Control

The 3-tier system handles normal operations. Circuit breakers handle abnormal situations. They provide graduated control for when you need to tighten governance quickly.

Think of circuit breakers as governance modes. Each mode restricts what agents can do, with varying degrees of severity.

Circuit Breaker 1: Hard Stop (Emergency Override)

What it does: Everything stops. Instantly. All agents pause. All work stops. All external API calls stop. No new tasks start.

When to use it:

Something catastrophically bad is happening
You need to investigate an incident immediately
You've detected a security breach
An agent is behaving completely erratically

How it works:

One button on the CEO Dashboard
Takes effect in milliseconds
All agents return to DORMANT state
All pending tasks are paused (can be resumed later)
All scheduled actions are cancelled
All external calls are halted

Reversibility: Yes. You can restart agents and resume tasks.

Example: You detect that an agent has somehow accessed financial systems it shouldn't. Hard stop triggered. All agents stop. You investigate. You figure out what went wrong. You re-enable agents once you understand the issue.

Circuit Breaker 2: Governance Enforcement

What it does: Restrict all agents to T1-only operations. No external communications. No tool calls beyond approved list. No new integrations.

When to use it:

You've detected shadow AI (unauthorized LLM usage)
You've had a policy breach and need to do damage assessment
You suspect an agent is misbehaving but aren't sure
You're in security audit mode
An agent has made decisions you disagree with

How it works:

All T2 and T3 actions are blocked until manually approved
T1 operations execute normally
Agents can read, think, and process information
Agents cannot make external calls or modify state outside their sandbox

Reversibility: Yes. You can disable governance enforcement and return to normal tier levels.

Example: You notice an agent made a financial decision that surprised you. You trigger governance enforcement. Now every action the agent tries requires your explicit approval. You watch what it does next. If you're satisfied it's operating correctly, you disable governance enforcement. If you're not, you investigate further.

Circuit Breaker 3: Financial Pause

What it does: All financial transactions are blocked until approval, regardless of tier.

When to use it:

You've detected unusual spending patterns
You're investigating a financial anomaly
You've hit your monthly budget and need to review additional spending
You're in financial audit mode

How it works:

All financial operations (Vaultline transactions, fund transfers, etc.) are T3 (require approval)
Non-financial operations continue normally
The approvals queue shows all pending financial transactions with full context

Reversibility: Yes. You can disable financial pause and return to normal thresholds.

Example: You notice your monthly AI inference costs are 2x normal. Financial pause triggered. Now every inference payment requires your approval. You review the first few. You understand what's driving the cost. You approve a batch of them. Then you disable financial pause once you've caught up.

Circuit Breaker 4: Communications Pause

What it does: All external communications are blocked until approval, regardless of tier.

When to use it:

You need to control organizational messaging (big announcement coming)
You've detected unauthorized communications from agents
You're in crisis management mode and need to control the narrative
You're preparing a sensitive communication

How it works:

All external communications (PulseChat, ChatBridge, CipherMail) become T3
Agents can prepare communications but cannot send
The approvals queue shows all pending communications
Once approved, communications are sent

Reversibility: Yes. You can disable communications pause.

Example: You're about to announce a major strategic change. You trigger communications pause. Now no agent can send external messages without approval. You review all pending communications. You coordinate messaging. Then you disable communications pause once everything is aligned.

Circuit Breaker 5: Soft Alert (Yellow Alert Mode)

What it does: No hard restrictions. But intensified monitoring and tighter approval thresholds.

When to use it:

You're concerned something might be wrong but aren't sure
You want closer visibility into agent behavior
You're running a fire drill and want to test your governance system
You're in a sensitive period and want stricter oversight

How it works:

All T2 actions require notification (normal)
All T3 actions get expedited review (SLA drops from 1 hour to 10 minutes)
Dashboard alerts are more sensitive (low-level anomalies trigger notifications)
Monitoring is more aggressive

Reversibility: Yes. You can disable soft alert.

Example: You're running a fire drill. You trigger soft alert mode. Now you're watching everything more carefully. You measure: how fast do approvals happen? How many do we approve vs. reject? Do we catch anomalies? After the drill, you disable soft alert.

How Circuit Breakers Cascade

Circuit breakers don't exist in isolation. They cascade and combine:

Hard Stop overrides everything. If CB-1 is active, nothing runs, period.
Governance Enforcement overrides tier permissions. If CB-2 is active, T2 and T3 are blocked (only T1 allowed).
Financial Pause affects only financial. If CB-3 is active, financial operations are T3, everything else continues normally.
Communications Pause affects only communications. If CB-4 is active, external communications are T3, everything else continues normally.
Soft Alert changes monitoring and thresholds. If CB-5 is active, monitoring is tighter and approvals are faster.

Example combination: You trigger governance enforcement (CB-2) and financial pause (CB-3). Now:

All T2 and T3 operations are blocked (governance enforcement)
All financial operations are T3 (financial pause overrides this further)
Agents can only execute T1 operations that don't touch financial systems

The Approval Queue: Human Decision-Making

When an action is classified as T3, or when a circuit breaker makes it T3, the action goes into the approval queue.

What's in an Approval Request?

Each approval request contains:

The action: What is the agent trying to do?
Reasoning: Full trace of the agent's decision logic
Impact analysis: What changes if we approve?
Policy alignment: Does this match governance rules?
Context: Related decisions, recent history, relevant facts
Risk assessment: What could go wrong?
Recommendation: What does the system recommend?
SLA: When does this decision need to happen?

The Approval Workflow

Step 1: Review

A human reviewer sees the request on the CEO Dashboard and reviews the full context.

Step 2: Decide

The human can:

Approve: Let the agent proceed as planned
Approve with modifications: "Go ahead, but with these constraints"
Reject: Block this action
Request clarification: Ask the agent to explain further
Escalate: "This needs CFO review" or "This needs security team review"
Delegate: "I can't review this, assign to someone else"

Step 3: Execute

Once approved, the action executes. The decision is logged with context.

SLA Enforcement

Approvals can't sit in a queue forever. SLAs are enforced:

First SLA: 1 hour. If no decision in 1 hour, escalate to backup approver.
Second SLA: 2 hours. If still no decision, escalate to entire leadership team.
Critical SLA: For emergencies, maybe 10 minutes to first decision.

This prevents approval queues from becoming decision paralysis.

YOLO Mode: Controlled Trust

Not every T3 action needs a human review forever. BUCC includes YOLO Mode, a configurable set of rules that allow specific T3 actions to auto-approve under defined conditions. Think of it as graduated trust: once an agent has demonstrated reliable behavior in a domain, you can create a YOLO rule that says "approve financial reads under $100 from this agent automatically." The rule is logged, auditable, and revocable at any time. It's not "turn off governance." It's "encode your trust decisions into policy."

Approval Metrics

You measure:

Approval velocity: How long between request and decision?
Approval rate: What percentage do we approve vs. reject?
Decision quality: Did we approve something we shouldn't have? Did we reject something we should have?
Escalation rate: How many decisions are escalated vs. resolved by primary approver?

These metrics tell you about your decision-making process. If you're rejecting 50% of financial decisions, maybe your T3 threshold is wrong. If approvals are taking 2 hours on average, maybe you need more approvers.

Shadow AI Detection: The Silent Threat

Here's a problem most organizations don't think about until it's too late: unauthorized AI usage on your infrastructure.

It could be:

An employee training a private ML model on company GPU time
An agent calling OpenAI without authorization (maybe due to a bug or misconfiguration)
A contractor or third party using your infrastructure for their own LLM work
Someone building their own LLM system in the background

You don't want to discover this in a quarterly audit. You want to know it's happening in real-time.

How It Works

We scan infrastructure for unauthorized LLM API calls. We look for:

OpenAI API calls from machines that shouldn't have OpenAI access
Anthropic API calls from unauthorized sources
Calls to other LLM providers (Mistral, etc.) from unexpected sources
Unusual patterns (sudden spike in API calls, calls from unusual IP addresses, etc.)

When we detect something, we:

Log the call (what model, who called it, what the call was for)
Alert the security team immediately
Begin investigation (is this authorized? Did we know about this?)
Document the finding

Real Examples

We caught:

Private ML training: A developer was using company GPU clusters to train a private model for their personal ML project. We detected unusual NVIDIA GPU usage patterns and unauthorized PyTorch API calls. Investigation revealed what was happening. The developer was told to stop. The incident was logged.
Misconfigured agent: An agent was trying to call OpenAI due to a misconfiguration. We caught it before the first call succeeded (our monitoring detected the attempt). We fixed the configuration. Crisis averted.
Contractor overreach: A contractor was using our infrastructure to test LLM approaches for their consulting work. We detected their API calls. We had a conversation about scope and authorization. The behavior stopped.

Fire Drills: Testing Your Governance

Governance controls only work if you've tested them. We run quarterly fire drills.

What Is a Fire Drill?

A simulation of a governance crisis. We simulate (not actually trigger):

A circuit breaker activation (what would happen if CB-1 triggered?)
An approval queue backup (what if 100 approvals showed up in 10 minutes?)
An agent misbehavior (what if an agent started making weird decisions?)
A shadow AI detection (what if we found unauthorized LLM usage?)
A financial anomaly (what if spending spiked 10x normal?)

How It Works

Planning Phase: We decide what scenario to test. Maybe: "What if an agent starts making large financial transactions without warning?" We prepare a scenario. We notify stakeholders (but not the full team).

Simulation Phase: We create fake approval requests that simulate the scenario. We don't actually execute anything. We just put events in the queue and watch how the organization responds.

Observation Phase: We observe:

How fast did people notice something was wrong?
How many people got involved in decision-making?
How fast were decisions made?
Did we follow procedures?
Did communication work?
What bottlenecks appeared?

Debrief Phase: We analyze what happened. We identify problems. We fix them.

What We've Learned

First fire drill: we discovered that the sole decision-maker was traveling and unreachable. Approvals backed up. Lesson: we need a backup approver and clear escalation procedures.

Second fire drill: We discovered that the "escalate to leadership" procedure wasn't clear. Who exactly should be notified? What's their phone number? Do we have contact info? Lesson: document escalation procedures and verify them work.

Third fire drill: We discovered that our monitoring system was too noisy. There were so many alerts that people were ignoring them. Lesson: tune alert thresholds and reduce false positives.

Fire drills are uncomfortable. You're basically asking "what would happen if we failed?" But that's the point. You want to find problems in the drill, not in the actual crisis.

AIMS Compliance Alignment

AIMS (AI Management Standard) is a framework for governance of AI systems. BUCC's governance framework aligns with AIMS principles:

AIMS Principle 1: Transparency

All decisions logged, full audit trail, context preserved. Check.

AIMS Principle 2: Accountability

Every action attributed to an agent, every approval attributed to a human, every decision documented. Check.

AIMS Principle 3: Oversight

CEO Dashboard, approval queues, human-in-the-loop for high-risk decisions. Check.

AIMS Principle 4: Testing

Fire drills, shadow AI detection, monitoring. Check.

AIMS Principle 5: Adaptability

Configurable tiers, circuit breakers, dynamic risk thresholds. Check.

If your organization is subject to AIMS compliance, BUCC's governance framework should help you meet those requirements.

Control Debt Scoring: Quantifying Your Governance Gap

We use a concept called "control debt" to measure how much governance risk we're carrying.

Think of it like technical debt, but for governance. If you have a T2 action that really should be T3, you're carrying governance debt. If you have a fire drill that's 6 months overdue, you're carrying testing debt.

We score control debt on a scale of 0-100:

0-20: Healthy. Your governance is tight, up-to-date, tested.
20-40: Manageable. You have some debt, but nothing urgent.
40-60: Concerning. Your governance has gaps. You should address them soon.
60-80: Dangerous. Your governance is weak. You have significant risk.
80-100: Critical. Your governance is broken. You need immediate action.

What increases control debt?

Fire drills that are more than 3 months old (testing debt)
T3 decisions that have exceeded their SLA (approval debt)
Known shadow AI detections that haven't been investigated (security debt)
Circuit breakers that haven't been tested (reliability debt)
Approval metrics that are degrading (decision debt)

What reduces control debt?

Running a fire drill (testing complete)
Completing all pending approvals (approval queue cleared)
Investigating shadow AI detections (security investigated)
Testing circuit breakers (reliability verified)
Improving approval velocity (decision-making improved)

We aim to keep control debt below 30. If it creeps above 40, we schedule a governance review.

Implementation Notes

If you're building governance for your own multi-agent system, here are key implementation considerations:

1. Start with T1/T2/T3, not five tiers. More tiers means more complexity and slower decision-making. Three tiers is the minimum viable governance model.

2. Make tiers reconfigurable. Don't hard-code them. Build a configuration system that lets you adjust by agent, time, state, and risk.

3. Log everything. Your audit trail is your governance. Comprehensive logging makes everything else possible.

4. Automate escalation. SLAs don't work if you have to remember to check them. Automate escalation.

5. Test your safety systems. Fire drills are uncomfortable but essential. Monthly is better than quarterly. Quarterly is better than never.

6. Make emergency stop accessible. One click. Always. Test it at least quarterly.

7. Design for transparency. The CEO Dashboard exists because humans should understand what's happening. Make visibility a first-class feature, not an afterthought.

Conclusion: Governance Is Architecture

The most important insight: governance isn't a feature you add. It's an architecture question.

You can't bolt good governance onto a system that wasn't designed for it. You have to design governance in from the start. You have to decide: which decisions are agent-only? Which decisions require human oversight? What does the escalation path look like?

Answer these questions early. Build the infrastructure to support them. Then deploy agents into that framework.

The agents might think faster than humans. The agents might make better decisions on average. But the human oversight is what makes the system trustworthy. It's what lets you scale to 25+ agents without losing control.

Next: Day 3, Memory Architecture

Read the rest of the series

Day 1: Running 25 AI agents in production
Day 2: Governance, not guardrails (you are here)
Day 3: Persistent agent memory
Day 4: The Data Sanitization Proxy
Day 5: The agent provisioning pipeline
Day 6: Three-layer LLM routing
Day 7: Catching AI hallucinations
Bonus: Agent ACL framework
Bonus: Agent wallets & DAO governance
Bonus: BlackOffice video pipeline
Bonus: Control Debt Scoring

Governance, Not Guardrails: Circuit Breakers for 25 AI Agents

Guardrails are filters. Governance is an architecture. Here's the 5-circuit-breaker system and 3-tier action classification we designed first, then built BUCC around, not bolted on after.

HumanApril 9, 202620 min read

These are necessary but not sufficient.

This is an architecture problem, not a filtering problem.

Why Governance Matters (And Why It's Usually Wrong)

Most teams approach AI governance like they approach regular software safety: write some rules, enforce them at the boundary.

Input validation → block malicious prompts

Output filtering → block bad completions

API key restriction → only let agents access certain services

Rate limiting → slow down agents that are acting weird

These are all good. But they're not governance.

So the question becomes: how do you govern a system that's supposed to think?

The 3-Tier Action Classification System

We classify every action an agent tries to take into one of three categories based on risk.

Tier 1: Auto-Execute

Low-risk, high-frequency operations that we're confident about.

Examples:

Status updates and information retrieval (checking agent health, reading database records)
Routine financial transactions below a threshold (e.g., any transaction under $100)
Internal communications (agent-to-agent messages, logging)
Approved API calls within scope (queries to DeepSearch for research)

Frequency: In a typical day, most agent actions are T1.

Tier 2: Notify

Medium-risk operations that we want to be aware of but don't need to block.

Examples:

Financial transactions above $100 (but below $5,000)
External communications (messages to PulseChat, ChatBridge, CipherMail)
New tool integrations (agent is requesting access to a tool)
Scheduling of future actions (agent is scheduling a task for later)

What happens: The agent executes the action immediately. But a notification appears on the CEO Dashboard showing exactly what happened. The human sees it, understands the reasoning, and can:

Approve (great, no action needed)
Watch (I'm concerned but not blocking)
Escalate (this is weird, let's investigate)

If something looks wrong, the human can escalate to governance enforcement (more on that later), which will prevent the next T2 action from executing.

Frequency: Maybe 10-20% of agent actions are T2.

Tier 3: Block-Until-Approved

High-risk operations that require explicit human approval before execution.

Examples:

Financial transactions above $5,000
Policy modifications
New agent provisioning
Tool integration requests that involve sensitive data
Communications in sensitive domains
Decisions that affect organizational structure or governance rules

What happens: The action blocks. An approval request appears on the CEO Dashboard showing:

What is the agent trying to do?
Why does it think this should happen?
What would change if we approve?
Is this aligned with current policy?
Any red flags?

The human reviews, makes a decision (approve, reject, modify), and the agent is notified. The decision is logged with full context.

Frequency: Maybe 5% of actions are T3. They're important but not frequent.

Configurable Tiers, Not Fixed Rules

A critical insight: T1, T2, and T3 aren't universal. They're configurable based on context.

Configurable by Agent

Example configuration:

the research agent: Financial transactions up to $500 are T1, $500-5,000 are T2, above $5,000 are T3
the communications agent: All external communications are T2 (no T1 communications to external parties)
the finance agent: Any transaction above $1,000 is T3 (lower threshold for financial risk)

Configurable by Time

After business hours, you might want stricter governance. Maybe T2 becomes T3 after 6pm (all decisions require human review when the full leadership team isn't available).

Example configuration:

Business hours (9am-5pm): T2 is acceptable for routine decisions
After hours (5pm-9am): T2 becomes T3 (all decisions require approval, SLA extended to 30 minutes)
Weekends: All T2 becomes T3

Configurable by Operational State

In normal operations, T2 is fine. But during an incident, you might want everything to be T3. During a security lockdown, you might want everything to be T1 max (no external calls).

Example configuration:

Normal: T1 and T2 auto-execute, T3 requires approval
Incident mode: All T2 becomes T3, approval SLA drops to 10 minutes
Security lockdown: Only T1 operations allowed, nothing touches the outside world

Configurable by Risk Level

Define risk dynamically. Financial transactions scale: $0-100 is T1, $100-5000 is T2, $5000+ is T3. But you can adjust the thresholds based on operational state or agent performance.

Example configuration:

Normal circumstances: $0-100 T1, $100-5000 T2, $5000+ T3
High spending quarter: $0-50 T1, $50-1000 T2, $1000+ T3 (lower thresholds)
Exceptional circumstances: All transactions T3 (everything requires approval)

The 5 Circuit Breakers: Graduated Governance Control

The 3-tier system handles normal operations. Circuit breakers handle abnormal situations. They provide graduated control for when you need to tighten governance quickly.

Think of circuit breakers as governance modes. Each mode restricts what agents can do, with varying degrees of severity.

Circuit Breaker 1: Hard Stop (Emergency Override)

What it does: Everything stops. Instantly. All agents pause. All work stops. All external API calls stop. No new tasks start.

When to use it:

Something catastrophically bad is happening
You need to investigate an incident immediately
You've detected a security breach
An agent is behaving completely erratically

How it works:

One button on the CEO Dashboard
Takes effect in milliseconds
All agents return to DORMANT state
All pending tasks are paused (can be resumed later)
All scheduled actions are cancelled
All external calls are halted

Reversibility: Yes. You can restart agents and resume tasks.

Circuit Breaker 2: Governance Enforcement

What it does: Restrict all agents to T1-only operations. No external communications. No tool calls beyond approved list. No new integrations.

When to use it:

You've detected shadow AI (unauthorized LLM usage)
You've had a policy breach and need to do damage assessment
You suspect an agent is misbehaving but aren't sure
You're in security audit mode
An agent has made decisions you disagree with

How it works:

All T2 and T3 actions are blocked until manually approved
T1 operations execute normally
Agents can read, think, and process information
Agents cannot make external calls or modify state outside their sandbox

Reversibility: Yes. You can disable governance enforcement and return to normal tier levels.

Circuit Breaker 3: Financial Pause

What it does: All financial transactions are blocked until approval, regardless of tier.

When to use it:

You've detected unusual spending patterns
You're investigating a financial anomaly
You've hit your monthly budget and need to review additional spending
You're in financial audit mode

How it works:

All financial operations (Vaultline transactions, fund transfers, etc.) are T3 (require approval)
Non-financial operations continue normally
The approvals queue shows all pending financial transactions with full context

Reversibility: Yes. You can disable financial pause and return to normal thresholds.

Circuit Breaker 4: Communications Pause

What it does: All external communications are blocked until approval, regardless of tier.

When to use it:

You need to control organizational messaging (big announcement coming)
You've detected unauthorized communications from agents
You're in crisis management mode and need to control the narrative
You're preparing a sensitive communication

How it works:

All external communications (PulseChat, ChatBridge, CipherMail) become T3
Agents can prepare communications but cannot send
The approvals queue shows all pending communications
Once approved, communications are sent

Reversibility: Yes. You can disable communications pause.

Circuit Breaker 5: Soft Alert (Yellow Alert Mode)

What it does: No hard restrictions. But intensified monitoring and tighter approval thresholds.

When to use it:

You're concerned something might be wrong but aren't sure
You want closer visibility into agent behavior
You're running a fire drill and want to test your governance system
You're in a sensitive period and want stricter oversight

How it works:

All T2 actions require notification (normal)
All T3 actions get expedited review (SLA drops from 1 hour to 10 minutes)
Dashboard alerts are more sensitive (low-level anomalies trigger notifications)
Monitoring is more aggressive

Reversibility: Yes. You can disable soft alert.

How Circuit Breakers Cascade

Circuit breakers don't exist in isolation. They cascade and combine:

Hard Stop overrides everything. If CB-1 is active, nothing runs, period.
Governance Enforcement overrides tier permissions. If CB-2 is active, T2 and T3 are blocked (only T1 allowed).
Financial Pause affects only financial. If CB-3 is active, financial operations are T3, everything else continues normally.
Communications Pause affects only communications. If CB-4 is active, external communications are T3, everything else continues normally.
Soft Alert changes monitoring and thresholds. If CB-5 is active, monitoring is tighter and approvals are faster.

Example combination: You trigger governance enforcement (CB-2) and financial pause (CB-3). Now:

All T2 and T3 operations are blocked (governance enforcement)
All financial operations are T3 (financial pause overrides this further)
Agents can only execute T1 operations that don't touch financial systems

The Approval Queue: Human Decision-Making

When an action is classified as T3, or when a circuit breaker makes it T3, the action goes into the approval queue.

What's in an Approval Request?

Each approval request contains:

The action: What is the agent trying to do?
Reasoning: Full trace of the agent's decision logic
Impact analysis: What changes if we approve?
Policy alignment: Does this match governance rules?
Context: Related decisions, recent history, relevant facts
Risk assessment: What could go wrong?
Recommendation: What does the system recommend?
SLA: When does this decision need to happen?

The Approval Workflow

Step 1: Review

A human reviewer sees the request on the CEO Dashboard and reviews the full context.

Step 2: Decide

The human can:

Approve: Let the agent proceed as planned
Approve with modifications: "Go ahead, but with these constraints"
Reject: Block this action
Request clarification: Ask the agent to explain further
Escalate: "This needs CFO review" or "This needs security team review"
Delegate: "I can't review this, assign to someone else"

Step 3: Execute

Once approved, the action executes. The decision is logged with context.

SLA Enforcement

Approvals can't sit in a queue forever. SLAs are enforced:

First SLA: 1 hour. If no decision in 1 hour, escalate to backup approver.
Second SLA: 2 hours. If still no decision, escalate to entire leadership team.
Critical SLA: For emergencies, maybe 10 minutes to first decision.

This prevents approval queues from becoming decision paralysis.

YOLO Mode: Controlled Trust

Approval Metrics

You measure:

Approval velocity: How long between request and decision?
Approval rate: What percentage do we approve vs. reject?
Decision quality: Did we approve something we shouldn't have? Did we reject something we should have?
Escalation rate: How many decisions are escalated vs. resolved by primary approver?

Shadow AI Detection: The Silent Threat

Here's a problem most organizations don't think about until it's too late: unauthorized AI usage on your infrastructure.

It could be:

An employee training a private ML model on company GPU time
An agent calling OpenAI without authorization (maybe due to a bug or misconfiguration)
A contractor or third party using your infrastructure for their own LLM work
Someone building their own LLM system in the background

You don't want to discover this in a quarterly audit. You want to know it's happening in real-time.

How It Works

We scan infrastructure for unauthorized LLM API calls. We look for:

OpenAI API calls from machines that shouldn't have OpenAI access
Anthropic API calls from unauthorized sources
Calls to other LLM providers (Mistral, etc.) from unexpected sources
Unusual patterns (sudden spike in API calls, calls from unusual IP addresses, etc.)

When we detect something, we:

Log the call (what model, who called it, what the call was for)
Alert the security team immediately
Begin investigation (is this authorized? Did we know about this?)
Document the finding

Real Examples

We caught:

Private ML training: A developer was using company GPU clusters to train a private model for their personal ML project. We detected unusual NVIDIA GPU usage patterns and unauthorized PyTorch API calls. Investigation revealed what was happening. The developer was told to stop. The incident was logged.
Misconfigured agent: An agent was trying to call OpenAI due to a misconfiguration. We caught it before the first call succeeded (our monitoring detected the attempt). We fixed the configuration. Crisis averted.
Contractor overreach: A contractor was using our infrastructure to test LLM approaches for their consulting work. We detected their API calls. We had a conversation about scope and authorization. The behavior stopped.

Fire Drills: Testing Your Governance

Governance controls only work if you've tested them. We run quarterly fire drills.

What Is a Fire Drill?

A simulation of a governance crisis. We simulate (not actually trigger):

A circuit breaker activation (what would happen if CB-1 triggered?)
An approval queue backup (what if 100 approvals showed up in 10 minutes?)
An agent misbehavior (what if an agent started making weird decisions?)
A shadow AI detection (what if we found unauthorized LLM usage?)
A financial anomaly (what if spending spiked 10x normal?)

How It Works

Simulation Phase: We create fake approval requests that simulate the scenario. We don't actually execute anything. We just put events in the queue and watch how the organization responds.

Observation Phase: We observe:

How fast did people notice something was wrong?
How many people got involved in decision-making?
How fast were decisions made?
Did we follow procedures?
Did communication work?
What bottlenecks appeared?

Debrief Phase: We analyze what happened. We identify problems. We fix them.

What We've Learned

First fire drill: we discovered that the sole decision-maker was traveling and unreachable. Approvals backed up. Lesson: we need a backup approver and clear escalation procedures.

Third fire drill: We discovered that our monitoring system was too noisy. There were so many alerts that people were ignoring them. Lesson: tune alert thresholds and reduce false positives.

Fire drills are uncomfortable. You're basically asking "what would happen if we failed?" But that's the point. You want to find problems in the drill, not in the actual crisis.

AIMS Compliance Alignment

AIMS (AI Management Standard) is a framework for governance of AI systems. BUCC's governance framework aligns with AIMS principles:

AIMS Principle 1: Transparency

All decisions logged, full audit trail, context preserved. Check.

AIMS Principle 2: Accountability

Every action attributed to an agent, every approval attributed to a human, every decision documented. Check.

AIMS Principle 3: Oversight

CEO Dashboard, approval queues, human-in-the-loop for high-risk decisions. Check.

AIMS Principle 4: Testing

Fire drills, shadow AI detection, monitoring. Check.

AIMS Principle 5: Adaptability

Configurable tiers, circuit breakers, dynamic risk thresholds. Check.

If your organization is subject to AIMS compliance, BUCC's governance framework should help you meet those requirements.

Control Debt Scoring: Quantifying Your Governance Gap

We use a concept called "control debt" to measure how much governance risk we're carrying.

We score control debt on a scale of 0-100:

0-20: Healthy. Your governance is tight, up-to-date, tested.
20-40: Manageable. You have some debt, but nothing urgent.
40-60: Concerning. Your governance has gaps. You should address them soon.
60-80: Dangerous. Your governance is weak. You have significant risk.
80-100: Critical. Your governance is broken. You need immediate action.

What increases control debt?

Fire drills that are more than 3 months old (testing debt)
T3 decisions that have exceeded their SLA (approval debt)
Known shadow AI detections that haven't been investigated (security debt)
Circuit breakers that haven't been tested (reliability debt)
Approval metrics that are degrading (decision debt)

What reduces control debt?

Running a fire drill (testing complete)
Completing all pending approvals (approval queue cleared)
Investigating shadow AI detections (security investigated)
Testing circuit breakers (reliability verified)
Improving approval velocity (decision-making improved)