CrewAI Integration Guide

A deep, copy/paste manual for adding hard enforcement, device limits, and remote kill switches to CrewAI agents.

Companion to Implementation Guide and Operational Guarantees
Goal: “one page” integration reference (human + LLM) Focus: exact boundaries (kickoff, LLM calls, tool calls, side-effects) Control: dashboard revoke + org-wide disable
What this is

This page is intentionally extensive. It is designed so a human or an LLM can use it as a single, authoritative reference to implement MachineID hard enforcement into CrewAI step-by-step, with minimal guesswork.

What you get here
  • Two canonical integration paths (Template and Python SDK)
  • Exact enforcement boundary strategy for CrewAI (kickoff, LLM calls, tool calls, side effects)
  • Device modeling examples across multiple tiers
  • Remote shutdown patterns via Dashboard controls
What this is not
  • A “soft guardrail” tutorial
  • A metering or analytics system
  • A best-effort policy guide
MachineID is an enforcement control plane: when validation fails, work does not begin.
Core invariant

Everything reduces to one invariant:

Register. Validate. Work.

If validation fails, work does not begin.

In practice, this means you choose explicit boundaries in your CrewAI runtime where validation must occur. Those boundaries become your “stop points” for revocation, device limits, and org-wide disable.

Simplest integration (fast path)

This is the fastest path to a known-good integration: run the official CrewAI template, then confirm you can stop it remotely.

Step 1 — Generate your org key
Go to machineid.io and click Generate free org key. Copy the key (starts with org_).
Step 2 — Use the official CrewAI template
Repo: machineid-io/crewai-machineid-template
# Python 3.11 is recommended for CrewAI installs
git clone https://github.com/machineid-io/crewai-machineid-template.git
cd crewai-machineid-template

python3.11 -m venv venv311
source venv311/bin/activate
pip install -r requirements.txt

export MACHINEID_ORG_KEY=org_your_key_here
export OPENAI_API_KEY=sk_your_openai_key_here

# Optional: pick a deterministic device id
export MACHINEID_DEVICE_ID=crewai:dev:runner:01

python crewai_agent.py
Step 3 — Prove remote stop works
Open machineid.io/dashboard. Revoke the device you’re running (e.g. crewai:dev:runner:01). The next validation boundary you configured will stop execution.
The template demonstrates the canonical register + validate pattern: it registers a device, validates before starting, and exits immediately if allowed is false. (This is intentional.)
What a “device” is (in CrewAI terms)

In MachineID, a “device” is an identity representing an execution surface. In CrewAI, that maps cleanly to things like:

  • One agent runner process (the Python process that calls crew.kickoff())
  • One worker replica (multiple containers/instances running the same CrewAI code)
  • One scheduled job instance (e.g. “nightly-run-01”)
  • One event-consumer instance that triggers CrewAI execution
Important: Avoid modeling an entire cluster as one “device”. You lose surgical control. Prefer identities that map to a single process/replica/runner.

Device identity is not a secret. Your org key (org_...) is the credential. Device IDs are stable labels for audit and revoke.

Where to validate in CrewAI (the exact boundaries)

CrewAI provides natural execution boundaries. Use them to make enforcement predictable.

Core boundaries (recommended)
  • Startup boundary: validate when the process boots (after registering)
  • Kickoff boundary: validate immediately before crew.kickoff()
  • LLM-call boundary: validate before every LLM call using CrewAI LLM hooks
  • Tool-call boundary: validate before every tool call using CrewAI tool hooks
  • Side-effect boundary: validate immediately before irreversible actions (writes, sends, purchases)
If your agent can create spend, write production data, or trigger external actions, validate at every side-effect boundary.
Fail-closed policy (critical)

Validation must be treated as safety-critical. Use a short timeout and fail closed. If you cannot confirm permission, do not execute.

Fail-closed means:
  • Timeout / network error is treated as allowed: false
  • Worker exits or blocks execution
  • No degraded mode (“continue anyway”)
Path A: Use the CrewAI starter template

If you want a known-good baseline, start here: machineid-io/crewai-machineid-template .

The template:
  • Reads MACHINEID_ORG_KEY and a deterministic deviceId
  • Calls register and validate before running
  • Hard-gates execution: if allowed == false, it stops immediately
Use the template as your “golden reference” for behavior, then add stronger boundaries (hooks) as your workflows become more autonomous.
Path B: Use the Python SDK (recommended for real projects)

For most CrewAI projects, the Python SDK is the cleanest integration surface: machineid-io/python-sdk .

Install:
pip install machineid-io
Minimum hard-enforcement gate (copy/paste):
import os
from machineid import MachineID

m = MachineID.from_env()

device_id = os.getenv("MACHINEID_DEVICE_ID", "crewai:dev:runner:01")

# Register device (idempotent)
m.register(device_id)

# HARD GATE — MUST stop execution if denied
decision = m.validate(device_id)

if not decision["allowed"]:
    print("Execution denied:", decision.get("code"), decision.get("request_id"))
    raise SystemExit(1)

print("Execution allowed")
Common SDK operations you may use in workflows and runbooks:
  • register(device_id), validate(device_id), list_devices()
  • revoke(device_id), unrevoke(device_id), remove(device_id)
  • usage() (useful for dashboards and reporting)
Device ID patterns (CrewAI-friendly naming)

Device IDs should be stable enough to audit, but specific enough to revoke. A common pattern is:

{framework}:{env}:{role}:{instance}

Examples that map cleanly to CrewAI execution surfaces:

  • crewai:dev:researcher:01
  • crewai:prod:planner:02
  • crewai:prod:tool-runner:05
  • crewai:prod:cron-nightly:01
  • crewai:prod:event-consumer:07
Do not embed secrets in a device ID. Treat it as an identifier, not a credential.
Remote controls (revoke, unrevoke, remove, org-wide disable) become effective at the next validation boundary. Design your boundaries accordingly.
Kickoff gate (validate before crew.kickoff)

This is the minimum CrewAI boundary. Always validate immediately before kickoff.

from machineid import MachineID

m = MachineID.from_env()
device_id = "crewai:prod:runner:01"

m.register(device_id)

# Gate BEFORE kickoff
decision = m.validate(device_id)
if not decision["allowed"]:
    raise SystemExit(1)

# Now kickoff
result = crew.kickoff(inputs={...})
Kickoff-only validation is sufficient for simple demos. For autonomous systems, it is not enough—because execution can continue inside long loops.
LLM call hooks gate (validate before each LLM call)

CrewAI supports “before LLM call” hooks. This is a strong enforcement boundary because every reasoning step depends on LLM calls.

You can use an LLM hook to block execution by returning False.
import os
from machineid import MachineID
from crewai.hooks import before_llm_call

m = MachineID.from_env()
DEVICE_ID = os.getenv("MACHINEID_DEVICE_ID", "crewai:prod:runner:01")

m.register(DEVICE_ID)

@before_llm_call
def machineid_gate_llm(context):
    decision = m.validate(DEVICE_ID)
    if not decision["allowed"]:
        print("MachineID denied LLM call:", decision.get("code"), decision.get("request_id"))
        return False
    return None  # allow
For systems with “plan/reflect/loop” behavior, an LLM-call gate turns revocation into a predictable stop point.
Tool call hooks gate (validate before each tool call)

CrewAI supports “before tool call” hooks. This is one of the highest-leverage boundaries because tools are where side effects begin.

Tool hooks can block execution by returning False. They also expose tool name and inputs, which lets you apply stricter policies to high-risk tools.
import os
from machineid import MachineID
from crewai.hooks import before_tool_call

m = MachineID.from_env()
DEVICE_ID = os.getenv("MACHINEID_DEVICE_ID", "crewai:prod:runner:01")

m.register(DEVICE_ID)

HIGH_RISK_TOOLS = set([
    "send_email",
    "charge_card",
    "place_order",
    "write_to_production_db",
])

@before_tool_call
def machineid_gate_tool(context):
    decision = m.validate(DEVICE_ID)
    if not decision["allowed"]:
        print("MachineID denied tool call:", decision.get("code"), decision.get("request_id"))
        return False

    if context.tool_name in HIGH_RISK_TOOLS:
        # Optional: additional validate immediately before risk
        decision2 = m.validate(DEVICE_ID)
        if not decision2["allowed"]:
            print("MachineID denied high-risk tool:", decision2.get("code"), decision2.get("request_id"))
            return False

    return None
If you only add one “inside the run” boundary beyond kickoff, make it a tool-call gate.
Side-effect gate (validate before irreversible actions)

Tool hooks cover many cases, but you still need explicit enforcement around irreversible actions. This is especially important if your tools call other services that can cause spend or state change.

def commit_side_effect(m, device_id):
    decision = m.validate(device_id)
    if not decision["allowed"]:
        raise SystemExit(1)

    perform_irreversible_action()
Rule of thumb: if an action can cost money or change external state, validate immediately before it.
Example: up to 3 devices

This tier is enough to model a real “stop boundary” in a small system and to validate your enforcement placement.

Device model (3)
  • crewai:dev:planner:01 — planning runner
  • crewai:dev:researcher:01 — research runner
  • crewai:dev:tool-runner:01 — tool-heavy runner
How to test remote stop
  • Run one runner
  • Open Dashboard and revoke that device
  • Observe: next validate returns allowed=false and the process exits
The practical key is validation frequency: revoke/disable takes effect at the next validation boundary.
Example: up to 25 devices

This is where distributed execution becomes the default: multiple runners, scheduled jobs, event consumers, and tool-heavy workers.

Device model (example)
  • crewai:prod:runner:01:10 (10 replicas)
  • crewai:prod:cron-nightly:01:03 (3 scheduled jobs)
  • crewai:prod:event-consumer:01:06 (6 consumers)
  • crewai:prod:tool-runner:01:06 (6 tool-heavy workers)
Boundary plan
  • Startup: register + validate
  • Kickoff: validate
  • Before every LLM call: validate
  • Before every tool call: validate
  • Before irreversible side effects: validate
This makes remote revoke behave like an operational stop mechanism, even though a single Python call cannot always be interrupted mid-stack.
Example: up to 250 devices

At this scale, the risk shifts from “one run lasts too long” to “many surfaces spawn and fan out under load.” Identity strategy and boundary placement become the control system.

Typical shapes that drive device count
  • Auto-scaling worker replicas running CrewAI
  • Per-tenant or per-workflow runners
  • Event-driven triggers (per message)
  • Tool chains that enqueue follow-on runs
Treat “before tool call” and “before side effect” as mandatory boundaries for high-leverage control.
Example: up to 1000 devices

At the upper end of standard caps, the dominant failure mode is systemic multiplication: retries, event storms, recursive task creation, and many simultaneous execution surfaces.

Scale guidance
  • Use per-replica identity (not per-cluster)
  • Keep device IDs deterministic and auditable
  • Validate at LLM/tool boundaries and before irreversible actions
  • Avoid internal fallback authority paths (no degraded enforcement)
If your system can multiply execution and you cannot reliably stop it, you do not have control. MachineID is designed for exactly this control-plane problem.
Custom device limits

If your system requires device counts beyond standard tiers, MachineID supports custom device limits. The implementation pattern remains the same: identity per execution surface, and explicit validation boundaries.

Design stays constant
  • Define execution surfaces (what is a “device” in your topology)
  • Assign stable IDs
  • Place boundaries where work begins and where side effects occur
Dashboard control (revoke, restore, bulk operations)

MachineID includes a console at machineid.io/dashboard. This is intentionally separate from your agent runtime.

What you can do from the dashboard
  • Revoke / restore devices (including bulk)
  • Remove devices
  • Register a device
  • Rotate keys
  • Org disable (explained next)
The dashboard becomes effective at the next validation boundary. Validation placement matters more than UI.
Org-wide disable (emergency stop)

In addition to revoking individual devices, you can disable validation org-wide (a hard stop mechanism). This blocks execution across all devices at the next validate.

Important behavior
  • Org disable does not change device revoked/restored state
  • It changes what validate returns (allowed becomes false)
  • Its effect is realized at your next validation boundary
Mobile operations (control from outside the workflow)

External control is designed for intervention from anywhere: a different laptop, a phone, or a secure ops environment that does not have access to the runtime.

Practical runbook
  • Keep device IDs readable so you can find the correct runner quickly
  • Validate frequently so revokes take effect quickly
  • Use org-wide disable when you need an immediate full stop
What not to do

These patterns defeat the entire purpose of external enforcement:

  • “Proceed anyway but log a warning”
  • “Continue for N minutes while validation is down”
  • “Fallback to an internal env flag if validation fails”
  • “Validate once at boot and never again in long-running loops”
If you introduce a second internal authority path, you no longer have an external control plane. You have best-effort policy.
Common failures (and how to debug fast)
Validate denied
  • Check the decision code and request_id from validate
  • Confirm the device is not revoked in the dashboard
  • Confirm org-wide disable is not enabled
  • Confirm you have not exceeded your device cap (registering new unique device IDs)
Revocation doesn’t stop immediately
  • This usually means your validation boundary is too far away
  • Add validation at LLM/tool boundaries and/or before side effects
  • If a single tool call runs for minutes, add a boundary before it begins
Network timeouts
  • Use short timeouts (1–3s) and fail closed
  • Treat inability to validate as “not allowed”
  • Surface denial via logs and exit the worker loop
Why common controls eventually fail (and what MachineID changes)

In early prototypes, it can look like “kill switches” and internal flags are enough. As systems become more agentic and more distributed, the failure modes change: execution surfaces multiply, and control becomes a topology problem, not a single-process problem.

Common escalation patterns
  • Replica multiplication: autoscaling runners and workers increase execution surfaces under load
  • Event storms: one upstream event becomes many downstream agent runs
  • Retries + replay: transient failures create repeated execution and repeated side effects
  • Recursive workflows: agents schedule or spawn follow-on work (fan-out over time)
  • Tool fan-out: one decision triggers many external calls (cost and side effects scale fast)
Why internal mechanisms degrade at scale
  • They require cooperation: an in-process flag only works when every surface reads and obeys it
  • They drift: multiple services, versions, and teams create inconsistent enforcement points
  • They miss boundaries: long-running loops keep going if validation is only at startup
  • They don’t travel: a “stop button” in one runtime doesn’t reliably stop all runtimes
MachineID does not replace orchestration. It replaces “best-effort permission” with an external enforcement boundary: if validate fails, work does not begin. Your control improves as your validation boundaries become more frequent and more explicit.
Copy/paste prompts (for LLM-assisted implementation)

These prompts are designed so a user can paste them into an LLM and get an exact, step-by-step implementation plan. Replace the bracketed placeholders.

Prompt 1 — Add MachineID to my CrewAI project (SDK path)
I have a Python CrewAI project. I want hard enforcement using MachineID.io

Constraints:
- I will use the MachineID Python SDK (pip install machineid-io).
- My org key is: [PASTE ORG KEY]
- My device id strategy should be: crewai:[env]:[role]:[instance]
- I want fail-closed behavior with a short timeout.
- I want validation at: (1) startup, (2) before crew.kickoff(), (3) before every tool call, (4) before every LLM call.

Please give me:
1) The exact files/locations to change
2) Copy/paste code blocks
3) Where to put environment variables
4) A test plan: revoke device + org disable from dashboard, and how the process should behave
Prompt 2 — Use the official template and show me exactly what to change
I want to start from the official CrewAI template:
https://github.com/machineid-io/crewai-machineid-template

Goal:
- Add hard enforcement boundaries (startup + kickoff + tool-call hook + LLM-call hook)
- Keep device IDs deterministic and auditable

Please provide:
1) Exact terminal commands to run the template
2) The exact files to edit and the full copy/paste code changes
3) A test plan using the dashboard to revoke/unrevoke and org-disable
Highest-leverage instruction: validate at the boundaries where work begins or commits side effects.
References