This is a deep integration manual for implementing MachineID into LangChain systems. It is intentionally extensive so it can serve as a single reference for: (1) designing enforceable validation boundaries, (2) modeling devices correctly, and (3) applying remote stop control across distributed execution surfaces.
- Hard gating: startup / invoke / tool-call / side-effect boundaries
- Device identity schemes that scale cleanly across replicas and workflows
- Remote stop: device revoke/restore, bulk controls, and org-wide disable
- Operational semantics: fail-closed, short timeouts, predictable stop points
- You want enforcement, not “best effort”
- You are willing to define explicit boundaries inside the runtime
- You want authority to live outside the process
Everything reduces to a single invariant:
Register. Validate. Work.
If validation fails, work does not begin.
In LangChain, “work begins” at recognizable edges: starting a worker, invoking an agent, calling a tool, and committing side effects. Your job is to choose these boundaries and enforce them consistently.
If your goal is to verify end-to-end enforcement quickly, use the official LangChain starter template. It demonstrates the canonical register/validate flow and hard-gate semantics.
github.com/machineid-io/langchain-machineid-template
- Clone the repo and create a venv
- Get a free org key (supports up to 3 devices): machineid.io
- Set
MACHINEID_ORG_KEYand your LLM provider key, then run - Open Dashboard and revoke the device
- Observe: the next validate returns
allowed=falseand execution stops
A “device” is an identity representing an execution surface. It is not a machine fingerprint and it is not a secret. It is a stable label you assign so you can enforce limits and revoke execution surgically.
- One runner process that invokes chains/agents
- One worker replica (containers/instances running the same code)
- One event-consumer instance that triggers LangChain work
- One scheduled job instance (nightly summarizer, re-indexer, etc.)
- One tool-heavy execution surface you want to stop independently
Enforcement without observability is operationally painful. A good device ID should be: stable enough to audit, specific enough to revoke, and readable enough to operate under pressure.
- Framework: langchain
- Environment: dev / staging / prod
- Role: runner / worker / consumer / cron / tool-runner
- Instance: 01, 02, 03 …
Validation belongs at execution boundaries — points where work begins or commits side effects. LangChain systems naturally contain these boundaries, but they are rarely treated as enforcement points.
- Startup boundary: before a runner/worker starts consuming work
- Invoke boundary: before
chain.invoke()/agent.invoke() - Tool boundary: before a tool call (external API, browse, DB query, queue publish)
- Side-effect boundary: before irreversible actions (writes, sends, payments, deletes)
- High-cost boundary: before entering fan-out, recursion, or batch loops
Revocation and org-wide disable become effective at the next validate. Validation frequency determines how quickly you can stop work.
- Always: validate at startup
- Always: validate before invoke
- Autonomous systems: validate before each tool call
- High-risk actions: validate before side effects (even if you already gated tools)
- Low risk: startup + invoke boundaries
- Medium risk: startup + invoke + per unit of work
- High risk: startup + tool-call + side-effect boundaries
Treat validation like a safety-critical call. Use short timeouts and fail closed. If permission cannot be confirmed, work should not proceed.
- Client timeout: short (for example, 1–3 seconds)
- Timeout/network failure: treat as
allowed:false - Stop the run/worker loop and surface it via logs/alerts
The Python SDK is the simplest and most maintainable integration surface: machineid-io/python-sdk.
pip install machineid-io
import os
from machineid import MachineID
m = MachineID.from_env()
device_id = os.getenv("MACHINEID_DEVICE_ID", "langchain:dev:runner:01")
m.register(device_id)
decision = m.validate(device_id)
if not decision["allowed"]:
print("Execution denied:", decision.get("code"), decision.get("request_id"))
raise SystemExit(1)
Add additional validation boundaries before tool calls and side effects (examples below).
If you prefer a minimal dependency footprint, you can call the canonical endpoints directly.
MachineID’s canonical entry points are POST register and POST validate using the x-org-key header.
POST https://machineid.io/api/v1/devices/register
Headers:
x-org-key: org_...
Body:
{"deviceId":"langchain:prod:runner:01"}
POST https://machineid.io/api/v1/devices/validate
Headers:
x-org-key: org_...
Body:
{"deviceId":"langchain:prod:runner:01"}
Wrappers let you add enforcement without refactoring the rest of your system. The goal is always the same: validate immediately before the boundary where work begins or side effects commit.
def must_be_allowed(m, device_id, boundary):
d = m.validate(device_id)
if not d["allowed"]:
print(f"Denied at {boundary}:", d.get("code"), d.get("request_id"))
raise SystemExit(1)
return d
must_be_allowed(m, device_id, "invoke")
result = agent.invoke({"input": user_prompt})
def tool_call_with_gate(tool_fn, *args, **kwargs):
must_be_allowed(m, device_id, "before_tool_call")
return tool_fn(*args, **kwargs)
def commit_side_effect():
must_be_allowed(m, device_id, "before_side_effect")
perform_irreversible_action()
Device IDs should be stable enough to audit, but specific enough to revoke. A practical pattern:
langchain:{env}:{role}:{instance}
Examples:
langchain:dev:runner:01langchain:prod:worker:07langchain:prod:event-consumer:04langchain:prod:cron-nightly:01langchain:prod:tool-runner:12
The free tier is enough to model real control boundaries. The objective is to prove: (1) identities are stable, and (2) revoke/disable stops execution at predictable checkpoints.
langchain:dev:runner:01— interactive runner for local experimentslangchain:dev:tool-runner:01— tool-heavy runner (web, DB, APIs)langchain:dev:cron-nightly:01— scheduled job identity
- Generate free org key: machineid.io
- Run a tool-heavy sequence (or a loop with multiple tool calls)
- Revoke
langchain:dev:tool-runner:01from Dashboard - Observe: stop occurs at the next validation boundary (tool-call gate recommended)
This tier supports multiple concurrent execution surfaces (replicas, consumers, and scheduled jobs) while keeping device identity manageable and readable.
langchain:prod:runner:01…:08(8 replicas)langchain:prod:tool-runner:01…:06(6 tool-heavy workers)langchain:prod:event-consumer:01…:06(6 consumers)langchain:prod:cron-nightly:01…:03(3 scheduled jobs)langchain:staging:runner:01…:02(2 staging replicas)
- Startup: register + validate
- Invoke: validate before each run
- Tools: validate before each tool call
- Side effects: validate immediately before irreversible actions
At this scale, systems typically include autoscaling, per-tenant workflows, event-driven triggers, and fan-out patterns. The dominant requirement is consistent enforcement across many replicas and across time.
- Autoscaling worker pools executing LangChain runs
- Per-tenant or per-workflow runner identities
- Event-driven triggers that multiply execution surfaces under load
- Tool fan-out (one decision triggers many external calls)
At the upper end of standard caps, the dominant failure mode is execution multiplication: retries, event storms, recursive workflows, and large numbers of simultaneous surfaces.
- Prefer per-replica identities (avoid “one device for the whole fleet”)
- Keep device IDs deterministic and auditable
- Validate frequently enough that stop control is operationally useful
- Avoid internal fallback authority paths (no degraded enforcement)
If you need device limits beyond standard tiers, MachineID supports custom device limits. This does not require changes to agent code — the identity model and enforcement boundaries remain the same.
- Define execution surfaces (what counts as a “device” in your topology)
- Assign stable IDs
- Validate at explicit boundaries
- Use revoke/disable to stop execution at predictable checkpoints
MachineID provides a console at machineid.io/dashboard. The console exists outside your runtime so control does not depend on the process cooperating.
- Revoke / restore devices (including bulk)
- Remove devices
- Register devices
- Rotate keys
- Org-wide disable (hard stop across devices)
In addition to revoking individual devices, MachineID supports an org-wide disable control. This is a deliberate “stop everything” mechanism that changes validate outcomes across the org.
- Org-wide disable does not change device revoked/restored state
- It affects validate decisions across the org (allowed becomes false)
- It takes effect at the next validation boundary you defined
External control is specifically designed to work from outside the runtime environment: a different laptop, a phone, or a secure ops environment that has no access to the worker process.
- Use readable device IDs so you can find the correct surface quickly
- Validate frequently enough that stop control has low latency
- Revoke specific devices for surgical control; use org-wide disable for full stop
Plans are enforced externally and take effect immediately for upgrades and downgrades. Device caps are enforced based on unique device IDs registered to the org.
- Free tier is intended to remain available and supports up to 3 devices
- Paid-to-free transition retains the org and caps devices back to 3
- Cancellation at end of billing cycle is enforced at the end of that cycle
- Plan changes do not require agent code changes
These patterns defeat the purpose of external enforcement:
- Proceed anyway on validation timeout or error
- Continue for a fixed grace window while enforcement is unavailable
- Fallback to internal flags as an alternate authority path
- Validate only at startup for long-running, tool-heavy runs
- This usually means validate boundaries are too far apart
- Add validate before tool calls and before side effects
- If one tool call runs for minutes, add a boundary before it begins
- Check
codeandrequest_idfor the decision - Confirm the device is not revoked in the dashboard
- Confirm org-wide disable is not enabled
- Confirm you have not exceeded your device cap (new unique IDs)
- Use short client timeouts (1–3s) and fail closed
- Treat inability to validate as not allowed and stop
- Surface denial via logs and stop the worker loop
Early LangChain deployments often run in a single process with a single agent. In that shape, internal flags and “stop buttons” can appear sufficient.
As systems become more agentic and distributed, execution surfaces multiply: more replicas, more consumers, more scheduled jobs, more tool-driven fan-out, and more long-running loops operating across time.
- Replica multiplication: autoscaling increases concurrent execution surfaces under load
- Event storms: one upstream condition triggers many downstream agent runs
- Retry amplification: transient failures cause repeated runs and repeated side effects
- Recursive workflows: agents schedule follow-on work (fan-out over time)
- Tool fan-out: a single decision triggers many external calls (cost and effects scale quickly)
Internal flags and in-process kill switches rely on cooperation. As systems scale, cooperation becomes inconsistent: multiple services, multiple versions, multiple teams, and multiple execution surfaces.
- They require every surface to obey: one missed boundary becomes an escape hatch
- They drift over time: enforcement points vary across services and versions
- They fail under multiplication: new replicas and consumers must all inherit the same controls
- They are not external authority: the runtime can still decide to proceed
The prompts below are designed to produce practical integration plans with minimal guesswork. Replace bracketed placeholders and paste into your LLM of choice.
I have a Python LangChain project and I want hard enforcement using MachineID.io
Context:
- My org key: [PASTE ORG KEY]
- My device ID pattern: langchain:{env}:{role}:{instance}
- Fail-closed policy, short timeout (1–3s)
- Validation boundaries required:
1) Startup (register + validate)
2) Before invoke (chain.invoke / agent.invoke)
3) Before tool calls
4) Before irreversible side effects
Please provide:
1) Exact files/locations to change
2) Copy/paste code blocks
3) Environment variables to add
4) A test plan:
- revoke one device from dashboard
- restore it
- use org-wide disable
- verify stops occur at the next validate
Help me model MachineID devices for my LangChain system.
Inputs:
- Execution surfaces: [describe: runners, workers, consumers, cron jobs, regions]
- Expected scale: [3 / 25 / 250 / 1000]
- I need readable device IDs and surgical revoke control.
Output:
- A proposed device ID scheme
- A list of device IDs for the target tier
- Where to validate (startup / invoke / tools / side effects)
- A minimal runbook for revoke + org-wide disable
I have a long-running LangChain agent loop that may run for hours.
Goal:
- Add MachineID validate boundaries inside the loop so I can stop it remotely.
- Fail closed on validation failure or timeout.
Please provide:
- Exactly where to place validation calls
- A wrapper pattern that is hard to forget
- A test plan using dashboard revoke and org-wide disable
- Stable device identity per execution surface
- Startup gating (register + validate, fail closed)
- At least one in-run stop point (tool-call or side-effect boundary)
- Short timeout and consistent failure policy
- Denials logged as operational events (include request_id)
- A runbook to revoke/restore and use org-wide disable
- LangChain starter template: github.com/machineid-io/langchain-machineid-template
- Python SDK: github.com/machineid-io/python-sdk
- MachineID GitHub org: github.com/machineid-io
- Dashboard: machineid.io/dashboard
- Core enforcement guidance: Implementation Guide and Operational Guarantees
- Control plane rationale: External Identity Control Plane