Human-in-the-Loop with LangGraph Interrupts¶
Overview¶
This implementation demonstrates LangGraph's interrupt-based Human-in-the-Loop pattern - one of the most powerful features of LangGraph for building reliable AI agents that require human oversight.
How It Works¶
The LangGraph HITL Pattern¶
Unlike pre-processing approval (which waits before starting), LangGraph's interrupt pattern:
- Starts processing immediately - The agent begins workflow execution
- Pauses at critical decision points - Uses
interrupt_beforeto halt at specific nodes - Persists state - Saves workflow state in a checkpointer (MemorySaver)
- Awaits human input - Returns control to API, workflow is "frozen" mid-execution
- Resumes from checkpoint - When human approves, workflow continues from exact interruption point
Workflow Architecture¶
┌─────────────┐
│ Submit │
│ Incident │
└──────┬──────┘
│
▼
┌──────────────┐
│ Classify │ ◄─── Step 1: Classify incident
│ (LangGraph) │
└──────┬───────┘
│
▼
┌────────────────┐
│ Human Review │ ◄─── Step 2: Determine if approval needed
│ (P1/P2?) │
└────────┬───────┘
│
├─────► Low Priority (P3/P4) → Auto-approve → Continue
│
└─────► High Priority (P1/P2) → 🛑 INTERRUPT
│
▼
┌──────────────────┐
│ Checkpoint │
│ State Saved │
└────────┬─────────┘
│
[Workflow Paused]
[API Returns]
[Human Reviews]
│
┌───────▼────────┐
│ Human Decides │
└───────┬────────┘
│
┌────────────────┼────────────────┐
│ │
✅ Approve ❌ Reject
│ │
▼ ▼
┌────────────────┐ ┌─────────────────┐
│ Resume from │ │ Workflow │
│ Checkpoint │ │ Terminated │
└───────┬────────┘ └─────────────────┘
│
▼
┌────────────────┐
│ Spatial │ ◄─── Step 3: Spatial analysis
│ Analysis │
└───────┬────────┘
│
▼
┌────────────────┐
│ Guidance │ ◄─── Step 4: Search guidance
│ Search │
└───────┬────────┘
│
▼
┌────────────────┐
│ Notifications │ ◄─── Step 5: Send notifications
└───────┬────────┘
│
▼
┌────────────────┐
│ Finalize │ ◄─── Step 6: Complete
└────────────────┘
Key Implementation Details¶
1. Checkpointer Setup¶
from langgraph.checkpoint.memory import MemorySaver
class HITLIncidentAgent:
def __init__(self):
# Create checkpointer for state persistence
self.checkpointer = MemorySaver()
self.workflow = self._create_workflow()
2. Interrupt Configuration¶
def _create_workflow(self):
workflow = StateGraph(HITLIncidentState)
# Add all nodes
workflow.add_node("classify", self._classify_incident)
workflow.add_node("human_review", self._request_human_review)
workflow.add_node("spatial", self._check_spatial_context)
# ... more nodes
# Compile with interrupt BEFORE spatial analysis
return workflow.compile(
checkpointer=self.checkpointer,
interrupt_before=["spatial"] # 🛑 Pause here if approval needed
)
3. State Management¶
class HITLIncidentState(TypedDict):
# Standard fields
incident_id: str
description: str
# ... more fields
# HITL-specific fields
human_approval_required: bool # Set during classification
human_approved: bool | None # None = awaiting, True/False = decided
human_feedback: str | None # Feedback from human
modified_classification: dict # Human can modify classification
4. Workflow Invocation¶
def process_incident(self, incident_id, ...):
# Create initial state
initial_state = {...}
# Run workflow with thread_id for checkpoint tracking
config = {"configurable": {"thread_id": incident_id}}
result = self.workflow.invoke(initial_state, config)
# Check if paused at interrupt
if result["awaiting_human"]:
return {"awaiting_human": True, ...}
else:
return {"success": True, ...}
5. Resume After Approval¶
def continue_after_approval(self, incident_id, approved, feedback):
config = {"configurable": {"thread_id": incident_id}}
# Get current state from checkpoint
current_state = self.workflow.get_state(config)
# Update state with human decision
current_state.values["human_approved"] = approved
current_state.values["human_feedback"] = feedback
# Update checkpoint
self.workflow.update_state(config, current_state.values)
if not approved:
return {"success": False, "rejected": True}
# Resume workflow from interrupt point! 🚀
result = self.workflow.invoke(None, config)
return result
Why This Is Powerful¶
1. State Persistence¶
- Workflow state is saved at the exact interruption point
- No data loss even if server restarts
- Can review later - no time pressure
2. Clean Separation of Concerns¶
- Agent logic stays pure - no approval code mixed in
- Human approval is a separate concern
- Easy to test and maintain
3. Flexible Decision Points¶
- Can interrupt at multiple points
- Can have conditional interrupts
- Can modify state before resuming
4. Auditability¶
- Full history of workflow execution
- Can see exactly when and why it paused
- Human decisions are logged in state
5. Scalability¶
- Multiple incidents can be paused simultaneously
- Each has its own checkpoint thread
- Can prioritize which to review first
API Endpoints¶
Submit Incident (with HITL)¶
POST /api/v1/incidents/submit
# High priority (P1/P2) → Pauses at interrupt
# Low priority (P3/P4) → Auto-approves and completes
List Pending Approvals¶
Approve and Resume¶
POST /api/v1/incidents/{incident_id}/approve
{
"approved_by": "Sarah Manager"
}
# Resumes LangGraph workflow from checkpoint!
Reject and Terminate¶
POST /api/v1/incidents/{incident_id}/reject
{
"approved_by": "Sarah Manager",
"reason": "Duplicate report"
}
# Terminates workflow - will not continue
Testing the Pattern¶
- Submit high-priority incident:
Response: Workflow paused at interrupt, awaiting review
-
Check dashboard: Visit http://localhost:8502 → "Approval Queue" tab
-
Approve: Click "Approve & Process" → Workflow resumes from checkpoint!
Comparison: Pre-processing vs Interrupt-based HITL¶
| Aspect | Pre-processing | LangGraph Interrupts |
|---|---|---|
| When starts | After approval | Immediately |
| State management | Manual DB queries | Automatic checkpointing |
| Resume capability | Starts from scratch | Resumes from exact point |
| Flexibility | One gate | Multiple interrupt points |
| Code complexity | Higher (manual state) | Lower (LangGraph handles it) |
| Agent purity | Mixed with approval logic | Clean separation |
| Auditability | Manual logging | Built-in state history |
| Scalability | Limited | Excellent |
Advanced Patterns¶
Multi-level Approval¶
# Interrupt at multiple decision points
return workflow.compile(
checkpointer=self.checkpointer,
interrupt_before=["spatial", "notify"] # Two gates
)
Conditional Interrupts¶
def should_interrupt(state):
return state["severity"] == "critical" and state["impact_area"] > 10
# Add conditional routing instead of fixed interrupt
Human Modification¶
# Human can modify classification before resuming
current_state.values["modified_classification"] = {
"severity": "high", # Downgrade from critical
"priority": "P2"
}
self.workflow.update_state(config, current_state.values)
Conclusion¶
This implementation showcases the true strength of LangGraph's HITL pattern: seamless interruption and resumption of complex workflows with full state persistence. This is far more powerful than simple pre-processing approval and demonstrates why LangGraph is the leading framework for building production-grade AI agents with human oversight.
The pattern is: - ✅ Production-ready - State persists across restarts - ✅ Scalable - Handle multiple paused workflows - ✅ Auditable - Full state history - ✅ Flexible - Interrupt anywhere, modify state - ✅ Clean - Separation of agent logic and approval
This is the foundation for building reliable, trustworthy AI agents in critical domains like environmental incident management.