Human-in-the-Loop with LangGraph Interrupts¶

Overview¶

This implementation demonstrates LangGraph's interrupt-based Human-in-the-Loop pattern - one of the most powerful features of LangGraph for building reliable AI agents that require human oversight.

How It Works¶

The LangGraph HITL Pattern¶

Unlike pre-processing approval (which waits before starting), LangGraph's interrupt pattern:

Starts processing immediately - The agent begins workflow execution
Pauses at critical decision points - Uses interrupt_before to halt at specific nodes
Persists state - Saves workflow state in a checkpointer (MemorySaver)
Awaits human input - Returns control to API, workflow is "frozen" mid-execution
Resumes from checkpoint - When human approves, workflow continues from exact interruption point

Workflow Architecture¶

┌─────────────┐
│   Submit    │
│  Incident   │
└──────┬──────┘
       │
       ▼
┌──────────────┐
│  Classify    │  ◄─── Step 1: Classify incident
│  (LangGraph) │
└──────┬───────┘
       │
       ▼
┌────────────────┐
│ Human Review   │  ◄─── Step 2: Determine if approval needed
│  (P1/P2?)      │
└────────┬───────┘
         │
         ├─────► Low Priority (P3/P4) → Auto-approve → Continue
         │
         └─────► High Priority (P1/P2) → 🛑 INTERRUPT
                                           │
                                           ▼
                                   ┌──────────────────┐
                                   │   Checkpoint     │
                                   │  State Saved     │
                                   └────────┬─────────┘
                                            │
                                   [Workflow Paused]
                                   [API Returns]      
                                   [Human Reviews]
                                            │
                                    ┌───────▼────────┐
                                    │  Human Decides │
                                    └───────┬────────┘
                                            │
                           ┌────────────────┼────────────────┐
                           │                                 │
                      ✅ Approve                        ❌ Reject
                           │                                 │
                           ▼                                 ▼
                   ┌────────────────┐             ┌─────────────────┐
                   │ Resume from    │             │  Workflow       │
                   │  Checkpoint    │             │  Terminated     │
                   └───────┬────────┘             └─────────────────┘
                           │
                           ▼
                   ┌────────────────┐
                   │    Spatial     │  ◄─── Step 3: Spatial analysis
                   │   Analysis     │
                   └───────┬────────┘
                           │
                           ▼
                   ┌────────────────┐
                   │   Guidance     │  ◄─── Step 4: Search guidance
                   │    Search      │
                   └───────┬────────┘
                           │
                           ▼
                   ┌────────────────┐
                   │  Notifications │  ◄─── Step 5: Send notifications
                   └───────┬────────┘
                           │
                           ▼
                   ┌────────────────┐
                   │   Finalize     │  ◄─── Step 6: Complete
                   └────────────────┘

Key Implementation Details¶

1. Checkpointer Setup¶

from langgraph.checkpoint.memory import MemorySaver

class HITLIncidentAgent:
    def __init__(self):
        # Create checkpointer for state persistence
        self.checkpointer = MemorySaver()
        self.workflow = self._create_workflow()

2. Interrupt Configuration¶

def _create_workflow(self):
    workflow = StateGraph(HITLIncidentState)

    # Add all nodes
    workflow.add_node("classify", self._classify_incident)
    workflow.add_node("human_review", self._request_human_review)
    workflow.add_node("spatial", self._check_spatial_context)
    # ... more nodes

    # Compile with interrupt BEFORE spatial analysis
    return workflow.compile(
        checkpointer=self.checkpointer,
        interrupt_before=["spatial"]  # 🛑 Pause here if approval needed
    )

3. State Management¶

class HITLIncidentState(TypedDict):
    # Standard fields
    incident_id: str
    description: str
    # ... more fields

    # HITL-specific fields
    human_approval_required: bool  # Set during classification
    human_approved: bool | None    # None = awaiting, True/False = decided
    human_feedback: str | None     # Feedback from human
    modified_classification: dict  # Human can modify classification

4. Workflow Invocation¶

def process_incident(self, incident_id, ...):
    # Create initial state
    initial_state = {...}

    # Run workflow with thread_id for checkpoint tracking
    config = {"configurable": {"thread_id": incident_id}}
    result = self.workflow.invoke(initial_state, config)

    # Check if paused at interrupt
    if result["awaiting_human"]:
        return {"awaiting_human": True, ...}
    else:
        return {"success": True, ...}

5. Resume After Approval¶

def continue_after_approval(self, incident_id, approved, feedback):
    config = {"configurable": {"thread_id": incident_id}}

    # Get current state from checkpoint
    current_state = self.workflow.get_state(config)

    # Update state with human decision
    current_state.values["human_approved"] = approved
    current_state.values["human_feedback"] = feedback

    # Update checkpoint
    self.workflow.update_state(config, current_state.values)

    if not approved:
        return {"success": False, "rejected": True}

    # Resume workflow from interrupt point! 🚀
    result = self.workflow.invoke(None, config)
    return result

Why This Is Powerful¶

1. State Persistence¶

Workflow state is saved at the exact interruption point
No data loss even if server restarts
Can review later - no time pressure

2. Clean Separation of Concerns¶

Agent logic stays pure - no approval code mixed in
Human approval is a separate concern
Easy to test and maintain

3. Flexible Decision Points¶

Can interrupt at multiple points
Can have conditional interrupts
Can modify state before resuming

4. Auditability¶

Full history of workflow execution
Can see exactly when and why it paused
Human decisions are logged in state

5. Scalability¶

Multiple incidents can be paused simultaneously
Each has its own checkpoint thread
Can prioritize which to review first

API Endpoints¶

Submit Incident (with HITL)¶

POST /api/v1/incidents/submit

# High priority (P1/P2) → Pauses at interrupt
# Low priority (P3/P4) → Auto-approves and completes

List Pending Approvals¶

GET /api/v1/incidents/pending-approval

# Returns incidents paused at LangGraph interrupt

Approve and Resume¶

POST /api/v1/incidents/{incident_id}/approve
{
  "approved_by": "Sarah Manager"
}

# Resumes LangGraph workflow from checkpoint!

Reject and Terminate¶

POST /api/v1/incidents/{incident_id}/reject
{
  "approved_by": "Sarah Manager",
  "reason": "Duplicate report"
}

# Terminates workflow - will not continue

Testing the Pattern¶

Submit high-priority incident:

curl -X POST http://localhost:8000/api/v1/incidents/submit \
  -H "Content-Type: application/json" \
  -d '{
    "incident_type": "chemical_spill",
    "location": "Industrial Park",
    "description": "Major hazardous chemical leak",
    "urgency": "critical"
  }'

Response: Workflow paused at interrupt, awaiting review

Check dashboard: Visit http://localhost:8502 → "Approval Queue" tab
Approve: Click "Approve & Process" → Workflow resumes from checkpoint!

Comparison: Pre-processing vs Interrupt-based HITL¶

Aspect	Pre-processing	LangGraph Interrupts
When starts	After approval	Immediately
State management	Manual DB queries	Automatic checkpointing
Resume capability	Starts from scratch	Resumes from exact point
Flexibility	One gate	Multiple interrupt points
Code complexity	Higher (manual state)	Lower (LangGraph handles it)
Agent purity	Mixed with approval logic	Clean separation
Auditability	Manual logging	Built-in state history
Scalability	Limited	Excellent

Advanced Patterns¶

Multi-level Approval¶

# Interrupt at multiple decision points
return workflow.compile(
    checkpointer=self.checkpointer,
    interrupt_before=["spatial", "notify"]  # Two gates
)

Conditional Interrupts¶

def should_interrupt(state):
    return state["severity"] == "critical" and state["impact_area"] > 10

# Add conditional routing instead of fixed interrupt

Human Modification¶

# Human can modify classification before resuming
current_state.values["modified_classification"] = {
    "severity": "high",  # Downgrade from critical
    "priority": "P2"
}
self.workflow.update_state(config, current_state.values)

Conclusion¶

This implementation showcases the true strength of LangGraph's HITL pattern: seamless interruption and resumption of complex workflows with full state persistence. This is far more powerful than simple pre-processing approval and demonstrates why LangGraph is the leading framework for building production-grade AI agents with human oversight.

The pattern is: - ✅ Production-ready - State persists across restarts - ✅ Scalable - Handle multiple paused workflows - ✅ Auditable - Full state history - ✅ Flexible - Interrupt anywhere, modify state - ✅ Clean - Separation of agent logic and approval

This is the foundation for building reliable, trustworthy AI agents in critical domains like environmental incident management.