Standard LLM calls are stateless request-response cycles. Agents are different: they run in loops, use tools, accumulate state, and make decisions at each step. LangGraph gives you the primitives to build production-grade agents without fighting against a framework that wants to hide the complexity from you.
This guide starts from first principles — what a graph is, how nodes and edges work, conditional routing — and builds up to checkpointing, human-in-the-loop patterns, and multi-agent coordination.
What Is LangGraph?
LangGraph is a library for building stateful, multi-actor applications using a directed graph model. Think of it as React's state machine for LLM workflows:
- State — a typed dict/TypedDict that flows through the graph; nodes read and update it
- Nodes — functions (sync or async) that receive state and return state updates
- Edges — define execution order; can be static or conditional (dynamic routing)
- Checkpointer — persists state between steps for long-running or paused workflows
Your First LangGraph Agent
from typing import TypedDict, Annotated
from langgraph.graph import StateGraph, START, END
from langgraph.prebuilt import ToolNode
from langchain_anthropic import ChatAnthropic
from langchain_core.tools import tool
import operator
# 1. Define state
class AgentState(TypedDict):
messages: Annotated[list, operator.add] # list that accumulates
tool_calls_made: int
# 2. Define tools
@tool
def web_search(query: str) -> str:
"""Search the web for current information."""
# Your real search implementation here
return f"Search results for: {query}"
@tool
def calculator(expression: str) -> str:
"""Evaluate a mathematical expression."""
return str(eval(expression)) # use safer eval in production
tools = [web_search, calculator]
# 3. Set up the model with tools bound
model = ChatAnthropic(model="claude-3-5-sonnet-20241022").bind_tools(tools)
# 4. Define nodes
def call_model(state: AgentState) -> dict:
response = model.invoke(state["messages"])
return {
"messages": [response],
"tool_calls_made": state["tool_calls_made"]
}
def should_continue(state: AgentState) -> str:
"""Conditional edge: route to tools or end."""
last = state["messages"][-1]
if hasattr(last, "tool_calls") and last.tool_calls:
if state["tool_calls_made"] >= 5:
return "end" # Circuit breaker
return "tools"
return "end"
# 5. Build the graph
tool_node = ToolNode(tools)
graph = StateGraph(AgentState)
graph.add_node("agent", call_model)
graph.add_node("tools", tool_node)
graph.add_edge(START, "agent")
graph.add_conditional_edges("agent", should_continue, {"tools": "tools", "end": END})
graph.add_edge("tools", "agent") # Loop back after tool execution
app = graph.compile()
# 6. Run
result = app.invoke({
"messages": [{"role": "user", "content": "What's the current Bitcoin price in EUR?"}],
"tool_calls_made": 0
})
print(result["messages"][-1].content)
Checkpointing for Long-Running Workflows
Checkpointers persist state between graph runs. This enables resumable workflows, multi-turn conversations that span hours/days, and human-in-the-loop review at arbitrary points.
from langgraph.checkpoint.memory import MemorySaver
from langgraph.checkpoint.postgres import PostgresSaver
import psycopg
# Memory checkpointer (dev/testing)
memory = MemorySaver()
app = graph.compile(checkpointer=memory)
# Postgres checkpointer (production)
conn = psycopg.connect("postgresql://user:pass@localhost/langgraph")
checkpointer = PostgresSaver(conn)
checkpointer.setup() # Creates tables
app = graph.compile(checkpointer=checkpointer)
# Run with a thread_id — same ID continues existing state
config = {"configurable": {"thread_id": "user-session-abc123"}}
# First run
result = app.invoke(
{"messages": [{"role": "user", "content": "Research quantum computing for me"}]},
config=config
)
# Resume later — state is automatically loaded from checkpoint
result2 = app.invoke(
{"messages": [{"role": "user", "content": "Now summarise what you found in 3 bullets"}]},
config=config # Same thread_id restores previous state
)
Human-in-the-Loop
Use interrupt_before to pause the graph before executing a specific node — waiting for human approval before taking an action.
from langgraph.types import interrupt
# Pause before the "execute_code" node for human review
app = graph.compile(
checkpointer=memory,
interrupt_before=["execute_code"]
)
config = {"configurable": {"thread_id": "review-thread-1"}}
# Agent plans, then pauses waiting for approval
state = app.invoke(initial_input, config=config)
print("Agent wants to execute:", state["messages"][-1].content)
print("Current graph state:", app.get_state(config))
# Human reviews and approves — continue the graph
human_decision = input("Approve? (y/n): ")
if human_decision == "y":
final = app.invoke(None, config=config) # None continues from checkpoint
else:
app.update_state(config, {"messages": [{"role": "user", "content": "Do NOT execute that code. Explain instead."}]})
final = app.invoke(None, config=config)
Multi-Agent Patterns
Complex tasks benefit from specialised agents. A supervisor agent routes work to sub-agents; a swarm pattern has agents hand off to each other based on expertise.
from langchain_core.messages import HumanMessage
from langgraph.graph import StateGraph, MessagesState
# Subgraph 1: Research agent
research_graph = StateGraph(MessagesState)
# ... add research-specific tools and nodes
# Subgraph 2: Code agent
code_graph = StateGraph(MessagesState)
# ... add code execution tools and nodes
research_app = research_graph.compile()
code_app = code_graph.compile()
# Supervisor node
def supervisor(state: MessagesState) -> dict:
"""Decide which subgraph handles the task."""
task = state["messages"][-1].content
if "code" in task or "implement" in task:
result = code_app.invoke(state)
else:
result = research_app.invoke(state)
return {"messages": result["messages"]}
parent = StateGraph(MessagesState)
parent.add_node("supervisor", supervisor)
parent.add_edge(START, "supervisor")
parent.add_edge("supervisor", END)
orchestrator = parent.compile()
For production multi-agent systems, use LangSmith tracing to visualise agent decision trees and debug tool call sequences across nested graphs.
Streaming Agentic Output
# Stream each step as it happens
async for event in app.astream_events(initial_input, config=config, version="v2"):
kind = event["event"]
if kind == "on_chat_model_stream":
# Stream LLM tokens to UI
chunk = event["data"]["chunk"]
print(chunk.content, end="", flush=True)
elif kind == "on_tool_start":
print(f"\n[Tool] {event['name']} called with {event['data']['input']}")
elif kind == "on_tool_end":
print(f"[Tool] Result: {event['data']['output'][:100]}...")
Summary
- LangGraph = directed graph of stateful nodes — explicit and debuggable vs. opaque chains
- Conditional edges enable dynamic routing (tool loop, circuit breakers, fallbacks)
- Checkpointers make long-running workflows resumable across process restarts
interrupt_beforeenables human-in-the-loop review at any node boundary- Multi-agent systems: supervisor routes to specialised sub-graphs
- LangSmith tracing is essential for debugging nested agent calls in production