TL;DR — Pydantic AI brings the "if it type-checks, it works" philosophy to AI agents. From the team behind Pydantic, it gives you a clean Agent class with typed dependencies, typed results, typed tools, and a test mode that replaces real model calls with deterministic fakes. Model-agnostic (OpenAI, Anthropic, Gemini, Ollama). Designed for production Python codebases where you want IDE support, validation, and testability.
What it is
Pydantic AI is a Python agent framework that treats LLM interactions as typed function calls. You define an agent with a result type, dependency type, system prompt, and tools — all using Python type hints and Pydantic models. The framework validates everything at the boundary: tool arguments in, structured results out.
Why it exists
LLM frameworks tend to be stringly-typed: prompts as strings, tool schemas as dicts, results as untyped text. This makes bugs invisible until runtime. Pydantic AI applies the same discipline that Pydantic brought to data validation: define the shape, let the framework enforce it, and get IDE autocomplete and error-checking for free.
Install & setup
pip install pydantic-ai
export OPENAI_API_KEY=sk-...
Basic agent
from pydantic_ai import Agent
agent = Agent(
"openai:gpt-4o",
system_prompt="You are a helpful assistant.",
)
result = agent.run_sync("What is Kubernetes?")
print(result.data) # plain string response
Structured output
from pydantic import BaseModel
from pydantic_ai import Agent
class CityInfo(BaseModel):
name: str
country: str
population: int
agent = Agent("openai:gpt-4o", result_type=CityInfo)
result = agent.run_sync("Tell me about Tokyo")
print(result.data.name) # Tokyo
print(result.data.population) # 13960000
Dependencies — typed context
Pass runtime context (database connections, API clients, user info) to tools via typed dependencies:
from dataclasses import dataclass
from pydantic_ai import Agent, RunContext
@dataclass
class Deps:
user_id: str
db: DatabaseConnection
agent = Agent("openai:gpt-4o", deps_type=Deps)
@agent.tool
def get_user_orders(ctx: RunContext[Deps], limit: int = 5) -> str:
"""Get recent orders for the current user."""
orders = ctx.deps.db.query(
"SELECT * FROM orders WHERE user_id = ? LIMIT ?",
ctx.deps.user_id, limit
)
return str(orders)
result = agent.run_sync("Show my recent orders", deps=Deps(user_id="42", db=db))
Tools
@agent.tool
def search_web(ctx: RunContext[Deps], query: str) -> str:
"""Search the web for information."""
return f"Results for: {query}"
@agent.tool_plain # no context needed
def calculate(expression: str) -> str:
"""Evaluate a math expression."""
return str(eval(expression)) # simplified
Testing — the killer feature
from pydantic_ai.models.test import TestModel
# Replace the real model with a deterministic fake
with agent.override(model=TestModel()):
result = agent.run_sync("What is Kubernetes?")
# TestModel returns a predictable response
# Your tests don't hit the API, don't cost money, and are deterministic
Streaming
async with agent.run_stream("Explain RAG") as response:
async for text in response.stream_text():
print(text, end="", flush=True)
When to use, when to skip
Use it when you want type safety, testability, and IDE support in your agent code. Ideal for production Python backends where agents are part of a larger typed codebase. The dependency injection pattern makes it natural for web services.
Skip it for complex multi-agent orchestration (LangGraph), rich middleware (LangChain v1), or if you need a huge integration ecosystem. It's deliberately minimal.
vs the alternatives
| Tool | Best for | Trade-off |
|---|---|---|
| Pydantic AI | Type-safe, testable, Pythonic agents | Smaller ecosystem, newer |
| LangChain | Rich integrations, middleware | Heavier, less typed |
| Agno | Lightweight, fast agents | Less type-safety focus |
| Instructor | Structured output only (no agents) | Not an agent framework |
Verified against Pydantic AI docs (ai.pydantic.dev), May 2026.