Agent is a stateless executor, Session adds the statefulness and resilience you need for interactive, multi-turn conversations. It owns the message history and handles compaction, retry, persistence, and lifecycle hooks automatically.
Creating a Session
session.send() yields all the same AgentEvent types as agent.run(), plus additional session lifecycle events.
Configuration
| Option | Default | Description |
|---|---|---|
agent | (required) | The Agent to use for execution |
contextWindow | — | Model context window size in tokens. Required for auto-compaction |
reservedTokens | min(20_000, agent.maxTokens ?? 20_000) | Tokens reserved for output |
autoCompact | true when contextWindow is set | Enable auto-compaction |
shouldCompact | — | Custom overflow detection function |
compactionStrategy | DefaultCompactionStrategy() | Custom compaction strategy |
retry | — | Retry config for transient API errors |
hooks | — | Lifecycle hooks |
sessionStore | — | Pluggable persistence backend |
sessionId | auto-generated UUID | Session identifier |
Session Events
In addition to allAgentEvent types, session.send() yields:
| Event | Description |
|---|---|
turn.start | A new turn is starting |
turn.done | Turn completed (includes token usage) |
compaction.start | Compaction triggered (includes reason and token count) |
compaction.pruned | Tool results pruned (phase 1) |
compaction.summary | Conversation summarized (phase 2) |
compaction.done | Compaction finished (includes before/after token counts) |
retry | Retrying after a transient error (includes attempt count and delay) |
Compaction
When a conversation approaches the context window limit, the session automatically compacts the message history. The default strategy works in two phases:- Pruning — replaces tool result content in older messages with
"[pruned]", preserving the most recent ~40K tokens of context. No LLM call needed. - Summarization — when pruning isn’t enough, calls the model to generate a structured summary and replaces the entire history with it.
Retry
Transient API errors (429, 500, 502, 503, 504, 529, rate limits, timeouts) are retried automatically with exponential backoff and jitter. Retries only happen before any content has been streamed — once the model starts producing output, the session commits to that attempt.Hooks
Hooks let you intercept and customize the session lifecycle:Persistence
Plug in any storage backend by implementing theSessionStore interface: