ChatCompletions Client
stirrup.clients.chat_completions_client
OpenAI SDK-based LLM client for chat completions.
This client uses the official OpenAI Python SDK directly, supporting both OpenAI's
API and any OpenAI-compatible endpoint via the base_url parameter (e.g., vLLM,
Ollama, Azure OpenAI, local models).
This is the default client for Stirrup.
ChatMessage
ChatMessage = Annotated[
SystemMessage
| UserMessage
| AssistantMessage
| ToolMessage,
Field(discriminator=role),
]
Discriminated union of all message types, automatically parsed based on role field.
ContextOverflowError
Bases: Exception
Raised when LLM context window is exceeded (max_tokens or length finish_reason).
AssistantMessage
Bases: BaseModel
LLM response message with optional tool calls and token usage tracking.
LLMClient
Bases: Protocol
Protocol defining the interface for LLM client implementations.
Any LLM client must implement this protocol to work with the Agent class. Provides text generation with tool support and model capability inspection.
Reasoning
Bases: BaseModel
Extended thinking/reasoning content from models that support chain-of-thought reasoning.
TokenUsage
Bases: BaseModel
Token counts for LLM usage (input, output, reasoning tokens).
__add__
__add__(other: TokenUsage) -> TokenUsage
Add two TokenUsage objects together, summing each field independently.
Source code in src/stirrup/core/models.py
Tool
Bases: BaseModel
Tool definition with name, description, parameter schema, and executor function.
Generic over
P: Parameter model type (must be a Pydantic BaseModel, or None for parameterless tools) M: Metadata type (should implement Addable for aggregation; use None for tools without metadata)
Tools are simple, stateless callables. For tools requiring lifecycle management (setup/teardown, resource pooling), use a ToolProvider instead.
Example with parameters
class CalcParams(BaseModel): expression: str
calc_tool = ToolCalcParams, None)
Example without parameters
time_tool = ToolNone, None)
ToolCall
ChatCompletionsClient
ChatCompletionsClient(
model: str,
max_tokens: int = 64000,
*,
base_url: str | None = None,
api_key: str | None = None,
supports_audio_input: bool = False,
reasoning_effort: str | None = None,
timeout: float | None = None,
max_retries: int = 2,
kwargs: dict[str, Any] | None = None,
)
Bases: LLMClient
OpenAI SDK-based client supporting OpenAI and OpenAI-compatible APIs.
Uses the official OpenAI Python SDK directly for chat completions. Supports custom base_url for OpenAI-compatible providers (vLLM, Ollama, Azure OpenAI, local models, etc.).
Includes automatic retries for transient failures and token usage tracking.
Example
Standard OpenAI usage
client = ChatCompletionsClient(model="gpt-4o", max_tokens=128_000)
Custom OpenAI-compatible endpoint
client = ChatCompletionsClient( ... model="llama-3.1-70b", ... base_url="http://localhost:8000/v1", ... api_key="your-api-key", ... )
Initialize OpenAI SDK client with model configuration.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model
|
str
|
Model identifier (e.g., 'gpt-5', 'gpt-4o', 'o1-preview'). |
required |
max_tokens
|
int
|
Maximum context window size in tokens. Defaults to 64,000. |
64000
|
base_url
|
str | None
|
API base URL. If None, uses OpenAI's standard URL. Use for OpenAI-compatible providers (e.g., 'http://localhost:8000/v1'). |
None
|
api_key
|
str | None
|
API key for authentication. If None, reads from OPENROUTER_API_KEY environment variable. |
None
|
supports_audio_input
|
bool
|
Whether the model supports audio inputs. Defaults to False. |
False
|
reasoning_effort
|
str | None
|
Reasoning effort level for extended thinking models (e.g., 'low', 'medium', 'high'). Only used with o1/o3 style models. |
None
|
timeout
|
float | None
|
Request timeout in seconds. If None, uses OpenAI SDK default. |
None
|
max_retries
|
int
|
Number of retries for transient errors. Defaults to 2. The OpenAI SDK handles retries internally with exponential backoff. |
2
|
kwargs
|
dict[str, Any] | None
|
Additional arguments passed to chat.completions.create(). |
None
|
Source code in src/stirrup/clients/chat_completions_client.py
generate
async
generate(
messages: list[ChatMessage], tools: dict[str, Tool]
) -> AssistantMessage
Generate assistant response with optional tool calls.
Retries up to 3 times on transient errors (connection, timeout, rate limit, internal server errors) with exponential backoff.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
messages
|
list[ChatMessage]
|
List of conversation messages. |
required |
tools
|
dict[str, Tool]
|
Dictionary mapping tool names to Tool objects. |
required |
Returns:
| Type | Description |
|---|---|
AssistantMessage
|
AssistantMessage containing the model's response, any tool calls, |
AssistantMessage
|
and token usage statistics. |
Raises:
| Type | Description |
|---|---|
ContextOverflowError
|
If the context window is exceeded. |
Source code in src/stirrup/clients/chat_completions_client.py
119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 | |
to_openai_messages
Convert ChatMessage list to OpenAI-compatible message dictionaries.
Handles all message types: SystemMessage, UserMessage, AssistantMessage, and ToolMessage. Preserves reasoning content and tool calls for assistant messages.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
msgs
|
list[ChatMessage]
|
List of ChatMessage objects (System, User, Assistant, or Tool messages). |
required |
Returns:
| Type | Description |
|---|---|
list[dict[str, Any]]
|
List of message dictionaries ready for the OpenAI API. |
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
If an unsupported message type is encountered. |
Source code in src/stirrup/clients/utils.py
to_openai_tools
Convert Tool objects to OpenAI function calling format.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
tools
|
dict[str, Tool]
|
Dictionary mapping tool names to Tool objects. |
required |
Returns:
| Type | Description |
|---|---|
list[dict[str, Any]]
|
List of tool definitions in OpenAI's function calling format. |
Example
tools = {"calculator": calculator_tool} openai_tools = to_openai_tools(tools)
Returns: [{"type": "function", "function": {"name": "calculator", ...}}]