OpenResponses Client

The OpenResponsesClient uses OpenAI's Responses API (POST /v1/responses) instead of the Chat Completions API. This client is useful for providers that implement the newer Responses API format.

Key Differences from ChatCompletionsClient

Feature	ChatCompletionsClient	OpenResponsesClient
API endpoint	`chat.completions.create()`	`responses.create()`
System messages	Included in `messages` array	Passed as `instructions` parameter
Message format	`{"role": "user", "content": [...]}`	`{"role": "user", "content": [{"type": "input_text", ...}]}`
Tool call IDs	`tool_call_id`	`call_id`
Reasoning config	`reasoning_effort` param	`reasoning: {"effort": ...}` object

Usage

For models that support extended thinking (like o1/o3), you can configure the reasoning effort:

import asyncio

from stirrup import Agent
from stirrup.clients import OpenResponsesClient


async def main() -> None:
    """Run an agent using the OpenResponses API with a reasoning model."""

    # Create client using OpenResponsesClient
    # Uses the OpenAI Responses API (responses.create)
    # For reasoning models, you can set reasoning_effort
    client = OpenResponsesClient(
        model="gpt-5.2",
        reasoning_effort="medium",
    )

    agent = Agent(client=client, name="reasoning-agent", max_turns=19)

    async with agent.session(output_dir="output/open_responses_example") as session:
        _finish_params, _history, _metadata = await session.run(
            "Plan a software release with these tasks: Design (5 days), Backend (10 days, needs Design), "
            "Frontend (8 days, needs Design), Testing (4 days, needs Backend and Frontend), "
            "Documentation (3 days, can start after Backend). Two developers are available. "
            "What's the minimum time to complete? Output an Excel Gantt chart with the schedule."
        )


if __name__ == "__main__":
    asyncio.run(main())

Constructor Parameters

Parameter	Type	Default	Description
`model`	`str`	required	Model identifier (e.g., `"gpt-4o"`, `"o1"`)
`max_tokens`	`int`	`64_000`	Maximum output tokens
`base_url`	`str \\| None`	`None`	Custom API base URL
`api_key`	`str \\| None`	`None`	API key (falls back to `OPENAI_API_KEY` env var)
`reasoning_effort`	`str \\| None`	`None`	Reasoning effort for o1/o3 models: `"low"`, `"medium"`, `"high"`
`timeout`	`float \\| None`	`None`	Request timeout in seconds
`max_retries`	`int`	`2`	Number of retries for transient errors
`instructions`	`str \\| None`	`None`	Default system instructions
`kwargs`	`dict \\| None`	`None`	Additional arguments passed to `responses.create()`

API Reference

stirrup.clients.open_responses_client

OpenAI SDK-based LLM client for the Responses API.

This client uses the official OpenAI Python SDK's responses.create() method, supporting both OpenAI's API and any OpenAI-compatible endpoint that implements the Responses API via the base_url parameter.

all `module-attribute`

__all__ = ['OpenResponsesClient']

LOGGER `module-attribute`

LOGGER = getLogger(__name__)

ChatMessage

ChatMessage = Annotated[
    SystemMessage
    | UserMessage
    | AssistantMessage
    | ToolMessage,
    Field(discriminator=role),
]

Discriminated union of all message types, automatically parsed based on role field.

Content

Content = list[ContentBlock] | str

Message content: either a plain string or list of mixed content blocks.

ContextOverflowError

Bases: Exception

Raised when LLM context window is exceeded (max_tokens or length finish_reason).

AssistantMessage

Bases: BaseModel

LLM response message with optional tool calls and token usage tracking.

e2e_otps `property`

e2e_otps: float | None

End-to-end output tokens per second.

AudioContentBlock

Bases: BinaryContentBlock

Audio content supporting MPEG, WAV, AAC, and other common audio formats.

to_base64_url

to_base64_url(bitrate: str = '192k') -> str

Transcode to MP3 and return base64 data URL.

Source code in src/stirrup/core/models.py

def to_base64_url(self, bitrate: str = "192k") -> str:
    """Transcode to MP3 and return base64 data URL."""
    with warnings.catch_warnings():
        warnings.filterwarnings("ignore", category=UserWarning, module="moviepy.*")
        with NamedTemporaryFile(suffix=".bin") as fin, NamedTemporaryFile(suffix=".mp3") as fout:
            fin.write(self.data)
            fin.flush()
            clip = AudioFileClip(fin.name)
            clip.write_audiofile(fout.name, codec="libmp3lame", bitrate=bitrate, logger=None)
            clip.close()
            return f"data:audio/mpeg;base64,{b64encode(fout.read()).decode()}"

EmptyParams

Bases: BaseModel

Empty parameter model for tools that don't require parameters.

ImageContentBlock

Bases: BinaryContentBlock

Image content supporting PNG, JPEG, WebP, PSD formats with automatic downscaling.

to_base64_url

to_base64_url(
    max_pixels: int | None = RESOLUTION_1MP,
) -> str

Convert image to base64 data URL, optionally resizing to max pixel count.

Source code in src/stirrup/core/models.py

def to_base64_url(self, max_pixels: int | None = RESOLUTION_1MP) -> str:
    """Convert image to base64 data URL, optionally resizing to max pixel count."""
    img: Image.Image = Image.open(BytesIO(self.data))
    if max_pixels is not None and img.width * img.height > max_pixels:
        tw, th = downscale_image(img.width, img.height, max_pixels)
        img.thumbnail((tw, th), Image.Resampling.LANCZOS)
    if img.mode != "RGB":
        img = img.convert("RGB")
    buf = BytesIO()
    img.save(buf, format="PNG")
    return f"data:image/png;base64,{b64encode(buf.getvalue()).decode()}"

LLMClient

Bases: Protocol

Protocol defining the interface for LLM client implementations.

Any LLM client must implement this protocol to work with the Agent class. Provides text generation with tool support and model capability inspection.

Reasoning

Bases: BaseModel

Extended thinking/reasoning content from models that support chain-of-thought reasoning.

SystemMessage

Bases: BaseModel

System-level instructions and context for the LLM.

TokenUsage

Bases: BaseModel

Token counts for LLM usage.

Token terminology: output = reasoning + answer.

output `property`

output: int

Total output tokens (reasoning + answer).

total `property`

total: int

Total token count across input, answer, and reasoning.

add

__add__(other: TokenUsage) -> TokenUsage

Add two TokenUsage objects together, summing each field independently.

Source code in src/stirrup/core/models.py

def __add__(self, other: "TokenUsage") -> "TokenUsage":
    """Add two TokenUsage objects together, summing each field independently."""
    return TokenUsage(
        input=self.input + other.input,
        answer=self.answer + other.answer,
        reasoning=self.reasoning + other.reasoning,
    )

Tool

Bases: BaseModel

Tool definition with name, description, parameter schema, and executor function.

Generic over

P: Parameter model type (Pydantic BaseModel subclass, or EmptyParams for parameterless tools) M: Metadata type (should implement Addable for aggregation; use None for tools without metadata)

Tools are simple, stateless callables. For tools requiring lifecycle management (setup/teardown, resource pooling), use a ToolProvider instead.

Example with parameters

class CalcParams(BaseModel):
    expression: str

calc_tool = Tool[CalcParams, None](
    name="calc",
    description="Evaluate math",
    parameters=CalcParams,
    executor=lambda p: ToolResult(content=str(eval(p.expression))),
)

Example without parameters (uses EmptyParams by default):

time_tool = Tool[EmptyParams, None](
    name="time",
    description="Get current time",
    executor=lambda _: ToolResult(content=datetime.now().isoformat()),
)

ToolCall

Bases: BaseModel

Represents a tool invocation request from the LLM.

Attributes:

Name	Type	Description
`name`	`str`	Name of the tool to invoke
`arguments`	`str`	JSON string containing tool parameters
`tool_call_id`	`str \| None`	Unique identifier for tracking this tool call and its result

ToolMessage

Bases: BaseModel

Tool execution result returned to the LLM.

Attributes:

Name	Type	Description
`role`	`Literal['tool']`	Always "tool"
`content`	`Content`	The tool result content
`tool_call_id`	`str \| None`	ID linking this result to the corresponding tool call
`name`	`str \| None`	Name of the tool that was called
`args_was_valid`	`bool`	Whether the tool arguments were valid
`success`	`bool`	Whether the tool executed successfully (used by finish tool to control termination)

tool_duration `property`

tool_duration: float | None

Tool execution duration in seconds.

UserMessage

Bases: BaseModel

User input message to the LLM.

VideoContentBlock

Bases: BinaryContentBlock

MP4 video content with automatic transcoding and resolution downscaling.

to_base64_url

to_base64_url(
    max_pixels: int | None = RESOLUTION_480P,
    fps: int | None = None,
) -> str

Transcode to MP4 and return base64 data URL.

Source code in src/stirrup/core/models.py

def to_base64_url(self, max_pixels: int | None = RESOLUTION_480P, fps: int | None = None) -> str:
    """Transcode to MP4 and return base64 data URL."""
    with warnings.catch_warnings():
        warnings.filterwarnings("ignore", category=UserWarning, module="moviepy.*")
        with NamedTemporaryFile(suffix=".mp4") as fin, NamedTemporaryFile(suffix=".mp4") as fout:
            fin.write(self.data)
            fin.flush()
            clip = VideoFileClip(fin.name)
            tw, th = downscale_image(int(clip.w), int(clip.h), max_pixels)
            clip = clip.with_effects([Resize(new_size=(tw, th))])

            clip.write_videofile(
                fout.name,
                codec="libx264",
                fps=fps,
                audio=clip.audio is not None,
                audio_codec="aac",
                preset="veryfast",
                logger=None,
            )
            clip.close()
            return f"data:video/mp4;base64,{b64encode(fout.read()).decode()}"

OpenResponsesClient

OpenResponsesClient(
    model: str,
    max_tokens: int = 64000,
    *,
    base_url: str | None = None,
    api_key: str | None = None,
    reasoning_effort: str | None = None,
    timeout: float | None = None,
    max_retries: int = 2,
    instructions: str | None = None,
    kwargs: dict[str, Any] | None = None,
)

Bases: LLMClient

OpenAI SDK-based client using the Responses API.

Uses the official OpenAI Python SDK's responses.create() method. Supports custom base_url for OpenAI-compatible providers that implement the Responses API.

Includes automatic retries for transient failures and token usage tracking.

Example

Standard OpenAI usage

client = OpenResponsesClient(model="gpt-4o", max_tokens=128_000)

Custom OpenAI-compatible endpoint

client = OpenResponsesClient( ... model="gpt-4o", ... base_url="http://localhost:8000/v1", ... api_key="your-api-key", ... )

Initialize OpenAI SDK client with model configuration for Responses API.

Parameters:

Name	Type	Description	Default
`model`	`str`	Model identifier (e.g., 'gpt-4o', 'o1-preview').	required
`max_tokens`	`int`	Maximum output tokens. Defaults to 64,000.	`64000`
`base_url`	`str \| None`	API base URL. If None, uses OpenAI's standard URL. Use for OpenAI-compatible providers.	`None`
`api_key`	`str \| None`	API key for authentication. If None, reads from OPENROUTER_API_KEY environment variable.	`None`
`reasoning_effort`	`str \| None`	Reasoning effort level for extended thinking models (e.g., 'low', 'medium', 'high'). Only used with o1/o3 style models.	`None`
`timeout`	`float \| None`	Request timeout in seconds. If None, uses OpenAI SDK default.	`None`
`max_retries`	`int`	Number of retries for transient errors. Defaults to 2.	`2`
`instructions`	`str \| None`	Default system-level instructions. Can be overridden by SystemMessage in the messages list.	`None`
`kwargs`	`dict[str, Any] \| None`	Additional arguments passed to responses.create().	`None`

Source code in src/stirrup/clients/open_responses_client.py

def __init__(
    self,
    model: str,
    max_tokens: int = 64_000,
    *,
    base_url: str | None = None,
    api_key: str | None = None,
    reasoning_effort: str | None = None,
    timeout: float | None = None,
    max_retries: int = 2,
    instructions: str | None = None,
    kwargs: dict[str, Any] | None = None,
) -> None:
    """Initialize OpenAI SDK client with model configuration for Responses API.

    Args:
        model: Model identifier (e.g., 'gpt-4o', 'o1-preview').
        max_tokens: Maximum output tokens. Defaults to 64,000.
        base_url: API base URL. If None, uses OpenAI's standard URL.
            Use for OpenAI-compatible providers.
        api_key: API key for authentication. If None, reads from OPENROUTER_API_KEY
            environment variable.
        reasoning_effort: Reasoning effort level for extended thinking models
            (e.g., 'low', 'medium', 'high'). Only used with o1/o3 style models.
        timeout: Request timeout in seconds. If None, uses OpenAI SDK default.
        max_retries: Number of retries for transient errors. Defaults to 2.
        instructions: Default system-level instructions. Can be overridden by
            SystemMessage in the messages list.
        kwargs: Additional arguments passed to responses.create().
    """
    self._model = model
    self._max_tokens = max_tokens
    self._reasoning_effort = reasoning_effort
    self._default_instructions = instructions
    self._kwargs = kwargs or {}

    # Initialize AsyncOpenAI client
    resolved_api_key = api_key or os.environ.get("OPENAI_API_KEY")

    # Strip /responses suffix if present - SDK appends it automatically
    resolved_base_url = base_url
    if resolved_base_url and resolved_base_url.rstrip("/").endswith("/responses"):
        resolved_base_url = resolved_base_url.rstrip("/").removesuffix("/responses")

    self._client = AsyncOpenAI(
        api_key=resolved_api_key,
        base_url=resolved_base_url,
        timeout=timeout,
        max_retries=max_retries,
    )

max_tokens `property`

max_tokens: int

Maximum output tokens.

model_slug `property`

model_slug: str

Model identifier.

generate `async`

generate(
    messages: list[ChatMessage], tools: dict[str, Tool]
) -> AssistantMessage

Generate assistant response with optional tool calls using Responses API.

Retries up to 3 times on transient errors (connection, timeout, rate limit, internal server errors) with exponential backoff.

Parameters:

Name	Type	Description	Default
`messages`	`list[ChatMessage]`	List of conversation messages.	required
`tools`	`dict[str, Tool]`	Dictionary mapping tool names to Tool objects.	required

Returns:

Type	Description
`AssistantMessage`	AssistantMessage containing the model's response, any tool calls,
`AssistantMessage`	and token usage statistics.

Raises:

Type	Description
`ContextOverflowError`	If the response is incomplete due to token limits.

Source code in src/stirrup/clients/open_responses_client.py

@retry(
    retry=retry_if_exception_type(
        (
            APIConnectionError,
            APITimeoutError,
            RateLimitError,
            InternalServerError,
        )
    ),
    stop=stop_after_attempt(3),
    wait=wait_exponential(multiplier=1, min=1, max=10),
)
async def generate(
    self,
    messages: list[ChatMessage],
    tools: dict[str, Tool],
) -> AssistantMessage:
    """Generate assistant response with optional tool calls using Responses API.

    Retries up to 3 times on transient errors (connection, timeout, rate limit,
    internal server errors) with exponential backoff.

    Args:
        messages: List of conversation messages.
        tools: Dictionary mapping tool names to Tool objects.

    Returns:
        AssistantMessage containing the model's response, any tool calls,
        and token usage statistics.

    Raises:
        ContextOverflowError: If the response is incomplete due to token limits.
    """
    # Convert messages to OpenResponses format
    instructions, input_items = _to_open_responses_input(messages)

    # Use provided instructions or fall back to default
    final_instructions = instructions or self._default_instructions

    # Build request kwargs
    request_kwargs: dict[str, Any] = {
        "model": self._model,
        "input": input_items,
        "max_output_tokens": self._max_tokens,
        **self._kwargs,
    }

    # Add instructions if present
    if final_instructions:
        request_kwargs["instructions"] = final_instructions

    # Add tools if provided
    if tools:
        request_kwargs["tools"] = _to_open_responses_tools(tools)
        request_kwargs["tool_choice"] = "auto"

    # Add reasoning effort if configured (for o1/o3 models)
    if self._reasoning_effort:
        request_kwargs["reasoning"] = {"effort": self._reasoning_effort}

    # Make API call
    request_start_time = perf_counter()
    response = await self._client.responses.create(**request_kwargs)
    request_end_time = perf_counter()

    # Check for incomplete response (context overflow)
    if response.status == "incomplete":
        stop_reason = getattr(response, "incomplete_details", None)
        raise ContextOverflowError(
            f"Response incomplete for model {self.model_slug}: {stop_reason}. "
            "Reduce max_tokens or message length and try again."
        )

    # Parse response output
    content, tool_calls, reasoning = _parse_response_output(response.output)

    # Parse token usage
    usage = response.usage
    input_tokens = usage.input_tokens if usage else 0
    output_tokens = usage.output_tokens if usage else 0

    # Handle reasoning tokens if available
    reasoning_tokens = 0
    if usage and hasattr(usage, "output_tokens_details") and usage.output_tokens_details:
        reasoning_tokens = getattr(usage.output_tokens_details, "reasoning_tokens", 0) or 0

    answer_tokens = output_tokens - reasoning_tokens

    return AssistantMessage(
        reasoning=reasoning,
        content=content,
        tool_calls=tool_calls,
        token_usage=TokenUsage(
            input=input_tokens,
            answer=answer_tokens,
            reasoning=reasoning_tokens,
        ),
        request_start_time=request_start_time,
        request_end_time=request_end_time,
    )

_content_to_open_responses_input

_content_to_open_responses_input(
    content: Content,
) -> list[dict[str, Any]]

Convert Content blocks to OpenResponses input content format.

Uses input_text for text content (vs output_text for responses).

Source code in src/stirrup/clients/open_responses_client.py

def _content_to_open_responses_input(content: Content) -> list[dict[str, Any]]:
    """Convert Content blocks to OpenResponses input content format.

    Uses input_text for text content (vs output_text for responses).
    """
    if isinstance(content, str):
        return [{"type": "input_text", "text": content}]

    out: list[dict[str, Any]] = []
    for block in content:
        if isinstance(block, str):
            out.append({"type": "input_text", "text": block})
        elif isinstance(block, ImageContentBlock):
            out.append({"type": "input_image", "image_url": block.to_base64_url()})
        elif isinstance(block, AudioContentBlock):
            out.append(
                {
                    "type": "input_audio",
                    "input_audio": {
                        "data": block.to_base64_url().split(",")[1],
                        "format": block.extension,
                    },
                }
            )
        elif isinstance(block, VideoContentBlock):
            out.append({"type": "input_file", "file_data": block.to_base64_url()})
        else:
            raise NotImplementedError(f"Unsupported content block: {type(block)}")
    return out

_content_to_open_responses_output

_content_to_open_responses_output(
    content: Content,
) -> list[dict[str, Any]]

Convert Content blocks to OpenResponses output content format.

Uses output_text for assistant message content.

Source code in src/stirrup/clients/open_responses_client.py

def _content_to_open_responses_output(content: Content) -> list[dict[str, Any]]:
    """Convert Content blocks to OpenResponses output content format.

    Uses output_text for assistant message content.
    """
    if isinstance(content, str):
        return [{"type": "output_text", "text": content}]

    out: list[dict[str, Any]] = []
    for block in content:
        if isinstance(block, str):
            out.append({"type": "output_text", "text": block})
        else:
            raise NotImplementedError(f"Unsupported output content block: {type(block)}")
    return out

_to_open_responses_tools

_to_open_responses_tools(
    tools: dict[str, Tool],
) -> list[dict[str, Any]]

Convert Tool objects to OpenResponses function format.

OpenResponses API expects tools with name/description/parameters at top level, not nested under a 'function' key like Chat Completions API.

Parameters:

Name	Type	Description	Default
`tools`	`dict[str, Tool]`	Dictionary mapping tool names to Tool objects.	required

Returns:

Type	Description
`list[dict[str, Any]]`	List of tool definitions in OpenResponses format.

Source code in src/stirrup/clients/open_responses_client.py

def _to_open_responses_tools(tools: dict[str, Tool]) -> list[dict[str, Any]]:
    """Convert Tool objects to OpenResponses function format.

    OpenResponses API expects tools with name/description/parameters at top level,
    not nested under a 'function' key like Chat Completions API.

    Args:
        tools: Dictionary mapping tool names to Tool objects.

    Returns:
        List of tool definitions in OpenResponses format.
    """
    out: list[dict[str, Any]] = []
    for t in tools.values():
        tool_def: dict[str, Any] = {
            "type": "function",
            "name": t.name,
            "description": t.description,
        }
        if t.parameters is not EmptyParams:
            tool_def["parameters"] = t.parameters.model_json_schema()
        out.append(tool_def)
    return out

_to_open_responses_input

_to_open_responses_input(
    msgs: list[ChatMessage],
) -> tuple[str | None, list[dict[str, Any]]]

Convert ChatMessage list to OpenResponses (instructions, input) tuple.

SystemMessage content is extracted as the instructions parameter. Other messages are converted to input items.

Returns:

Type	Description
`str \| None`	Tuple of (instructions, input_items) where instructions is the system
`list[dict[str, Any]]`	message content (or None) and input_items is the list of input items.

Source code in src/stirrup/clients/open_responses_client.py

def _to_open_responses_input(
    msgs: list[ChatMessage],
) -> tuple[str | None, list[dict[str, Any]]]:
    """Convert ChatMessage list to OpenResponses (instructions, input) tuple.

    SystemMessage content is extracted as the instructions parameter.
    Other messages are converted to input items.

    Returns:
        Tuple of (instructions, input_items) where instructions is the system
        message content (or None) and input_items is the list of input items.
    """
    instructions: str | None = None
    input_items: list[dict[str, Any]] = []

    for m in msgs:
        if isinstance(m, SystemMessage):
            # Extract system message as instructions
            if isinstance(m.content, str):
                instructions = m.content
            else:
                # Join text content blocks for instructions
                instructions = "\n".join(block if isinstance(block, str) else "" for block in m.content)
        elif isinstance(m, UserMessage):
            input_items.append(
                {
                    "role": "user",
                    "content": _content_to_open_responses_input(m.content),
                }
            )
        elif isinstance(m, AssistantMessage):
            # For assistant messages, we need to add them as response output items
            # First add any text content as a message item
            content_str = (
                m.content
                if isinstance(m.content, str)
                else "\n".join(block if isinstance(block, str) else "" for block in m.content)
            )
            if content_str:
                input_items.append(
                    {
                        "type": "message",
                        "role": "assistant",
                        "content": [{"type": "output_text", "text": content_str}],
                    }
                )

            # Add tool calls as separate function_call items
            input_items.extend(
                {
                    "type": "function_call",
                    "call_id": tc.tool_call_id,
                    "name": tc.name,
                    "arguments": tc.arguments,
                }
                for tc in m.tool_calls
            )
        elif isinstance(m, ToolMessage):
            # Tool results are function_call_output items
            content_str = m.content if isinstance(m.content, str) else str(m.content)
            input_items.append(
                {
                    "type": "function_call_output",
                    "call_id": m.tool_call_id,
                    "output": content_str,
                }
            )
        else:
            raise NotImplementedError(f"Unsupported message type: {type(m)}")

    return instructions, input_items

_get_attr

_get_attr(obj: Any, name: str, default: Any = None) -> Any

Get attribute from object or dict, with fallback default.

Source code in src/stirrup/clients/open_responses_client.py

def _get_attr(obj: Any, name: str, default: Any = None) -> Any:  # noqa: ANN401
    """Get attribute from object or dict, with fallback default."""
    if isinstance(obj, dict):
        return obj.get(name, default)
    return getattr(obj, name, default)

_parse_response_output

_parse_response_output(
    output: list[Any],
) -> tuple[str, list[ToolCall], Reasoning | None]

Parse response output items into content, tool_calls, and reasoning.

Parameters:

Name	Type	Description	Default
`output`	`list[Any]`	List of output items from the response.	required

Returns:

Type	Description
`tuple[str, list[ToolCall], Reasoning \| None]`	Tuple of (content_text, tool_calls, reasoning).

Source code in src/stirrup/clients/open_responses_client.py

def _parse_response_output(
    output: list[Any],
) -> tuple[str, list[ToolCall], Reasoning | None]:
    """Parse response output items into content, tool_calls, and reasoning.

    Args:
        output: List of output items from the response.

    Returns:
        Tuple of (content_text, tool_calls, reasoning).
    """
    content_parts: list[str] = []
    tool_calls: list[ToolCall] = []
    reasoning: Reasoning | None = None

    for item in output:
        item_type = _get_attr(item, "type")

        if item_type == "message":
            # Extract text content from message
            msg_content = _get_attr(item, "content", [])
            for content_item in msg_content:
                content_type = _get_attr(content_item, "type")
                if content_type == "output_text":
                    text = _get_attr(content_item, "text", "")
                    content_parts.append(text)

        elif item_type == "function_call":
            call_id = _get_attr(item, "call_id")
            name = _get_attr(item, "name")
            arguments = _get_attr(item, "arguments", "")
            tool_calls.append(
                ToolCall(
                    tool_call_id=call_id,
                    name=name,
                    arguments=arguments,
                )
            )

        elif item_type == "reasoning":
            # Extract reasoning/thinking content - try multiple possible attribute names
            # summary can be a list of Summary objects with .text attribute
            summary = _get_attr(item, "summary")
            if summary:
                if isinstance(summary, list):
                    # Extract text from Summary objects
                    thinking = "\n".join(_get_attr(s, "text", "") for s in summary if _get_attr(s, "text"))
                else:
                    thinking = str(summary)
            else:
                thinking = _get_attr(item, "thinking") or ""

            if thinking:
                reasoning = Reasoning(content=thinking)

    return "\n".join(content_parts), tool_calls, reasoning

OpenResponses Client

Key Differences from ChatCompletionsClient

Usage

Constructor Parameters

API Reference

stirrup.clients.open_responses_client

__all__ module-attribute

LOGGER module-attribute

ChatMessage

Content

ContextOverflowError

AssistantMessage

e2e_otps property

AudioContentBlock

to_base64_url

EmptyParams

ImageContentBlock

to_base64_url

LLMClient

Reasoning

SystemMessage

TokenUsage

output property

total property

__add__

Tool

ToolCall

ToolMessage

tool_duration property

UserMessage

VideoContentBlock

to_base64_url

OpenResponsesClient

Standard OpenAI usage

Custom OpenAI-compatible endpoint

max_tokens property

model_slug property

generate async

_content_to_open_responses_input

_content_to_open_responses_output

_to_open_responses_tools

_to_open_responses_input

_get_attr

_parse_response_output

all `module-attribute`

LOGGER `module-attribute`

e2e_otps `property`

output `property`

total `property`

add

tool_duration `property`

max_tokens `property`

model_slug `property`

generate `async`