LiteLLM Client

Optional Dependency

LiteLLM is an optional dependency. Install with:

pip install stirrup[litellm]  # or: uv add stirrup[litellm]

stirrup.clients.litellm_client

LiteLLM-based LLM client for multi-provider support.

This client uses LiteLLM to provide a unified interface to multiple LLM providers (OpenAI, Anthropic, Google, etc.) with automatic retries for transient failures.

Requires the litellm extra: pip install stirrup[litellm]

all `module-attribute`

__all__ = ['LiteLLMClient']

LOGGER `module-attribute`

LOGGER = getLogger(__name__)

ChatMessage

ChatMessage = Annotated[
    SystemMessage
    | UserMessage
    | AssistantMessage
    | ToolMessage,
    Field(discriminator=role),
]

Discriminated union of all message types, automatically parsed based on role field.

ContextOverflowError

Bases: Exception

Raised when LLM context window is exceeded (max_tokens or length finish_reason).

AssistantMessage

Bases: BaseModel

LLM response message with optional tool calls and token usage tracking.

LLMClient

Bases: Protocol

Protocol defining the interface for LLM client implementations.

Any LLM client must implement this protocol to work with the Agent class. Provides text generation with tool support and model capability inspection.

Reasoning

Bases: BaseModel

Extended thinking/reasoning content from models that support chain-of-thought reasoning.

TokenUsage

Bases: BaseModel

Token counts for LLM usage (input, output, reasoning tokens).

total `property`

total: int

Total token count across input, output, and reasoning.

add

__add__(other: TokenUsage) -> TokenUsage

Add two TokenUsage objects together, summing each field independently.

Source code in src/stirrup/core/models.py

def __add__(self, other: "TokenUsage") -> "TokenUsage":
    """Add two TokenUsage objects together, summing each field independently."""
    return TokenUsage(
        input=self.input + other.input,
        output=self.output + other.output,
        reasoning=self.reasoning + other.reasoning,
    )

Tool

Bases: BaseModel

Tool definition with name, description, parameter schema, and executor function.

Generic over

P: Parameter model type (must be a Pydantic BaseModel, or None for parameterless tools) M: Metadata type (should implement Addable for aggregation; use None for tools without metadata)

Tools are simple, stateless callables. For tools requiring lifecycle management (setup/teardown, resource pooling), use a ToolProvider instead.

Example with parameters

class CalcParams(BaseModel): expression: str

calc_tool = ToolCalcParams, None)

Example without parameters

time_tool = ToolNone, None)

ToolCall

Bases: BaseModel

Represents a tool invocation request from the LLM.

Attributes:

Name	Type	Description
`name`	`str`	Name of the tool to invoke
`arguments`	`str`	JSON string containing tool parameters
`tool_call_id`	`str \| None`	Unique identifier for tracking this tool call and its result

LiteLLMClient

LiteLLMClient(
    model_slug: str,
    max_tokens: int,
    supports_audio_input: bool = False,
    reasoning_effort: str | None = None,
    kwargs: dict[str, Any] | None = None,
)

Bases: LLMClient

LiteLLM-based client supporting multiple LLM providers with unified interface.

Includes automatic retries for transient failures and token usage tracking.

Initialize LiteLLM client with model configuration and capabilities.

Parameters:

Name	Type	Description	Default
`model_slug`	`str`	Model identifier for LiteLLM (e.g., 'anthropic/claude-3-5-sonnet-20241022')	required
`max_tokens`	`int`	Maximum context window size in tokens	required
`supports_audio_input`	`bool`	Whether the model supports audio inputs	`False`
`reasoning_effort`	`str \| None`	Reasoning effort level for extended thinking models (e.g., 'medium', 'high')	`None`
`kwargs`	`dict[str, Any] \| None`	Additional arguments to pass to LiteLLM completion calls	`None`

Source code in src/stirrup/clients/litellm_client.py

def __init__(
    self,
    model_slug: str,
    max_tokens: int,
    supports_audio_input: bool = False,
    reasoning_effort: str | None = None,
    kwargs: dict[str, Any] | None = None,
) -> None:
    """Initialize LiteLLM client with model configuration and capabilities.

    Args:
        model_slug: Model identifier for LiteLLM (e.g., 'anthropic/claude-3-5-sonnet-20241022')
        max_tokens: Maximum context window size in tokens
        supports_audio_input: Whether the model supports audio inputs
        reasoning_effort: Reasoning effort level for extended thinking models (e.g., 'medium', 'high')
        kwargs: Additional arguments to pass to LiteLLM completion calls
    """
    self._model_slug = model_slug
    self._supports_video_input = False
    self._supports_audio_input = supports_audio_input
    self._max_tokens = max_tokens
    self._reasoning_effort = reasoning_effort
    self._kwargs = kwargs or {}

max_tokens `property`

max_tokens: int

Maximum context window size in tokens.

model_slug `property`

model_slug: str

Model identifier used by LiteLLM.

generate `async`

generate(
    messages: list[ChatMessage], tools: dict[str, Tool]
) -> AssistantMessage

Generate assistant response with optional tool calls. Retries up to 3 times on timeout/connection errors.

Source code in src/stirrup/clients/litellm_client.py

@retry(
    retry=retry_if_exception_type((Timeout, APIConnectionError, RateLimitError)),
    stop=stop_after_attempt(3),
    wait=wait_exponential(multiplier=1, min=1, max=10),
)
async def generate(self, messages: list[ChatMessage], tools: dict[str, Tool]) -> AssistantMessage:
    """Generate assistant response with optional tool calls. Retries up to 3 times on timeout/connection errors."""
    r = await acompletion(
        model=self.model_slug,
        messages=to_openai_messages(messages),
        tools=to_openai_tools(tools) if tools else None,
        tool_choice="auto" if tools else None,
        max_tokens=self._max_tokens,
        **self._kwargs,
    )

    choice = r["choices"][0]

    if choice.finish_reason in ["max_tokens", "length"]:
        raise ContextOverflowError(
            f"Maximal context window tokens reached for model {self.model_slug}, resulting in finish reason: {choice.finish_reason}. Reduce agent.max_tokens and try again."
        )

    msg = choice["message"]

    reasoning: Reasoning | None = None
    if getattr(msg, "reasoning_content", None) is not None:
        reasoning = Reasoning(content=msg.reasoning_content)
    if getattr(msg, "thinking_blocks", None) is not None and len(msg.thinking_blocks) > 0:
        reasoning = Reasoning(
            signature=msg.thinking_blocks[0]["signature"], content=msg.thinking_blocks[0]["content"]
        )

    usage = r["usage"]

    calls = [
        ToolCall(
            tool_call_id=tc.get("id"),
            name=tc["function"]["name"],
            arguments=tc["function"].get("arguments", "") or "",
        )
        for tc in (msg.get("tool_calls") or [])
    ]

    input_tokens = usage.prompt_tokens
    reasoning_tokens = 0
    if usage.completion_tokens_details:
        reasoning_tokens = usage.completion_tokens_details.reasoning_tokens or 0
    output_tokens = usage.completion_tokens - reasoning_tokens

    return AssistantMessage(
        reasoning=reasoning,
        content=msg.get("content") or "",
        tool_calls=calls,
        token_usage=TokenUsage(
            input=input_tokens,
            output=output_tokens,
            reasoning=reasoning_tokens,
        ),
    )

to_openai_messages

to_openai_messages(
    msgs: list[ChatMessage],
) -> list[dict[str, Any]]

Convert ChatMessage list to OpenAI-compatible message dictionaries.

Handles all message types: SystemMessage, UserMessage, AssistantMessage, and ToolMessage. Preserves reasoning content and tool calls for assistant messages.

Parameters:

Name	Type	Description	Default
`msgs`	`list[ChatMessage]`	List of ChatMessage objects (System, User, Assistant, or Tool messages).	required

Returns:

Type	Description
`list[dict[str, Any]]`	List of message dictionaries ready for the OpenAI API.

Raises:

Type	Description
`NotImplementedError`	If an unsupported message type is encountered.

Source code in src/stirrup/clients/utils.py

def to_openai_messages(msgs: list[ChatMessage]) -> list[dict[str, Any]]:
    """Convert ChatMessage list to OpenAI-compatible message dictionaries.

    Handles all message types: SystemMessage, UserMessage, AssistantMessage,
    and ToolMessage. Preserves reasoning content and tool calls for assistant
    messages.

    Args:
        msgs: List of ChatMessage objects (System, User, Assistant, or Tool messages).

    Returns:
        List of message dictionaries ready for the OpenAI API.

    Raises:
        NotImplementedError: If an unsupported message type is encountered.
    """
    out: list[dict[str, Any]] = []
    for m in msgs:
        if isinstance(m, SystemMessage):
            out.append({"role": "system", "content": content_to_openai(m.content)})
        elif isinstance(m, UserMessage):
            out.append({"role": "user", "content": content_to_openai(m.content)})
        elif isinstance(m, AssistantMessage):
            msg: dict[str, Any] = {"role": "assistant", "content": content_to_openai(m.content)}

            if m.reasoning:
                if m.reasoning.content:
                    msg["reasoning_content"] = m.reasoning.content

                if m.reasoning.signature:
                    msg["thinking_blocks"] = [
                        {"type": "thinking", "signature": m.reasoning.signature, "thinking": m.reasoning.content}
                    ]

            if m.tool_calls:
                msg["tool_calls"] = []
                for tool in m.tool_calls:
                    tool_dict = tool.model_dump()
                    tool_dict["id"] = tool.tool_call_id
                    tool_dict["type"] = "function"
                    tool_dict["function"] = {
                        "name": tool.name,
                        "arguments": tool.arguments,
                    }
                    msg["tool_calls"].append(tool_dict)

            out.append(msg)
        elif isinstance(m, ToolMessage):
            out.append(
                {
                    "role": "tool",
                    "content": content_to_openai(m.content),
                    "tool_call_id": m.tool_call_id,
                    "name": m.name,
                }
            )
        else:
            raise NotImplementedError(f"Unsupported message type: {type(m)}")

    return out

to_openai_tools

to_openai_tools(
    tools: dict[str, Tool],
) -> list[dict[str, Any]]

Convert Tool objects to OpenAI function calling format.

Parameters:

Name	Type	Description	Default
`tools`	`dict[str, Tool]`	Dictionary mapping tool names to Tool objects.	required

Returns:

Type	Description
`list[dict[str, Any]]`	List of tool definitions in OpenAI's function calling format.

Example

tools = {"calculator": calculator_tool} openai_tools = to_openai_tools(tools)

Returns: [{"type": "function", "function": {"name": "calculator", ...}}]

Source code in src/stirrup/clients/utils.py

def to_openai_tools(tools: dict[str, Tool]) -> list[dict[str, Any]]:
    """Convert Tool objects to OpenAI function calling format.

    Args:
        tools: Dictionary mapping tool names to Tool objects.

    Returns:
        List of tool definitions in OpenAI's function calling format.

    Example:
        >>> tools = {"calculator": calculator_tool}
        >>> openai_tools = to_openai_tools(tools)
        >>> # Returns: [{"type": "function", "function": {"name": "calculator", ...}}]
    """
    out: list[dict[str, Any]] = []
    for t in tools.values():
        function: dict[str, Any] = {
            "name": t.name,
            "description": t.description,
        }
        if t.parameters is not None:
            function["parameters"] = t.parameters.model_json_schema()
        tool_payload: dict[str, Any] = {
            "type": "function",
            "function": function,
        }
        out.append(tool_payload)
    return out

LiteLLM Client

stirrup.clients.litellm_client

__all__ module-attribute

LOGGER module-attribute

ChatMessage

ContextOverflowError

AssistantMessage

LLMClient

Reasoning

TokenUsage

total property

__add__

Tool

ToolCall

LiteLLMClient

max_tokens property

model_slug property

generate async

to_openai_messages

to_openai_tools

Returns: [{"type": "function", "function": {"name": "calculator", ...}}]

all `module-attribute`

LOGGER `module-attribute`

total `property`

add

max_tokens `property`

model_slug `property`

generate `async`