Skip to content

LiteLLM Client

Optional Dependency

LiteLLM is an optional dependency. Install with:

pip install stirrup[litellm]  # or: uv add stirrup[litellm]

stirrup.clients.litellm_client

LiteLLM-based LLM client for multi-provider support.

This client uses LiteLLM to provide a unified interface to multiple LLM providers (OpenAI, Anthropic, Google, etc.) with automatic retries for transient failures.

Requires the litellm extra: pip install stirrup[litellm]

__all__ module-attribute

__all__ = ['LiteLLMClient']

LOGGER module-attribute

LOGGER = getLogger(__name__)

ChatMessage

ChatMessage = Annotated[
    SystemMessage
    | UserMessage
    | AssistantMessage
    | ToolMessage,
    Field(discriminator=role),
]

Discriminated union of all message types, automatically parsed based on role field.

ContextOverflowError

Bases: Exception

Raised when LLM context window is exceeded (max_tokens or length finish_reason).

AssistantMessage

Bases: BaseModel

LLM response message with optional tool calls and token usage tracking.

LLMClient

Bases: Protocol

Protocol defining the interface for LLM client implementations.

Any LLM client must implement this protocol to work with the Agent class. Provides text generation with tool support and model capability inspection.

Reasoning

Bases: BaseModel

Extended thinking/reasoning content from models that support chain-of-thought reasoning.

TokenUsage

Bases: BaseModel

Token counts for LLM usage (input, output, reasoning tokens).

total property

total: int

Total token count across input, output, and reasoning.

__add__

__add__(other: TokenUsage) -> TokenUsage

Add two TokenUsage objects together, summing each field independently.

Source code in src/stirrup/core/models.py
def __add__(self, other: "TokenUsage") -> "TokenUsage":
    """Add two TokenUsage objects together, summing each field independently."""
    return TokenUsage(
        input=self.input + other.input,
        output=self.output + other.output,
        reasoning=self.reasoning + other.reasoning,
    )

Tool

Bases: BaseModel

Tool definition with name, description, parameter schema, and executor function.

Generic over

P: Parameter model type (must be a Pydantic BaseModel, or None for parameterless tools) M: Metadata type (should implement Addable for aggregation; use None for tools without metadata)

Tools are simple, stateless callables. For tools requiring lifecycle management (setup/teardown, resource pooling), use a ToolProvider instead.

Example with parameters

class CalcParams(BaseModel): expression: str

calc_tool = ToolCalcParams, None)

Example without parameters

time_tool = ToolNone, None)

ToolCall

Bases: BaseModel

Represents a tool invocation request from the LLM.

Attributes:

Name Type Description
name str

Name of the tool to invoke

arguments str

JSON string containing tool parameters

tool_call_id str | None

Unique identifier for tracking this tool call and its result

LiteLLMClient

LiteLLMClient(
    model_slug: str,
    max_tokens: int,
    supports_audio_input: bool = False,
    reasoning_effort: str | None = None,
    kwargs: dict[str, Any] | None = None,
)

Bases: LLMClient

LiteLLM-based client supporting multiple LLM providers with unified interface.

Includes automatic retries for transient failures and token usage tracking.

Initialize LiteLLM client with model configuration and capabilities.

Parameters:

Name Type Description Default
model_slug str

Model identifier for LiteLLM (e.g., 'anthropic/claude-3-5-sonnet-20241022')

required
max_tokens int

Maximum context window size in tokens

required
supports_audio_input bool

Whether the model supports audio inputs

False
reasoning_effort str | None

Reasoning effort level for extended thinking models (e.g., 'medium', 'high')

None
kwargs dict[str, Any] | None

Additional arguments to pass to LiteLLM completion calls

None
Source code in src/stirrup/clients/litellm_client.py
def __init__(
    self,
    model_slug: str,
    max_tokens: int,
    supports_audio_input: bool = False,
    reasoning_effort: str | None = None,
    kwargs: dict[str, Any] | None = None,
) -> None:
    """Initialize LiteLLM client with model configuration and capabilities.

    Args:
        model_slug: Model identifier for LiteLLM (e.g., 'anthropic/claude-3-5-sonnet-20241022')
        max_tokens: Maximum context window size in tokens
        supports_audio_input: Whether the model supports audio inputs
        reasoning_effort: Reasoning effort level for extended thinking models (e.g., 'medium', 'high')
        kwargs: Additional arguments to pass to LiteLLM completion calls
    """
    self._model_slug = model_slug
    self._supports_video_input = False
    self._supports_audio_input = supports_audio_input
    self._max_tokens = max_tokens
    self._reasoning_effort = reasoning_effort
    self._kwargs = kwargs or {}

max_tokens property

max_tokens: int

Maximum context window size in tokens.

model_slug property

model_slug: str

Model identifier used by LiteLLM.

generate async

generate(
    messages: list[ChatMessage], tools: dict[str, Tool]
) -> AssistantMessage

Generate assistant response with optional tool calls. Retries up to 3 times on timeout/connection errors.

Source code in src/stirrup/clients/litellm_client.py
@retry(
    retry=retry_if_exception_type((Timeout, APIConnectionError, RateLimitError)),
    stop=stop_after_attempt(3),
    wait=wait_exponential(multiplier=1, min=1, max=10),
)
async def generate(self, messages: list[ChatMessage], tools: dict[str, Tool]) -> AssistantMessage:
    """Generate assistant response with optional tool calls. Retries up to 3 times on timeout/connection errors."""
    r = await acompletion(
        model=self.model_slug,
        messages=to_openai_messages(messages),
        tools=to_openai_tools(tools) if tools else None,
        tool_choice="auto" if tools else None,
        max_tokens=self._max_tokens,
        **self._kwargs,
    )

    choice = r["choices"][0]

    if choice.finish_reason in ["max_tokens", "length"]:
        raise ContextOverflowError(
            f"Maximal context window tokens reached for model {self.model_slug}, resulting in finish reason: {choice.finish_reason}. Reduce agent.max_tokens and try again."
        )

    msg = choice["message"]

    reasoning: Reasoning | None = None
    if getattr(msg, "reasoning_content", None) is not None:
        reasoning = Reasoning(content=msg.reasoning_content)
    if getattr(msg, "thinking_blocks", None) is not None and len(msg.thinking_blocks) > 0:
        reasoning = Reasoning(
            signature=msg.thinking_blocks[0]["signature"], content=msg.thinking_blocks[0]["content"]
        )

    usage = r["usage"]

    calls = [
        ToolCall(
            tool_call_id=tc.get("id"),
            name=tc["function"]["name"],
            arguments=tc["function"].get("arguments", "") or "",
        )
        for tc in (msg.get("tool_calls") or [])
    ]

    input_tokens = usage.prompt_tokens
    reasoning_tokens = 0
    if usage.completion_tokens_details:
        reasoning_tokens = usage.completion_tokens_details.reasoning_tokens or 0
    output_tokens = usage.completion_tokens - reasoning_tokens

    return AssistantMessage(
        reasoning=reasoning,
        content=msg.get("content") or "",
        tool_calls=calls,
        token_usage=TokenUsage(
            input=input_tokens,
            output=output_tokens,
            reasoning=reasoning_tokens,
        ),
    )

to_openai_messages

to_openai_messages(
    msgs: list[ChatMessage],
) -> list[dict[str, Any]]

Convert ChatMessage list to OpenAI-compatible message dictionaries.

Handles all message types: SystemMessage, UserMessage, AssistantMessage, and ToolMessage. Preserves reasoning content and tool calls for assistant messages.

Parameters:

Name Type Description Default
msgs list[ChatMessage]

List of ChatMessage objects (System, User, Assistant, or Tool messages).

required

Returns:

Type Description
list[dict[str, Any]]

List of message dictionaries ready for the OpenAI API.

Raises:

Type Description
NotImplementedError

If an unsupported message type is encountered.

Source code in src/stirrup/clients/utils.py
def to_openai_messages(msgs: list[ChatMessage]) -> list[dict[str, Any]]:
    """Convert ChatMessage list to OpenAI-compatible message dictionaries.

    Handles all message types: SystemMessage, UserMessage, AssistantMessage,
    and ToolMessage. Preserves reasoning content and tool calls for assistant
    messages.

    Args:
        msgs: List of ChatMessage objects (System, User, Assistant, or Tool messages).

    Returns:
        List of message dictionaries ready for the OpenAI API.

    Raises:
        NotImplementedError: If an unsupported message type is encountered.
    """
    out: list[dict[str, Any]] = []
    for m in msgs:
        if isinstance(m, SystemMessage):
            out.append({"role": "system", "content": content_to_openai(m.content)})
        elif isinstance(m, UserMessage):
            out.append({"role": "user", "content": content_to_openai(m.content)})
        elif isinstance(m, AssistantMessage):
            msg: dict[str, Any] = {"role": "assistant", "content": content_to_openai(m.content)}

            if m.reasoning:
                if m.reasoning.content:
                    msg["reasoning_content"] = m.reasoning.content

                if m.reasoning.signature:
                    msg["thinking_blocks"] = [
                        {"type": "thinking", "signature": m.reasoning.signature, "thinking": m.reasoning.content}
                    ]

            if m.tool_calls:
                msg["tool_calls"] = []
                for tool in m.tool_calls:
                    tool_dict = tool.model_dump()
                    tool_dict["id"] = tool.tool_call_id
                    tool_dict["type"] = "function"
                    tool_dict["function"] = {
                        "name": tool.name,
                        "arguments": tool.arguments,
                    }
                    msg["tool_calls"].append(tool_dict)

            out.append(msg)
        elif isinstance(m, ToolMessage):
            out.append(
                {
                    "role": "tool",
                    "content": content_to_openai(m.content),
                    "tool_call_id": m.tool_call_id,
                    "name": m.name,
                }
            )
        else:
            raise NotImplementedError(f"Unsupported message type: {type(m)}")

    return out

to_openai_tools

to_openai_tools(
    tools: dict[str, Tool],
) -> list[dict[str, Any]]

Convert Tool objects to OpenAI function calling format.

Parameters:

Name Type Description Default
tools dict[str, Tool]

Dictionary mapping tool names to Tool objects.

required

Returns:

Type Description
list[dict[str, Any]]

List of tool definitions in OpenAI's function calling format.

Example

tools = {"calculator": calculator_tool} openai_tools = to_openai_tools(tools)

Returns: [{"type": "function", "function": {"name": "calculator", ...}}]

Source code in src/stirrup/clients/utils.py
def to_openai_tools(tools: dict[str, Tool]) -> list[dict[str, Any]]:
    """Convert Tool objects to OpenAI function calling format.

    Args:
        tools: Dictionary mapping tool names to Tool objects.

    Returns:
        List of tool definitions in OpenAI's function calling format.

    Example:
        >>> tools = {"calculator": calculator_tool}
        >>> openai_tools = to_openai_tools(tools)
        >>> # Returns: [{"type": "function", "function": {"name": "calculator", ...}}]
    """
    out: list[dict[str, Any]] = []
    for t in tools.values():
        function: dict[str, Any] = {
            "name": t.name,
            "description": t.description,
        }
        if t.parameters is not None:
            function["parameters"] = t.parameters.model_json_schema()
        tool_payload: dict[str, Any] = {
            "type": "function",
            "function": function,
        }
        out.append(tool_payload)
    return out