Skip to content

OpenAI Adapters

OpenAI SDK agent adapters — wraps OpenAI clients as Agents.

Note

Requires the openai extra: pip install "russo[openai]"

openai

OpenAI SDK agent adapters — wraps OpenAI clients as Agents.

Provides two adapters:

  • OpenAIAgent: standard Chat Completions with audio input (gpt-4o-audio-preview and similar models)
  • OpenAIRealtimeAgent: Realtime API over WebSocket (gpt-4o-realtime-preview)

Requires the openai package: pip install russo[openai]

OpenAIAgent

OpenAIAgent(*, client: Any, model: str = 'gpt-4o-audio-preview', tools: list[Any] | None = None, system_prompt: str | None = None, extra_create_kwargs: dict[str, Any] | None = None)

Agent adapter that wraps an AsyncOpenAI client for Chat Completions.

Sends audio via the Chat Completions API and auto-parses tool-call responses with :class:OpenAIResponseParser.

Usage::

from openai import AsyncOpenAI

client = AsyncOpenAI()
agent = OpenAIAgent(
    client=client,
    model="gpt-4o-audio-preview",
    tools=[{
        "type": "function",
        "function": {
            "name": "book_flight",
            "parameters": {"type": "object", "properties": {...}},
        },
    }],
)
response = await agent.run(audio)
PARAMETER DESCRIPTION
client

An openai.AsyncOpenAI instance.

TYPE: Any

model

Model name supporting audio input.

TYPE: str DEFAULT: 'gpt-4o-audio-preview'

tools

OpenAI tool definitions (function-calling format).

TYPE: list[Any] | None DEFAULT: None

system_prompt

Optional system message prepended to the conversation.

TYPE: str | None DEFAULT: None

extra_create_kwargs

Additional kwargs forwarded to client.chat.completions.create().

TYPE: dict[str, Any] | None DEFAULT: None

Source code in src/russo/adapters/openai.py
def __init__(
    self,
    *,
    client: Any,
    model: str = "gpt-4o-audio-preview",
    tools: list[Any] | None = None,
    system_prompt: str | None = None,
    extra_create_kwargs: dict[str, Any] | None = None,
) -> None:
    """
    Args:
        client: An ``openai.AsyncOpenAI`` instance.
        model: Model name supporting audio input.
        tools: OpenAI tool definitions (function-calling format).
        system_prompt: Optional system message prepended to the conversation.
        extra_create_kwargs: Additional kwargs forwarded to
            ``client.chat.completions.create()``.
    """
    self.client = client
    self.model = model
    self.tools = tools
    self.system_prompt = system_prompt
    self.extra_create_kwargs = extra_create_kwargs or {}
    self._parser = OpenAIResponseParser()

run async

run(audio: Audio) -> AgentResponse

Send audio via Chat Completions and parse the tool-call response.

Source code in src/russo/adapters/openai.py
async def run(self, audio: Audio) -> AgentResponse:
    """Send audio via Chat Completions and parse the tool-call response."""
    audio_b64 = base64.b64encode(audio.data).decode("ascii")

    messages: list[dict[str, Any]] = []
    if self.system_prompt:
        messages.append({"role": "system", "content": self.system_prompt})

    messages.append(
        {
            "role": "user",
            "content": [
                {
                    "type": "input_audio",
                    "input_audio": {"data": audio_b64, "format": audio.format},
                }
            ],
        }
    )

    kwargs: dict[str, Any] = {
        "model": self.model,
        "messages": messages,
        **self.extra_create_kwargs,
    }
    if self.tools:
        kwargs["tools"] = self.tools

    logger.debug(
        "Sending %d bytes of %s audio to %s",
        len(audio.data),
        audio.format,
        self.model,
    )

    response = await self.client.chat.completions.create(**kwargs)
    return self._parser.parse(response)

OpenAIRealtimeAgent

OpenAIRealtimeAgent(*, client: Any | None = None, connection: Any | None = None, model: str = 'gpt-4o-realtime-preview', tools: list[Any] | None = None, response_timeout: float = 30.0)

Agent adapter for OpenAI's Realtime API.

Connects via the SDK's client.beta.realtime.connect() interface, sends audio, and collects response.function_call_arguments.done events.

Accepts either:

  • An AsyncOpenAI client (creates a new connection per run())
  • A pre-existing realtime connection (reuses it, no session config sent)

Usage with client::

from openai import AsyncOpenAI

client = AsyncOpenAI()
agent = OpenAIRealtimeAgent(
    client=client,
    model="gpt-4o-realtime-preview",
    tools=[{
        "type": "function",
        "name": "book_flight",
        "description": "Book a flight",
        "parameters": {"type": "object", "properties": {...}},
    }],
)
response = await agent.run(audio)

Usage with pre-existing connection::

async with client.beta.realtime.connect(model="gpt-4o-realtime-preview") as conn:
    agent = OpenAIRealtimeAgent(connection=conn)
    response = await agent.run(audio)
Note

The Realtime API expects pcm16 audio at 24 kHz mono by default. If you pass WAV audio, the adapter automatically strips the WAV header to extract raw PCM frames.

PARAMETER DESCRIPTION
client

An openai.AsyncOpenAI instance. Mutually preferred with connection.

TYPE: Any | None DEFAULT: None

connection

A pre-existing realtime connection (from client.beta.realtime.connect()).

TYPE: Any | None DEFAULT: None

model

Realtime model name.

TYPE: str DEFAULT: 'gpt-4o-realtime-preview'

tools

Tool definitions sent during session configuration.

TYPE: list[Any] | None DEFAULT: None

response_timeout

Max seconds to wait for a complete response.

TYPE: float DEFAULT: 30.0

Source code in src/russo/adapters/openai.py
def __init__(
    self,
    *,
    client: Any | None = None,
    connection: Any | None = None,
    model: str = "gpt-4o-realtime-preview",
    tools: list[Any] | None = None,
    response_timeout: float = 30.0,
) -> None:
    """
    Args:
        client: An ``openai.AsyncOpenAI`` instance. Mutually preferred with *connection*.
        connection: A pre-existing realtime connection (from ``client.beta.realtime.connect()``).
        model: Realtime model name.
        tools: Tool definitions sent during session configuration.
        response_timeout: Max seconds to wait for a complete response.
    """
    if client is None and connection is None:
        msg = "Provide either 'client' (AsyncOpenAI) or 'connection' (realtime connection)"
        raise ValueError(msg)
    self.client = client
    self.connection = connection
    self.model = model
    self.tools = tools
    self.response_timeout = response_timeout

run async

run(audio: Audio) -> AgentResponse

Send audio via the Realtime API and collect function calls.

Source code in src/russo/adapters/openai.py
async def run(self, audio: Audio) -> AgentResponse:
    """Send audio via the Realtime API and collect function calls."""
    if self.connection is not None:
        return await self._run_on(self.connection, audio, configure=False)

    async with self.client.beta.realtime.connect(model=self.model) as conn:
        return await self._run_on(conn, audio, configure=True)