russo¶
Top-level module exports.
russo
¶
russo — testing framework for LLM tool-call accuracy.
Audio
¶
Bases: BaseModel
Audio data with format metadata.
save
¶
Save audio to a file. Wraps raw PCM in a WAV container if needed.
Usage
audio.save("output.wav")
Source code in src/russo/_types.py
ToolCall
¶
Bases: BaseModel
A normalized tool/function call representation.
Provider-agnostic — parsers convert provider-specific formats into this.
AgentResponse
¶
Bases: BaseModel
Normalized response from an agent, containing extracted tool calls.
raw
class-attribute
instance-attribute
¶
The raw, unparsed response from the provider (for debugging).
EvalResult
¶
Bases: BaseModel
Full evaluation result for a test scenario.
summary
¶
Human-readable summary of the evaluation.
Source code in src/russo/_types.py
ToolCallMatch
¶
Bases: BaseModel
Result of comparing a single expected tool call against actuals.
AudioCache
¶
File-system cache for synthesized audio.
Each entry is a pair of files
Usage
cache = AudioCache() # .russo_cache/ cache = AudioCache(Path("my_cache")) # custom dir cache.get("abc123") # Audio | None cache.put("abc123", audio) cache.clear()
Source code in src/russo/_cache.py
cache_key
¶
Deterministic key from prompt text + optional extra metadata.
Extra kwargs (e.g. voice, model) are included so a change in synthesizer config invalidates the cache automatically.
Source code in src/russo/_cache.py
get
¶
get(key: str) -> Audio | None
Load cached audio, or None if not cached.
Source code in src/russo/_cache.py
put
¶
put(key: str, audio: Audio, *, prompt: str = '') -> None
Write audio + metadata to cache.
Source code in src/russo/_cache.py
clear
¶
Remove all cached entries.
Source code in src/russo/_cache.py
CachedSynthesizer
¶
CachedSynthesizer(synthesizer: Synthesizer, *, cache: AudioCache | None = None, enabled: bool = True, cache_key_extra: dict[str, Any] | None = None)
Wraps any Synthesizer with local audio caching.
Satisfies the Synthesizer protocol — drop-in replacement.
Usage
synth = CachedSynthesizer(GoogleSynthesizer(...))
Disable caching at runtime¶
synth = CachedSynthesizer(GoogleSynthesizer(...), enabled=False)
Custom cache directory¶
synth = CachedSynthesizer( GoogleSynthesizer(...), cache=AudioCache(Path("/tmp/my_cache")), )
Include synthesizer config in cache key (invalidates on config change)¶
synth = CachedSynthesizer( GoogleSynthesizer(voice="Kore", model="gemini-2.5-flash-preview-tts"), cache_key_extra={"voice": "Kore", "model": "gemini-2.5-flash-preview-tts"}, )
Clear cache¶
synth.cache.clear()
Source code in src/russo/_cache.py
synthesize
async
¶
synthesize(text: str) -> Audio
Synthesize with cache lookup/store.
Source code in src/russo/_cache.py
ToolCallAssertionError
¶
ToolCallAssertionError(result: EvalResult, message: str = '')
Bases: AssertionError
Rich assertion error with detailed tool call diff.
Source code in src/russo/_assertions.py
tool_call
¶
tool_call(name: str, **arguments: Any) -> ToolCall
Shorthand for creating a ToolCall.
Usage
russo.tool_call("book_flight", from_city="NYC", to_city="LA")
agent
¶
agent(fn: Callable[[Audio], Coroutine[Any, Any, AgentResponse]]) -> _CallableAgent
Decorator to turn an async function into an Agent.
Usage
@russo.agent async def my_agent(audio: russo.Audio) -> russo.AgentResponse: result = await call_my_api(audio.data) return russo.AgentResponse(tool_calls=[...])
Source code in src/russo/_helpers.py
run
async
¶
run(*, prompt: str, synthesizer: Synthesizer, agent: Agent, evaluator: Evaluator, expect: list[ToolCall]) -> EvalResult
Run the full russo pipeline.
- Synthesize audio from the text prompt.
- Pass audio to the agent under test.
- Evaluate the agent's tool calls against expectations.
| PARAMETER | DESCRIPTION |
|---|---|
prompt
|
The text prompt to synthesize into audio.
TYPE:
|
synthesizer
|
Converts text to audio.
TYPE:
|
agent
|
The agent under test.
TYPE:
|
evaluator
|
Compares expected vs actual tool calls.
TYPE:
|
expect
|
The expected tool calls.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
EvalResult
|
EvalResult with pass/fail and per-call match details. |
Source code in src/russo/_pipeline.py
assert_tool_calls
¶
assert_tool_calls(result: EvalResult, *, message: str = '') -> None
Assert that an EvalResult passed.
Raises a ToolCallAssertionError with a rich diff if it didn't.
Usage
result = await russo.run(...) russo.assert_tool_calls(result)
Or with a custom message¶
russo.assert_tool_calls(result, message="Flight booking should work")