Exact Evaluator¶
Exact-match evaluator for tool calls.
exact
¶
Exact-match evaluator for tool calls.
ExactEvaluator
¶
ExactEvaluator(*, match_order: bool = False, ignore_extra_args: bool = False, ignore_extra_calls: bool = True)
Evaluates tool calls by exact name + arguments match.
Supports optional config for relaxed matching: - match_order: If True, tool calls must appear in the same order. - ignore_extra_args: If True, actual calls may contain extra arguments. - ignore_extra_calls: If True, extra actual calls don't cause failure.
Usage
evaluator = ExactEvaluator() result = evaluator.evaluate(expected=[...], actual=[...])
Source code in src/russo/evaluators/exact.py
evaluate
¶
evaluate(expected: list[ToolCall], actual: list[ToolCall]) -> EvalResult
Compare expected tool calls against actual ones.