DynamicComparisonRunEvaluator#

Compare predictions (as traces) from 2 or more runs.

Initialize the DynamicRunEvaluator with a given function.

Parameters:

func (Callable) – A function that takes a Run and an optional Example as
arguments
EvaluationResults. (and returns an EvaluationResult or)
afunc (Optional[Callable[[Sequence[Run], Optional[Example]], Awaitable[_COMPARISON_OUTPUT]]])

Attributes

is_async

Check if the evaluator function is asynchronous.

Methods

`__init__`(func[, afunc])	Initialize the DynamicRunEvaluator with a given function.
`acompare_runs`(runs[, example])	Evaluate a run asynchronously using the wrapped async function.
`compare_runs`(runs[, example])	Compare runs to score preferences.

Initialize the DynamicRunEvaluator with a given function.

Parameters:

func (Callable) – A function that takes a Run and an optional Example as
arguments
EvaluationResults. (and returns an EvaluationResult or)
afunc (Callable[[Sequence[Run], Example | None], Awaitable[ComparisonEvaluationResult | dict]] | None)

async acompare_runs( runs: Sequence[Run], example: Example | None = None, ) → ComparisonEvaluationResult[source]#

Evaluate a run asynchronously using the wrapped async function.

This method directly invokes the wrapped async function with the: provided arguments.

Parameters:

runs (Run) – The runs to be evaluated.
example (Optional[Example]) – An optional example to be used in the evaluation.

Returns:

The result of the evaluation.

Return type:

ComparisonEvaluationResult

compare_runs( runs: Sequence[Run], example: Example | None = None, ) → ComparisonEvaluationResult[source]#

Compare runs to score preferences.

Parameters:

runs (Sequence[Run]) – A list of runs to compare.
example (Example | None) – An optional example to be used in the evaluation.

Return type:

ComparisonEvaluationResult