DynamicComparisonRunEvaluator#

class langsmith.evaluation.evaluator.DynamicComparisonRunEvaluator(func: Callable[[Sequence[Run], Example | None], ComparisonEvaluationResult | dict | Awaitable[ComparisonEvaluationResult | dict]], afunc: Callable[[Sequence[Run], Example | None], Awaitable[ComparisonEvaluationResult | dict]] | None = None)[source]#

Compare predictions (as traces) from 2 or more runs.

Initialize the DynamicRunEvaluator with a given function.

Parameters:
  • func (Callable) – A function that takes a Run and an optional Example as

  • arguments

  • EvaluationResults. (and returns an EvaluationResult or) –

  • afunc (Optional[Callable[[Sequence[Run], Optional[Example]], Awaitable[_COMPARISON_OUTPUT]]]) –

Attributes

is_async

Check if the evaluator function is asynchronous.

Methods

__init__(func[, afunc])

Initialize the DynamicRunEvaluator with a given function.

acompare_runs(runs[, example])

Evaluate a run asynchronously using the wrapped async function.

compare_runs(runs[, example])

Compare runs to score preferences.

__init__(func: Callable[[Sequence[Run], Example | None], ComparisonEvaluationResult | dict | Awaitable[ComparisonEvaluationResult | dict]], afunc: Callable[[Sequence[Run], Example | None], Awaitable[ComparisonEvaluationResult | dict]] | None = None)[source]#

Initialize the DynamicRunEvaluator with a given function.

Parameters:
  • func (Callable) – A function that takes a Run and an optional Example as

  • arguments

  • EvaluationResults. (and returns an EvaluationResult or) –

  • afunc (Callable[[Sequence[Run], Example | None], Awaitable[ComparisonEvaluationResult | dict]] | None) –

async acompare_runs(runs: Sequence[Run], example: Example | None = None) ComparisonEvaluationResult[source]#

Evaluate a run asynchronously using the wrapped async function.

This method directly invokes the wrapped async function with the

provided arguments.

Parameters:
  • runs (Run) – The runs to be evaluated.

  • example (Optional[Example]) – An optional example to be used in the evaluation.

Returns:

The result of the evaluation.

Return type:

ComparisonEvaluationResult

compare_runs(runs: Sequence[Run], example: Example | None = None) ComparisonEvaluationResult[source]#

Compare runs to score preferences.

Parameters:
  • runs (Sequence[Run]) – A list of runs to compare.

  • example (Example | None) – An optional example to be used in the evaluation.

Return type:

ComparisonEvaluationResult