ComparisonEvaluationResult#

class langsmith.evaluation.evaluator.ComparisonEvaluationResult[source]#

Bases: BaseModel

Feedback scores for the results of comparative evaluations.

These are generated by functions that compare two or more runs, returning a ranking or other feedback.

Create a new model by parsing and validating input data from keyword arguments.

Raises ValidationError if the input data cannot be parsed to form a valid model.

param comment: str | Dict[UUID | str, str] | None = None#

Comment for the scores. If a string, it’s shared across all target runs. If a dict, it maps run IDs to individual comments.

param key: str [Required]#

The aspect, metric name, or label for this evaluation.

param scores: Dict[UUID | str, StrictBool | StrictInt | StrictFloat | None] [Required]#

The scores for each run in the comparison.

param source_run_id: UUID | str | None = None#

The ID of the trace of the evaluator itself.