ComparisonEvaluationResult#
- class langsmith.evaluation.evaluator.ComparisonEvaluationResult[source]#
Bases:
BaseModel
Feedback scores for the results of comparative evaluations.
These are generated by functions that compare two or more runs, returning a ranking or other feedback.
Create a new model by parsing and validating input data from keyword arguments.
Raises ValidationError if the input data cannot be parsed to form a valid model.
- param comment: str | Dict[UUID | str, str] | None = None#
Comment for the scores. If a string, it’s shared across all target runs. If a dict, it maps run IDs to individual comments.
- param key: str [Required]#
The aspect, metric name, or label for this evaluation.
- param scores: Dict[UUID | str, StrictBool | StrictInt | StrictFloat | None] [Required]#
The scores for each run in the comparison.
- param source_run_id: UUID | str | None = None#
The ID of the trace of the evaluator itself.