OptionalactualThe actual response generated by Genie.
OptionalassessmentAssessment of the evaluation result: good, bad, or needs review
OptionalassessmentReasons for the assessment score.
Assessment reasons describe why a Genie response was scored as BAD.
Deterministic values (compared against the ground truth result):
LLM judge ratings explain the factors driving BAD results:
Deprecated LLM judge values (kept for backward compatibility, do not use):
OptionalbenchmarkThe ID of the benchmark question that was evaluated.
OptionalevalCurrent status of the evaluation run.
OptionalexpectedThe expected responses from the benchmark.
OptionalmanualWhether this evaluation was manually assessed.
OptionalresultThe unique identifier for the evaluation result.
OptionalspaceThe ID of the space the evaluation result belongs to.
Shows detailed information for an evaluation result.