base
ipw.evaluation.base
¶
EvaluationHandler
¶
Bases: ABC
Base class for per-dataset evaluation strategies.
Source code in intelligence-per-watt/src/ipw/evaluation/base.py
evaluate(*, problem, reference, model_answer, metadata)
abstractmethod
¶
Evaluate a single model answer.
Returns:
| Type | Description |
|---|---|
Optional[bool]
|
(is_correct, metadata) |
Dict[str, object]
|
|
Tuple[Optional[bool], Dict[str, object]]
|
|