Draft: Add YAML config support for defining Duo Chat docs evaluators
What does this merge request do and why?
Add generic YAML config system to define Duo Chat docs
evaluators with custom LLMs (currently Anthropic support only).
How to set up and validate locally
-
Prev logic works as expected:
poetry run eli5 duo-chat evaluate docs --limit=10
-
With YAML configuration:
# We use an example file that defines two LLM judges (Sonnet and Haiku). Feel free to use any other YAML file with the same structure. poetry run eli5 duo-chat evaluate docs --limit=10 --config-file=eli5/duochat/evaluation/config/.examples/docs_multiple_llms_judges.yaml
Example output with two resulting metrics -
context-qa-accuracy_claude-3-haiku-20240307
andcontext-qa-accuracy_claude-3-sonnet-20240229
.
Merge request checklist
-
Tests added for new functionality. If not, please raise an issue to follow up. -
Documentation added/updated, if needed.
Edited by Alexander Chueshev