[go: up one dir, main page]

Skip to content

Draft: Add YAML config support for defining Duo Chat docs evaluators

What does this merge request do and why?

Add generic YAML config system to define Duo Chat docs evaluators with custom LLMs (currently Anthropic support only).

How to set up and validate locally

  • Prev logic works as expected:

    poetry run eli5 duo-chat evaluate docs --limit=10
  • With YAML configuration:

    # We use an example file that defines two LLM judges (Sonnet and Haiku). Feel free to use any other YAML file with the same structure.
    poetry run eli5 duo-chat evaluate docs --limit=10 --config-file=eli5/duochat/evaluation/config/.examples/docs_multiple_llms_judges.yaml

    Example output with two resulting metrics - context-qa-accuracy_claude-3-haiku-20240307 and context-qa-accuracy_claude-3-sonnet-20240229.

Merge request checklist

  • Tests added for new functionality. If not, please raise an issue to follow up.
  • Documentation added/updated, if needed.
Edited by Alexander Chueshev

Merge request reports

Loading