Iteration 2: Collecting prompt and user satisfaction with response
here
Metrics are collectedProposal
Create a component (e.g. API or something else) that allows to receive and collect the following data:
Timestamp (only day, no time) | Use Case identifier | language model | Prompt Version | Prompt | Prompt location | Reference to content passed along with the prompt | User satisfaction |
---|---|---|---|---|---|---|---|
2 April 2023 | Create:Source_Code:Explain_Code | text-davinci-003 | 1.0 | is a code that | before content | https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/feature.rb#L8-17 | |
10 April 2023 | Create:Source_Code:Explain_Code | text-davinci-003 | 1.1 | Could you explain this code: | after content | https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/feature.rb#L8-17 |
Do not collect any user data !!! The timestamp may help us to debug changes in the AI services used. It is purposely only day-specific to avoid that we could infer who the users are that made the query.
Create a dashboard (e.g. in Sisense or in Grafana) that can be filtered by the different fields and is accessible to the internal developers so that they can use it to understand the user satisfaction with the responses of the AI.
event_action: 'explain_code_blob_viewer',
event_label: 'response_feedback',
event_property: 'helpful',
extra: {
prompt_location: 'before content',
}
Design
After selection we will display the text Feedback saved as {helpful/unhelpful/wrong}
.
This behaviour of the buttons to text came during this discussion !118071 (comment 1362238825)
Related discussion https://gitlab.com/gitlab-org/gitlab/-/issues/403728/designs/20220405-feedback.png