Iteration 3: Experiment with different prompts and compare how they perform on user satisfaction with response

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

Goal of this issue

Before releasing Explain Code as GA, we should achieve that user feedback is in at least x% of cases helpful and in less then y% of cases wrong. The values for x and y remain to be determined.
We are likely going to switch to a different AI vendor. We should compare how the initial vendor compare to the new vendor to inform the business decision regarding consequences on user's satisfaction when switching vendors.

helpful, unhelpful, wrong (already available as a result of Iteration 2: Collecting prompt and user satisfa... (#404272 - closed))
number of lines (or characters or tokens) selected for explanation
- this will help us understand if the satisfaction is a function of the length of code selected
number of characters (or tokens) of the answer
- this will help us understand if the satisfaction is a function of the length of the answer
language of the code selected
- this will help us understand if the satisfaction is a function of the code language
the prompt used and wether the selected code was before the prompt or after the prompt
- this will help us understand how different prompt designs perform
- do not collect the code itself or the answer from the AI to prevent collecting customer or user data.
allow users to add a text message to explain their sentiment about the response or the feature as such.
count the total number of times that
- users have received an AI answer vs.
- the times they also choose to provide feedback
- the times they asked a follow-up question
  - and did not give feedback
  - did give feedback

Use guidance like https://www.promptingguide.ai/ to engineer a hand full of prompts.
Randomly use the different prompts and different providers.
Present the results in Sisense.
- we intend to keep measuring user satisfaction also beyond GA, to be able to adjust prompts when needed
Use the best performing response going forward.

Edited Sep 29, 2025 by 🤖 GitLab Bot 🤖