Skip to content
Last updated

Test of Humanity debrief

Detection accuracy

  • How many of your intentional problems did the AI reviewer catch?

    • All problems except 1 were catched
  • How many did it miss? Do the misses share a pattern (e.g., all subtle, all in a specific file type)?

    • Missed a problem in a file that was moved and had a wrong change:

      image copy.png
  • Were there any false positives - comments about things that weren't actually problems?

    • No
  • Did the severity levels match how serious the problems actually were?

    • Yes
  • How actionable were the comments? Could you fix the issue based on the comment alone?

    • Yes Would be good that when clicking on file from review, user will be redirected to the Editor to fix this issue

Verification accuracy

  • Did the AI correctly identify which issues you fixed?
    • Yes
  • Did it incorrectly resolve any issues you left broken?
    • No
  • Did it miss any fixes you made?
    • No
  • How confident are you in the verification after this test?
    • Very confident

Analytics page

  • Did the numbers on the analytics page match your experience? Yes
  • What would you add or change about the page?
    • Add a refresh button or keep state after refresh (deep linking for selected project?)

Overall experience

  • Was it clear whether the AI review was in progress or complete? - Yes
  • How long did the AI review take? - ~8 minutes (maybe because we moved a lot of files, though with minimal changes)
  • Did the summary comment posted by the AI reviewer make sense? - Yes
  • If you added custom AI rules, were they accounted for in the review? - We didn't add custom AI rules
  • Would you trust this AI reviewer on real PRs after this test? - Yes, with caution (as with others)
  • What made you smile?
    • The reviewer was aware that we are testing it :Dimage.png
  • What felt frustrating or broken?
    • When AI reviewer flags issues, you can't navigate directly to the files in the editor (as with the link checker, for example)
  • On a scale of 1 to 10, how would you rate the overall quality of the AI reviewer?
  • 9/10
  • Are there any additional notes you would like to share?
    • N/A