Use Case Utility

  • XAI and LLMs are often tools for accomplishing some other goal.
  • Very limited work has explored the utility of LLMs in use-case– specified user studies, but a user study on Microsoft/Github’s Copilot [1], an LLM-based code generation tool, found that it “did not necessarily improve the task completion time or success rate” [52]
  • LLM outputs often sound very confident, even if what they are saying is hallucinated [50]
  • When the user inquires about the incorrectness, they also have a documented tendency to argue that the user is wrong and that their response is correct. In fact, some have called LLMs “mansplaining as a service” [34]
  • This can make it more difficult for humans to implement cognitive checks on LLM outputs.
  • While some recent LLM work has outlined categories of failure modes for LLMs based on the types of cognitive biases use [29], we push for greater work in this field