Beyond Accuracy: How AI Metacognitive Sensitivity improves AI-assisted Decision Making | Walking the Line: Balancing AI Advice and Human Annoyance for Human-AI Complementarity
-----
Beyond Accuracy: How AI Metacognitive Sensitivity improves AI-assisted Decision Making
In settings where human decision‐making relies on AI input, both the predictive accuracy
of the AI system and the reliability of its confidence estimates influence decision
quality. We highlight the role of AI metacognitive sensitivity---its ability to assign
confidence scores that accurately distinguish correct from incorrect predictions---and
introduce a theoretical framework for assessing the joint impact of AI's predictive
accuracy and metacognitive sensitivity in hybrid decision‐making settings. Our research
identifies conditions under which an AI with lower predictive accuracy but higher
metacognitive sensitivity can enhance the overall accuracy of human decision making.
Finally, a behavioral experiment confirms that greater AI metacognitive sensitivity
improves human decision performance. Together, these findings underscore the importance
of evaluating AI assistance not only by accuracy but also by metacognitive sensitivity,
and of optimizing both to achieve superior decision outcomes.
Walking the Line: Balancing AI Advice and Human Annoyance for Human-AI Complementarity
Human-AI complementarity requires the performance of a human-AI team to be better
than either agent type alone. One practical hurdle for achieving human-AI complementarity
is that people can get annoyed with AI assistance, leading them to turn it off. What
should an AI assistant do, if it knows that a less than enthusiastically received
suggestion might prompt the human to shut it off for good? To get a handle on this
question, we designed an experiment in which people have access to a deliberately
annoying AI assistant and find that people are not very good at judging when they
ought (not) to use our AI helper. This reveals a fundamental challenge: if people
struggle to assess whether AI assistance could benefit them at any given time, then
it might be better for the assistant to offer help rather than wait for the human
to request it. But then again, an assistant that speaks up is annoying and so might
not get to help for long. To solve this tricky balancing act, we propose a POMDP framework
that integrates a cognitive model of human annoyance decisions to figure out when
speaking up is worth the risk. In principle, this method could address both human
over- and under-reliance on AI.
-----

