Optimising User Trust in Decision Aids
Investigating how to calibrate appropriate trust in AI-powered clinical decision support tools — tested with medical students comparing LLM suggestions against experienced clinicians.
- Role
- Researcher & UX Analyst
- Year
- 2019
- Organisation
- Aalborg University
Outcome
Identified key interface and framing factors that significantly affect trust calibration. Found that presentation order and confidence signalling had larger effects on trust than diagnostic accuracy alone.
As AI-powered decision support tools enter clinical settings, the question isn’t just whether they’re accurate — it’s whether clinicians trust them at the right level. Over-trust leads to automation bias; under-trust means the tool gets ignored. This project explored how interface design and framing affect that calibration.
The study
We tested medical students’ trust responses when comparing patient symptom sets with suggested diagnoses from two sources: a large language model and an experienced senior doctor. Participants were not always told which was which.
The core methodology used pairwise comparison analysis, which lets us build a preference/trust model without asking participants to rate things directly on scales (which introduces social desirability bias). We ran the statistical analysis in MATLAB after discovering errors in the available R packages for this method.
What we examined
- Does knowing a suggestion comes from an AI change trust independent of accuracy?
- How does confidence framing (“I am 87% certain” vs. “This is likely”) affect trust?
- Are there interface patterns that produce better-calibrated trust — neither blind acceptance nor dismissal?
Key findings
Presentation and framing mattered more than we expected. Students’ trust was significantly influenced by:
- Source labelling — “AI” vs. “Doctor” framing shifted trust even when outputs were identical
- Confidence language — numerical confidence scores produced different trust responses than qualitative language
- Ordering effects — which option appeared first in a comparison influenced preference
Why this matters now
This work feels increasingly relevant. As LLMs move into professional decision-making contexts — medicine, law, engineering — designing interfaces that produce calibrated trust (rather than maximum trust or maximum scepticism) is a genuinely hard UX problem. This project gave me early fluency in that space.