Optimising User Trust in Decision Aids · Morten Adelsen Jakobsen

As AI-powered decision support tools enter clinical settings, the question isn’t just whether they’re accurate — it’s whether clinicians trust them at the right level. Over-trust leads to automation bias; under-trust means the tool gets ignored. This project explored how interface design and framing affect that calibration.

The study

We tested medical students’ trust responses when comparing patient symptom sets with suggested diagnoses from two sources: a large language model and an experienced senior doctor. Participants were not always told which was which.

The core methodology used pairwise comparison analysis, which lets us build a preference/trust model without asking participants to rate things directly on scales (which introduces social desirability bias). We ran the statistical analysis in MATLAB after discovering errors in the available R packages for this method.

What we examined

Does knowing a suggestion comes from an AI change trust independent of accuracy?
How does confidence framing (“I am 87% certain” vs. “This is likely”) affect trust?
Are there interface patterns that produce better-calibrated trust — neither blind acceptance nor dismissal?

Key findings

Presentation and framing mattered more than we expected. Students’ trust was significantly influenced by:

Source labelling — “AI” vs. “Doctor” framing shifted trust even when outputs were identical
Confidence language — numerical confidence scores produced different trust responses than qualitative language
Ordering effects — which option appeared first in a comparison influenced preference

Why this matters now

This work feels increasingly relevant. As LLMs move into professional decision-making contexts — medicine, law, engineering — designing interfaces that produce calibrated trust (rather than maximum trust or maximum scepticism) is a genuinely hard UX problem. This project gave me early fluency in that space.