Close

Presentation

Holistic Evaluation of Explainable Artificial Intelligence for Human-Autonomy Teaming
DescriptionHuman-autonomy teaming with an Explainable AI (XAI) system is being proposed in a variety of high-consequence, dynamic settings such as military aviation or space exploration. Rigorous, human-centered evaluation of XAI systems is still a nascent field and often lacks the full range of human factors evaluation. We conducted an evaluation of the effect explanation type for AI route planning on team performance, mental workload, trust, situation awareness (SA), and user preference in a realistic space exploration simulator. Participants (N=16, 10M/6F) simultaneously drove a simulated Mars rover while supervising and directing a highly-autonomous unmanned rover. Participants received various forms of explanations of AI-generated routes, including global goal explanation, contrastive explanation, or deductive explanation. Linear Mixed Effects models were used to account for trial number and condition relative to all outcomes measured, with a baseline condition of “AI agent available, no explanation provided”. Performance on the manual driving task was better with global explanations (p=0.02). Performance on the autonomy supervision task was best with all explanations (p=0.01). Performance on the combined task was best with global and contrastive explanations (p=0.016) or all explanations (p=0.021), but worse when no AI agent was available (p=0.02). Explanation type also led to changes in workload ratings (p=0.0.18), explainability ratings (p<0.001), trust ratings (p<0.001), and usability ratings (p=0.005). SAGAT performance did not change with explanation type (p=0.24). These results begin to establish a systematic means of assessing explanation type through a human-centered evaluation of XAI systems.