Logo of the Physikalisch-Technische Bundesanstalt

Benchmark for “explainable” artificial intelligence established


Artificial intelligence (AI, e.g., deep learning) is increasingly used to assist high-stakes decisions in areas such as finance, medicine, or autonomous driving. Upcoming regulations will require that the principles by which such algorithms arrive at their predictions should be transparent. However, while numerous “explainable” AI (XAI) methods have been proposed, the field is lacking formal definitions of what these are supposed to deliver and how their results should be interpreted. In a new study, we used synthetic ground-truth data to enable a quantitative assessment of the “explanation performance” of common XAI techniques.

We have previously demonstrated that so-called suppressor variables can lead to misinterpretations that could have severe implications in practice (such as attributing the presence of a disease to a variable that is statistically independent of it). Our simulation study allowed us to identify those XAI approaches whose performance is systematically impaired by the presence of suppressors. The study will appear in the Machine Learning Journal (Springer) and be presented at the European Conference on Machine Learning (ECML) this year. Code and reference data are made publicly available.

Related publication: Wilming, R., Budding, C., Müller, K. R., & Haufe, S. (2021). Scrutinizing XAI using linear ground-truth data with suppressor variables. Machine Learning Journal. In press. DOI: Opens external link in new window10.1007/s10994-022-06167-y


Experts: Opens local program for sending emailRick Wilming (TU Berlin), Opens local program for sending emailStefan Haufe (TU Berlin, Opens internal link in current windowPTB 8.44)