Actuarial Studies seminar - Dr Fei Huang - UNSW Business School

A seminar by Dr Fei Huang from UNSW

Title: Why You Should Not Trust Interpretations in Machine Learning: Adversarial Attacks on Partial Dependence Plots

Abstract: The adoption of artificial intelligence (AI) across industries, including insurance, has led to the widespread use of complex black-box models such as gradient-boosting machines and neural networks. Although these models offer enhanced efficiency and accuracy, their lack of transparency has raised concerns among regulators and consumers. To address this, interpretation methods from the growing field of interpretable machine learning have gained attention for understanding relationships between model inputs and outputs. However, while stakeholders may possess a certain level of understanding regarding the limitations of these explanations, there is often a lack of awareness regarding the inherent vulnerability of these methods. 

This paper proposes an adversarial framework to uncover the vulnerability of permutation-based interpretation methods for machine learning tasks, with a particular focus on partial dependence (PD) plots. This adversarial framework modifies the original black box model to manipulate model predictions for instances in the extrapolation domain, which produces deceived PD plots that can hide discriminatory behaviors while maintaining the prediction accuracy of the original model. This framework can produce multiple fooled PD plots via a single model. By using real-world datasets including an auto insurance claims dataset and COMPAS dataset, our results show that it is possible to intentionally hide the discriminatory behaviour of a predictor and make the black-box model appear neutral through interpretation tools like PD plots while retaining almost all the predictions of the original black-box model.  We will provide managerial insights for regulators and industry practitioners based on the findings. 

(Joint work with Xi Xin and Giles Hooker)

For further information, please contact RSFAS Seminars.

All information collected by the University is governed by the ANU Privacy Policy.

Details
Start Date
End Date
Venue
CBE LT1
Presenter(s)
Dr Fei Huang