Attexplore: Attribution for Explanation with Model Parameters Exploration
Date
2024
Authors
Zhu, Z.
Chen, H.
Zhang, J.
Wang, X.
Jin, Z.
Xue, J.
Salim, F.D.
Editors
Advisors
Journal Title
Journal ISSN
Volume Title
Type:
Conference paper
Citation
Proceedings of the 12th International Conference on Learning Representations (ICLR 2024), 2024, pp.32790-32816
Statement of Responsibility
Zhiyu Zhu, Huaming Chen, Jiayu Zhang, Xinyi Wang, Zhibo Jin, Jason Xue and Flora D. Salim
Conference Name
12th International Conference on Learning Representations (ICLR) (7 May 2024 - 11 May 2024 : Vienna Austria)
Abstract
Due to the real-world noise and human-added perturbations, attaining the trustworthiness of deep neural networks (DNNs) is a challenging task. Therefore, it becomes essential to offer explanations for the decisions made by these non-linear and complex parameterized models. Attribution methods are promising for this goal, yet its performance can be further improved. In this paper, for the first time, we present that the decision boundary exploration approaches of attribution are consistent with the process for transferable adversarial attacks. Specifically, the transferable adversarial attacks craft general adversarial samples from the source model, which is consistent with the generation of adversarial samples that can cross multiple decision boundaries in attribution. Utilizing this consistency, we introduce a novel attribution method via model parameter exploration. Furthermore, inspired by the capability of frequency exploration to investigate the model parameters, we provide enhanced explainability for DNNs by manipulating the input features based on frequency information to explore the decision boundaries of different models. Large-scale experiments demonstrate that our \textbf{A}ttribution method for \textbf{E}xplanation with model parameter e\textbf{X}ploration (AttEXplore) outperforms other state-of-the-art interpretability methods. Moreover, by employing other transferable attack techniques, AttEXplore can explore potential variations in attribution outcomes. Our code is available at: https://github.com/LMBTough/ATTEXPLORE.
School/Discipline
Dissertation Note
Provenance
Description
Access Status
Rights
© the author(s). Authors do not transfer the copyright of their paper to ICLR, instead they grant ICLR a non-exclusive, perpetual, royalty-free, fully-paid, fully-assignable license to copy, distribute and publicly display all or part of the paper.