Bayesian Learned Models Can Detect Adversarial Malware for Free
Date
2024
Authors
Doan, B.G.
Nguyen, D.Q.
Montague, P.
Abraham, T.
De Vel, O.
Camtepe, S.
Kanhere, S.S.
Abbasnejad, E.
Ranasinghe, D.C.
Editors
Garcia-Alfaro, J.
Kozik, R.
Choras, M.
Katsikas, S.
Kozik, R.
Choras, M.
Katsikas, S.
Advisors
Journal Title
Journal ISSN
Volume Title
Type:
Conference paper
Citation
Lecture Notes in Artificial Intelligence, 2024 / Garcia-Alfaro, J., Kozik, R., Choras, M., Katsikas, S. (ed./s), vol.14982, pp.45-65
Statement of Responsibility
Bao Gia Doan, Dang Quang Nguyen, Paul Montague, Tamas Abraham, Olivier De Vel, Seyit Camtepe, Salil S. Kanhere, Ehsan Abbasnejad, and Damith C. Ranasinghe
Conference Name
European Symposium on Research in Computer Security (ESORICS) (16 Sep 2024 - 20 Sep 2024 : Bydgoszcz, Poland)
Abstract
Vulnerability of machine learning-based malware detectors to adversarial attacks has prompted the need for robust solutions. Adversarial training is an effective method but is computationally expensive to scale up to large datasets and comes at the cost of sacrificing model performance for robustness. We hypothesize that adversarial malware exploits the low-confidence regions of models and can be identified using epistemic uncertainty of ML approaches—epistemic uncertainty in a machine learning-based malware detector is a result of a lack of similar training samples in regions of the problem space. In particular, a Bayesian formulation can capture the model parameters’ distribution and quantify epistemic uncertainty without sacrificing model performance. To verify our hypothesis, we consider Bayesian learning approaches with a mutual information-based formulation to quantify uncertainty and detect adversarial malware in Android, Windows domains and PDF malware. We found, quantifying uncertainty through Bayesian learning methods can defend against adversarial malware. In particular, Bayesian models: (1) are generally capable of identifying adversarial malware in both feature and problem space, (2) can detect concept drift by measuring uncertainty, and (3) with a diversity-promoting approach (or better posterior approximations) leads to parameter instances from the posterior to significantly enhance a detectors’ ability.
School/Discipline
Dissertation Note
Provenance
Description
Access Status
Rights
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024