Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Bane or boon: measuring the effect of evasive malware on system call classifiers

Nunes, Matthew ORCID: https://orcid.org/0000-0003-1990-5814, Burnap, Peter ORCID: https://orcid.org/0000-0003-0396-633X, Reinecke, Philipp ORCID: https://orcid.org/0000-0002-2411-0891 and Lloyd, Kaelon 2022. Bane or boon: measuring the effect of evasive malware on system call classifiers. Journal of Information Security and Applications 67 , 103202. 10.1016/j.jisa.2022.103202

[thumbnail of new1-s2.0-S2214212622000813-main (1).pdf]
Preview
PDF - Published Version
Available under License Creative Commons Attribution.

Download (4MB) | Preview

Abstract

Malware refers to software that is designed to achieve a malicious purpose usually to benefit its creator. To accomplish this, malware hides its true purpose from its target and malware analysts until it has established a foothold on the victim's machine. Malware analysts, therefore, have to find increasingly sophisticated methods to detect malware prompting malware authors to increase the number of evasive techniques employed by their malware. Dynamic malware analysis has been framed as a potential solution as it runs malware in its preferred environment to ensure that it observes its true behaviour. However, it is usually a restricted form of the preferred environment and malware may only be run for two minutes or less. This means that if malware does not demonstrate its malicious intent within that time frame and environment, the behaviour observed and subsequently learned may not be the behaviour that needs to be prevented. There is a risk that classifiers trained using the standard dynamic malware analysis process will only recognise malware by its evasive behaviour rather than a mix of behaviours. In this paper, we study the extent to which classifiers are dependent on evasive behaviour when identifying malware. We achieve this by training them on real ransomware and benignware and then testing their ability to detect carefully crafted simulated ransomware. The simulated ransomware gives us the freedom to create samples with different levels of evasive and malicious behaviour. The simulated samples, like the real samples, are run in a sandboxed environment where data is collected at a user- and Kernel-level. The results of our experiments indicated that, in general, the classifiers were more likely to label the simulated samples as malicious once the amount of evasive behaviour present in a sample went beyond a threshold. Generally, this threshold was crossed when the simulated ransomware waited 2 seconds or more between each file it encrypted. Additionally, the classifiers trained on the user-level data were not as robust against small changes in system calls made. Whereas, when trained on system calls gathered at a Kernel, system-wide level, the classifiers' results were less variable. Finally, in attempting to simulate malware for our experiments, we discovered that the field of malware simulation is relatively unstudied despite its potential and therefore provide recommendations for simulating malware for system-call analysis.

Item Type: Article
Date Type: Publication
Status: Published
Schools: Computer Science & Informatics
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Q Science > QA Mathematics > QA76 Computer software
Publisher: Elsevier
ISSN: 2214-2126
Funders: Engineering and Physical Sciences Research Council
Date of First Compliant Deposit: 26 May 2022
Date of Acceptance: 29 April 2022
Last Modified: 24 May 2023 16:26
URI: https://orca.cardiff.ac.uk/id/eprint/149485

Actions (repository staff only)

Edit Item Edit Item

Downloads

Downloads per month over past year

View more statistics