| Burnap, Pete  ORCID: https://orcid.org/0000-0003-0396-633X, French, Richard, Turner, Frederick and Jones, Kevin
      2018.
      
      Malware classification using self organising feature maps and machine activity data.
      Computers and Security
      73
      
      , pp. 399-410.
      
      10.1016/j.cose.2017.11.016   | 
| Preview | PDF
 - Published Version Available under License Creative Commons Attribution. Download (1MB) | Preview | 
Abstract
In this article we use machine activity metrics to automatically distinguish between malicious and trusted portable executable software samples. The motivation stems from the growth of cyber attacks using techniques that have been employed to surreptitiously deploy Advanced Persistent Threats (APTs). APTs are becoming more sophisticated and able to obfuscate much of their identifiable features through encryption, custom code bases and in-memory execution. Our hypothesis is that we can produce a high degree of accuracy in distinguishing malicious from trusted samples using Machine Learning with features derived from the inescapable footprint left behind on a computer system during execution. This includes CPU, RAM, Swap use and network traffic at a count level of bytes and packets. These features are continuous and allow us to be more flexible with the classification of samples than discrete features such as API calls (which can also be obfuscated) that form the main feature of the extant literature. We use these continuous data and develop a novel classification method using Self Organizing Feature Maps to reduce over fitting during training through the ability to create unsupervised clusters of similar ‘behaviour’ that are subsequently used as features for classification, rather than using the raw data. We compare our method to a set of machine classification methods that have been applied in previous research and demonstrate an increase of between 7.24% and 25.68% in classification accuracy using our method and an unseen dataset over the range of other machine classification methods that have been applied in previous research.
| Item Type: | Article | 
|---|---|
| Date Type: | Publication | 
| Status: | Published | 
| Schools: | Schools > Computer Science & Informatics Research Institutes & Centres > Data Innovation Research Institute (DIURI) | 
| Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science | 
| Additional Information: | This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). | 
| Publisher: | Elsevier | 
| ISSN: | 0167-4048 | 
| Funders: | Engineering and Physical Sciences Research Council | 
| Date of First Compliant Deposit: | 13 December 2017 | 
| Date of Acceptance: | 24 November 2017 | 
| Last Modified: | 05 May 2023 09:18 | 
| URI: | https://orca.cardiff.ac.uk/id/eprint/107377 | 
Citation Data
Cited 78 times in Scopus. View in Scopus. Powered By Scopus® Data
Actions (repository staff only)
|  | Edit Item | 

 
							

 Altmetric
 Altmetric Altmetric
 Altmetric