Identifying Polymorphic Malware Variants Using Biosequence Analysis Techniques

Naidu, Vijay Jeevanantham

Identifying Polymorphic Malware Variants Using Biosequence Analysis Techniques

Files

Whole thesis(5.5 MB)

Date

2018

Authors

Naidu, Vijay Jeevanantham

Supervisor

Narayanan, Ajit

Whalley, Jacqueline

Pears, Russel

Item type

Thesis

Degree name

Doctor of Philosophy

Publisher

Auckland University of Technology

Abstract

Modern antivirus systems (AVSs) are not able to detect new polymorphic malware variants until they emerge, even when signatures of one or more variants belonging to a specific polymorphic malware family are known. Polymorphic malware can transform into functionally identical variants of themselves. Polymorphism changes the order of the viral code but not typically the code itself to avoid signature-based detection. Current AVSs detect malware by adopting signatures based on the most essential parts of a known virus, such as execution traces, instruction sequences, etc. Virus writers exploit the weaknesses of malware signature databases by creating new variants using the same engine employed by an already existing polymorphic malware family. In this thesis, virus detection and signature extraction techniques are presented. These techniques were developed by exploring string matching techniques traditionally employed in biosequence analysis. The main contribution of these matching techniques is to extract syntactic patterns (i.e. conserved regions/sequences) from semantically rich polymorphic hex code. These extracted syntactic patterns act as signatures and are used in the identification of polymorphic malware variants belonging to the same family. Moreover, these extracted syntactic patterns can help in identifying new variants that make simple alterations to their newly generated variants. The string matching approaches presented in this thesis may revolutionise our knowledge of polymorphic variant generation and give rise to a new era of string-based syntactic AVSs.

Keywords

Smith-Waterman algorithm , Dynamic programming , Polymorphic malware , Syntactic approach , Sequence alignment techniques , String matching algorithm , Biological sequences , Bioinformatics , Data mining , Automatic signature generation , Phylogenetics

Permanent link

https://hdl.handle.net/10292/12064

Collections

Doctoral Theses

Full item page