Repository logo
 

Data Mining Log File Streams for the Detection of Anomalies

Date

Supervisor

Russel, Pears

Item type

Thesis

Degree name

Master of Computer and Information Sciences

Journal Title

Journal ISSN

Volume Title

Publisher

Auckland University of Technology

Abstract

Log files play an important part in the day to day running of many systems and services, allowing administrators and other users to gain insights into operational, performance or even security issues but it is now impractical with the volume of files today to manually examine them. Existing tools in this space largely work by detecting anomalies from log files that have already been stored or by comparing them against known errors (signatures). By data mining log file streams for the detection of anomalies instead, it will allow administrators to reduce the time required to detect anomalies significantly with no signatures or complex settings needing to be maintained. This paper presents the experimental work undertaken to define a generic, practical and scalable method for anomaly detection in streaming log files by detecting the change to the mix of log events occurring. This was achieved by following a modified CRISP-DM (Cross Industry Standard Process for Data Mining) methodology enabling a broader more flexible approach to the data mining process. By taking this approach, a solution was developed that employs common log file features together with a weighted earth mover distance metric. This enabled a framework to be developed that can be broadly applied to many log file types. By setting a simple percentile threshold indicating an acceptable level of change, anomaly detection in streaming log files can be achieved.

Description

Source

DOI

Publisher's version

Rights statement

Collections