Near closed frequent itemsets to accelerate the generation of association rules in a data stream environment

Date
2010
Authors
Viriyarattanaporn, Nathaphong
Supervisor
Pears, Russel
Item type
Thesis
Degree name
Master of Computer and Information Sciences
Journal Title
Journal ISSN
Volume Title
Publisher
Auckland University of Technology
Abstract

The subject of this research is mining data stream. It is one of the most challenging and widely researched areas in Knowledge Discovery and Data Mining (KDD). A data stream is a continuous, voluminous, and unpredictable flow of data which occurs in many application domains. In a previous study, Data Stream Mining (DSM) algorithm was proposed to overcome these problems on association rules mining. It was built using various techniques such as closed frequent itemsets, tree data structures, itemsets pruning, and statistical sampling. We have developed Near Closed Nodes algorithms, which can be applied to algorithms for mining association rules that utilised closed itemsets structure. In this study, we look into the characteristics of closed frequent itemsets and propose a novel concept called Near Closed Nodes (NCN). This concept was thoroughly explored and later developed in conjunction with an existing DSM algorithm. By incorporating NCN into the DSM algorithm, we were able to increase the performance of both speed and memory usage. A comprehensive experimental study was performed to compare the performance of DSM and DSM-NCN using both simulated and real world datasets. Based on the results from the experimental study, we concluded that DSM-NCN outperformed DSM in most circumstances, especially when the datasets were dense.

Description
Keywords
Association rules , Data stream mining , Near closed node , Computer science
Source
DOI
Publisher's version
Rights statement