HI-Tree: Mining High Influence Patterns Using External and Internal Utility Values

Koh, YS; Pears, RL

doi:10.1007/978-3-319-22729-0_4

HI-Tree: Mining High Influence Patterns Using External and Internal Utility Values

Files

dawak paper id 9051.pdf

Size: 270.34 KB, File format: Adobe PDF

Date

2015-09-01

Authors

Koh, YS

Pears, RL

Item type

Conference Contribution

Publisher

Springer

Abstract

We propose an efficient algorithm, called HI-Tree, for mining high influence patterns for an incremental dataset. In traditional pattern mining, one would find the complete set of patterns and then apply a post-pruning step to it. The size of the complete mining results is typically prohibitively large, despite the fact that only a small percentage of high utility patterns are interesting. Thus it is inefficient to wait for the mining algorithm to complete and then apply feature selection to post-process the large number of resulting patterns. Instead of generating the complete set of frequent patterns we are able to directly mine patterns with high utility values in an incremental manner. In this paper we propose a novel utility measure called an influence factor using the concepts of external utility and internal utility of an item. The influence factor for an item takes into consideration its connectivity with its neighborhood as well as its importance within a transaction. The measure is especially useful in problem domains utilizing network or interaction characteristics amongst items such as in a social network or web click-stream data. We compared our technique against state of the art incremental mining techniques and show that our technique has better rule generation and runtime performance.

Keywords

Prefix-tree; High influence patterns; FP-growth

Source

In: Madria S., Hara T. (eds) Big Data Analytics and Knowledge Discovery. DaWaK 2015. Lecture Notes in Computer Science, vol 9263. Springer, Cham

DOI

10.1007/978-3-319-22729-0_4

Rights statement

An author may self-archive an author-created version of his/her article on his/her own website and or in his/her institutional repository. He/she may also deposit this version on his/her funder’s or funder’s designated repository at the funder’s request or as a result of a legal obligation, provided it is not made publicly available until 12 months after official publication. He/ she may not use the publisher's PDF version, which is posted on www.springerlink.com, for the purpose of self-archiving or deposit. Furthermore, the author may only post his/her version provided acknowledgement is given to the original source of publication and a link is inserted to the published article on Springer's website. The link must be accompanied by the following text: "The final publication is available at www.springerlink.com”. (Please also see Publisher’s Version and Citation).

Permanent link

https://hdl.handle.net/10292/9370

Collections

School of Engineering, Computer and Mathematical Sciences - Te Kura Mātai Pūhanga, Rorohiko, Pāngarau

Full item page

HI-Tree: Mining High Influence Patterns Using External and Internal Utility Values

Files

Date

Authors

Supervisor

Item type

Degree name

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Source

DOI

Publisher's version

Rights statement

Permanent link

Collections

Endorsement

Review

Supplemented By

Referenced By