Optimised X-HYBRIDJOIN for near-real-time data warehousing

Naeem, M; Dobbie, G; Weber, G

Optimised X-HYBRIDJOIN for near-real-time data warehousing

Files

Published-version.pdf

Size: 748 KB, File format: Adobe PDF

Date

2012-01-30

Authors

Naeem, M

Dobbie, G

Weber, G

Item type

Conference Contribution

Publisher

Australian Computer Society

Abstract

Stream-based join algorithms are needed in modern near-real-time data warehouses. A particular class of stream-based join algorithms, with MESHJOIN as a typical example, computes the join between a stream and a disk-based relation. Recently we have presented a new algorithm X-HYBRIDJOIN (Extended Hybrid Join) in that class. X-HYBRIDJOIN achieves better performance compared to earlier algorithms by pinning frequently accessed data from the disk-based relation in main memory. Apart from being held in main memory, X-HYBRIDJOIN treats this frequently accessed data no differently than other data from the disk-based relation. In this paper we investigate whether performance can be improved by treating the frequently accessed data differently. We present a new algorithm called Optimised X-HYBRIDJOIN, which consists of two phases. One phase, called the stream-probing phase, deals with the frequently accessed part of the disk-based relation. The other one is called the disk-probing phase and deals with the other part of the disk-based relation. In experiments we found that the performance of Optimised X-HYBRIDJOIN is significantly better than the performance of X-HYBRIDJOIN. We derive the cost model for our algorithm, which allows us to tune the components of Optimised X-HYBRIDJOIN. We performed an experimental study and we validate the cost model against the experimental results.

Source

Proceeding ADC '12 Proceedings of the Twenty-Third Australasian Database Conference - Volume 124. Pages 21-30.

Publisher's version

http://dl.acm.org/citation.cfm?id=2483739.2483744

Rights statement

Copyright 2012, Australian Computer Society, Inc. This paper appeared at the 23rd Australasian Database Conference (ADC 2012), Melbourne, Australia, January-February 2012. Conferences in Research and Practice in Information Technology (CRPIT), Vol. 124, Rui Zhang and Yanchun Zhang, Ed. Reproduction for academic, not-for-profit purposes permitted provided this text is included.

Permanent link

https://hdl.handle.net/10292/7521

Collections

School of Engineering, Computer and Mathematical Sciences - Te Kura Mātai Pūhanga, Rorohiko, Pāngarau

Full item page

Optimised X-HYBRIDJOIN for near-real-time data warehousing

Files

Date

Authors

Supervisor

Item type

Degree name

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Source

DOI

Publisher's version

Rights statement

Permanent link

Collections

Endorsement

Review

Supplemented By

Referenced By