HYBRIDJOIN for near-real-time Data Warehousing
aut.researcher | Naeem, Muhammad | |
dc.contributor.author | Naeem, MA | |
dc.contributor.author | Dobbie, G | |
dc.contributor.author | Weber, G | |
dc.contributor.editor | Taniar, D | |
dc.date.accessioned | 2012-04-26T00:42:42Z | |
dc.date.available | 2012-04-26T00:42:42Z | |
dc.date.copyright | 2011 | |
dc.date.issued | 2011 | |
dc.description.abstract | An important component of near-real-time data warehouses is the near-real-time integration layer. One important element in near-real-time data integration is the join of a continuous input data stream with a diskbased relation. For high-throughput streams, stream-based algorithms, such as Mesh Join (MESHJOIN), can be used. However, in MESHJOIN the performance of the algorithm is inversely proportional to the size of disk-based relation. The Index Nested Loop Join (INLJ) can be set up so that it processes stream input, and can deal with intermittences in the update stream but it has low throughput. This paper introduces a robust stream-based join algorithm called Hybrid Join (HYBRIDJOIN), which combines the two approaches. A theoretical result shows that HYBRIDJOIN is asymptotically as fast as the fastest of both algorithms. The authors present performance measurements of the implementation. In experiments using synthetic data based on a Zipfian distribution, HYBRIDJOIN performs significantly better for typical parameters of the Zipfian distribution, and in general performs in accordance with the theoretical model while the other two algorithms are unacceptably slow under different settings. | |
dc.identifier.citation | International Journal of Data Warehousing and Mining, vol.7(4), pp.21 - 42 | |
dc.identifier.doi | 10.4018/jdwm.2011100102 | |
dc.identifier.issn | 1548-3924 | |
dc.identifier.uri | https://hdl.handle.net/10292/4051 | |
dc.publisher | IGI Publishers | |
dc.relation.isreplacedby | 10292/4176 | |
dc.relation.isreplacedby | http://hdl.handle.net/10292/4176 | |
dc.relation.uri | http://dx.doi.org/10.4018/jdwm.2011100102 | |
dc.rights | Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited. | |
dc.rights.accessrights | OpenAccess | |
dc.subject | Data transformation | |
dc.subject | Data Warehousing | |
dc.subject | Near-real-time | |
dc.subject | Performance and tuning | |
dc.title | HYBRIDJOIN for near-real-time Data Warehousing | |
dc.type | Journal Article | |
pubs.organisational-data | /AUT | |
pubs.organisational-data | /AUT/Design & Creative Technologies | |
pubs.organisational-data | /AUT/Design & Creative Technologies/School of Computing & Mathematical Science |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- HYBRIDJOIN-for-Near-Real-Time-Data-Warehousing.pdf
- Size:
- 440.89 KB
- Format:
- Adobe Portable Document Format
- Description: