An event-based near real-time data integration architecture

Date
2008
Authors
Naeem, M
Dobbie, G
Weber, G
Supervisor
Item type
Conference Contribution
Degree name
Journal Title
Journal ISSN
Volume Title
Publisher
IEEE Computer Society
Abstract

Extract-Transform-Load (ETL) tools feed data from operational databases into data warehouses. Traditionally, these ETL tools use batch processing and operate offline at regular time intervals, for example on a nightly or weekly basis. Naturally, users prefer to have up-to-date data to make their decisions, therefore there is a demand for real-time ETL tools. In this paper we investigate an event-based near real-time ETL layer for transferring and transforming data from the operational database to the data warehouse. One of our main concerns in this paper is master data management in the ETL layer. We present the architecture of a novel, general purpose, event-driven, and near real-time ETL layer that uses a Database Queue (DBQ), works on a push technology principle and directly supports content enrichment. We also observe that the system architecture is consistent with the information architecture of a classical Online Transaction Processing (OLTP) application, allowing us to distinguish between different kinds of data to increase the clarity of the design. Keywords: event-based architecture, content enrichment, master data, extract-transform-load, enterprise service bus.

Description
Keywords
Source
12th Enterprise Distributed Object Computing Conference Workshops, Germany, pages 401 - 404
Publisher's version
Rights statement
Copyright © 2009 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.