A Reference Architecture for Large Scale Distributed Domain-Driven Big Data Systems
Files
Date
Authors
Supervisor
Item type
Degree name
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Today, people are the ceaseless generators of structured, semi-structured, and unstructured data that, if gleaned and processed, can reveal game-changing patterns. Additionally, advancements in technology have made it easier and faster to collect and analyse this data. This has led to the age of big data. The age of big data began when the volume, variety, and velocity of data overwhelmed traditional systems.
Many businesses have attempted to harness the power of big data; nevertheless, the success rate is low. According to multiple surveys, only a 20% of big data projects are successful. This is due to the challenges of adopting big data, such as organisational culture, rapid technological change, system complexity, and data architecture. This thesis aims to address data architecture challenges of adopting big data by introducing a domain-driven, decentralised big data reference architecture.
This reference architecture is designed specifically to mitigate big data challenges by providing a scalable data architecture for big data systems, flexible and rapid data processing for varied velocity, adaptable management for a wide variety of data formats, maintainable approach for data discovery and aggregation, and increased attention to cross-cutting concerns such as metadata, privacy and security. This research uses design science research as the underlying research framework while utilising empirically grounded reference architecture guidelines for the development of the artefact. The evaluation of the artefact involves two distinct methods: a case-mechanism experiment and expert opinion, ensuring a comprehensive assessment of the big data reference architecture.
The reference architecture’s usefulness and effectiveness are supported by this process, which shows that it can handle volume, velocity, and variety of big data by processing data quickly, being scalable, and being able to adapt to different data formats. Additionally, the reference architecture’s design mitigates the complexity of monolithic data pipelines, decentralises data ownership to avoid bottlenecks, and fosters a more integrated, agile approach to big data systems. This study positions itself as a progressive step in big data reference architectures, directly targeting and offering solutions to the existing shortcomings of big data architectures. It is aimed primarily at data architects and researchers seeking innovative approaches in big data system design and development, as well as practitioners looking to understand and apply the latest advancements in big data architectures.