Transforming Open Data to Linked Open Data: An Ontology Framework Based on New Zealand Case Study

Paramjeet Kaur, ~
Nand, Parma
Nguyen, Minh
Item type
Degree name
Doctor of Philosophy
Journal Title
Journal ISSN
Volume Title
Auckland University of Technology

Open government initiatives are increasingly gaining momentum and are becoming a critical part of the democratic fabric of both developing and developed nations. The primary motivation behind the initiatives is to provide transparency to the general public and encourage wider data usage by not only government-employed professionals but also by the general public and data scientists. As part of open governance, governments around the globe are increasingly releasing governance-related data that was either classified or undisclosed before the open data initiative. However, the consumption of the data has been dearth largely due to the nature of the data being disparate and heterogeneous formats. This has given rise to a need for frameworks that would be able to transform this data into an easily consumable form and make it accessible to stakeholders as well as the general public. This thesis proposes the design, implementation, and usage of an approach driven by an ontology that captures the knowledge of a subset of data released by the New Zealand government as a part of the open data initiative. The proposed framework transforms open data into linked open data based on a novel ontology developed as part of this research.

While open data is commonly available in huge quantities, it lacks quality, accuracy, consistency, and completeness. It can be challenging to find information from this data for analysis towards an objective. There are many rich open data repositories globally. However, they are challenging to understand and use, because the data can only be accessed with a complex set of key phrase search options. Even then, it might end up retrieving data that might be irrelevant and/or incomplete. To mitigate this, ontology-based search, which uses semantics rather than keywords, has been proven to be more effective in strengthening the quality of queries for searching for content from repositories.

This thesis presents a novel framework for semantically linking and achieving disparate open datasets. The framework and end-to-end process are demonstrated using open datasets for agriculture, land, and rainfall sectors in New Zealand. The framework is used to generate ontologies, which are then populated using the data and stored in a knowledge base. We then demonstrate how the knowledge base can be used to extract valuable, rich information pertaining to an objective. We demonstrate how ontologies can be linked manually as well as semi-automatically. Manual linking requires domain experts, whereas semi-automatic linking reduces the overhead of the dependency on domain experts to manually link the concepts. The result of this approach is promising in terms of enhancing the quality of data and the efficiency of the search.

An expert evaluation was conducted to demonstrate and evaluate the efficiency and effectiveness of the ontology framework. The proposed framework was given to seven domain experts to access the knowledge base and do an end-to-end evaluation. The evaluators were asked to answer questions on five criteria: usability, reliability, correctness, usefulness, and effectiveness. A thematic review was then conducted for the collated feedback of domain experts using Nvivo. The results demonstrate the proposed scheme can effectively link open data by generating ontologies for disparate open data, which can then be used to supply useful information derived from a conflation of the original data sets.

Publisher's version
Rights statement