Materialization strategies for web based search computing applications

aut.embargoNoen_NZ
aut.thirdpc.containsNoen_NZ
aut.thirdpc.permissionNoen_NZ
aut.thirdpc.removedNoen_NZ
dc.contributor.advisorPears, Russel
dc.contributor.authorZagorac, Srdan
dc.date.accessioned2015-11-25T23:02:28Z
dc.date.available2015-11-25T23:02:28Z
dc.date.copyright2014
dc.date.created2015
dc.date.issued2014
dc.date.updated2015-11-25T08:49:59Z
dc.description.abstractIn the thesis we provide a characterization of view materialization in the context of multi domain heterogeneous search application. Web data view materialization is presented as a solution for technical constraints and problems implied by the unknown structure of the web data sources. The web data materialization model extends the search computing (SeCo) multi-layered model, where the search services are registered in a service repository that describes the functional (e.g. invocation end-point, input and output attributes) information of data end-points. Our first research goal is to solve the problem of finding a sequence of access patterns, which when executed produces a materialization output. For the first research goal we make the following novel contributions: 1) Formulation of the building blocks for the materialization feasibility analysis; 2) Definition of the materialization feasibility analysis method and the accompanying algorithms; 3) A detailed empirical study conducted on a set of materialization tasks ranging in their schema dependency complexity. Our second research goal is the optimization of the materialization process so that the most optimal sequence in terms of materialization output efficiency and quality, executes at all times. For this goal we make the following novel contributions: 1) Formulation of a set of performance dimensions and their metrics for web source materialization; 2) A cost model that utilizes optimization metrics in order to qualitatively differentiate between web services in terms of materialization time; 3) A query optimization procedure that explores the characteristics of the underlying source data domain in order to prioritize the execution of the most productive queries in terms of their data harvesting power; 4) Materialization process optimization strategies based on the web source performance dimension metrics and query optimization procedure; 5) A detailed empirical study conducted on several relevant web based data sources that clearly shows the effectiveness of the proposed solution.en_NZ
dc.identifier.urihttps://hdl.handle.net/10292/9274
dc.language.isoenen_NZ
dc.publisherAuckland University of Technology
dc.rights.accessrightsOpenAccess
dc.subjectMulti-domain searchen_NZ
dc.subjectMaterialization feasibility analysisen_NZ
dc.subjectMaterialization optimizationen_NZ
dc.subjectData surfacingen_NZ
dc.subjectDeep web miningen_NZ
dc.subjectWeb data materializationen_NZ
dc.subjectWeb data servicesen_NZ
dc.titleMaterialization strategies for web based search computing applicationsen_NZ
dc.typeThesis
thesis.degree.discipline
thesis.degree.grantorAuckland University of Technology
thesis.degree.grantorAuckland University of Technology
thesis.degree.levelDoctoral Theses
thesis.degree.nameDoctor of Philosophyen_NZ
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
ZagoracS.pdf
Size:
5.38 MB
Format:
Adobe Portable Document Format
Description:
Whole thesis
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
889 B
Format:
Item-specific license agreed upon to submission
Description:
Collections