Optimizing the computation of the Fourier Spectrum from decision trees

aut.embargoNoen_NZ
aut.thirdpc.containsNoen_NZ
aut.thirdpc.permissionNoen_NZ
aut.thirdpc.removedNoen_NZ
dc.contributor.advisorPears, Russel
dc.contributor.authorNugraha, Johan Jonathan
dc.date.accessioned2015-07-09T03:38:55Z
dc.date.available2015-07-09T03:38:55Z
dc.date.copyright2015
dc.date.created2015
dc.date.issued2015
dc.date.updated2015-07-08T22:11:54Z
dc.description.abstractIn the field of mining data stream, processing time is one of the most important factor because data instances continuously arrive at high-speed. Thus, it is crucial to process each instance in timely manner. This concern is further emphasized when Discrete Fourier Transform (DFT) is applied to Decision Tree systems. DFT is highly beneficial in data stream mining as it allows us to represent a Decision Tree with less memory usage while preserving the accuracy of the prediction model. However, DFT is notorious for being computationally expensive to perform. An existing solution to mitigate this problem is by computing coefficients for smaller order. This approach, however, is highly inefficient for deep trees as the underlying problem space grows exponentially. We seek to improve the efficiency of computing Fourier coefficients of Decision Trees in this study. We propose a divide and conquer solution by decomposing a Decision Tree into small sub-trees. Local Fourier coefficient table are then built for each sub-tree. The final Fourier coefficient table is then compiled by tabulating all the local Fourier coefficient tables. Local coefficient tables are build incrementally as it allows us to build a coefficient table from an existing coefficient table. This is particularly useful for two purposes. First is to track and synchronize Decision tree and its Fourier spectrum on the fly. Second is to exploit the possibility for reoccurring concepts. The latter purpose will be explored further in future works. In our experiments, we compare the runtime of our solution to that of the normal DFT process with various order cutoff on both real and artificial datasets. The results show that our solution generally performed better than normal DFT process with various degrees of success from moderate to significant improvements. We also measured the runtime performances of our solution in eager synchronization setup for comparison.en_NZ
dc.identifier.urihttps://hdl.handle.net/10292/8931
dc.language.isoenen_NZ
dc.publisherAuckland University of Technology
dc.rights.accessrightsOpenAccess
dc.subjectRecurrent conceptsen_NZ
dc.subjectData stream miningen_NZ
dc.subjectDiscrete Fourier Transformen_NZ
dc.subjectDivide and conqueren_NZ
dc.subjectRecurrent concept miningen_NZ
dc.titleOptimizing the computation of the Fourier Spectrum from decision treesen_NZ
dc.typeThesis
thesis.degree.discipline
thesis.degree.grantorAuckland University of Technology
thesis.degree.grantorAuckland University of Technology
thesis.degree.levelMasters Theses
thesis.degree.nameMaster of Computer and Information Sciencesen_NZ
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
NugrahaJ.pdf
Size:
1.42 MB
Format:
Adobe Portable Document Format
Description:
Whole thesis
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
897 B
Format:
Item-specific license agreed upon to submission
Description:
Collections