Audio segmentation, classification and visualization

aut.embargoNoen
aut.thirdpc.containsNo
aut.thirdpc.permissionNoen
aut.thirdpc.removedNoen
dc.contributor.advisorWhalley, Jacqueline
dc.contributor.advisorBrooks, Stephen
dc.contributor.advisorMacdonell, Stephen
dc.contributor.authorZhang, Xin
dc.date.accessioned2009-12-09T02:38:22Z
dc.date.available2009-12-09T02:38:22Z
dc.date.copyright2009
dc.date.issued2009
dc.description.abstractThis thesis presents a new approach to the visualization of audio files that simultaneously illustrates general audio properties and the component sounds that comprise a given input file. New audio segmentation and classification methods are reported that outperform existing methods. In order to visualize audio files, the audio is segmented (separated into component sounds) and then classified in order to select matching archetypal images or video that represent each audio segment and are used as templates for the visualization. Each segment's template image or video is then subjected to image processing filters that are driven by audio features. One visualization method reported represents heterogeneous audio files as a seamless image mosaic along a time axis where each component image in the mosaic maps directly to a discovered component sound. The second visualization method, video texture mosaics, builds on the ideas developed in time mosaics. A novel adaptive video texture generation method was created by using acoustic similarity detection to produce a resultant video texture that more accurately represents an audio file. Compared with existing visualization methods such as oscilloscopes and spectrograms, both approaches yield more accessible illustrations of audio files and are more suitable for casual and non expert users.
dc.identifier.urihttps://hdl.handle.net/10292/802
dc.language.isoenen
dc.publisherAuckland University of Technology
dc.rights.accessrightsOpenAccess
dc.subjectAudio
dc.subjectSegmentation
dc.subjectClassification
dc.subjectVisualization
dc.subjectTime mosaics
dc.subjectVideo textures
dc.titleAudio segmentation, classification and visualization
dc.typeThesis
thesis.degree.grantorAuckland University of Technology
thesis.degree.levelDoctoral Theses
thesis.degree.nameDoctor of Philosophy
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
ZhangX.pdf
Size:
9.54 MB
Format:
Adobe Portable Document Format
Description:
Whole thesis
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
969 B
Format:
Item-specific license agreed upon to submission
Description:
Collections