Single-channel Speech Enhancement Using Statistical Modelling

aut.embargoNoen_NZ
aut.thirdpc.containsNoen_NZ
aut.thirdpc.permissionNoen_NZ
aut.thirdpc.removedNoen_NZ
dc.contributor.advisorMoir, Tom
dc.contributor.advisorCollins, John
dc.contributor.authorChehrehsa, Sarang
dc.date.accessioned2017-02-23T22:17:59Z
dc.date.available2017-02-23T22:17:59Z
dc.date.copyright2016
dc.date.created2017
dc.date.issued2016
dc.date.updated2017-02-23T21:10:36Z
dc.description.abstractA new speech enhancement method based on Maximum A-Posteriori (MAP) estimation on Gaussian Mixture Models (GMMs) of speech and different noise types is introduced. The GMMs model the distribution of speech and noise periodograms in a high dimensional space and hence decrease the complexity of estimation procedure. Using the GMMs the Probability Density Functions (PDFs) of clean speech and noise can be calculated and by applying MAP on these PDFs, the estimates of speech and noise periodograms that form the noisy speech periodogram of the observed noisy speech frame can be estimated. These estimates are then used in a Wiener filter to enhance the noisy speech and recover the speech signal as close as possible to the original one. Since the PDFs are complicated and hence the realization of a MAP criterion can become even more complicated, some approximations are used to find the MAP criterion. Some improvements on this MAP estimation based on the characteristics of periodograms are also introduced in which the approximations are improved in a way which leads to more accurate estimates of speech and noise periodograms. Since the accuracy of the introduced MAP estimate is highly dependent on the accuracy of speech and noise power estimation in the noisy frame, a new power estimation method using Gamma modelling is introduced to replace the older methods like Minimum Statistics. The results of all the estimation methods are used in a classic Wiener filter to be applied on the noisy frame to enhance it. Since all the estimation algorithms can have some errors, we introduce an improvement of Wiener filter in which we can attenuate the effect of these errors on the enhanced speech signal. The performance of all the introduced methods are analyzed in terms of quality and intelligibility and reported thus.en_NZ
dc.identifier.urihttps://hdl.handle.net/10292/10339
dc.language.isoenen_NZ
dc.publisherAuckland University of Technology
dc.rights.accessrightsOpenAccess
dc.subjectSpeech enhancementen_NZ
dc.subjectGaussian Mixture Modellingen_NZ
dc.subjectWiener filteren_NZ
dc.subjectMaximum A-Posteriori estimationen_NZ
dc.titleSingle-channel Speech Enhancement Using Statistical Modellingen_NZ
dc.typeThesis
thesis.degree.grantorAuckland University of Technology
thesis.degree.levelDoctoral Theses
thesis.degree.nameDoctor of Philosophyen_NZ
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
ChehrehsaS.pdf
Size:
6.37 MB
Format:
Adobe Portable Document Format
Description:
Whole thesis
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
889 B
Format:
Item-specific license agreed upon to submission
Description:
Collections