Impact of Different Speech Interfaces of Personal Devices on Users' Perception

Wadea, Mazen

Impact of Different Speech Interfaces of Personal Devices on Users' Perception

aut.embargo	No	en_NZ
aut.supplementaryupload	Yes
aut.thirdpc.contains	No	en_NZ
aut.thirdpc.permission	No	en_NZ
aut.thirdpc.removed	No	en_NZ
dc.contributor.advisor	Symonds, Judith
dc.contributor.author	Wadea, Mazen
dc.date.accessioned	2011-11-29T23:37:13Z
dc.date.available	2011-11-29T23:37:13Z
dc.date.copyright	2011
dc.date.created	2011
dc.date.issued	2011
dc.date.updated	2011-11-29T23:16:08Z
dc.description.abstract	Because of Text-to-Speech (TTS) lacks both clarity and prosody of normal human speech, TTS sounds unnatural and is unpleasant to listen to. It is generally accepted using natural speech for a static prompts, whereas synthetic speech for dynamic content. However, most commercial applications on the market adopt mixing human speech and TTS within the same sentence and/or between sentences. But, this mixing approach led to inconsistent interface (Gong & Lai, 2001). So that, an immediate issue in the design of such speech interface is what type of speech should be used. The goal of this project is to explore users’ perception towards different types of speech in order to investigate the acceptability of personal speech interfaces. This study is aimed for the public users of mobile applications. This project explored redevelopment of the speech interface of the Goal Management Training (GMT) system based on results from testing different speech samples by the delivered VoiceTester mobile application. The VoiceTester application has been developed on the iPhone in this study, to facilitate the listening task therefore adding validity to the responses from participants by simulating environment of speech interfaces on personal devices. The contribution of this study is to provide some knowledge to the developers and health researchers about exploring the impact of different types of speech interfaces on users’ perception. The findings are ultimately helpful to the Traumatic Brain Injury (TBI) patients. As the recommended software will assist them undertake activities with support to help prevent them from making errors (McPherson, Kayes, & Weatherall, 2009). Six participants from different age groups have been chosen in the form of 3 couples, each couple construct of both genders. The examined types of speech are computer-generated voice (CV), natural voice (NV), and familiar voice (FV). The synthetic voices were generated by computer software, the natural speech samples were provided by two native speakers of New Zealand English, and the familiar voices for each couple were simply the recording of each other voices. Participants completed three times a post paper-and-pencil self-perception of task performance scales after each listening test, and then followed by an interview. The evaluative data were used to inform the participants and the researcher about the study and to guide the interview process. The main methods were largely qualitative through the use of semi-structured interviews to explore the users’ perception about manner of speaking and the speaker of the three examined speech samples, as well as, to investigate the importance of the used voice characteristics. The interviews are analysed to discover themes and patterns related to an analysis framework structured from the literature review. The findings revealed differences between three couples in their perceptions of different types of speech. The effect of gender was slightly present, as the subjects revealed a more positive attitude to their opposite gender. Both human voices, NV and FV, were acceptable to the majority of participants with many reporting improved mood and goal attainment. Participants found working with CV both challenging and rewarding. NV seemed particularly helpful in engaging people in the task process, while FV appeared particularly helpful in providing a structured framework for error prevention in attempting goal performance.	en_NZ
dc.identifier.uri	https://hdl.handle.net/10292/2865
dc.language.iso	en	en_NZ
dc.publisher	Auckland University of Technology
dc.rights.accessrights	OpenAccess
dc.subject	Computer generated voice	en_NZ
dc.subject	Perception of TTS interfaces	en_NZ
dc.subject	Acceptability of Text-to-Speech	en_NZ
dc.subject	Interpretive, exploratory, qualitative research approachs	en_NZ
dc.subject	Methodological triangulation	en_NZ
dc.title	Impact of Different Speech Interfaces of Personal Devices on Users' Perception	en_NZ
dc.type	Thesis
thesis.degree.grantor	Auckland University of Technology
thesis.degree.level	Masters Theses
thesis.degree.name	Master of Computer and Information Sciences	en_NZ

Files

Original bundle

Now showing 1 - 3 of 3

Name:: WadeaJ.pdf
Size:: 2.38 MB
Format:: Adobe Portable Document Format
Description:: Whole thesis

Download

Name:: Male TTS.mp3
Size:: 2.17 MB
Format:: MPEG Audio
Description:: Male TTS (audio)

Download

Name:: Female TTS.mp3
Size:: 2.08 MB
Format:: MPEG Audio
Description:: Female TTS (audio)

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 897 B
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Masters Theses