Impact of different speech interfaces of personal devices on users' perception

aut.embargoNoen_NZ
aut.supplementaryuploadYes
aut.thirdpc.containsNoen_NZ
aut.thirdpc.permissionNoen_NZ
aut.thirdpc.removedNoen_NZ
dc.contributor.advisorSymonds, Judith
dc.contributor.authorWadea, Mazen
dc.date.accessioned2011-11-29T23:37:13Z
dc.date.available2011-11-29T23:37:13Z
dc.date.copyright2011
dc.date.created2011
dc.date.issued2011
dc.date.updated2011-11-29T23:16:08Z
dc.description.abstractBecause of Text-to-Speech (TTS) lacks both clarity and prosody of normal human speech, TTS sounds unnatural and is unpleasant to listen to. It is generally accepted using natural speech for a static prompts, whereas synthetic speech for dynamic content. However, most commercial applications on the market adopt mixing human speech and TTS within the same sentence and/or between sentences. But, this mixing approach led to inconsistent interface (Gong & Lai, 2001). So that, an immediate issue in the design of such speech interface is what type of speech should be used. The goal of this project is to explore users’ perception towards different types of speech in order to investigate the acceptability of personal speech interfaces. This study is aimed for the public users of mobile applications. This project explored redevelopment of the speech interface of the Goal Management Training (GMT) system based on results from testing different speech samples by the delivered VoiceTester mobile application. The VoiceTester application has been developed on the iPhone in this study, to facilitate the listening task therefore adding validity to the responses from participants by simulating environment of speech interfaces on personal devices. The contribution of this study is to provide some knowledge to the developers and health researchers about exploring the impact of different types of speech interfaces on users’ perception. The findings are ultimately helpful to the Traumatic Brain Injury (TBI) patients. As the recommended software will assist them undertake activities with support to help prevent them from making errors (McPherson, Kayes, & Weatherall, 2009). Six participants from different age groups have been chosen in the form of 3 couples, each couple construct of both genders. The examined types of speech are computer-generated voice (CV), natural voice (NV), and familiar voice (FV). The synthetic voices were generated by computer software, the natural speech samples were provided by two native speakers of New Zealand English, and the familiar voices for each couple were simply the recording of each other voices. Participants completed three times a post paper-and-pencil self-perception of task performance scales after each listening test, and then followed by an interview. The evaluative data were used to inform the participants and the researcher about the study and to guide the interview process. The main methods were largely qualitative through the use of semi-structured interviews to explore the users’ perception about manner of speaking and the speaker of the three examined speech samples, as well as, to investigate the importance of the used voice characteristics. The interviews are analysed to discover themes and patterns related to an analysis framework structured from the literature review. The findings revealed differences between three couples in their perceptions of different types of speech. The effect of gender was slightly present, as the subjects revealed a more positive attitude to their opposite gender. Both human voices, NV and FV, were acceptable to the majority of participants with many reporting improved mood and goal attainment. Participants found working with CV both challenging and rewarding. NV seemed particularly helpful in engaging people in the task process, while FV appeared particularly helpful in providing a structured framework for error prevention in attempting goal performance.en_NZ
dc.identifier.urihttps://hdl.handle.net/10292/2865
dc.language.isoenen_NZ
dc.publisherAuckland University of Technology
dc.rights.accessrightsOpenAccess
dc.subjectComputer generated voiceen_NZ
dc.subjectPerception of TTS interfacesen_NZ
dc.subjectAcceptability of Text-to-Speechen_NZ
dc.subjectInterpretive, exploratory, qualitative research approachsen_NZ
dc.subjectMethodological triangulationen_NZ
dc.titleImpact of different speech interfaces of personal devices on users' perceptionen_NZ
dc.typeThesis
thesis.degree.grantorAuckland University of Technology
thesis.degree.levelMasters Theses
thesis.degree.nameMaster of Computer and Information Sciencesen_NZ
Files
Original bundle
Now showing 1 - 3 of 3
Loading...
Thumbnail Image
Name:
WadeaJ.pdf
Size:
2.38 MB
Format:
Adobe Portable Document Format
Description:
Whole thesis
Loading...
Thumbnail Image
Name:
Male TTS.mp3
Size:
2.17 MB
Format:
MPEG Audio
Description:
Loading...
Thumbnail Image
Name:
Female TTS.mp3
Size:
2.08 MB
Format:
MPEG Audio
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
897 B
Format:
Item-specific license agreed upon to submission
Description:
Collections