Author Identification in Free Texts

Wang, Yahui (Kay)

Author Identification in Free Texts

aut.embargo	No	en_NZ
aut.thirdpc.contains	No	en_NZ
dc.contributor.advisor	Nand, Parma
dc.contributor.author	Wang, Yahui (Kay)
dc.date.accessioned	2020-07-01T22:28:21Z
dc.date.available	2020-07-01T22:28:21Z
dc.date.copyright	2020
dc.date.issued	2020
dc.date.updated	2020-07-01T22:25:35Z
dc.description.abstract	Information Extraction is a popular topic in the Natural Language Processing area. This thesis focuses on author identi cation in free text. This study divided the author identi cation task into two subtask, quotation extraction and speaker attribution. The entire system contains two parts, a rule based model for quotation extraction and a machine learning model for speaker attribution. The resource domain used in this thesis is the literary narrative. There is also a generalisation test on the news domain. The results of the experiment show that the rule based model can achieve a 0.88 F-score on quotation extraction and the best result of a machine learning model is 85.7% accuracy. The overall test on the entire system returns 77.9% accuracy on the literary source domain and 73.6% on the news domain.	en_NZ
dc.identifier.uri	https://hdl.handle.net/10292/13480
dc.language.iso	en	en_NZ
dc.publisher	Auckland University of Technology
dc.rights.accessrights	OpenAccess
dc.subject	Natural Language Processing (NLP)	en_NZ
dc.subject	Author Identification	en_NZ
dc.subject	Quotation Extraction	en_NZ
dc.subject	Speaker Attribution	en_NZ
dc.subject	Conditional Random Field (CRF)	en_NZ
dc.subject	Support Vector Machine (SVM)	en_NZ
dc.title	Author Identification in Free Texts	en_NZ
dc.type	Thesis	en_NZ
thesis.degree.grantor	Auckland University of Technology
thesis.degree.level	Masters Theses
thesis.degree.name	Master of Computer and Information Sciences	en_NZ

Files

Original bundle

Now showing 1 - 1 of 1

Name:: WangK.pdf
Size:: 453.3 KB
Format:: Adobe Portable Document Format
Description:: Thesis

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 897 B
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Masters Theses