The Comparison Between the Tools for Named Entity Recognition

aut.embargoNoen_NZ
aut.thirdpc.containsNoen_NZ
dc.contributor.advisorNand, Parma
dc.contributor.authorZhang, Wenjie
dc.date.accessioned2020-06-02T01:03:18Z
dc.date.available2020-06-02T01:03:18Z
dc.date.copyright2020
dc.date.issued2020
dc.date.updated2020-06-01T07:30:35Z
dc.description.abstractNLP (Natural language processing) is currently been wildly using in our modern daily life, such as spam email detecting, prediction of potential criminals by analysing the provided information, sentiment analysing and many so on. In the old-time, the popular way to solve NLP tasks can be done by hands or by computers with some strictly given disciplines, these ways are tiresome and often too slow if we have to deal with a huge amount of data or information. Until now, some of the companies and researchers are still using these ways to do their jobs. Besides that, thanks to the development of the computer hardware, we have some relatively new, prevalent and highly accurate ways to help us to finish these tasks, from Machine Learning (such as SVM----supported vector machines) to DNN (Deep Neural Networks) and pretrained models(like ELMO, OpenAI GPT), therefore in modern data companies and online shopping enterprises, the previous ways are gradually decreasing in use to do these tasks because they are obsolete and not suitable to be used in the data era anymore. But now, we have the brand-new approach to cope with the NLP tasks, which is called BERT, it is the abbreviation of Bidirectional Encoder Representations from Transformer. The BERT was introduced and inaugurated by the Google development team in late 2018, it is a pretrained language model for multiple language processing tasks, and also for multiple languages by using the unsupervised learning way. This model is powerful for doing the realistic language processing tasks and it has already broken 11 records in the NLP area in 2018. And, it is simple to use, it just needs fine-tuning and only one additional layer for output without other specific architectures for some specific tasks (such as labelling, classification or question answering). More information of this new technique will be introduced in a detailed way. The method used for completing this thesis is based on literature review, and a comparison between BERT and one prevailing and elaborate DNN technique which is called BILSTM-CRF-CNN (Bi-directional Long Short-Term Memory-Conditional Random Field-Convolutional Neural Networks) in dealing with Name Entity Recognition (NER) and the results of the comparison will be shown. Also, my opinions about the BERT, the suggestions, and potential future issues of future researches are given.en_NZ
dc.identifier.urihttps://hdl.handle.net/10292/13364
dc.language.isoenen_NZ
dc.publisherAuckland University of Technology
dc.rights.accessrightsOpenAccess
dc.subjectBERTen_NZ
dc.subjectNLPen_NZ
dc.subjectBILSTM-CRF-CNNen_NZ
dc.subjectNERen_NZ
dc.titleThe Comparison Between the Tools for Named Entity Recognitionen_NZ
dc.typeThesisen_NZ
thesis.degree.grantorAuckland University of Technology
thesis.degree.levelMasters Theses
thesis.degree.nameMaster of Computer and Information Sciencesen_NZ
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
ZhangWenjie.pdf
Size:
1.24 MB
Format:
Adobe Portable Document Format
Description:
Thesis
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
897 B
Format:
Item-specific license agreed upon to submission
Description:
Collections