The Comparison Between the Tools for  Named Entity Recognition

Zhang, Wenjie

The Comparison Between the Tools for Named Entity Recognition

aut.embargo	No	en_NZ
aut.thirdpc.contains	No	en_NZ
dc.contributor.advisor	Nand, Parma
dc.contributor.author	Zhang, Wenjie
dc.date.accessioned	2020-06-02T01:03:18Z
dc.date.available	2020-06-02T01:03:18Z
dc.date.copyright	2020
dc.date.issued	2020
dc.date.updated	2020-06-01T07:30:35Z
dc.description.abstract	NLP (Natural language processing) is currently been wildly using in our modern daily life, such as spam email detecting, prediction of potential criminals by analysing the provided information, sentiment analysing and many so on. In the old-time, the popular way to solve NLP tasks can be done by hands or by computers with some strictly given disciplines, these ways are tiresome and often too slow if we have to deal with a huge amount of data or information. Until now, some of the companies and researchers are still using these ways to do their jobs. Besides that, thanks to the development of the computer hardware, we have some relatively new, prevalent and highly accurate ways to help us to finish these tasks, from Machine Learning (such as SVM----supported vector machines) to DNN (Deep Neural Networks) and pretrained models(like ELMO, OpenAI GPT), therefore in modern data companies and online shopping enterprises, the previous ways are gradually decreasing in use to do these tasks because they are obsolete and not suitable to be used in the data era anymore. But now, we have the brand-new approach to cope with the NLP tasks, which is called BERT, it is the abbreviation of Bidirectional Encoder Representations from Transformer. The BERT was introduced and inaugurated by the Google development team in late 2018, it is a pretrained language model for multiple language processing tasks, and also for multiple languages by using the unsupervised learning way. This model is powerful for doing the realistic language processing tasks and it has already broken 11 records in the NLP area in 2018. And, it is simple to use, it just needs fine-tuning and only one additional layer for output without other specific architectures for some specific tasks (such as labelling, classification or question answering). More information of this new technique will be introduced in a detailed way. The method used for completing this thesis is based on literature review, and a comparison between BERT and one prevailing and elaborate DNN technique which is called BILSTM-CRF-CNN (Bi-directional Long Short-Term Memory-Conditional Random Field-Convolutional Neural Networks) in dealing with Name Entity Recognition (NER) and the results of the comparison will be shown. Also, my opinions about the BERT, the suggestions, and potential future issues of future researches are given.	en_NZ
dc.identifier.uri	https://hdl.handle.net/10292/13364
dc.language.iso	en	en_NZ
dc.publisher	Auckland University of Technology
dc.rights.accessrights	OpenAccess
dc.subject	BERT	en_NZ
dc.subject	NLP	en_NZ
dc.subject	BILSTM-CRF-CNN	en_NZ
dc.subject	NER	en_NZ
dc.title	The Comparison Between the Tools for Named Entity Recognition	en_NZ
dc.type	Thesis	en_NZ
thesis.degree.grantor	Auckland University of Technology
thesis.degree.level	Masters Theses
thesis.degree.name	Master of Computer and Information Sciences	en_NZ

Files

Original bundle

Now showing 1 - 1 of 1

Name:: ZhangWenjie.pdf
Size:: 1.24 MB
Format:: Adobe Portable Document Format
Description:: Thesis

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 897 B
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Masters Theses