A Novel Transformer Pre-training Objective and a Novel Fine-Tuning Method for Abstractive Summarization

Zhang, Cangge

A Novel Transformer Pre-training Objective and a Novel Fine-Tuning Method for Abstractive Summarization

aut.embargo	No	en_NZ
aut.thirdpc.contains	No	en_NZ
dc.contributor.advisor	Nand, Parma
dc.contributor.author	Zhang, Cangge
dc.date.accessioned	2022-05-19T22:15:42Z
dc.date.available	2022-05-19T22:15:42Z
dc.date.copyright	2022
dc.date.issued	2022
dc.date.updated	2022-05-19T21:45:35Z
dc.description.abstract	Pre-training Transformer has been widely used in many NLP tasks including document summarization. Researchers designed many different self-supervised objectives for their pre-training transformer models, then based on the seq2seq model to fine tune on these pre-trained Transformer models for downstream tasks. However, most researchers designed their self-supervised objectives for all NLP tasks, the ability of self-supervised objectives for a specific task such as abstractive document summary hasn’t been largely explored. This article designed a novel self-supervised objective MSLM (Mask Summary Language Model) for document summarization. MSLM uses labeled document summary corpus for pre-training, where some words have been removed/masked from the summary. The source text concatenates the masked summary as the input, while the output is the summary with the original words masked. The objective is to predict the masked words from the summary. We first pre-trained on three variants of MSLM that remove nouns, verbs, and all the other words from the summary respectively. We found that removing nouns from the summary obtained the best ROUGE score on the downstream abstractive document summarization task. Then, inspired by BERT (Devlin et al., 2018) and Roberta (Liu et al., 2019), we pre-trained the concatenation of MLM (Mask Language Model that first been proposed in BERT) and our best MSLM variant, we found that fine-tuning the model that pre-trained on the concatenation of MLM and MSLM obtained higher ROUGE score than the model that pre-trained on MLM only.	en_NZ
dc.identifier.uri	https://hdl.handle.net/10292/15143
dc.language.iso	en	en_NZ
dc.publisher	Auckland University of Technology
dc.rights.accessrights	OpenAccess
dc.title	A Novel Transformer Pre-training Objective and a Novel Fine-Tuning Method for Abstractive Summarization	en_NZ
dc.type	Thesis	en_NZ
thesis.degree.grantor	Auckland University of Technology
thesis.degree.level	Masters Theses
thesis.degree.name	Master of Computer and Information Sciences	en_NZ

Files

Original bundle

Now showing 1 - 1 of 1

Name:: ZhangC.pdf
Size:: 1.13 MB
Format:: Adobe Portable Document Format
Description:: Thesis

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 897 B
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Masters Theses