Information Extraction from TV Series Scripts for Uptake Prediction

Wang, Junshu

Information Extraction from TV Series Scripts for Uptake Prediction

aut.embargo	No	en_NZ
aut.thirdpc.contains	No	en_NZ
aut.thirdpc.permission	No	en_NZ
aut.thirdpc.removed	No	en_NZ
dc.contributor.advisor	Nand, Parma
dc.contributor.advisor	Naeem, Muhammad Asif
dc.contributor.author	Wang, Junshu
dc.date.accessioned	2017-11-12T23:21:04Z
dc.date.available	2017-11-12T23:21:04Z
dc.date.copyright	2017
dc.date.created	2017
dc.date.issued	2017
dc.date.updated	2017-11-11T09:55:35Z
dc.description.abstract	The script of a movie, or of an episode of a television series, describes the setting, the storyline, and the scene changes. It also details the movement, actions, non-oral expression, and dialogues of the characters. The script is assessed by potential investors. If it is considered to be qualified, a decision is made to arrange funds and other resources to create the real product, i.e. a movie or a television series. This action of approving the project is known as green-lighting. Many studies have been conducted on building models to predict the success of movies. However, the majority of these studies exploit factors which only become known after the decision of green-lighting, or after the release of the products. Only a few studies have focused on predictive models based on pre-greenlighting factors, which are available before the decision of green-lighting. In comparison, there are even less models that forecast the performance of television series exploiting pre-greenlighting factors. This study aims to extract features from scripts of pilot episodes, which are the first episodes of television series. These features will be exploited to construct predictive models for uptake of the television series. Three data sources were employed, including the IMDB, the OpenSubtitles2016 corpus, and television series scripts retrieved from multiple websites. The scripts were then parsed, and the structures were analysed. Subsequently, features were extracted and data matrices were generated. These features and data matrices were used in classification algorithms for training and construction of predictive models. The output from the prediction models was then used for prediction of the uptake. However, the results were not as compelling as expected. The present research was compared with previous studies on the same topic. The evaluation results are discussed, and suggestions for future work are given.	en_NZ
dc.identifier.uri	https://hdl.handle.net/10292/10968
dc.language.iso	en	en_NZ
dc.publisher	Auckland University of Technology
dc.rights.accessrights	OpenAccess
dc.subject	Information extraction	en_NZ
dc.subject	Feature extraction	en_NZ
dc.subject	NLP	en_NZ
dc.subject	Prediction	en_NZ
dc.subject	TV Series Scripts	en_NZ
dc.subject	Distributed representation	en_NZ
dc.subject	Dependency parsing	en_NZ
dc.title	Information Extraction from TV Series Scripts for Uptake Prediction	en_NZ
dc.type	Thesis
thesis.degree.grantor	Auckland University of Technology
thesis.degree.level	Masters Theses
thesis.degree.name	Master of Computer and Information Sciences	en_NZ

Files

Original bundle

Now showing 1 - 1 of 1

Name:: WangJ.pdf
Size:: 3.28 MB
Format:: Adobe Portable Document Format
Description:: Whole thesis

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 897 B
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Masters Theses