Information extraction from free text comments in questionnaires

Ramachandran, Kartik

Information extraction from free text comments in questionnaires

Files

Whole thesis

Size: 2.13 MB, File format: Adobe PDF

Date

2018

Authors

Ramachandran, Kartik

Supervisor

Tegginmath, Shoba

Nand, Parma

Item type

Thesis

Degree name

Master of Computer and Information Sciences

Publisher

Auckland University of Technology

Abstract

The last 15 years have seen a tremendous explosion in the amount of information available, encoded both in structured forms such as databases and XML files as well as free, naturally occurring forms such as HTML pages and word documents. This availability of free texts has created a need for automated text processing tools so that information can be extracted in a timely and effective manner. This research investigated the extraction of information from free text responses to open-ended questions in questionnaires. The research undertook to develop a framework for analyzing open question responses to extract structured information which can then be conflated with the closed question responses in order to produce a more informative report from the survey, in particular to determine the sentiment expressed in the response. Specifically, this research will help in understanding the positive or negative nature of the respondent’s answers through the creation of software tools using Natural Language Toolkit (NLTK) and data mining and Natural Language Processing techniques and will help surveyors (Health centers, doctors, data analysts) obtain additional information from surveys. There is also a discussion of existing sentiment analysis solutions as well as the different components and ways of analyzing sentiment and creating a Natural Language Processing tool which would be interesting to future developers of such systems. This research was successfully able to classify free text responses as positive or negative. While we appreciate that more time to fine tune the application and perform more training and testing would have been useful, the results obtained are promising. We have successfully developed a platform which can be used for generating a custom corpus and provide interested developers a starting framework to develop sentiment analysis tools.

Keywords

Natural Language Processing, Data mining, Information extraction, NLTK

Permanent link

https://hdl.handle.net/10292/11141

Collections

Masters Theses

Full item page

Information extraction from free text comments in questionnaires

Files

Date

Authors

Supervisor

Item type

Degree name

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Source

DOI

Publisher's version

Rights statement

Permanent link

Collections

Endorsement

Review

Supplemented By

Referenced By