AUT LibraryAUT
View Item 
  •   Open Theses & Dissertations
  • Masters Theses
  • View Item
  •   Open Theses & Dissertations
  • Masters Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Ensemble Classifier Modelling for Dealing with Missing Values

Hasan, Mohammad Rajib
Thumbnail
View/Open
Thesis (1.932Mb)
Permanent link
http://hdl.handle.net/10292/13085
Metadata
Show full metadata
Abstract
An ensemble classifier method for life critical data classification is considered one of the most capable classifiers where data suffers from missing values. The execution of a decision tree classifier can be expanded by the ensemble method as it is found to be the most superior method for single classifiers. Notwithstanding, the performance of an ensemble classifier relies upon the data quality and missing values. In this study, we discover that better classification accuracy is often achieved by missing value imputation. Medical experts do not have confidence in missing value imputation (filling up the missing values by any of the statistical methods) as each case/attribute is unique and possesses different possibilities. Missing value imputation in life critical data may lead to the wrong diagnosis and thus medical decision making may be influenced wrongly, which is dangerous and life threatening. This study, therefore, proposes a new ensemble model that can accomplish a preferred accuracy of over 96 percent without missing value imputation. The relevancy of features like HPV, HIV, AIDS, and smoking with cervical cancer is a long debate. This study successfully selected some of these influential features and validated their relevancy in terms of accuracy with statistical error root squared mean error and mean absolute error. This study also considers true-positive and false-positive rates in accuracy. Finally, this study concluded that missing value imputation in life critical data may not be necessary to obtain better accuracy. Selection of base classifiers in the ensemble method should be the prior concern over missing value imputation.
Keywords
Ensemble; Missing value; Classifier; Machine learning; Cervical cancer; Ensemble_rh
Date
2020
Item Type
Thesis
Supervisor(s)
Narayanan, Ajit; Sarkar, Nurul
Degree Name
Master of Philosophy
Publisher
Auckland University of Technology

Contact Us
  • Admin

Hosted by Tuwhera, an initiative of the Auckland University of Technology Library

 

 

Browse

Open Theses & DissertationsTitlesAuthorsDateThesis SupervisorMasters ThesesTitlesAuthorsDateThesis Supervisor

Alternative metrics

 

Statistics

For this itemFor all Open Theses & Dissertations

Share

 
Follow @AUT_SC

Contact Us
  • Admin

Hosted by Tuwhera, an initiative of the Auckland University of Technology Library