Filter and Wrapper Stacking Ensemble (FWSE): A Robust Approach for Reliable Biomarker Discovery in High-Dimensional Omics Data

Date
2023
Authors
Budhraja, Sugam
Doborjeh, Maryam
Singh, Balkaran
Tan, Samuel
Doborjeh, Zohreh
Lai, Edmund
Merkin, Alexander
Lee, Jimmy
Goh, Wilson
Kasabov, Nikola
Supervisor
Item type
Journal Article
Degree name
Journal Title
Journal ISSN
Volume Title
Publisher
Oxford University Press (OUP)
Abstract

Selecting informative features, such as accurate biomarkers for disease diagnosis, prognosis and response to treatment, is an essential task in the field of bioinformatics. Medical data often contain thousands of features and identifying potential biomarkers is challenging due to small number of samples in the data, method dependence and non-reproducibility. This paper proposes a novel ensemble feature selection method, named Filter and Wrapper Stacking Ensemble (FWSE), to identify reproducible biomarkers from high-dimensional omics data. In FWSE, filter feature selection methods are run on numerous subsets of the data to eliminate irrelevant features, and then wrapper feature selection methods are applied to rank the top features. The method was validated on four high-dimensional medical datasets related to mental illnesses and cancer. The results indicate that the features selected by FWSE are stable and statistically more significant than the ones obtained by existing methods while also demonstrating biological relevance. Furthermore, FWSE is a generic method, applicable to various high-dimensional datasets in the fields of machine intelligence and bioinformatics.

Description
Keywords
biomarker discovery , ensemble learning , feature selection , genomics , high-dimensional data , proteomics , biomarker discovery , ensemble learning , feature selection , genomics , high-dimensional data , proteomics , 0601 Biochemistry and Cell Biology , 0802 Computation Theory and Mathematics , 0899 Other Information and Computing Sciences , Bioinformatics , 3101 Biochemistry and cell biology , 3102 Bioinformatics and computational biology , 3105 Genetics
Source
Brief Bioinform, ISSN: 1477-4054 (Print); 1477-4054 (Online), Oxford University Press (OUP), 24(6), bbad382-. doi: 10.1093/bib/bbad382
Rights statement
© The Author(s) 2023. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com