The Importance of Outlier Rejection and Significant Explanatory Variable Selection for Pinot Noir Wine Soft Sensor Development
Sensory attributes are essential factors in determining the quality of wines. However, it can be challenging for consumers, even experts, to differentiate and quantify wines' sensory attributes for quality control. Soft sensors based on rapid chemical analysis offer a potential solution to overcome this challenge. However, the current limitation in developing soft sensors for wines is the need for a significant number of input parameters, at least 12, necessitating costly and time-consuming analyses. While such a comprehensive approach provides high accuracy in sensory quality mapping, the expensive and time-consuming studies required do not lend themselves to the industry's routine quality control activities. In this work, Box plots, Tucker-1 plots, and Principal Component Analysis (PCA) score plots were used to deal with output data (sensory attributes) to improve the model quality. More importantly, this work has identified that the number of analyses required to fully quantify by regression models and qualify by classification models can be significantly reduced. Based on regression models, only four key chemical parameters (total flavanols, total tannins, A520nmHCl, and pH) were required to accurately predict 35 sensory attributes of a wine with R2 values above 0.6 simultaneously. In addition, for classification models to accurately predict 35 sensory attributes of a wine at once with prediction accuracy above 70%, only four key chemical parameters (A280nmHCl, A520nmHCl, chemical age and pH) were required. These models with reduced chemical parameters complement each other in sensory quality mapping and provide acceptable accuracy. The application of the soft sensor based on these reduced sets of key chemical parameters translated to a potential reduction in analytical cost and labour cost of 56% for the regression model and 83% for the classification model, respectively, making these models suitable for routine quality control use.