Document Type : Research Paper
Authors
1 PhD student, Department of Agricultural Machinery Engineering, Faculty of Agriculture, College of Agriculture & Natural Resources, University of Tehran, Karaj, Iran
2 Professor, Department of Agricultural Machinery Engineering, Faculty of Agricultural Engineering and Technology, University College of Agriculture and Natural Resources, University of Tehran, Karaj, Iran
3 Professor, Department of Agricultural Machinery Engineering, Faculty of Agriculture, College of Agriculture & Natural Resources, University of Tehran, Karaj, Iran
Abstract
Keywords
Main Subjects
The Use of Gradient Boost Regression Model to Modeling of Gas Sensors in Diagnosis of Sun-dried, Sulphurous and Acidic solution dried Raisins
EXTENDED ABSTRACT
Remaining elements such as sulfur dioxide and its compounds, which are widely used in dried fruits as preservatives due to their availability and affordability, are one of the factors that importing countries consider when purchasing raisins from Iran. Therefore, exported raisins should be examined for the presence of these elements in the final product. Machine learning modeling can help overcome some of the limitations of gas sensors, such as high operating conditions, drift errors, limited selectivity, the need for a large amount of labeled data, and cost and manufacturing challenges. The gradient boost regression model is a machine learning model used to solve regression problems.
In this study, three treatments, including sun-dried, Acidic solution dried Raisins and sulfur-treated each with three replicates, were prepared and exposed to the gas sensors for 60 minutes to record the sensor responses to each treatment. The obtained data were then analyzed using machine learning models to determine the accuracy of each modeling method and make them comparable. The model evaluation parameters were examined, and the interpretation of each was discussed in detail. Finally, the analysis of variance of the gradient boost regression model was performed for each quality prediction component separately for treatments, sensors, and combinations of sensors with treatments, and various points were extracted from the interpretation of each in the discussion and results section.
Based on the charts and results, the gradient boost regression model has been able to provide more accurate and better predictions in all sensors. Therefore, the modeling by this model with the quality determining components of the model was analyzed and the modeling results were examined. Overall, considering the high values of quality prediction metrics, it can be concluded that the designed gradient boost regression model is well compatible with the dataset and can effectively predict the target variable. The significant difference test results also showed significant differences between the mean treatments. Treatments 1 to 3, corresponding to acidic solution dried raisins, sulfuric-treated raisins, and sun-dried raisins, were found to have significant differences. According to the results, the coefficient of determination in the Acidic solution treatment had a significant difference compared to the Sun-dried and sulfuric treatments, and performed better than both. Additionally, the Sun-dried treatment had a significant difference compared to the solar treatment, and the results of the solar treatment were better, indicating that the Acidic solution treatment had the highest modeling capability and predictability of sensor responses, which can be justified by the more noticeable odor created by acid. Furthermore, the Sun-dried treatment showed the lowest modeling capability compared to other treatments, which can be justified by the lack of clear processing performed on it compared to other treatments. However, no significant differences were observed in the root mean squared error, mean absolute error, and root mean squared error of relative error between the mean treatments. Comparing the means of responses of each sensor also showed that for the comparisons of sensors 1 and 2, 1 and 3, 2 and 3, 2 and 5, 4 and 5, and 6 and 7, the reject value was False, indicating that the null hypothesis was accepted, meaning that there is no significant difference between these sensors. In the other comparisons, the reject value was True, indicating that the null hypothesis was rejected and there is a significant difference between them.
The results showed that the gradient boost regression model with the coefficient of explanation of 0.9972 and the root mean square error of 0.0209 for the test data was able to model the response of the gas sensors compared to treatments. Also, by examining and analyzing the obtained results, the type and degree of correlation between the response of the sensors in relation to each other and in relation to time was determined to be evaluated in predicting their behavior. Then, with the modeling done, it was determined that the MQ9, MQ3, MQ5, TGS2620 sensors have coefficients of explanation of 0.8668, 0.8786, 0.9458, and 0.9074, respectively, and the root mean square error of 0.0163, 0.0168, and 0.0083. 0 and 0.0227 showed more accurate and predictable responses than MQ135, TGS822, TGS810 and MQ4 sensors.