شناسایی و تعیین موقعیت مکانی ناخالصی‌های نخود با استفاده کلاسبندهای SVM و KNN

نوع مقاله : مقاله پژوهشی

نویسندگان

گروه مهندسی بیوسیستم، دانشکده کشاورزی، دانشگاه بوعلی سینا، همدان، ایران

چکیده

در زمان برداشت نخود، انواع مختلفی از ناخالصی‌ها در محصول وجود دارد که لازم است پیش از عرضه به بازار، شناسایی و جداسازی شوند. اگرچه بخش زیادی از این ناخالصی‌های براحتی قابل حذف هستند، اما جداسازی مواردی مانند سنگ‌ریزه‌های هم‌اندازه نخود یا نخودهای نارس و بدرنگ با روش‌های مرسوم امکان‌پذیر نیست. هدف این پژوهش، تشخیص نوع و تعیین موقعیت ناخالصی‌های مختلف نخود با استفاده از دو مدل هوشمند ماشین‌بردار پشتیبان  (SVM) و K نزدیک‌ترین همسایه (KNN)  است. برای این منظور، 400 تصویر RGB تهیه شد که هر کدام از تصویرها شامل شش کلاس نخود سالم، سبز، سیاه، رنگی، سنگ و لپه بودند. برای شناسایی نوع کلاس هر کدام از اشیای موجود در تصویر و استخراج ویژگی‌ها، بعد از تعیین موقعیت مکانی، هر یک از 6 کلاس از تصاویر اصلی جدا گردیدند و به صورت مجزا در 6 دسته مختلف طبقه‌بندی شدند. با این عملیات، در مجموع کل تعداد تصاویر اشیا به 3840 رسید. ویژگی‌هایی شامل میانگین، میانه، واریانس، چولگی، هیستوگرام، آنتروپی و نیز ویژگی‌های بافتی حاصل از ماتریس هم‌وقوع سطح خاکستری شامل کنتراست، همبستگی، انرژی و همگنی استخراج شد. در مدل SVM، تابع RBF بهترین عملکرد را در مقایسه با توابع دیگر نشان داد. در مدل KNN نیز بهترین نتایج با 13k=، معیار فاصله City Block و وزن‌دهی (c+D²)/1 با 1c=  حاصل شد. تعیین موقعیت مکانی اشیا بر اساس مختصات مرکز آن‌ها در محیط MATLAB انجام گرفت. بر اساس نتایج، بیشترین دقت مدل‌های SVM و KNN در رزولوشن 250×250 به ترتیب برابر با 09/98 و 88/90 درصد به‌دست آمد.

کلیدواژه‌ها

موضوعات


عنوان مقاله [English]

Identification and Localization of Chickpea Impurities Using SVM and KNN Classifiers

نویسندگان [English]

  • hossein bagherpour
  • Siavash Shamohammadi
Department of Biosystems Engineering, Faculty of Agriculture, Bu-Ali Sina University, Hamedan, Iran
چکیده [English]

During chickpea harvesting, various types of impurities are present in the product, which must be identified and removed before market distribution or use as seed. Although pneumatic and mechanical methods can eliminate a substantial portion of these impurities, conventional techniques are insufficient for separating objects such as small stones of similar size to chickpeas or unripe and discolored grains. The objective of this study was to identify the type and determine the location of different chickpea impurities using two intelligent classifiers: Support Vector Machine (SVM) and k-Nearest Neighbors (KNN). For this purpose, 400 RGB images were acquired, encompassing six classes: healthy, green, black, colored, stones, and split chickpeas. After object segmentation and classification into six groups, the total number of samples reached 3,840. Features extracted included mean, median, variance, skewness, histogram, entropy, and texture descriptors derived from the gray-level co-occurrence matrix (GLCM), such as contrast, correlation, energy, and homogeneity. In the SVM model, the RBF kernel exhibited superior performance compared to other kernels. For KNN, the optimal results were obtained with k = 13, the City Block distance metric, and a weighting scheme of 1/(c + D²) with c = 1. Object localization was performed in MATLAB by determining the coordinates of each object's center. Based on the results, the highest classification accuracy for the SVM and KNN models at a resolution of 250×250 pixels were 98.09% and 90.88%, respectively.

کلیدواژه‌ها [English]

  • Classification
  • Image processing
  • Beans
  • Pea impurities

Introduction

Chickpeas are a high-protein legume whose consumption has increased worldwide. Ensuring their quality requires the removal of foreign impurities and uniformity in size, shape, and color. Manual separation is labor-intensive, making automated systems necessary for efficient and accurate impurity detection. Machine vision has shown great potential for identifying impurities and classifying varieties in small-seeded legumes. However, the effective deployment of such systems requires crop-specific algorithms tailored to local cultivars. This study aimed to develop an algorithm for native Iranian chickpea varieties that can accurately detect and classify impurities while determining their spatial locations—a factor largely unaddressed in previous research. The ultimate goal is to reduce reliance on manual labor and enhance the performance of automatic grading machines.

Methods

Images were pre-processed to remove background noise and isolate the primary objects. Various color spaces (HSV, Lab, YCbCr) were evaluated, and YCbCr provided the best discrimination between objects and background. Following conversion from RGB to YCbCr, thresholds in the Cr channel were tested, and Cr > 105 was selected for effective segmentation. Small particles were removed using the bwareaopen function, and final binary masks were multiplied with the original images to produce segmented outputs. Each object was assigned to one of six classes: healthy, green, black, colored, stone, and split chickpeas, resulting in 3,840 images (640 per class). Feature extraction combined color, texture, and geometric descriptors. Texture features (correlation, energy, homogeneity, entropy) were derived from grayscale images using the gray-level co-occurrence matrix (GLCM) at four angles (0°, 45°, 90°, 135°) and pixel distances of 1–10, yielding 160 features. Statistical descriptors (kurtosis, skewness, maximum, median, mean, variance) were computed for RGB channels (18 features) and grayscale images (5 features), along with perimeter-to-area ratio and entropy (2 features), totaling 185 features. Seventy percent of the dataset was used for training and 30% for testing. All features were normalized before classification. To identify and classify pea impurities, two well-known machine learning models—Support Vector Machine (SVM) and K-Nearest Neighbors (KNN)—were employed, allowing a comparative evaluation of their performance in distinguishing between the six classes.

Results

The SVM classifier demonstrated high stability and reliability due to its convergence to a global minimum. Analysis of the cost parameter C indicated that the optimal performance was achieved at C = 10. Furthermore, the RBF kernel outperformed both linear and polynomial kernels, yielding approximately 14.9% and 13.8% higher accuracy, respectively. For the KNN classifier, k values from 3 to 21 were examined, and the best performance (90.79% accuracy) was obtained using k = 13, the City Block distance metric, and the weighted scheme 1/(c + D²). Evaluation of distance metrics showed that weighted approaches performed better than simple distance measures. Investigating the effect of image size revealed that increasing the resolution did not significantly improve classification accuracy, and the 250×250 resolution provided the optimal trade-off between speed and accuracy. Confusion matrix results indicated that the black chickpea class exhibited the highest separability, while misclassifications mainly occurred in classes with similar characteristics. Overall comparison demonstrated that SVM achieved superior performance, with an accuracy of 98.09%, outperforming KNN, which achieved 90.88%. In the SVM model, the F1-score for all classes—except the stone class—exceeded 97%. The slightly lower performance for the stone class was attributed to its visual similarity to certain chickpea types. Comparison with previous studies showed that the improved SVM model increased accuracy by approximately 6.8%. Additionally, object localization and bounding box generation were successfully accomplished, with an average processing time of 25.6 ms per image, which can be further reduced using more advanced hardware.

Conclusion

In this study, two well-known classifiers—Support Vector Machine (SVM) and K-Nearest Neighbors (KNN)—were used to identify and differentiate healthy chickpea seeds from various impurities. SVM outperformed KNN, with KNN performance influenced by the number of neighbors, distance metrics, and weighting schemes, while SVM accuracy depended on the kernel function. A key feature of this study is the inclusion of classes with highly similar characteristics, improving discrimination of subtle differences and enabling high-purity chickpea batches. Although common Iranian varieties and typical impurities were used, variations in color, texture, or shape in other cultivars may limit generalizability. Future studies should include a broader range of cultivars and impurities to enhance robustness. With the growing adoption of robotic and automated systems, the algorithm can be integrated into automatic grading machines for detecting both the type and spatial location of impurities. Further testing in laser-based grading systems is recommended to assess industrial performance.

Author Contributions

Both authors contributed equally to the research and the preparation of the manuscript, under the guidance and supervision of the corresponding author.

Data Availability Statement

Data available on request from the authors

Acknowledgments

The authors would like to express their sincere gratitude to Bu-Ali Sina University for providing the facilities and support necessary to conduct this research.

Ethical considerations

The authors avoided data fabrication, falsification, plagiarism, and misconduct.

Conflict of interest

The author declares no conflict of interest

Aggarwal, A. K., & Mohan, R. (2010). Aspect ratio analysis using image processing for rice grain quality. International Journal of Food Engineering6(5). DOI:10.2202/1556-3758.1788
Ardeshirifar, R. (2024). Automated Classification of Dry Bean Varieties Using XGBoost and SVM Models. arXiv preprint arXiv:2408.01244. https://doi.org/10.48550/arXiv.2408.01244
Azzeh, M., Elsheikh, Y., Nassif, A. B., & Angelis, L. (2023). Examining the performance of kernel methods for software defect prediction based on support vector machine. Science of Computer Programming226, 102916. https://doi.org/10.1016/j.scico.2022.102916
Bazrafkan, A., Navasca, H., Kim, J. H., Morales, M., Johnson, J. P., Delavarpour, N., ... & Flores, P. (2023). Predicting Dry Pea Maturity Using Machine Learning and Advanced Sensor Fusion with Unmanned Aerial Systems (UASs). Remote Sensing15(11), 2758. https://doi.org/10.3390/rs15112758
Brosnan, T., & Sun, D. W. (2002). Inspection and grading of agricultural and food products by computer vision systems—a review. Computers and electronics in agriculture36(2-3), 193-213. https://doi.org/10.1016/S0168-1699(02)00101-1
Cubero, S., Aleixos, N., Moltó, E., Gómez-Sanchis, J., & Blasco, J. (2011). Advances in machine vision applications for automatic inspection and quality evaluation of fruits and vegetables. Food and bioprocess technology4, 487-504. https://doi.org/10.1007/s11947-010-0411-8
Cujbescu, D., Nenciu, F., Persu, C., Găgeanu, I., Gabriel, G., Vlăduț, N. V., ... & Boruz, S. P. (2023). Evaluation of an Optical Sorter Effectiveness in Separating Maize Seeds Intended for Sowing. Applied Sciences13(15), 8892. DOI:10.3390/app13158892
Dheer, P., & Singh, V. (2019). Classifying wheat varieties using machine learning model. Journal of Pharmacognosy and Phytochemistry8(3), 47-49. DOI: 10.13140/RG.2.2.16338.81600
Fan, F., Chen, H., Gao, Y., & Mou, T. (2024). Quantitative detection and sorting of broken kernels and chalky grains in milled rice using computer vision algorithms. Journal of Food Engineering383, 112225. https://doi.org/10.1016/j.jfoodeng.2024.112225
Geng, J., Min, H., & Rao, X. (2021). Separation of clods and stones from harvested potatoes using laser backscattering imaging technique. Journal of Food Measurement and Characterization15, 3262-3273. https://doi.org/10.1007/s11694-021-00896-9
Gou, J., Du, L., Zhang, Y., & Xiong, T. (2012). A new distance-weighted k-nearest neighbor classifier. J. Inf. Comput. Sci9(6), 1429-1436.
Halder, R. K., Uddin, M. N., Uddin, M. A., Aryal, S., & Khraisat, A. (2024). Enhancing K-nearest neighbor algorithm: a comprehensive review and performance analysis of modifications. Journal of Big Data11(1), 113. https://doi.org/10.1186/s40537-024-00973-y
Haralick, R. M., Shanmugam, K., & Dinstein, I. H. (2007). Textural features for image classification. IEEE Transactions on systems, man, and cybernetics, (6), 610-621. 10.1109/TSMC.1973.4309314
Jahed Armaghani, D., Asteris, P. G., Askarian, B., Hasanipanah, M., Tarinejad, R., & Huynh, V. V. (2020). Examining hybrid and single SVM models with different kernels to predict rock brittleness. Sustainability12(6), 2229. https://doi.org/10.3390/su12062229
Kanouni, H.; Ahari, D.S.; Khoshroo, H.H. Chickpea Research and Production in Iran. In Proceedings of the 7th International Food Legume Research Conference, Marrakech, Morocco, 6–8 May 2018.(In Persian).
Kılıç, K., Boyacı, I. H., Köksel, H., & Küsmenoğlu, İ. (2007). A classification system for beans using computer vision system and artificial neural networks. Journal of Food Engineering78(3), 897-904. https://doi.org/10.1016/j.jfoodeng.2005.11.030
Lefebvre, M., Zimmerman, T., Baur, C., Guegerli, P., & Pun, T. (1995, January). Potato operation: automatic detection of potato diseases. In Optics in Agriculture, Forestry, and Biological Processing (Vol. 2345, pp. 2-9). SPIE. https://doi.org/10.1117/12.198858
Liu, D., Ning, X., Li, Z., Yang, D., Li, H., & Gao, L. (2015). Discriminating and elimination of damaged soybean seeds based on image characteristics. Journal of Stored Products Research60, 67-74. https://doi.org/10.1016/j.jspr.2014.10.001
Ozan, A. K. I., Güllü, A., & Uçar, E. (2015, November). Classification of rice grains using image processing and machine learning techniques. In International scientific conference (pp. 20-21).
Prasath, V. B., Alfeilat, H. A. A., Hassanat, A., Lasassmeh, O., Tarawneh, A. S., Alhasanat, M. B., & Salman, H. S. E. (2017). Distance and similarity measures effect on the performance of K-nearest neighbor classifier—a review. arXiv preprint arXiv:1708.04321. DOI:10.48550/arXiv.1708.04321
Salam, S., Kheiralipour, K. (2022). Development and evaluation of chickpea classification system based on visible image processing technology and artificial neural network. Innovative Food Technologies, 9(2), 181-193. https://doi.org/10.22104/jift.2021.5173.2063. (In Persian)
Salam, S., Kheiralipour, K., & Jian, F. (2022). Detection of unripe kernels and foreign materials in chickpea mixtures using image processing. Agriculture12(7), 995. https://doi.org/10.22104/jift.2021.5173.2063
Stegmayer, G., Milone, D. H., Garran, S., & Burdyn, L. (2013). Automatic recognition of quarantine citrus diseases. Expert systems with applications40(9), 3512-3517. https://doi.org/10.1016/j.eswa.2012.12.059
Venkataraman, D., & Mangayarkarasi, N. (2017, September). Support vector machine based classification of medicinal plants using leaf features. In 2017 International conference on advances in computing, communications and informatics (ICACCI) (pp. 793-798). IEEE. DOI:10.1109/ICACCI.2017.8125939
Venora, G., Grillo, O., Shahin, M. A., & Symons, S. J. (2007). Identification of Sicilian landraces and Canadian cultivars of lentil using an image analysis system. Food Research International40(1), 161-166. https://doi.org/10.1016/j.foodres.2006.09.001
Voisin, A. S., Guéguen, J., Huyghe, C., Jeuffroy, M. H., Magrini, M. B., Meynard, J. M., ... & Pelzer, E. (2014). Legumes for feed, food, biomaterials and bioenergy in Europe: a review. Agronomy for Sustainable Development34, 361-380. https://doi.org/10.1007/s13593-013-0189-y
Yang, H., Ni, J., Gao, J., Han, Z., & Luan, T. (2021). A novel method for peanut variety identification and classification by Improved VGG16. Scientific Reports11(1), 15756. https://doi.org/10.1038/s41598-021-95240-y
Zareiforoush, H., Minaei, S., Alizadeh, M. R., & Banakar, A. (2016). Qualitative classification of milled rice grains using computer vision and metaheuristic techniques. Journal of food science and technology53, 118-131. doi: 10.1007/s13197-015-1947-4
Zhu, B., Jiang, L., Jin, F., Qin, L., Vogel, A., & Tao, Y. (2007). Walnut shell and meat differentiation using fluorescence hyperspectral imagery with ICA-kNN optimal wavelength selection. Sensing and Instrumentation for Food Quality and Safety1, 123-131. https://doi.org/10.1007/s11694-007-9015-z