Keywords
Machine learning, Brain tumor MRI, feature extraction, Data Augmentation, Image classification
Document Type
Research Paper
Abstract
Brain tumors are among the most serious neurological diseases, posing significant diagnostic challenges due to their diverse nature and complexity imbalance in medical imaging datasets. To address these challenges, machine learning (ML) has demonstrated high potential in brain tumor classification; nevertheless, its overall performance can suffer when minority classes are underrepresented. This study examines the impact of the Synthetic Minority Over-sampling Technique (SMOTE) on MRI brain tumor classification using handcrafted features, including Gray Level Co-occurence Matrix (GLCM), Histogram of Oriented Gradients (HOG), and their combination (HOG + GLCM), with three classifiers, namely, Logistic Regression (LR), Support Vector Machines (SVM), and k-Nearest Neighbours (KNN). Experiments were carried out using two publicly available datasets. For Dataset 1, SVM with GLCM increased from 55.21 to 57.40% (full SMOTE), but LR with HOG + GLCM increased from 52.4 to 53.49%. For Dataset 2, fold-wise SMOTE increased LR with GLCM from 69.85 to 70.60% and SVM with HOG increased from 92.87 to 93.10% compared to complete SMOTE. The results show that the SMOTE's effect is dependent on feature type, classifier, and augmentation strategy, with fold-wise typically boosting generalization while avoiding information leaking. These findings validate SMOTE as a viable method for improving the type overall performance in imbalanced medical imaging tasks, particularly for weaker texture-based descriptors.
References
M. Azeez Joodi, M. Hadi Saleh, D. Jasim Kadhim, A New Proposed Hybrid Learning Approach with Features for Extraction of Image Classification, J. Robotics, 2023 (2023)1-13. https://doi.org/10.1155/2023/9961421 J. Chen, Y. Zeng, Application of machine learning in rock facies classification with physics-motivated feature augmentation, arXiv preprint arXiv:1808 (2018) 09856. https://doi.org/10.48550/arXiv.1808.09856 J. Rama, C. Nalini, A. Kumaravel, Image pre-processing: enhance the performance of medical image classification using various data augmentation technique, ACCENTS Transactions on Image Processing and Computer Vision, 5 (2015) 7-14. http://dx.doi.org/10.19101/TIPCV.2018.413001 D. A. Dablain, N. V. Chawla, Towards understanding how data augmentation works with imbalanced data, arXiv preprint arXiv, 2304 (2023) 05895. https://doi.org/10.48550/arXiv.2304.05895 K. Alomar, H. I. Aysel, X. Cai, Data augmentation in classification and segmentation: A survey and new strategies, J. Imaging, 9 (2023) 46. https://doi.org/10.3390/jimaging9020046 Kalaivani, S., Asha, N., Gayathri A. 2023. Geometric transformations-based medical image augmentation, InGANs for Data Augmentation in Healthcare, Cham: Springer International Publishing, pp. 133–141. https://doi.org/10.1007/978-3-031-43205-7_8 J. Liu, Importance-SMOTE: a synthetic minority oversampling method for noisy imbalanced data, Soft Comput., 26 (2022) 1141–1163. https://doi.org/10.1007/s00500-021-06532-4 Y. Wang, Y. Ji, H. Xiao, A data augmentation method for fully automatic brain tumor segmentation, Comput. Biol. Med., 149 (2022) 106039. https://doi.org/10.1016/j.compbiomed.2022.106039 H. Wang, S. Tian, Y. Fu, J. Zhou, J. Liu, D. Chen, Feature augmentation based on information fusion rectification for few-shot image classification, Sci. Rep., 13 (2023) 3607. https://doi.org/10.1038/s41598-023-30398-1 Y. Hasan, T. Khan, D. R. F. De Bulnes, J. F. H. Albarracin, C. Ryan, A Comparative Analysis of Implicit Augmentation Techniques for Breast Cancer Diagnosis Using Multiple Views, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024, 2345-2354. https://openaccess.thecvf.com/content/CVPR2024W/DCAMI/html/Hasan_A_Comparative_Analysis_of_Implicit_Augmentation_Techniques_for_Breast_Cancer_CVPRW_2024_paper.html M. Z. Alam, T. Roy, H. M. N. Kawsar, I. Rimi, Enhancing Transfer Learning for Medical Image Classification with SMOTE: A Comparative Study, 27th International Conference on Computer and Information Technology (ICCIT), Cox's Bazar, Bangladesh, 2024, 245-250. https://doi.org/10.1109/ICCIT64611.2024.11022326 S. Harish, G. F. A. Ahammed, Integrated modelling approach for enhancing brain MRI with flexible pre-processing capability, Int. J. Electr. Comput. Eng., 9 (2019) 2416. http://doi.org/10.11591/ijece.v9i4.pp2416-2424 Islam, M. A. Comparative analysis of pre-trained models and interpolation for facial expression recognition. M.Sc. Thesis, Metropolia University of Applied Sciences, 2023. https://www.theseus.fi/handle/10024/800196 L. Dalavai, N. M. R. Purimetla, S. S. Vellela, T. SyamsundaraRao, L. R. Vuyyuru, K. K. Kumar, Improving Deep Learning-Based Image Classification Through Noise Reduction and Feature Enhancement, International Conference on Artificial Intelligence and Quantum Computation-Based Sensor Application (ICAIQSA), Nagpur, India, 2024, 1-7. https://doi.org/10.1109/ICAIQSA64000.2024.10882201 Z. Rasheed, Y. K. Ma, I. Ullah, Y. Y. Ghadi, M. Z. Khan, M. A. Khan, A. Abdusalomov, F. Alqahtani, Brain tumor classification from MRI using image enhancement and convolutional neural network techniques, Brain Sci., 13 (2023) 1320. https://doi.org/10.3390/brainsci13091320 I. M. Mohammed, N. A. M. Isa, Contrast Limited Adaptive Local Histogram Equalization Method for Poor Contrast Image Enhancement, IEEE Access, 13 (2025) 62600-62632. https://doi.org/10.1109/ACCESS.2025.3558506 N. J. Wala'a, J. M. Rana, A survey on segmentation techniques for image processing, Iraqi J. Electr. Electron. Eng., 17 (2021) 73-93. http://ijeee.edu.iq/Papers/Vol17-Issue2/1570736047.pdf A. Kesana, J. Nallola, R. T. Bootapally, Brain Tumor Detection Using YOLOv5 and Faster R-CNN, 2nd International Conference on Vision Towards Emerging Trends in Communication and Networking Technologies (ViTECoN), Vellore, India, 2023, 1-6. https://doi.org/10.1109/ViTECoN58111.2023.10157773 Cai, X., Li, X. , Razmjooy, N. Breast cancer diagnosis by convolutional neural network and advanced thermal exchange optimization algorithm, Comput. Math. Methods Med., 2021 (2021)1-13. https://doi.org/10.1155/2021/5595180 M. Ahammed, M. Al Mamun, M. S. Uddin, A machine learning approach for skin disease detection and classification using image segmentation, Healthcare Analytics, 2 (2022) 100122. https://doi.org/10.1016/j.health.2022.100122 M. Nazir, Z. Jan, M. Sajjad, Facial expression recognition using histogram of oriented gradients based transformed features, Cluster Comput., 21 (2018) 539-548. https://doi.org/10.1007/s10586-017-0921-5 Y. Nizamli, A. Filatov, MRI brain tumor classification using HOG features selected via impurity-based importances measure, Int. J. Electr. Electron. Res., 12 (2024) 1251-1257. https://doi.org/10.37391/IJEER.120416 S. Barburiceanu, R. Terebes, S. Meza, 3D texture feature extraction and classification using GLCM and LBP-based descriptors, Appl. Sci., 11 (2021) 2332. https://doi.org/10.3390/app11052332 M. Shahajad, D. Gambhir, R. Gandhi, Features extraction for classification of brain tumor MRI images using support vector machine, 11th International Conference on Cloud Computing, Data Science & Engineering (Confluence), Noida, India, 2021, 767-772. https://doi.org/10.1109/Confluence51648.2021.9377111 F. T. Kurniati, D. H. F. Manongga, E. Sediyono, GLCM-based feature combination for extraction model optimization in object detection using machine learning, J. Ilm. Tek. Elektro Komput. Dan Inform., 9 (2023) 1196-1205. https://doi.org/10.26555/jiteki.v9i4.27842 B. Pattanaik, K. Anitha, S. Rathore, P. Biswas, P. Sethy, S. Behera, Brain tumor magnetic resonance images classification based machine learning paradigms, Contemporary Oncology/Współczesna Onkologia, 26 (2022) 268-274. https://doi.org/10.5114/wo.2023.124612 G. Dheepak, D. Vaishali, Brain tumor classification: a novel approach integrating GLCM, LBP and composite features, Front. Oncol., 13 (2024) 1248452. https://doi.org/10.3389/fonc.2023.1248452 J. Wang, N. Awang, MKC-SMOTE: A Novel Synthetic Oversampling Method for Multi-Class Imbalanced Data Classification, IEEE Access, 12 (2024) 196929-196938. https://doi.org/10.1109/ACCESS.2024.3521120 M. Z. Alam, T. Roy, H. M. N. Kawsar, I. Rimi, Enhancing transfer learning for medical image classification with smote: A comparative study, 27th International Conference on Computer and Information Technology (ICCIT), Cox's Bazar, Bangladesh, 2024, 245-250. https://doi.org/10.1109/ICCIT64611.2024.11022326 F. R. Adi Pratama, S. I. Oktora, Synthetic Minority Over-sampling Technique (SMOTE) for handling imbalanced data in poverty classification, Stat. J. IAOS., 39 (2023) 233-239. https://doi.org/10.3233/SJI-220080 N. Hameed, A. M. Shabut, M. K. Ghosh, M. A. Hossain, Multi-class multi-level classification algorithm for skin lesions classification using machine learning techniques, Expert Syst. Appl., 141 (2020) 112961. https://doi.org/10.1016/j.eswa.2019.112961 S. K. Chauhan, B. Jaysawal, J. K. Bhalani, P. K. Sahoo, S. R. Parija, S. B. Shah, A. Thakur, Implementation and performance analysis of k-nearest neighbors algorithm for classification, in IET Conference Proceedings CP920. 2025. IET. https://doi.org/10.1049/icp.2025.1656 S. N. Khan, S. U. Khan, H. Aznaoui, C. B. Şahin, Ö. B. Dinler, Generalization of linear and non-linear support vector machine in multiple fields: a review, Computer Science and Information Technologies, 4 (2023) 226-239. https://doi.org/10.11591/csit.v4i3.pp226-239 J. Sultana, A. K. Jilani, Predicting breast cancer using logistic regression and multi-class classifiers, Int. J. Eng. Technol., 7 (2018) 22-26. https://doi.org/10.14419/ijet.v7i4.20.22115 J. Zhang, X. Tan, W. Chen, G. Du, Q. Fu, H. Zhang, H. Jiang, EFF_D_SVM: a robust multi-type brain tumor classification system, Front. Neurosci., 17 (2023) 1269100. https://doi.org/10.3389/fnins.2023.1269100
Highlights
Two MRI brain tumor datasets were used, covering glioma, meningioma, pituitary, and healthy cases. Images were preprocessed with resizing, sharpening, CLAHE, and Otsu thresholding for clarity. GLCM and HOG features, along with their combination, were extracted to capture texture and area. SMOTE was applied to balance class distributions and enhance classifier performance.
Recommended Citation
Fadhil, Farah and Sultani, Zainab
(2025)
"The effect of synthetic minority oversampling for enhancing brain tumor image classification,"
Engineering and Technology Journal: Vol. 43:
Iss.
10, Article 3.
DOI: https://doi.org/10.30684/etj.2025.163902.2001
DOI
10.30684/etj.2025.163902.2001
First Page
804
Last Page
821





