dc.description.abstract |
Soil-transmitted helminths and schistosomiasis are widespread parasitic diseases in tropical
areas, especially in Africa, causing significant health impacts. Prompt treatment offers both
health and economic benefits. Current diagnosis, mainly microscopy-based, is time-intensive
and challenging in low-resource settings like Ethiopia. This study develops an innovative sys
tem that analyzes parasite egg images from microscopes. Unlike previous CNN-only ap
proaches, it combines machine learning and deep learning for faster, more accurate disease
identification, enhancing diagnostic efficiency and reliability.
This study compared predictive model with standalone deep learning for system modeling,
focusing on five classes: ascariasis, hookworms, schistosomiasis, Trichuris, and negative sam
ples. The dataset, from the Ethiopian Public Health Institute’s research center, contained 1,490
images (300 per class and 290 for negatives). Various image processing steps resizing, normal
ization, and augmentation were applied. Models including VGG16, ResNet50, DenseNet121,
MobileNetV2, EfficientNetB0, and Vision Transformer served as classifiers and feature ex
tractors. Additionally, machine learning classifiers such as XGBoost, SVM, KNN, Random
Forest, and Decision Trees were integrated with deep extractors for classifiers.
The predictive model demonstrated higher accuracy. Strong results were obtained with SVM, where
VGG16 and DenseNet121 as feature extractors led to 99.31% test accuracy. Also VGG and xgboost
shows highest test accuracy of 99.35%. However, CNN-only models showed lower accuracy. VGG16
achieved 79.98% test accuracy and 83.4% training accuracy, while DenseNet121 reached 84.12% test
and 88.56% training accuracy. ResNet50’s training accuracy was 92.23%, with 86.01% on testing; Ef
ficientNetB0 achieved 91.80% training and 84.33% testing accuracy; MobileNetV2 reached 90.49%
training and 87.02% test accuracy, and Vision Transformer recorded 93.75% training and 87.43% test
accuracy. At class level Negative samples show high accuracy while others show different accuracy
based on model types.
These applications improve the diagnostic utility as they feed real-time information and are convenient
to use in areas where even primary healthcare may not be available. Working with a small and long
stored dataset posed challenges due to limited diversity and sample degradation, which hindered accu
rate class distinction and affected the model’s generalization performance. To overcome dataset limita
tions, collect fresh samples to increase diversity and represent all classes adequately. Implement sys
tematic field collection under varied conditions, ensuring data quality. Collaborate with relevant insti
tutions or stakeholders to expand the dataset, emphasizing consistency and accuracy. |
en_US |