Abstract:
This study was conducted to investigate Afan Oromo Word Sense Disambiguation which
is a technique in the field of Natural Language Processing where the main task is to find
the appropriate sense in which ambiguous word occurs in a particular context. A word
may have multiple senses and the problem is to find out which particular sense is
appropriate in a given context. Hence, this study presents a Word Sense Disambiguation
strategy which combines an unsupervised approach that exploits sense in a corpus and
manually crafted rule. The idea behind the approach is to overcome a bottleneck of
training data. In this study, the context of a given word is captured using term cooccurrences within a defined window size of words. The similar contexts of a given
senses of ambiguous word are clustered using hierarchical and partitional clustering.
Each cluster representing a unique sense. Some ambiguous words have two senses to the
five senses. The optimal window sizes for extracting semantic contexts is window 1 and 2
words to the right and left of the ambiguous word. The result argued that WSD yields an
accuracy of 56.2% in Unsupervised Machine learning and 65.5% in Hybrid Approach.
Based on this, the integration of deep linguistic knowledge with machine learning
improves disambiguation accuracy. The achieved result was encouraging; despite it is
less resource requirement. Yet; further experiments using different approaches that
extend this work are needed for a better performance.