Abstract:
This paper presents the sense clustering of multi-sense words in Afan Oromo. The main idea of this work is to
cluster contexts which is providing a useful way to discover semantically related senses. The similar contexts of a
given senses of target word are clustered using three hierarchical and two partitional clustering. All contexts of related
senses are included in the clustering and thus performed over all the contexts in the corpus. The underlying hypothesis
is that clustering captures the reflected unity among the contexts and each cluster reveal possible relationships existing
among the contexts. As the experiment shows, from the total five clusters, the EM and K-Means clusters which yield
significantly higher accuracy than hierarchical (single clustering, complete clustering and average clustering) result.
For Afan Oromo, EM and K-means enhance the accuracy of sense clustering than hierarchical clustering algorithms.
Each cluster representing a unique sense. Some words have two senses to the five senses. As the result shows an
average accuracy of test set was 85.5% which is encouraging with the unsupervised machine learning work. By using
this approach, finding the right number of clusters is equivalent to finding the number of senses. The achieved result
was encouraging, despite it is less resource requirement.