Jimma University Open access Institutional Repository

Towards the sense disambiguation of afaan oromoo words using hybrid approach (unsupervised machine learning and rule based)

Show simple item record

dc.contributor.author Workineh Tesema
dc.contributor.author Debela Tesfaye
dc.contributor.author Teferi Kebebew
dc.date.accessioned 2020-12-07T14:01:23Z
dc.date.available 2020-12-07T14:01:23Z
dc.date.issued 2015-10
dc.identifier.uri http://10.140.5.162//handle/123456789/1907
dc.description.abstract Word Sense Disambiguation is a technique in the field of Natural Language Processing where the main task is to find the appropriate sense in which ambiguous word occurs in a particular context. It is a fundamental problem for many natural language technology applications(Machine Translation, Text Summarization, Question and Answering, Information extraction and text mining and Information Retrieval). A word may have multiple senses and the problem is to find out which particular sense is appropriate in a given context. Ambiguity is a cause of poor performance in searching and retrieval system. The objective of this work is to develop hybrid word sense disambiguation which finds the sense of words based on surrounding contexts. Hence, this study presents a Word Sense Disambiguation strategy which combines an unsupervised approach that exploits sense in a corpus and manually crafted rule. The idea behind the approach is to overcome the problem a bottleneck for the machine learning approaches, while hybrid method can improve the accuracy and suitable when there is scarcity of training data. This makes our approach suitable for disambiguation when there is lack of resource and sense definitions. In this study, the context of a given word is captured using term co-occurrences within a defined window size of words. The optimal window sizes for extracting semantic contexts is window +1 and +2 words to the right and left of the ambiguous word. The similar contexts of a given senses of ambiguous word are clustered using hierarchical and partitional clustering. Each cluster representing a unique sense. Some ambiguous words have two senses to the five senses. The result argued that WSD yields an accuracy of 70% in Unsupervised Machine learning and 81.1% in Hybrid Approach. The machine learning were a useful information source for disambiguation but that it not as robust as a linguistic(rule based) [89]. Based on this, the integration of deep linguistic knowledge with machine learning improves disambiguation accuracy. Therefore, for Afan Oromo semantic has come to the conclusion that the sense of words are closely connected to the statistics of word usage. The achieved result was encouraging, despite it is less resource requirement. Yet; further experiments using different approaches that extend this work are needed for a better performance. en_US
dc.language.iso en en_US
dc.subject Word Sense Disambiguation en_US
dc.subject Afan Oromo en_US
dc.subject Ambiguous Word en_US
dc.subject Disambiguation en_US
dc.subject Rule Based en_US
dc.subject Hybrid en_US
dc.title Towards the sense disambiguation of afaan oromoo words using hybrid approach (unsupervised machine learning and rule based) en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search IR


Browse

My Account