Abstract:
The ambiguity problem in a natural language is a common problem, that it needs to be solved to
get the exact context or meaning of the sentence intended to be specified. Humans are naturally
adept at word sense disambiguation and can recognize different senses in words via spoken
language. Computers on the other hand have trouble recognizing accurate word meanings. We
have different kinds of ambiguities in natural language, in this thesis we will consider the semantic
ambiguities. The goal of this thesis is to describe a machine learning model based on supervised
and unsupervised word embedding on neural networks. Basically we have used supervised and
unsupervised neural word embedding techniques to solve the semantic lexical ambiguities seen in
Amharic language. We have developed two approaches: the supervised and unsupervised word
sense approach. In this supervised version the dataset or documents with ambiguous words are
manually labeled with the support of language linguistics. In the second approach we have used
the sense definition (gathered from Babelnet) as a sense vector so that the input sentence will be
converted to context vector and their context similarities will be checked with the sense value.
Amount of dataset used was more than 36,000 documents or 1 million amount in word level for
testing and development. The F1 measure we have got is 92.5% and accuracy of 86.0%.