Abstract:
Nowadays, Natural Language Processing (NLP) concerns with the interaction between
computers and human natural languages. The most difficult task in NLP is to learn natural
languages for the computer. Parsing is one of the very important tasks in natural language
processing. It is the task of analyzing the structural relationship between the words in a sentence.
For a free word order language like Afaan Oromo, parser suits the best to extract the relation
between the words in the sentences. Development of hybrid sentence parser for Afaan Oromo
will avoid the large amount of time wasted to manually process sentences in the language to
show its syntactic structure. The parser is also useful for semantic parsing which extracting
meaning from a sentence and checking the well-formed-ness of a sentence, which is
useful in a number of applications such as language teaching. Corpus used in this study as
training and test set are manually parsed by researchers with linguistic advisor. Manually parsed
sentences are given to machine for machine learning.
In this thesis, Weka tool is used for machine learning technique. The algorithm used for machine
learning is support vector machine (SVM). The SVM algorithm is implemented using sequential
minimal optimizing function (SMO). The features for the parser to machine learning include
parts of speech, word and Lexicalized features. The algorithm achieved precision and recall of
82% for complex sentence parser and 89.5% for simple sentence. Accuracy of the result is
73.11%.
The model created for the parser differs from the previous work since the model developed
includes machine leaning technique and also the tag set used is different. At the end, the
developed model gives satisfactory results