Afaan Oromoo  Dependency Parser Using RNN

Mendasa Tesfa; Getachew Mamo; Mizanu Zelalem

dc.contributor.author	Mendasa Tesfa
dc.contributor.author	Getachew Mamo
dc.contributor.author	Mizanu Zelalem
dc.date.accessioned	2022-07-25T13:02:03Z
dc.date.available	2022-07-25T13:02:03Z
dc.date.issued	2022-06
dc.identifier.uri	https://repository.ju.edu.et//handle/123456789/7452
dc.description.abstract	Dependency parsing is an act of extracting the relations among the words or morphemes using dependency type to resolve ambiguities among the head and its modifiers. Humans use facial expressions, tone of speech, body language, and others to make a natural language more clear and understandable. Unlike humans, machines need well-formed and studied language structures for both natural language understanding and generations. This was achieved through developing natural language processing applications. Hence Afaan Oromoo dependency parser was developed to resolve and clarify misunderstandings among Afaan Oromoo morphemes. Even though, constituent parsers and universal dependency parsers exist they are not effective to handle morpheme information. Among dependency parser approaches, data driven approach was selected in Afaan Oromoo dependency parser to obtain morphemes and word order features in Afaan Oromoo. From a data-driven approach transition system was selected for its simplicity and fast performance than graph-based dependency parsers. Particularly arc standard is used to generate an unlabeled dependency graph. Afaan Oromoo dependency parser was developed from two sub-models that work self-reliantly. The first one is used to predict the transition and then generates an unlabeled dependency graph (tree). The second one is used to predict the relation types and generate a labeled dependency graph. RNN algorithm was selected to handle sequences of Afaan Oromoo morphemes and extract the language patterns. The treebank was constructed from 500 sentences and in the first model 3480 and 1740 instances of configurations were used for training and test data. In the second model 1000 and 415 (head-dependents) were used for training and test purposes. Consequently, LSTM and BILSTM had experimented and the BILSTM has shown better accuracy for classifications of both transitions and relations. The first model performs an accuracy of 90% using BILSTM and 89% using LSTM. Next, the second model scored 71% for BILSTM and 69% for LSTM. Additionally, using BILSTM the model scores 60% for UAS and 40% for LAS. To sum up, the performance of the deep learning models is directly proportional to corpus size. And also increasing dependency labels enhances clarifications between the morphemes.	en_US
dc.language.iso	en_US	en_US
dc.subject	Transition predictor, Relation predictor, Root	en_US
dc.title	Afaan Oromoo Dependency Parser Using RNN	en_US
dc.type	Thesis	en_US