Afaan Oromoo  Dependency Parser Using RNN

Mendasa Tesfa; Getachew Mamo; Mizanu Zelalem

Afaan Oromoo Dependency Parser Using RNN

Mendasa Tesfa; Getachew Mamo; Mizanu Zelalem

URI: https://repository.ju.edu.et//handle/123456789/7452

Date: 2022-06

Abstract:

Dependency parsing is an act of extracting the relations among the words or morphemes using dependency type to resolve ambiguities among the head and its modifiers. Humans use facial expressions, tone of speech, body language, and others to make a natural language more clear and understandable. Unlike humans, machines need well-formed and studied language structures for both natural language understanding and generations. This was achieved through developing natural language processing applications. Hence Afaan Oromoo dependency parser was developed to resolve and clarify misunderstandings among Afaan Oromoo morphemes. Even though, constituent parsers and universal dependency parsers exist they are not effective to handle morpheme information. Among dependency parser approaches, data driven approach was selected in Afaan Oromoo dependency parser to obtain morphemes and word order features in Afaan Oromoo. From a data-driven approach transition system was selected for its simplicity and fast performance than graph-based dependency parsers. Particularly arc standard is used to generate an unlabeled dependency graph. Afaan Oromoo dependency parser was developed from two sub-models that work self-reliantly. The first one is used to predict the transition and then generates an unlabeled dependency graph (tree). The second one is used to predict the relation types and generate a labeled dependency graph. RNN algorithm was selected to handle sequences of Afaan Oromoo morphemes and extract the language patterns. The treebank was constructed from 500 sentences and in the first model 3480 and 1740 instances of configurations were used for training and test data. In the second model 1000 and 415 (head-dependents) were used for training and test purposes. Consequently, LSTM and BILSTM had experimented and the BILSTM has shown better accuracy for classifications of both transitions and relations. The first model performs an accuracy of 90% using BILSTM and 89% using LSTM. Next, the second model scored 71% for BILSTM and 69% for LSTM. Additionally, using BILSTM the model scores 60% for UAS and 40% for LAS. To sum up, the performance of the deep learning models is directly proportional to corpus size. And also increasing dependency labels enhances clarifications between the morphemes.

Show full item record