Word sequence prediction for Afaan Oromo  using Deep Learning

Ahmed Indris; Admasu. A; Tefery. K

dc.contributor.author	Ahmed Indris
dc.contributor.author	Admasu. A
dc.contributor.author	Tefery. K
dc.date.accessioned	2023-10-18T07:49:47Z
dc.date.available	2023-10-18T07:49:47Z
dc.date.issued	2023-07
dc.identifier.uri	https://repository.ju.edu.et//handle/123456789/8660
dc.description.abstract	Word prediction is one of the most extensively utilized approaches for increasing the communication pace in augmentative and alternative communication. The following word prediction entails guessing the following words. A variety of word-sequence prediction algorithms are available in many languages to help users enter text. Given a sequence of words created from the corpus, the potential is to predict the following word with the highest likelihood of occurrence; thus, it is a predictive modelling problem for languages, also known as Language Modelling. Word sequence prediction benefits physically challenged individuals who have typing issues, increases typing speed by minimizing keystrokes, aids in spelling and error detection, and aids in speech and handwriting recognition. Although Afaan Oromo is one of the most widely spoken and written languages in Ethiopia, no significant research has been undertaken in the field of word sequence prediction. Word sequence prediction is very important for Afaan Oromo because the same vowels and consonants can be typed by pressing the same consonants along with other long and short vowels, combinations of vowels, and special keys. As a result, we designed and implemented deep learning-based network model for Afaan Oromo word sequence prediction. To achieve the objectives, corpus data was collected from different sources and divided 70% of total dataset into training set for training the models and 30% into testing set for testing the designed model. To identify the best performing model for Afaan Oromo word sequence prediction, we conducted total of 6 different experiments using various RNN advanced version in single as well as the hybrids of them, LSTM, BLSTM, GRU, BGRU, BLSTM-GRU and BLSTM-BGRU using collected and preprocessed Afaan Oromo datasets with similar layers and hyperparameters. We evaluated the designed model using accuracy and categorical cross entropy loss function as evaluation metrics. The proposed models were trained and tested with 42,575 Afaan Oromo sentences and we obtained 93.5% for LSTM, 83% for GRU, 97.4% for BLSTM, 80.8% for BGRU, 89.9% for BLSTM-GRU and 88.9% for BLSTM-BGRU respectively. The experimental results prove that the designed stacked BLSTM network model improves over all other conducted experiments and identified and suggested for Afaan Oromo word sequence prediction tasks which yields promising results	en_US
dc.language.iso	en_US	en_US
dc.subject	Afaan Oromo, LSTM, BLSTM, GRU model, word sequence predictions, deep leaning	en_US
dc.title	Word sequence prediction for Afaan Oromo using Deep Learning	en_US
dc.type	Thesis	en_US