NEXT WORD PREDICTION for SILTIGNA LANGUAGE USING RNN

YILMA, BEHREDIN REDI

NEXT WORD PREDICTION for SILTIGNA LANGUAGE USING RNN

YILMA, BEHREDIN REDI

URI: https://repository.ju.edu.et//handle/123456789/9433

Date: 2024

Abstract:

Natural Language Processing (NLP) is a branch of Artificial Intelligence focused on the analysis and understanding of natural language. One primary application of NLP is next-word prediction. Which involves predicting the next word in a sentence by presenting a list of the most likely candidates for that position. Siltigna language is categorized into to Semitic language group which is spoken in the central Ethiopian regional Government by the Silte peoples. The language is characterized by unique syntactic and semantic structures and requires specialized models for effective language processing. Lack of next-word prediction model leads Siltigna language users to problems like more time consuming during writing, error-prone, spelling error and also physically disabled persons who have typing difficulties can ‘t use this language easily to communicate with each other. This study addresses this problem by proposing an approach to next-word prediction for the Siltigna language, by applying the power of RNN. The objective of this study is to investigate the possibility of building a next-word prediction model for the Siltigna language, using the RNN algorithm. To achieve the objectives, 70,434 sentence of data was collected from different sources. The corpus divided 80% into a training set for training the models and 20% into a testing set for testing the designed model. To get the optimal performing model, we executed 6 distinct experiments employing various advanced iterations of RNN in both singular instances and their hybrids, namely LSTM, BLSTM, GRU, BGRU, BLSTM-GRU, and BLSTM-BGRU utilizing collected and preprocessed datasets of Siltigna with different layers and hyperparameters. We evaluated the constructed model utilizing accuracy and categorical cross-entropy loss function. The proposed models were trained and evaluated with Siltigna sentences, and we acquired performance metrics of 94.4% BGRU, 92.14% BLSTM, 88.5% LSTM, 86.35% GRU, 83.54% BLSTM-BGRU and 81.89% BLSTM-GRU. The experimental findings substantiate that the architected BGRU network model surpasses all other conducted experiments and is identified and recommended for Siltigna next word prediction tasks, which yield encouraging results.

Show full item record