Abstract:
Recently, the volume of textual data has rapidly increased, which has generated a valuable resource
for extracting and analyzing information. This information must be summarized to retrieve useful
knowledge within a reasonable time period. Text summarization is one of the main issues in natural
language processing in recent years. Text summarization is a technique for generating concise and
succinct summaries from long texts that focuses on the most relevant information while preserving
the text's overall comprehensive meaning. In this thesis, we propose a method of generating short
abstractive summaries for Afaan Oromo texts using some basic NLP techniques with sequence to-sequence recurrent neural network algorithms. The dataset has been collected from various
Afaan Oromo online resources, including BBC Afaan Oromo news, Kallacha Oromiyaa
newspapers, Fana Afaan Oromo news, Afaan Oromo Watchtower study text, Afaan Oromo Bible
text, Ethiopian News Agency (ENA), Ethiopian Press Agency (EPA) and Jehovah Witnesses (JW)
publication texts, etc. Then, the dataset has been preprocessed. Finally, the abstractive summary
has been generated using sequence-to-sequence RNNs deep learning techniques. In order to
evaluate the performance of the proposed system, we have used a well-known metric ROUGE for
evaluating our model. The performance of four summarizers (E1, E2, E3 and E4) was measured
using ROUGE-1. The average F1-Score values obtained were 0.16, 0.24, 0.26 and 0.34
respectively. Among them, E4 exhibited the highest performance, outperforming the other
summarizers.