Abstract:
This thesis presents a research work on a probabilistic information retrieval system for Afan
Oromo text. The primary purpose of an information retrieval system is to retrieve all the relevant
documents, which are relevant to the user query. Information retrieval is not being an optional
technology; it is an important to everybody and mandatory to use. As considerable amount of
information is being produced in Afan Oromo rapidly and continuously; experimenting on the
applicability of information retrieval system for Afan Oromo is important. The main objective of
this study is to design a prototype architecture of Afan Oromo text retrieval system based on
probabilistic model in order to increase its effectiveness in retrieving relevant documents as per
the users information need. A Probabilistic retrieval model that has the capability of reweighting
query terms based on relevance feedback could be used and also the potential of the model was
investigated. The study presents the design and implementation of a probabilistic model for Afan
Oromo free-text-documents. Both indexing and searching modules were constructed. Text
operations were applied in both modules. Then, the retrieval system was tested using two
hundred (200) Afan Oromo free-text-documents and ten (10) queries. Other types of documents
like video, images and audio were not included. The development platform used to develop the
system prototype is Python 3.6.5 programming language. The experimental results show that
probabilistic based IR system in Afan Oromo free-text-documents returned encouraging result.
The system registered, after user relevance feedback, an average precision, recall and Fmeasure of 60%, 91.56% and 72.5% respectively. This result is achieved without controlling the
problem of synonyms and polysemous of terms that exist in Afan Oromo text. Though the
performance of the system is greatly affected by the word variants, the result obtained is
encouraging. It can be concluded that; when the terms are added to the user query and user
relevance feedback is applied; the performance of the retrieval system increases. It is
recommended that further research works be done to see the retrieval effectiveness of Afan
Oromo IR system using other probabilistic models like bayesian network, Bayesian belief
network, and Bayesian inference network model.