Jimma University Open access Institutional Repository

Automatic Text Categorization For Afan Oromo News: Machine Learning Approach

Show simple item record

dc.contributor.author Mubarak Taha
dc.date.accessioned 2021-01-04T07:11:17Z
dc.date.available 2021-01-04T07:11:17Z
dc.date.issued 2017-01
dc.identifier.uri https://repository.ju.edu.et//handle/123456789/4559
dc.description.abstract Automatic text categorization is a supervised learning task, defined as assigning category labels to new documents based on likelihood suggested by a training set of labeled documents. The world is widely changing hence, the impact of the technology and communications revolution has grown greater today. People have realized the importance of archiving and finding information, only nowadays with the advent of computers and the progress of information technology became possible to store and share large amounts of information, and finding useful information from such collections became a necessity. Currently Oromia Radio and Television Organization are implementing a manual categorization system to categorize their news items in their day-to-day activities although they are using computer system to store and dispatch information using database systems of un organized information system. The objective of this research is to apply the novel techniques of machine learning approaches to Afan Oromo news text categorization using Naïve Bayes, Sequential Minimal Optimization and J48 classifier algorithm to recommend the best for the problem at hand. The classifiers use Afan Oromo News items of five classes, collected from Oromia Television and Radio Organization and Voice of America AfaanOromoo program for training and testing of the classifiers. Before the implementation of classifiers, document preprocessing is applied on the prepared document. Under preprocessing steps, removing of digits, punctuation marks, extra characters following this compound words are merged and stop words are removed and finally documents are transformed into term matrix with its weighted values to perform the summarization. en_US
dc.language.iso en en_US
dc.title Automatic Text Categorization For Afan Oromo News: Machine Learning Approach en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search IR


Browse

My Account