Kafi Noonoo Named Entity Recognition Using Ensemble Methods

Getachew, Mintesinot; Beshah, Getachew; Zemene, Gashaw

Kafi Noonoo Named Entity Recognition Using Ensemble Methods

Getachew, Mintesinot; Beshah, Getachew; Zemene, Gashaw

URI: https://repository.ju.edu.et//handle/123456789/8093

Date: 2022-12-26

Abstract:

This study focuses on building Named Entity Recognition for Kafi Noonoo language which is frequently used in IE with the goal of classifying and predicting Named Entity categories of a given tokens in a given sentence into predefined classes. The approach we followed is ensemble methods which includes four techniques (HMM Machine Learning algorithm, Rule-based, Pattern matching, and Dictionary-based techniques). We used voting and priority technique to select the final Named Entity from a candidate Named Entities which are recognized by each model. By employing those methods, we make use of the strength each technique, in the end, a combination of those different approaches increases the efficiency of our NER system. We have collected the data from three main sources namely Kaffa TV (70% of data), Kaffa Zone Administration bureau (11% of data) and Elementary school books (13% of data). The corpus includes total of words 18090 words. The experiment was conducted using the Jupyter notebook computing platform. Unlabeled Kafi Noonoo sentences are given for the evaluating system. By comparing the output of the proposed model (Actual output) to the human-annotated one (Expected output), in terms of Precision, Recall and F1 measure, the following results are reported: 87.54%, 86.85% and 87.19%. Our model is relatively effective at recognizing miscellaneous named entities (DateTime, currency, and percentage values). The Machine Learning and Dictionary-based techniques are highly dependent on training data and gazetteer respectively.

Show full item record