dc.description.abstract |
As a result of the overabundance of digital data, extracting usable, structured data has
become a challenge. Information extraction techniques were developed to simplify the
process of searching for a relevant query through AI technology. Named Entity Recog nition is an example of an information extraction technique that extracts proper names
from unstructured text. The system was developed in many languages in the world
like English, chins, Spanish, Ahamaric and afaan Oromo. Hadiyya language is also
one of the spoken languages in the southern part of Ethiopia with a high user incising
rate. The major problem with the research is that it severely lacks of annotated com putational data resources. As a result, data collection was problematic in this study.
To perform the action, gathering data was mandatory from different data stations like
Hadiyya language FM Radio Station, HTV, Wachamo University’s Hadiyya language
department,and Hossaena Teacher Training College (TTC). For this research, newly
annotated data set with 26,098 token. The current study focuses on extracting primary
named entities from unstructured text, such as Other, people, location, organizations
and time using a deep learning approach in Keras environment. Furthermore, the re search was conducted using Python software, which provides an optimal set of tools and
frameworks for developing NLP systems. The computing model in these study is RNN,
LSTM and BIGRU. The accuracy result of the model is 85%, 91% and 95.5% respec tively. From the given model BIGRU preformed better than another competitive model.
However, applying the suffix feature showed less effect in this model. Furthermore, be yond the current model building a newly prepared dataset can take vital allotment for
future researchers in the area. |
en_US |