Event and Temporal Information Extraction from Amharic Text

Ephrem Tadesse; Debela Tesfaye; Tesfu Mekonen

dc.contributor.author	Ephrem Tadesse
dc.contributor.author	Debela Tesfaye
dc.contributor.author	Tesfu Mekonen
dc.date.accessioned	2021-02-04T10:54:13Z
dc.date.available	2021-02-04T10:54:13Z
dc.date.issued	2017
dc.identifier.uri	https://repository.ju.edu.et//handle/123456789/5371
dc.description.abstract	The drastic increase of large volume of data on the web becomes cumbersome to get relevant information. To tackle this problem a lot of information extraction tasks have been done from the literature background. Event and Temporal information extraction is one of information extraction tasks, which helps to get important events from large set of texts with their chronological order and answers the question of what happened on a certain situation as well as when does it happen. Unlike other information tasks like entity extraction research needs felt for event and temporal information extraction especially for Amharic still there is no work on this particular IE task. As the first comprehensive work we designed a model on event and temporal information extraction from Amharic text. The model is comprised of different components including common preprocessing, learning and classification, event extraction, temporal information extraction. To develop the proposed model we used different approaches for each tasks. For event extraction component we used a machine learning classifier but the classifier fails to detect deverbal events. To resolve the machine learning classifier limitation of missing deverbal entities due to their ambiguities we used rule based approach using syntactic features such as POS, morphological analyzer, and list of gazetteers. In practice it‟s difficult to stay within the boundary of single event extraction method. So as both approaches have advantages and disadvantages combining those results to get the advantage of the machine learning classifier and the rule based approach advantage we developed hybrid approach for event extraction. For temporal information extraction component regular expression, list of temporal gazetteers in combination with some rules is used. The preprocessing component is used to prepare and normalize input texts. Whereas the event extraction component extracts events and the temporal information extractor is used to extract and normalize temporal expressions. Various experiments are conducted for each approach with different scenarios. The hybrid approach for event extraction component outperforms over the other two approaches and the evaluation result yields precision, recall, and Fmeasures of 97.7%, 96.3% and 96.99% respectively. The rule based approach for temporal information extraction scores a precision, recall, and F-measures of 84.6%, 89.7% and 87.1% respectively	en_US
dc.language.iso	en	en_US
dc.subject	Event and Temporal information Extraction from Amharic Text	en_US
dc.subject	Information extraction	en_US
dc.subject	machine learning classifier for event Extraction	en_US
dc.subject	Rule based approach for Event Extraction	en_US
dc.subject	Hybrid approach for event extraction	en_US
dc.subject	Rule based temporal information extraction	en_US
dc.subject	Deverbal entities	en_US
dc.subject	Regular expression	en_US
dc.title	Event and Temporal Information Extraction from Amharic Text	en_US
dc.type	Thesis	en_US