Jimma University Open access Institutional Repository

Automatic Afan Oromo Sentence Identification and Simplification Using Rule Based Approach

Show simple item record

dc.contributor.author JUNDA, ABDUREHMAN MAHMUD
dc.date.accessioned 2022-02-02T11:55:26Z
dc.date.available 2022-02-02T11:55:26Z
dc.date.issued 2021-12-20
dc.identifier.uri https://repository.ju.edu.et//handle/123456789/6155
dc.description.abstract In NLP, sentence identification and simplification are necessary for machine translation, parsing, question generation, information extraction, summarization, semantic role labeling, opinion mining, etc. The majority of these applications use simple sentences as preprocessing to improve their functionality, and the high coverage of sentence simplification is used for various social classes that have language difficulties, such as aphasics, children, and adults learning the language (non-native speakers). The study provided a new automatic syntactic Afan Oromo sentence identification and simplification using a rule-based method that operates on POS tags. In this study, the main performed task can be separated into two tasks. The first task is the identification and separation of Afan Oromo declarative sentences into simple, compound, complex, and compound-complex sentences. The second task is the simplification of compound sentences into simple and self contained sentences by preserving the meaning of the original meaning as much as possible. Sentence identification and separation were performed to improve the performance of sentence simplification. The resursive type algorithm is developed both for sentence identification and simplification based on the syntactic structure of the sentences. To determine the syntactic structure of the sentence, the POS Tag is used as a preprocssor and then the sentence type indicators and sentence simplification features are managed. To evaluate the algorithms, a dataset containing 480 sentences was collected from the Afan Oromo textbook and annotated with the help of an expert. The performance of the sentence identification and compound sentence simplification algorithms is separately evaluated in terms of precision and recall using the result gained by the expet judgments. The expert classifies the identified and simplified sentences as correct or incorrect by comparing the system's output with the golden standard produced by the language expert. The sentence simplification evaluation criteria includes grammar and fluency of the simplified sentence and also the retainment of the original meaning. The overall performance of both sentence identification and compound sentence simplification is 90% and 84.4% F score respectively. The evaluation result reveals that the proposed algorithm is a promising one, as it is the beginning of a less resource-intensive study en_US
dc.language.iso en_US en_US
dc.subject Afan Oromo Sentences en_US
dc.subject Sentence Identification en_US
dc.subject Syntactic Sentence Simplification en_US
dc.title Automatic Afan Oromo Sentence Identification and Simplification Using Rule Based Approach en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search IR


Browse

My Account