Context based spell checker for Amharic

Nigusu Yitayal; Getachew Mamo; Teferi Kebebew

Context based spell checker for Amharic

Nigusu Yitayal; Getachew Mamo; Teferi Kebebew

URI: http://10.140.5.162//handle/123456789/4379

Date: 2016-06

Abstract:

Developing language applications or localizations of software is a resource intensive task that requires the active participation of stakeholders with various backgrounds. Spell checking is the one and significant application of computational linguistics. Spell checking is the process of detecting and sometimes providing spelling suggestions for incorrectly spelled words in a text. The text data in local languages is also increasing fast, requiring text-processing tools for text documents to be available in local languages. This application is vital to detect and correct spelling errors in under resource languages like Amharic. This thesis describes the development, implementation and testing of a model that have been developed to detect and correct non-word and real word typing errors made by writers for Amharic language. The aim of this study is to develop context based spell checker and corrector for Amharic depends on the spelling error patterns of language based on the sequence of words in in the input sentences contextually. Training and testing data sets were collected from various sources describes different issues to balance the inclusiveness of the corpus. The texts were prepared and cleaned manually from any kind of unnecessary errors which are not necessary for detection and correction like numbers and punctuations. Experimental research design was used to evaluate the performance of developed prototype system. To conduct experiment 10,000 and 500 sentences were used to learn and test the model respectively. According the experimental result, the spell checker can correctly classify Amharic words with prediction accuracy of 95.62%, lexical recall of 95.52% and lexical precision of 35.18% for non-word spelling errors. The performance of the context sensitive spell checker was measured and scored a value of prediction accuracy 64.93%, lexical recall 63.42% and error precision 5.49% to resolve real word errors. Finally, as a comprehensive spell checker system has to be capable of detection, resolving and ranking correction possibilities using complementary contextual and linguistic knowledge, we are planning to extend the coverage level of the system considering more syntactical and semantic knowledge to improve and complete the quality of the developed system through rule based approaches.

Show full item record