Online Hate Speech Detection for Afaan Oromo Using  Deep Learning

MESKELE, OLIYAD SEBOKA

Online Hate Speech Detection for Afaan Oromo Using Deep Learning

MESKELE, OLIYAD SEBOKA

URI: https://repository.ju.edu.et//handle/123456789/6169

Date: 2021-07-12

Abstract:

Social networking has now days become a part of human life. People share their information, feelings, and emotions by using social sites like Facebook and Twitter. As social networking increasing day by day, cyber hate using these social sites are also increasing rapidly. Social media especially twitter and Facebook have a very big impact on the success or destruction of a person's image. Many of the social movements are done in social Medias, particularly Facebook and Twitter, all of which successfully affect the users. There is a well-targeted movement there is also a movement with the goal of evil that is spreading hatred to others. Hate speech can contain any form of appearance such as images, videos, songs as well as text. Detecting hate speech is the most important things to avoid the influence of hate speech on social media. Hate speech detection system will help to clean any hatred comment or post that creates the society to participate in the violent activities, and besides, it creates social media users to communicate without harm. In this research we presented hate speech detection for Afaan Oromo language to tackle hate speech on social media. To accomplish this research, we prepared a dataset of 14,077 label data to train and test our model. The collected dataset was labeled into three class’s strong Hate, weak hate and neutral class. We trained three different deep learning models those are convolutional neural network, bidirectional long short-term memory neural network and the hybrid of the two neural networks models. We used the same dataset for each deep learning model. Additionally, word embedding was created by applying the word2vec algorithm with a CBOW model on a corpus collected from different social media. We explore the effect of using the pre-word embedding’s with these models. Experimental results have shown that the use of word embedding’s with neural networks effectively produces performance improvements in terms of run time and accuracy. The results achieved by CNN, BLSTM and CNN-BLSTM methods are 98.15%, 97.91% and 97.98% accuracy respectively. This research indicated that CNN model is more applicable to Afaan Oromo hate speech detection than CNN-BLSTM and BLSTM

Show full item record