Abstract:
Social networking has now days become a part of human life. People share their information, feelings,
and emotions by using social sites like Facebook and Twitter. As social networking increasing day by
day, cyber hate using these social sites are also increasing rapidly. Social media especially twitter and
Facebook have a very big impact on the success or destruction of a person's image. Many of the social
movements are done in social Medias, particularly Facebook and Twitter, all of which successfully
affect the users. There is a well-targeted movement there is also a movement with the goal of evil that
is spreading hatred to others. Hate speech can contain any form of appearance such as images, videos,
songs as well as text. Detecting hate speech is the most important things to avoid the influence of hate
speech on social media. Hate speech detection system will help to clean any hatred comment or post
that creates the society to participate in the violent activities, and besides, it creates social media users
to communicate without harm. In this research we presented hate speech detection for Afaan Oromo
language to tackle hate speech on social media. To accomplish this research, we prepared a dataset of
14,077 label data to train and test our model. The collected dataset was labeled into three class’s strong
Hate, weak hate and neutral class. We trained three different deep learning models those are
convolutional neural network, bidirectional long short-term memory neural network and the hybrid of
the two neural networks models. We used the same dataset for each deep learning model. Additionally,
word embedding was created by applying the word2vec algorithm with a CBOW model on a corpus
collected from different social media. We explore the effect of using the pre-word embedding’s with
these models. Experimental results have shown that the use of word embedding’s with neural networks
effectively produces performance improvements in terms of run time and accuracy. The results
achieved by CNN, BLSTM and CNN-BLSTM methods are 98.15%, 97.91% and 97.98% accuracy
respectively. This research indicated that CNN model is more applicable to Afaan Oromo hate speech
detection than CNN-BLSTM and BLSTM