Jimma University Open access Institutional Repository

Offensive And Hate Speech Detection For Amharic Language On Social Media Using Deep Learning Algorithm

Show simple item record

dc.contributor.author Miftah Adem
dc.contributor.author Kinde Anlay
dc.contributor.author Fetulhak Abdurahman
dc.date.accessioned 2022-03-11T11:44:11Z
dc.date.available 2022-03-11T11:44:11Z
dc.date.issued 2021-06
dc.identifier.uri https://repository.ju.edu.et//handle/123456789/6692
dc.description.abstract At this time, the number of social media users is increasing rapidly worldwide and in Ethiopia. So the use of social media becomes an essential tool for communication, increase tremendously in recent years. But this advancement also opens doors for trolls who poison these social media by their offensive and hate speech toward others. As a solution to this problem, this research proposed offensive and hate speech detection for Amharic text using a deep learning model. An offensive and hate speech data were collected from the Facebook and YouTube public page and manually labeled into hate speech, including their targets. Offensive language and not hate speech classes. The final dataset consists of 10,125 posts and comments. In recent times, Deep learning models such as Convolutional Neural Networks and Recurrent Neural Networks have been applied to offensive and hate speech detection with impressive results. The Convolutional neural networks are good at extract local information but cannot better express context informa tion. Recurrent Neural Networks, on the other hand, can extract context dependencies and have a good classification effect, but training takes a long time. In this research, we used a combined CNN­RNN structure to use the strength of both CNN and CNN. The convolution layer will extract local features, and the GRU layer will use the sequence of those features to learn about the input. The feature maps extracted and learned by CNN and GRU are passed to SoftMax and machine learning classifiers such as SVM and RF classifier to generate the final classification. We used word2vec and Fasttext word embeddings with Cbow and skip­gram model architecture to represent words as vectors. The Best results obtained from Fasttext (Skipgram) and CNN GRU­SVM model with an accuracy of 95.56%, the precision of 95.33%, recall of 95.44%, and F1 a score of 95.37% to classify comments and posts into religious hate speech, ethnic hate speech, offensive language, and not hate speech. However, the models lead to misclassifying offensive language as not hate speech class. Generally, replacing the SoftMax layer with an SVM classifier achieves good performance for offensive and hate speech detection including, the target of hate speech for the Amharic language. en_US
dc.language.iso en_US en_US
dc.subject Hate speech, Offensive language,word2vec, Fasttext, Deep learning, Amharic text en_US
dc.title Offensive And Hate Speech Detection For Amharic Language On Social Media Using Deep Learning Algorithm en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search IR


Browse

My Account