Jimma University Open access Institutional Repository

Ethiopian Sign Language Recognition from Video Sequences Using CNN and RNN Models

Show simple item record

dc.contributor.author Abdissa Getachew
dc.contributor.author Kinde Anlay
dc.contributor.author Fetulhak A.
dc.date.accessioned 2023-07-04T11:35:32Z
dc.date.available 2023-07-04T11:35:32Z
dc.date.issued 2023-04
dc.identifier.uri https://repository.ju.edu.et//handle/123456789/8241
dc.description.abstract More than 1-2 million people in Ethiopia are estimated to be deaf or hard of hearing, according to the Ethiopian National Association for the Deaf and a 2019 report from the Department of Linguistics at Addis Ababa University on empowering the Deaf in Africa. For those people, the only means of communication is sign language. Sign language is a form of communication used by people with impaired hearing and speech. These hearing- impaired communities can communicate by using sign language. Even if it is widely used in the hearing-impaired community, they struggle to communicate with hearing people due to the language barrier. Due to this communication gap hearing-impaired people encounter so many problems in their daily life since they are living with people who communicate in spoken languages. Unfortunately, few people have knowledge of sign language in our daily life. In general, interpreters can help us to communicate with these challenges, but it is expensive to employ interpreters on personal behalf and inconvenient when privacy is required. Consequently, it is very important to develop a system that fills the communication gap between the hearing impaired and hearing people. To address this problem, many researchers have studied the recognition of Ethiopian Sign Language, but they are mainly restricted to studying word-level (Isolated) recognition. A few researchers attempted to study sentence-level recognition using various techniques, but their results revealed a signer-dependent issue as well as insufficient system accuracy. Therefore, the researcher proposed Ethiopian Sign Language Recognition from Video Sequences by using pretrained CNN and RNN Models, which recognizes continuous gestures from a video stream performed by different signers.The main focus of this work is to create a vision-based continuous sign language recognition system to identify Ethiopian sign language gestures from video sequences using CNN and RNN models. The proposed model is composed of three major processes: preprocessing (hand, pose, and face landmark detection with mediapipe holistic), feature extraction with the CNN model, feature learning, and classification with LSTM. In the feature extraction phase, the characteristic features are extracted, and the distinguishing features are learned in the feature learning phase by applying different operations such as convolution, pooling, and an activation layer. For feature learning, in our study, we have applied a Bidirectional Long Short-Term Memory (BiLSTM) model. Hosanna Deaf School and Jimma Zone Disability Center provided us with the data for our experiment. Our data set consists of continuous gestures, with around 300 videos belonging to 5 gesture categories performed by different signers. We extract each video to frames using the OpenCV library and pass each frame through mediapipe holistic for prepossessing.The proposed system is implemented using Keras, TensorFlow on Google collab. We present continuous gesture recognition by combining two different neural network architectures. In the first case, CNN and RNN are used. The CNN model was retrained on the VGG16 model. In the second architecture, there is a CNN followed by a GRU. The two models achieved 85% and 70% accuracy, respectively. Our study explored the ability of these two architectures to recognize 5 daily usage sentences. Due to the fact that some gesture signs begin with the same movement and others overlap in the middle, the model is confused when it attempts to recognize continuous gesture signs. To improve system accuracy, more data collection with high resolution cameras is required, along with the application of different holistic algorithms. en_US
dc.language.iso en_US en_US
dc.subject Deep Learning, Sign Language, Gesture Recognition, Mediapipe-holistic en_US
dc.title Ethiopian Sign Language Recognition from Video Sequences Using CNN and RNN Models en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search IR


Browse

My Account