Amharic Text To Image Generation Model Using  Conditional Generative Adversarial Network

Getnet Zeru; Amanuel Ayde; Elsabet Wedajo

Amharic Text To Image Generation Model Using Conditional Generative Adversarial Network

Getnet Zeru; Amanuel Ayde; Elsabet Wedajo

URI: https://repository.ju.edu.et//handle/123456789/9278

Date: 2024-06

Abstract:

Amharic text-to-image generation model using a conditional generative adversarial network (CGAN) is a novel concept that can be made possible by advances in deep learning. The aim of this study is to develop a model for Amharic text-to-image generation using CGAN algorithm. This study employed Experimental research design as study method. For this research, 2575 images of clothes and shoes were acquired, and the corresponding Amharic texts were written manually. For Amharic text preprocessing, stop word removal, punctuation mark removal, tokenizing the text, and creating word embedding using Word2Vec have been done. For image data preprocessing, noise removal, image segmentation, image resizing, normalizing, and converting to numpy arrays have been done. 80% of the paired Amharic text with the corresponding images was used to train the generator and discriminator networks for 1000 epochs and 32 batch sizes of data. In training, the generator network achieved 100% accuracy, and the discriminator achieved 40–50% accuracy, but the discriminator was unable to distinguish the generated images. Finally, the generator network trained on the training data has to be tested with the testing data to produce fake images to be compared with the tested real images. The generator achieved a Fréchet inception distance score of 4.99e+108 and an inception score of 417.2, which indicates the quantitative measure of the generated image quality. These numbers indicate that the generated images by the trained generator are not comparable with the real images. Training both the generator and discriminator at the updated values of parameters is much better than the default values of parameters as it is seen in the testing results. It is possible to develop a perfect model for Amharic text image generation with enough dataset, enough computational resources, and by using other variants of CGAN.

Show full item record