dc.description.abstract |
Artificial Intelligence (AI) has emerged as a transformative force in various domains,
with Natural Language Processing (NLP) playing a pivotal role in enabling machines
to comprehend and generate human language. As AI advances, the application of NLP
becomes crucial for communication with intelligent systems, extending to diverse
languages, including Amharic. Sentiment Analysis (SA), a subset of NLP, is
particularly vital for extracting actionable insights from product reviews, aiding
organizations in understanding consumer sentiments.
This research addresses the unique challenges of sentiment analysis for Amharic
product reviews, marked by the absence of labeled data and the intricacies of the
language. The study focuses on XLNet, an attention mechanism transformer model, to
overcome limitations associated with masked language modeling.
Motivated by the cultural significance of Amharic as the second most spoken Semitic
language globally, the research leverages XLNet to develop robust sentiment analysis
models. The morphological richness and complex grammar of Amharic pose
challenges, prompting the investigation of XLNet's ability to capture nuanced
sentiments and word order sensitivities.
The research poses three key questions: addressing the lack of labeled data, handling
linguistic features, tokenizing Amharic pretraining data for XLNet compatibility, and
tailoring preprocessing techniques for unique linguistic characteristics. The
overarching goal is to contribute to both Amharic sentiment analysis and the broader
field of NLP.
In the pursuit of our research objectives, we conducted comprehensive experiments
comparing the performance of XLNet with other models. The findings underscore the
crucial understanding of Amharic nuances, as XLNet consistently outperformed base
cased models. Employing augmentation techniques such as random insertion,
swapping, and deletion significantly enhanced dataset variability. The meticulously
configured XLNet model, with specific hyperparameters and leveraging an augmented
dataset, showcased exceptional performance, achieving an impressive accuracy of
xii
98.10% in Amharic sentiment analysis. Subsequently, in a parallel experiment with
BERT, which yielded a commendable 90.79% accuracy, it became evident that, when
comparing both the Custom XLNet model and the BERT model using the same product
review dataset and pretraining data, the Custom XLNet model demonstrated superior
performance, further validating its efficacy in Amharic product review sentiment
analysis. The experimental results demonstrate the significance of XLNet in handling
Amharic product reviews, showcasing its potential for business strategies, product
refinement, and global competitiveness. The study contributes a valuable resource by
addressing the research gap in Amharic SA and showcases the potential of XLNet in
advancing NLP applications for morphologically rich languages. |
en_US |