Interpretable Semantic Textual Similarity

Abdo Ababor

dc.contributor.author	Abdo Ababor
dc.date.accessioned	2021-01-04T08:08:05Z
dc.date.available	2021-01-04T08:08:05Z
dc.date.issued	2017-11
dc.identifier.uri	https://repository.ju.edu.et//handle/123456789/4568
dc.description.abstract	This thesis focuses on the problem of interpretable semantic textual similarity in English language. The system takes pair of sentence then it identifies the chunks in each sentence according to standard gold chunks, align corresponding chunk, assign degree of similarity score as well as predict reason of similarity/dissimilarity for each aligned chunks. To do this computation distributional hypothesis approach blend with knowledge based was selected. Latent semantic analysis (LSA) is a purely statistical technique, which leverages word co-occurrence information from a large unlabeled large corpus of text relies on the distributional hypothesis that the words occurring in similar contexts tend to have similar meanings. To do so LSA word similarity computed from a statistical analysis of preprocessed Wikipedia corpus as well as it boosted by WordNet and string similarity. Furthermore semantic similarity measures between corresponding chunks are introduced in the theoretical part. We selected and implemented 10 similarity measures. In the experimentation part we proposes five chunk similarity measures inspired by state-of-the-art measures described in the chapter three. The evaluation is conducted two results (Run1 and Run2) on two data sets (Images and Headlines). We can be concluded that the performance of the system obtained was promising and gives a best result on Run1 which depends on 𝑃𝑂𝑆𝑖𝑚.	en_US
dc.language.iso	en	en_US
dc.title	Interpretable Semantic Textual Similarity	en_US
dc.type	Thesis	en_US