Jimma University Open access Institutional Repository

Interpretable Semantic Textual Similarity

Show simple item record

dc.contributor.author Abdo Ababor
dc.date.accessioned 2021-01-04T08:08:05Z
dc.date.available 2021-01-04T08:08:05Z
dc.date.issued 2017-11
dc.identifier.uri https://repository.ju.edu.et//handle/123456789/4568
dc.description.abstract This thesis focuses on the problem of interpretable semantic textual similarity in English language. The system takes pair of sentence then it identifies the chunks in each sentence according to standard gold chunks, align corresponding chunk, assign degree of similarity score as well as predict reason of similarity/dissimilarity for each aligned chunks. To do this computation distributional hypothesis approach blend with knowledge based was selected. Latent semantic analysis (LSA) is a purely statistical technique, which leverages word co-occurrence information from a large unlabeled large corpus of text relies on the distributional hypothesis that the words occurring in similar contexts tend to have similar meanings. To do so LSA word similarity computed from a statistical analysis of preprocessed Wikipedia corpus as well as it boosted by WordNet and string similarity. Furthermore semantic similarity measures between corresponding chunks are introduced in the theoretical part. We selected and implemented 10 similarity measures. In the experimentation part we proposes five chunk similarity measures inspired by state-of-the-art measures described in the chapter three. The evaluation is conducted two results (Run1 and Run2) on two data sets (Images and Headlines). We can be concluded that the performance of the system obtained was promising and gives a best result on Run1 which depends on 𝑃𝑂𝑆𝑖𝑚. en_US
dc.language.iso en en_US
dc.title Interpretable Semantic Textual Similarity en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search IR


Browse

My Account