Jimma University Open access Institutional Repository

Investigating the possibility of developing Tigrinya language interface to database

Show simple item record

dc.contributor.author Hagos Hailemaryam
dc.contributor.author Getachew Mamo
dc.contributor.author Teferi Kebebew
dc.date.accessioned 2021-02-11T07:27:51Z
dc.date.available 2021-02-11T07:27:51Z
dc.date.issued 2020-01
dc.identifier.uri https://repository.ju.edu.et//handle/123456789/5529
dc.description.abstract Now a day, different organizations use database with local language written contents to manage their work. These databases have huge information in an electronic format, and to access and manipulate these information users expected to know the SQL. Also, using SQL to access and manipulate the information written in local language is very difficult and tedious. So, instead of knowing and using of the SQL, it is better to use natural language for users to access and manipulate the contents of the database. Because using natural language is simple and comfortable. And also, it is good to form conjunctions and negations query simply. Because of the simplicity and comfortablity of the natural language for ordinary users many researches have been carried out on natural language interfaces since 1970. Therefore, that’s why Tigrinya language interface to database had been proposed. The database contents have been accessed and manipulated using the developed Tigrinya language interface to database (TLIDB) prototype without the knowledge of SQL. To carry out this, first the input Tigrinya sentences have been translated into the corresponding SQL statements and further the SQL statements were executed in the database. The TLIDB was designed and developed using a robust and effective approach called neural machine translation. The encoder-decoder long short term memory was used for the translation of the input Tigrinya sentence to corresponding SQL statements. In the sequence to sequence problems the encoderdecoder long short-term memory is a good technique. Also, word embedding technique was used to estimate the similarity of words and to have a dense representation. This solved the sparse data problem with the traditional approaches. The developed TLIDB prototype was evaluated on healthcare database that has patients, diseases and employees table. The record of diseases was prepared with health professionals. This prototype handles list query, conditional queries, aggregate functions, complex queries (join, union), update, delete and etc. To develop the prototype 6338 sentences were prepared with their corresponding SQL statements. The model was trained with 80% and tested with 20% of the dataset. This was done using the percentage split evaluation technique. Since, in percentage split evaluation technique the model has been evaluated with the data that were not included during the training. After the model had been evaluated, above 98.5% overall accuracy has been scored. en_US
dc.language.iso en en_US
dc.subject TLIDB en_US
dc.subject Natural Language Interface to Database en_US
dc.subject Natural Language Processing en_US
dc.title Investigating the possibility of developing Tigrinya language interface to database en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search IR


Browse

My Account