Hierarchical and Answer-to-Answer Attention Based Neural Network for Subjective Question Marking

Abebawu Eshetu

Hierarchical and Answer-to-Answer Attention Based Neural Network for Subjective Question Marking

Abebawu Eshetu

URI: https://repository.ju.edu.et//handle/123456789/4535

Date: 2017-11

Abstract:

The Evaluation of students’ capacity to construct a sustained argument with subjective questions allows mentors to assess implicit understanding ability of learners. However, manual evaluation of subjective question is challenging process and results grading inconsistency. From early 1960 several approaches are proposed to automate subjective question marking by giving due attention for essays. Recently, with advent of deep learning technique automatic essay assessment shown improved result that approaches to human raters without need of handcrafted features. The aims of this study were to model that can able to evaluate both essay and short answer questions without handcrafted features using deep learning technique. Given essay or short answer word sequences, our model first embed word level context using FastText word vectors and subword embedding built by character based convolutional neural network. For essay, the model encodes embedded essay vectors hierarchically by applying two level bidirectional recurrent neural network. We applied hierarchical word and sentence level attention that extract most salient words encapsulated in a sentences and sentences encapsulated in an essay respectively. For short answer, we used the same encoder as essay for both model and student answer vectors. Then, we applied reference attention on encoded vectors using model answer vector as weight. Finally, answer-to-answer attention is applied to get the relatedness level of resulting vector and encoded model answer from model to student and student to model answer. We evaluated our model on three datasets: Kaggle essay and short answer English dataset and Amharic short answer dataset prepared for this thesis work. Experimental results on Kaggle dataset show that our model achieves the state-of-the-art performance for both essay and short answer by improving weighted Kappa to +2 and +4 respectively. The experiment done on Amharic dataset shows promising result by achieving 66% and 62% correlation on Pearson and Kappa respectively on small sized dataset. This shows our model is capable of evaluating both short answer and essay questions from any domain in very human like way if trained on enough data. Our work not considered subjective questions with formulas and diagrams and we left open. We also recommend to include feedback that show how the model scored and rated missed points to student answer.

Show full item record