Home   >   CSC-OpenAccess Library   >    Manuscript Information
Automatic Generation of Multiple Choice Questions using Surface-based Semantic Relations
Naveed Afzal
Pages - 26 - 44     |    Revised - 31-08-2015     |    Published - 30-09-2015
Volume - 6   Issue - 3    |    Publication Date - September 2015  Table of Contents
E-Learning, Automatic Assessment, Educational Assessment, Natural Language Processing, Information Extraction, Unsupervised Relation Extraction, Multiple Choice Questions Generation, Biomedical Domain.
Multiple Choice Questions (MCQs) are a popular large-scale assessment tool. MCQs make it much easier for test-takers to take tests and for examiners to interpret their results; however, they are very expensive to compile manually, and they often need to be produced on a large scale and within short iterative cycles. We examine the problem of automated MCQ generation with the help of unsupervised Relation Extraction, a technique used in a number of related Natural Language Processing problems. Unsupervised Relation Extraction aims to identify the most important named entities and terminology in a document and then recognize semantic relations between them, without any prior knowledge as to the semantic types of the relations or their specific linguistic realization. We investigated a number of relation extraction patterns and tested a number of assumptions about linguistic expression of semantic relations between named entities. Our findings indicate that an optimized configuration of our MCQ generation system is capable of achieving high precision rates, which are much more important than recall in the automatic generation of MCQs. Its enhancement with linguistic knowledge further helps to produce significantly better patterns. We furthermore carried out a user-centric evaluation of the system, where subject domain experts from biomedical domain evaluated automatically generated MCQ items in terms of readability, usefulness of semantic relations, relevance, acceptability of questions and distractors and overall MCQ usability. The results of this evaluation make it possible for us to draw conclusions about the utility of the approach in practical e-Learning applications.
CITED BY (1)  
1 Afzal, N., & Bawakid, A. (2015). Comparison between Surface-based and Dependency-based Relation Extraction Approaches for Automatic Generation of Multiple-Choice Questions.
1 Google Scholar 
2 CiteSeerX 
3 refSeek 
4 Scribd 
5 SlideShare 
6 PdfSR 
A. Hoshino and H. Nakagawa “Assisting cloze test making with a web application”. In Proc. of Society for Information Technology and Teacher Education International Conference. Chesapeake, VA, 2007.
A. Hoshino and H. Nakagawa, “A real-time multiple-choice question generation for language testing – A preliminary study”. In Proc. of the 43rd ACL’05 2nd Workshop on Building Educational Applications Using Natural Language Processing, pp.17-20, 2005.
A. Papasalouros, K. Kanaris and K. Konstantinos, “Automatic generation of multiple choice questions from domain ontologies”. In Proc. of IADIS International Conference e-Learning, 2008.
A.M. Cohen and W.R. Hersh, “A survey of current work in biomedical text mining”. Briefings in Bioinformatics, pp. 57-71, 2005.
C. Grover, A. Lascarides and M. Lapata, “A comparison of parsing technologies for the biomedical domain”. Natural Language Engineering 11 (1), pp. 27 -65, 2005.
C. Manning and H. Schütze, “Foundations of Statistical Natural Language Processing”. The MIT Press, Cambridge, U.S. 1999.
C.-Y. Chen, H.-C. Liou and J.S. Chang, “FAST- An automatic generation system for grammar tests”. In Proc. of COLING/ACL Interactive Presentation Sessions, Sydney, Australia, 2006.
D. Jurafsky and J. H. Martin, “Speech and Language Processing”. Second Edition. Prentice Hall, 2008.
D. Lin and P. Pantel, “Concept discovery from text”. In Proc. of Conference on CL’02. pp. 577-583. Taipei, Taiwan, 2002.
D. P. Corney, D. Jones, B. Buxton and W. Langdon, “BioRAT: Extracting biological information from full-length papers”. Bioinformatics, pp. 3206-3213, 2004.
E. Agichtein and L. Gravano, “Snowball: Extracting relations from large plain text collections”. In Proc. of the 5th ACM International Conference on Digital Libraries, 2000.
E. P. Martin, E. Bremer, G. Guerin, M-C. DeSesa and O. Jouve, “Analysis of protein/protein interactions through biomedical literature: Text mining of abstracts vs. Text mining of full text articles”. Berlin: Springer-Verlag, pp. 96-108, 2004.
G. Erkan, A. Ozgur and D. R. Radev, “Semi-supervised classification for extracting protein interaction sentences using dependency parsing”. In Proc. of CoNLL-EMNLP, 2007.
G. Zhou, J. Su, D. Shen and C. Tan, “Recognizing name in biomedical texts: A machine learning approach”. Bioinformatics, pp. 1178-1190, 2004.
I. Szpektor, H. Tanev, I. Dagan and B. Coppola, “Scaling Web-based acquisition of entailment relations”. In Proc. of EMNLP-04, Barcelona, Spain, 2004.
J-D. Kim, T. Ohta and J. Tsujii, “Corpus annotation for mining biomedical events from literature”, BMC Bioinformatics, 2008.
J. Brown, G. Frishkoff and M. Eskenazi, “Automatic question generation for vocabulary assessment”. In Proc. of HLT/EMNLP. Vancouver, B.C. 2005.
J. Cohen, “Weighted Kappa: Nominal scale agreement with provision for scaled disagreement or partial credit”. Psychological Bulletin, 1968.
J. Wilbur, L. Smith and T. Tanabe, “BioCreative 2. Gene mention task. Proc. of the 2nd BioCreative Challenge Workshop pp. 7-16, 2007.
K. Eichler, H. Hemsen and G. Neumann, “Unsupervised relation extraction from web documents”. In Proc. of the 6th International Language Resources and Evaluation (LREC-08). Marrakech, Morocco, 2008.
K. Sudo, S. Sekine and R. Grishman, “An Improved Extraction Pattern Representation Model for Automatic IE Pattern Acquisition”. In Proc. of the 41st Annual Meeting of ACL-03, pp. 224– 231, Sapporo, Japan, 2003.
M. Greenwood, M. Stevenson, Y. Guo, H. Harkema and A. Roberts, “Automatically acquiring a linguistically motivated genic interaction extraction system”. In Proc. of the 4th Learning Language in Logic Workshop, Bonn, Germany, 2005.
M. Huang, X. Zhu, G. D. Payan, K. Qu and M. Li, “Discovering patterns to extract proteinprotein interactions from full biomedical texts”. Bioinformatics, pp. 3604-3612, 2004.
M. Stevenson and M. Greenwood, “A semantic approach to IE pattern induction”. In Proc. of ACL’05, pages 379-386, 2005.
M. Stevenson and M. Greenwood, “Dependency pattern models for information extraction”. Research on Language and Computation, 2009.
N. Afzal and A. Bawakid, “Comparison between Surface-based and Dependency-based Relation Extraction Approaches for Automatic Generation of Multiple-Choice Questions”. IJMSE, Volume 6, Issue 8, 2015.
N. Afzal and R. Mitkov, “Automatic Generation of Multiple Choice Questions using Dependency-based Semantic Relations”. Soft Computing. Volume 18, Issue 7, pp. 12691281, 2014. DOI: 10.1007/s00500-013-1141-4
N. Afzal and V.Pekar, “Unsupervised Relation Extraction for Automatic Generation of Multiple-Choice Questions”. In Proc. of RANLP2009 14-16 September 2009. Borovets, Bulgaria.
N. Gronlund, “Constructing Achievement Tests”. New York, USA: Prentice Hall, 1982.
N. Karamanis, L. A. Ha and R. Mitkov, “Generating multiple-choice test items from medical text: A pilot study”. In Proc. of the 4th International Natural Language Generation Conference, (July), pp.111-113, 2006.
O. Etzioni, M. Banko, S. Soderland and D. S. Weld, “Open information extraction from the web”. Communications of the ACM, 51(12), pp.68-74, 2008.
P. Tapanainen and T. Järvinen, “A non-projective dependency parser”. In Proc. of the 5th Conference on Applied Natural Language Processing, pages 64–74, Washington, 1997.
R. Bunescu and R. Mooney, “Learning to extract relations from the web using minimal supervision”. In Proc. of the 45th Annual Meeting of the Association for Computational Linguistics (ACL-07). Prague, Czech Republic, 2007.
R. Mitkov and L. A. An, “Computer-aided generation of multiple-choice tests”. In Proc. of the HLT/NAACL 2003 Workshop on Building educational applications using Natural Language Processing, 17-22. Edmonton, Canada, 2003.
R. Mitkov, L. A. Ha and N. Karamanis, “A computer-aided environment for generating multiple-choice test items”. Natural Language Engineering 12(2). Cambridge University Press, pp. 177-194, 2006.
R.K. Ando and T. Zhang, “A high-performance semi-supervised learning method for text chunking”. In Proc. of the 43rd Annual Meeting on Association for Computational Linguistics (ACL-05). Association for Computational Linguistics, pp. 1-9, 2005.
S. Ananiadou and J. McNaught eds. “Text Mining for Biology and Biomedicine”, Artech House, 2006.
S. Katrenko and P. Adriaans, “Learning relations from biomedical corpora using dependency trees”. In Proc. of the 1st International Workshop on Knowledge Discovery and Emergent Complexity in Bioinformatics, Ghent, pp. 61–80, 2006.
S. Sekine, “On-demand information extraction”. In Proc. of the COLING/ACL, 2006.
T. Hasegawa, S. Sekine and R. Grishman, “Discovering relations among named entities from large corpora”. In Proc. of ACL’04, 2004.
T. Ono, H. Hishigaki, A. Tanigami and T. Takagi, “Automated extraction of information on protein–protein interactions from the biological literature”. Bioinformatics, pp. 155-161, 2001.
V. Pekar, M. Krkoska and S. Staab, “Feature weighting for co-occurrence-based classification of words”. In Proc. of the 20th International Conference on Computational Linguistics (COLING-04). Geneva, Switzerland, pp. 799-805, 2004.
W.E. Becker and M. Watts, “Teaching methods in U.S. and undergraduate economics courses”. Journal of Economics Education, 32(3), pp. 269-279, 2001.
Y. Shinyama, and S. Sekine, “Preemptive information extraction using unrestricted relation discovery”. In Proc. of the HLT Conference of the North American Chapter of the ACL. New York, pp. 304-311, 2006.
Y. Skalban, “Improving the output of a multiple-choice test generator: Analysis and proposals”. University of Wolverhampton, 2009.
Y. Tsuruoka and J. Tsujii, “Bidirectional inference with the easiest-first strategy for tagging sequence data”. Proc. of HLT/EMNLP, pp. 467-474, 2005.’
Y. Tsuruoka, Y. Tateishi, J-D. Kim, T. Ohta, J. McNaught, S. Ananiadou and J.Tsujii, “Developing a robust PoS tagger for biomedical text”. Advances in Informatics – 10th Panhellenic Conference on Informatics, LNCS 3746, pp. 382-392, 2005.
Dr. Naveed Afzal
Mayo Clinic - United States of America