Home   >   CSC-OpenAccess Library   >    Manuscript Information
“C’mon – You Should Read This”: Automatic Identification of Tone from Language Text
Lisa Pearl, Mark Steyvers
Pages - 12 - 30     |    Revised - 01-08-2013     |    Published - 30-08-2013
Volume - 4   Issue - 1    |    Publication Date - August 2013  Table of Contents
MORE INFORMATION
KEYWORDS
language Text, Mental States, Tone, Game With a Purpose, Information Extraction, Natural Language Processing
ABSTRACT
Information extraction researchers have recently recognized that more subtle information beyond the basic semantic content of a message can be communicated via linguistic features in text, such as sentiments, emotions, perspectives, and intentions. One way to describe this information is that it represents something about the generator’s mental state, which is often interpreted as the tone of the message. A current technical barrier to developing a general-purpose tone identification system is the lack of reliable training data, with messages annotated with the message tone. We first describe a method for creating the necessary annotated data using human-based computation, based on interactive games between humans trying to generate and interpret messages conveying different tones. This draws on the use of game with a purpose methods from computer science and wisdom of the crowds methods from cognitive science. We then demonstrate the utility of this kind of database and the advantage of human-based computation by examining the performance of two machine learning classifiers trained on the database, each of which uses only shallow linguistic features. Though we already find near-human levels of performance with one classifier, we also suggest more sophisticated linguistic features and alternate implementations for the database that may improve tone identification results further.
CITED BY (1)  
1 Nagarsekar, U., Mhapsekar, A., Kulkarni, P., & Kalbande, D. R. (2013, December). Emotion detection from “the SMS of the internet”. In Intelligent Computational Systems (RAICS), 2013 IEEE Recent Advances in (pp. 316-321). IEEE.
1 Google Scholar 
2 CiteSeerX 
3 Scribd 
4 SlideShare 
5 PdfSR 
A. Abbasi. “Affect intensity analysis of dark web forums,” in Proceedings of Intelligence and Security Informatics (ISI), 2007, pp. 282-288.
A. Agarwal, F. Biadsy, and K. Mckeown. “Contextual Phrase-Level Polarity Analysis using Lexical Affect Scoring and Syntactic N-grams,” in Proceedings of the 12th Conference of the European Chapter of the ACL, 2009, pp. 24-32.
A. Gordon, A. “Story management technologies for organizational learning,” in Proceedings of the International Conference on Knowledge Management Graz, 2008. Internet:http://ict.usc.edu/files/publications/ IKNOW08.PDF [Feb 10, 2012].
A. Kennedy and D. Inkpen, D. “Sentiment classification of movie reviews using contextual valence shifters”. Computational Intelligence, vol. 22, pp. 110-125, 2006.
A. Kosorukoff. “Human-based Genetic Algorithm, “ in IEEE Transactions on Systems, Man,and Cybernetics (SMC), 2001, pp. 3464-3469.
B. Krishnapuram, M. Figueiredo, L. Carin, and A. Hartemink. “Sparse Multinomial Logistic Regression: Fast Algorithms and Generalization Bounds.” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, pp. 957-968, 2005.
B. Pang and L. Lee. “Opinion Mining and Sentiment Analysis”. Foundations and Trends in Information Retrieval, vol. 2(1-2), pp. 1-135, 2008.
B. Pang, L. Lee, and S. Vaithyanathan. “Thumbs up? Sentiment Classification using Machine Learning Techniques,” in Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2002, pp. 79-86.
B. Turner and M. Steyvers. “A Wisdom of the Crowd Approach to Forecasting,” in Proceedings of the 2nd NIPS workshop on Computational Social Science and the Wisdom of Crowds, 2011.
C. Alm, D. Roth, and R. Sproat. “Emotions from text: Machine learning for text-based emotion prediction,” in Proceedings of the Human Language Technology Conference and the Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP), 2005.
C. Strapparava and A. Valitutti, "WordNet-Affect: an affective extension of WordNet," in the Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC),2004, pp. 1083-1086.
D. Graff & C. Cieri. “English Gigaword.” Linguistic Data Consortium, Internet:http://www.ldc.upenn.edu/Catalog/catalogEntry.jsp?catalogId=LDC2003T05, 2003. [Mar 16,2012].
D. Klein and C. Manning. “Accurate Unlexicalized Parsing,” in Proceedings of the 41st Meeting of the Association for Computational Linguistics (ACL), 2003, pp. 423-430.
E. Hardisty, J. Boyd-Graber, and P. Resnik. “Modeling Perspective using Adaptor Grammars,” in Proceedings of Empirical Methods in Natural Language Processing, 2010.
E. Law and L. von Ahn. “Input-Agreement: A New Mechanism for Collecting Data Using Human Computation Games,” in Proceedings of ACM Conference on Human Factors in Computing Systems, 2009, pp 1197-1206.
J. Pennebaker and M. Francis. Linguistic Inquiry and Word Count, 1st edition. Lawrence Erlbaum, 1999.
J. Schler, M. Koppel, S. Argamon, and J. Pennebaker. “Effects of Age and Gender on Blogging”, in Proceedings of 2006 AAAI Spring Symposium on Computational Approaches for Analyzing Weblogs, 2006.
J. Wiebe, T. Wilson, R. Bruce, M. Bell, and M. Martin. “Learning subjective language”.Computational Linguistics, vol. 30, pp. 277-308, 2004.
K. Burton, A. Java, and I. Soboroff. “The ICWSM 2009 Spinn3r Dataset,” in Proceedings of the Third Annual Conference on Weblogs and Social Media (ICWSM), 2009. Internet:http://www.icwsm.org/ data/ [Feb 10 2012].
K. Dave, S. Lawrence, and D. Pennock. “Mining the peanut gallery: Opinion extraction and semantic classification of product reviews,” in Proceedings of WWW, 2003, pp. 519-528.
L. Anolli, M. Balconi, and R. Ciceri. “Deceptive Miscommunication Theory (DeMiT): A New Model for the Analysis of Deceptive Communication,” in Say not to say: new perspectives on miscommunication. L. Anolli, R. Ciceri, and G. Rivs, Ed. IOS Press, 2002, pp. 73-100.
L. Pearl and M. Steyvers. “Detecting authorship deception: A supervised machine learning approach using author writeprints”. Literary and Linguistic Computing., 2012. doi:10.1093/llc/fqs003.
L. von Ahn and L. Dabbish. “Labeling Images with a Computer Game,” in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Association for Computing Machinery), 2004, pp. 319-326.
L. von Ahn, M. Kedia, and M. Blum. 2006. “Verbosity: A Game for Collecting Common-Sense Facts,” in proceedings of the SIGCHI conference on Human Factors in computing systems, 2006.
L. von Ahn. “Games With A Purpose”. IEEE Computer Magazine (June, 2006), pp. 96-98.
L. Zhou and Y. Sung. 2008. “Cues to deception in online Chinese groups,” in Proceedings of the 41st Annual Hawaii international Conference on System Sciences, 2008, pp. 146-151.
L. Zhou, J. Burgoon, J. Nunamaker, and D. Twitchell. “Automating linguistics-based cues for detecting deception in text-based asynchronous computer-mediated communication”. Group Decision and Negotiation, vol. 13, pp. 81-106, 2004.
M. Diab, B. Dorr, L. Levin, T. Mitamura, R. Passonneau, O. Rambow, and L. Ramshaw.“Language Understanding Annotation Corpus”, Linguistic Data Consortium, 2009. Internet:http://www.ldc.upenn.edu/Catalog/catalogEntry.jsp?catalogId=LDC2009T10 [Mar 16 2012].
M. Lee, M. Steyvers, M. de Young, and B. Miller. “Inferring expertise in knowledge and prediction ranking tasks”. Topics in Cognitive Science, to appear 2012.
M. Steyvers, M. Lee, B. Miller, and P. Hemmer. “The Wisdom of Crowds in the Recollection of Order Information,” In Advances in Neural Information Processing Systems, 2009.
P. Anand, J. King, J. Boyd-Graber, E. Wagner, C. Martell, D. Oard, and P. Resnik, "Believe Me -- We Can Do This! Annotating Persuasive Acts in Blog Text", in Proceedings of the AAAI Workshop on Computational Models of Natural Argument, 2011.
P. Subasic and A. Huettner. “Affect analysis of text using fuzzy semantic typing”. IEEE Transactions on Fuzzy Systems, vol. 9, pp. 483-496, 2001.
P. Turney. 2002. “Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews,” in Proceedings of the Association for Computational Linguistics (ACL), 2002, pp. 417-424.
R. Mihalcea and C. Strapparava. “The Lie Detector: Explorations in the Automatic Recognition of Deceptive Language,” in Proceedings of the Association for Computational Linguistics (ACL), 2009, pp. 309-312.
R. Snow, B. O’Connor, D. Jurafsky, and A. Ng. “Cheap and Fast - But is it Good? Evaluating Non- Expert Annotations for Natural Language Tasks,” in Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2008, pp. 254-263.
S. Greene and P. Resnik. “More Than Words: Syntactic Packaging and Implicit Sentiment,” in Proceedings of NAACL, 2009.
S. Gupta and D. Skillicorn. 2006. “Improving a Textual Deception Detection Model,” in Proceedings of the conference of the Center for Advanced Studies on Collaborative research,2006.
S. Hacker and L. von Ahn. “Matchin: Eliciting User Preferences with an Online Game,” in Proceedings of ACM Conference on Human Factors in Computing Systems, 2009, pp 1207-1216.
S. Yi, M. Steyvers, and M. Lee. “The Wisdom of Crowds in Combinatorial Problems.”Cognitive Science, to appear 2012.
T. Griffiths, and M. Steyvers. “Finding scientific topics”. Proceedings of the National Academy of Sciences, vol. 101, pp. 5228–5235, 2004.
W. Lin, T. Wilson, J. Wiebe, and A. Hauptmann. “Which side are you on? Identifying perspectives at the document and sentence levels,” in Proceedings of the Conference on Natural Language Learning (CoNLL), 2006. Internet: https://sites.google.com/site/weihaolinatcmu/data
Professor Lisa Pearl
University of California, Irvine - United States of America
lpearl@uci.edu
Professor Mark Steyvers
University of California, Irvine - United States of America