Home   >   CSC-OpenAccess Library   >    Manuscript Information
Named Entity Recognition for Telugu Using Conditional Random Field
G.V.S.Raju, B.Srinivasu, S. Viswanadha Raju, Allam Balaram
Pages - 36 - 44     |    Revised - 30-11-2010     |    Published - 20-12-2010
Volume - 1   Issue - 3    |    Publication Date - December 2010  Table of Contents
MORE INFORMATION
KEYWORDS
Named entity , Conditional Random field,, NER,, Telugu
ABSTRACT
Named Entity (NE) recognition is a task in which proper nouns and numerical information are extracted from documents and are classified into predefined categories such as Person names, Organization names , Location names, miscellaneous(Date and others). It is a key technology of Information Extraction, Question Answering system, Machine Translations, Information Retrial etc. This paper reports about the development of a NER system for Telugu using Conditional Random field (CRF). Though this state of the art machine learning technique has been widely applied to NER in several well-studied languages, the use of this technique to Telugu languages is very new. The system makes use of the different contextual information of the words along with the variety of features that are helpful in predicting the four different named entities (NE) classes, such as Person name, Location name, Organization name, miscellaneous (Date and others). Keywords: Named entity, Conditional Random field, NE, CRF, NER, named entity recognition
CITED BY (2)  
1 Kulkarni, S., & Sagar, B. M. (2014). A Survey on Named Entity Recognition for South Indian Languages.
2 Althobaiti, M., Kruschwitz, U., & Poesio, M. (2012, September). Identifying named entities on a university intranet. In Computer Science and Electronic Engineering Conference (CEEC), 2012 4th (pp. 94-99). IEEE.
1 Google Scholar 
2 CiteSeerX 
3 Scribd 
4 SlideShare 
5 PdfSR 
http://en.wikipedia.org/wiki/Named_entity (accessed on 11 th Feb 2009)
Chinchor, N. 1997. MUC-7 Named entity task definition. In Proceedings of the 7th Message Understanding Conference (MUC-7)
Zobel, Justin and Dart, Philip. 1996. Phonetic string matching: Lessons from information retrieval. In Proceedings of the Eighteenth ACM SIGIR International Conference on Research and Development in Information Retrieval, Zurich, Switzerland, August 1996, pp. 166-173
Asif Ekbal et. al. “Language Independent Named Entity Recognition in Indian Languages”. IJCNLP, 2008.
Charles Sutton,Andrew McCallum, An Introduction to Conditional Random Fields for Relational Learning, Department of Computer Science University of Massachusetts, USA
CRF++: Yet Another CRF toolkit http://crfpp.sourceforge.net/ (accessed on 13 rd Feb 2009)
D. Pinto, A. McCallum, X. Wei, and W. B. Croft. Table extraction using conditional random fields. Proceedings of the ACM SIGIR, 2003.
D. Roth and W. Yih. Integer linear programming inference for conditional random fields. In Proc. of the International Conference on Machine Learning (ICML), pages 737–744, 2005
F. Sha and F. Pereira. Shallow parsing with conditional random fields. roceedings of Human Language Technology, NAACL 2003, 2003.
Finkel, Jenny Rose, Grenager, Trond and Manning, Christopher. 2005. “Incorporating Nonlocal Information into Information Extraction Systems by Gibbs Sampling.” Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL 2005), pp. 363-370.
G.V.S.Raju, B.Srinivasu, S.V.Raju and Kumar, Named Entity Recognition For Telugu using maximum entropy Model , Journal of Theoretical and Applied Information Technology (JATIT), Vol-13, No-2, pages 125-130.
Himanshu Agrawal et. al. “Part of Speech Tagging and Chunking with Conditional Random Fields”. IJCNLP, 2008
Kim, J. and Woodland, P.C. (2000a) “Rule Based Named Entity Recognition”. Technical Report CUED/F-INFENG/TR.385, Cambridge University Engineering Department, 2000.
Kristjansson T., Culotta A., Viola P., and McCallum A. 2004. Interactive Information Extraction with Constrained ConditionalRandom Fields. In Proceedings of AAAI-2004.
Lafferty J., McCallum A., and Pereira F. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the Eighteenth International Conference on Machine Learning. 2001.
Lafferty, McCallum, et al. “Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data”. 2001 .
Li W. and McCallum A. 2003. Rapid Development of Hindi Named Entity Recognition using Conditional Random Fields and Feature Induction. In Special issue of ACM Transactions on Asian Language Information Processing: Rapid Development of Language Capabilities: The Surprise Languages.
Malouf, Robert.2002 Markov models for language-independent named entity recognition. In Proceedings of CoNLL-2002 Taipei, Taiwan, pages 591-599.
Paul Viola and Mukund Narasimhan. Learning to extract information from semistructured text using a discriminative context free grammar. In Proceedings ofthe ACM SIGIR, 2005.
Pramod Kumar Gupta, Sunita Arora, An Approach for Named Entity Recognition System for Hindi: An Experim-ental Study, Proceedings of ASCNT – 2009, CDAC, Noida, India, pp. 103 – 108
Prasad Pingli et al. “A Hybrid Approach for Named Entity Recognition in Indian Languages”. IJCNLP, 2008.
T. W. Anderson and S. Scolve, Introduction to the Statistical Analysis of Data. Houghton Mifflin, 1978.
]Navbharat Times http://navbharattimes.indiatimes.com (accessed on 11th Feb 2009)
Professor G.V.S.Raju
IIET - India
letter2raju@gmail.com
Associate Professor B.Srinivasu
IIET - India
S. Viswanadha Raju
- India
Assistant Professor Allam Balaram
- India


CREATE AUTHOR ACCOUNT
 
LAUNCH YOUR SPECIAL ISSUE
View all special issues >>
 
PUBLICATION VIDEOS