OVERVIEW AND LANGUAGE MODELLING:
Overview: Origins and challenges of NLP-Language and Grammar-Processing Indian Languages- NLP Applications-Information Retrieval. Language Modelling: Various Grammar- based Language Models-Statistical Language Model.
WORD LEVEL AND SYNTACTIC ANALYSIS:
Word Level Analysis: Regular Expressions-FiniteState Automata-Morphological Parsing-Spelling Error Detection and correction-Words and Word Classes-Part-of Speech Tagging. Syntactic Analysis: Context-free Grammar-Constituency- ParsingProbabilistic Parsing.
Extracting Relations from Text:
From Word Sequences to Dependency Paths: Introduction, Subsequence Kernels for Relation Extraction, A Dependency-Path Kernel for Relation Extraction and Experimental Evaluation. Mining Diagnostic Text Reports by Learning to Annotate Knowledge Roles: Introduction, Domain Knowledge and Knowledge Roles, Frame Semantics and Semantic Role Labelling, Learning to Annotate Cases with Knowledge Roles and Evaluations. A Case Study in Natural Language Based Web Search: InFact System Overview, The GlobalSecurity.org Experience.
Evaluating Self-Explanations in iSTART:
Word Matching, Latent Semantic Analysis, and Topic Models: Introduction, iSTART: Feedback Systems, iSTART: Evaluation of Feedback Systems, Textual Signatures: Identifying Text-Types Using Latent Semantic Analysis to Measure the Cohesion of Text Structures: Introduction, Cohesion, Coh-Metrix, Approaches to Analysing Texts, Latent Semantic Analysis, Predictions, Results of Experiments. Automatic Document Separation: A Combination of Probabilistic Classification and Finite-State Sequence Modelling: Introduction, Related Work, Data Preparation, Document Separation as a Sequence Mapping Problem, Results. Evolving Explanatory Novel Patterns for Semantically Based Text Mining: Related Work, A Semantically Guided Model for Effective TextMining.
INFORMATION RETRIEVAL AND LEXICAL RESOURCES:
Information Retrieval: Design features of Information Retrieval Systems-Classical, Non classical, Alternative Models of Information Retrieval – valuation Lexical Resources: World Net-Frame Net- Stemmers-POS Tagger- Research Corpora.
Course outcomes:
At the end of the course the student will be able to:
Question paper pattern:
The SEE question paper will be set for 100 marks and the marks scored will be proportionately reduced to 60.
Textbook/ Textbooks
1 Natural Language Processing and Information Retrieval TanveerSiddiqui, U.S. Tiwary Oxford University Press 2008
2 Anne Kao and Stephen R. Potee Natural LanguageProcessing andText Mining Springer-Verlag London Limited 2007
Reference Books
1 Speech and Language Processing: Anintroduction to Natural Language Processing, Computational Linguistics and SpeechRecognition Daniel Jurafsky and James H Martin Prentice Hall 2008 2nd Edition
2 Natural Language Understanding James Allen Benjamin/Cumming spublishing company 2nd edition, 1995
3 Information Storage and Retrieval systems Gerald J. Kowalski and Mark.T. Maybury Kluwer academic Publishers 2000.
4 Natural Language Processing with Python Steven Bird, Ewan Klein, Edward Loper O'Reilly Media 2009
5 Foundations of Statistical Natural Language Processing Christopher D.Manning and HinrichSchutze MIT Press 1999