Overview and language modeling:
Overview: Origins and challenges of NLP-Language and Grammar-Processing Indian Languages- NLP Applications-Information Retrieval. Language Modeling: Various Grammar- based Language Models-Statistical Language Model.
Textbook 1: Ch. 1,2
RBT: L1, L2, L3
Word level and syntactic analysis:
Word Level Analysis: Regular Expressions-FiniteState Automata-Morphological Parsing-Spelling Error Detection and correction-Words and Word classes-Part-of Speech Tagging. Syntactic Analysis: Context-free GrammarConstituency- Parsing-Probabilistic Parsing.
Textbook 1: Ch. 3,4
RBT: L1, L2, L3
Extracting Relations from Text: From Word Sequences to Dependency Paths:
Introduction, Subsequence Kernels for Relation Extraction, A Dependency-Path Kernel for Relation Extraction and Experimental Evaluation.
Mining Diagnostic Text Reports by Learning to Annotate Knowledge Roles:
Introduction, Domain Knowledge and Knowledge Roles, Frame Semantics and Semantic Role Labeling, Learning to Annotate Cases with Knowledge Roles and Evaluations.
A Case Study in Natural Language Based Web Search:
InFact System Overview, The GlobalSecurity.org Experience.
Textbook 2: Ch. 3,4,5
RBT: L1, L2, L3
Evaluating Self-Explanations in iSTART:
Word Matching, Latent Semantic Analysis, and Topic Models: Introduction, iSTART: Feedback Systems, iSTART: Evaluation of Feedback Systems,
Textual Signatures:Identifying Text-Types Using Latent Semantic Analysis to Measure the Cohesion of Text Structures:
Introduction, Cohesion, Coh-Metrix, Approaches to Analyzing Texts, Latent Semantic Analysis, Predictions, Results of Experiments.
Automatic Document Separation:A Combination of Probabilistic Classification and Finite-State Sequence Modeling:
Introduction, Related Work, Data Preparation, Document Separation as a Sequence Mapping Problem, Results.
Evolving Explanatory Novel Patterns for Semantically-Based Text Mining:
Related Work, A Semantically Guided Model for Effective Text Mining.
Textbook 2: Ch. 6,7,8,9
RBT: L1, L2, L3
INFORMATION RETRIEVAL AND LEXICAL RESOURCES:
Information Retrieval: Design features of Information Retrieval Systems-Classical, Non classical, Alternative Models of Information Retrieval – valuation Lexical Resources: World Net-Frame NetStemmers-POS Tagger- Research Corpora.
Textbook 1: Ch. 9,12
RBT: L1, L2, L3
Course outcomes:
The students should be able to:
Question paper pattern:
Text Books:
1. Tanveer Siddiqui, U.S. Tiwary, “Natural Language Processing and Information Retrieval”, Oxford University Press, 2008.
2. Anne Kao and Stephen R. Poteet (Eds), “Natural LanguageProcessing and Text Mining”, Springer-Verlag London Limited 2007.
Reference Books:
1. Daniel Jurafsky and James H Martin, “Speech and Language Processing: Anintroduction to Natural Language Processing, Computational Linguistics and SpeechRecognition”, 2nd Edition, Prentice Hall, 2008.
2. James Allen, “Natural Language Understanding”, 2nd edition, Benjamin/Cummingspublishing company, 1995.
3. Gerald J. Kowalski and Mark.T. Maybury, “Information Storage and Retrieval systems”, Kluwer academic Publishers, 2000.