Overview and language modeling:
Overview: Origins and challenges of NLPLanguage and Grammar-Processing Indian Languages- NLP ApplicationsInformation Retrieval. Language Modeling: Various Grammar- based Language Models-Statistical Language Model.
Word level and syntactic analysis:
Word Level Analysis: Regular ExpressionsFinite-State Automata-Morphological Parsing-Spelling Error Detection and correction-Words and Word classes-Part-of Speech Tagging. Syntactic Analysis: Context-free Grammar-Constituency- Parsing-Probabilistic Parsing.
Extracting Relations from Text: From Word Sequences to Dependency Paths:
Introduction, Subsequence Kernels for Relation Extraction, A Dependency-Path Kernel for Relation Extraction and Experimental Evaluation.
Mining Diagnostic Text Reports by Learning to Annotate Knowledge Roles:
Introduction, Domain Knowledge and Knowledge Roles, Frame Semantics and Semantic Role Labeling, Learning to Annotate Cases with Knowledge Roles and Evaluations.
A Case Study in Natural Language Based Web Search:
InFact System Overview, The GlobalSecurity.org Experience.
Evaluating Self-Explanations in iSTART: Word Matching, Latent Semantic Analysis, and Topic Models:
Introduction, iSTART: Feedback Systems, iSTART: Evaluation of Feedback Systems,
Textual Signatures:Identifying Text-Types Using Latent Semantic Analysis to Measure the Cohesion of Text Structures:
Introduction, Cohesion, CohMetrix, Approaches to Analyzing Texts, Latent Semantic Analysis, Predictions, Results of Experiments.
Automatic Document Separation: A Combination of Probabilistic Classification and Finite-State Sequence Modeling:
Introduction, Related Work, Data Preparation, Document Separation as a Sequence Mapping Problem, Results.
Evolving Explanatory Novel Patterns for Semantically-Based Text Mining:
Related Work, A Semantically Guided Model for Effective Text Mining.
INFORMATION RETRIEVAL AND LEXICAL RESOURCES:
Information Retrieval: Design features of Information Retrieval Systems-Classical, Non classical, Alternative Models of Information Retrieval – valuation Lexical Resources: World Net-Frame Net- Stemmers-POS Tagger- Research Corpora.
Course outcomes:
The students should be able to:
• Analyze the natural language text.
• Generate the natural language.
• Do Text mining.
• Apply information retrieval techniques.
Question paper pattern:
Text Books:
1. Tanveer Siddiqui, U.S. Tiwary, “Natural Language Processing and Information Retrieval”, Oxford University Press, 2008.
2. Anne Kao and Stephen R. Poteet (Eds), “Natural LanguageProcessing and Text Mining”, Springer-Verlag London Limited 2007.
Reference Books:
1. Daniel Jurafsky and James H Martin, “Speech and Language Processing: Anintroduction to Natural Language Processing, Computational Linguistics and SpeechRecognition”, 2nd Edition, Prentice Hall, 2008.
2. James Allen, “Natural Language Understanding”, 2nd edition, Benjamin/Cummingspublishing company, 1995.
3. Gerald J. Kowalski and Mark.T. Maybury, “Information Storage and Retrieval systems”, Kluwer academic Publishers, 2000.