18CS743 Natural Language Processing syllabus for CS



A d v e r t i s e m e n t

Module-1 Overview and language modeling 8 hours

Overview and language modeling:

Overview: Origins and challenges of NLP-Language and Grammar-Processing Indian Languages- NLP Applications-Information Retrieval. Language Modeling: Various Grammar- based Language Models-Statistical Language Model.

Textbook 1: Ch. 1,2

RBT: L1, L2, L3

Module-2 Word level and syntactic analysis 8 hours

Word level and syntactic analysis:

Word Level Analysis: Regular Expressions-FiniteState Automata-Morphological Parsing-Spelling Error Detection and correction-Words and Word classes-Part-of Speech Tagging. Syntactic Analysis: Context-free GrammarConstituency- Parsing-Probabilistic Parsing.

Textbook 1: Ch. 3,4

RBT: L1, L2, L3

Module-3 Extracting Relations from Text: From Word Sequences to Dependency Paths: 8 hours

Extracting Relations from Text: From Word Sequences to Dependency Paths:

Introduction, Subsequence Kernels for Relation Extraction, A Dependency-Path Kernel for Relation Extraction and Experimental Evaluation.

Mining Diagnostic Text Reports by Learning to Annotate Knowledge Roles:

Introduction, Domain Knowledge and Knowledge Roles, Frame Semantics and Semantic Role Labeling, Learning to Annotate Cases with Knowledge Roles and Evaluations.

A Case Study in Natural Language Based Web Search:

InFact System Overview, The GlobalSecurity.org Experience.

Textbook 2: Ch. 3,4,5

RBT: L1, L2, L3

Module-4 Evaluating Self-Explanations in iSTART 8 hours

Evaluating Self-Explanations in iSTART:

Word Matching, Latent Semantic Analysis, and Topic Models: Introduction, iSTART: Feedback Systems, iSTART: Evaluation of Feedback Systems,

Textual Signatures:Identifying Text-Types Using Latent Semantic Analysis to Measure the Cohesion of Text Structures:

Introduction, Cohesion, Coh-Metrix, Approaches to Analyzing Texts, Latent Semantic Analysis, Predictions, Results of Experiments.

Automatic Document Separation:A Combination of Probabilistic Classification and Finite-State Sequence Modeling:

Introduction, Related Work, Data Preparation, Document Separation as a Sequence Mapping Problem, Results.

Evolving Explanatory Novel Patterns for Semantically-Based Text Mining:

Related Work, A Semantically Guided Model for Effective Text Mining.

Textbook 2: Ch. 6,7,8,9

RBT: L1, L2, L3

Module-5 INFORMATION RETRIEVAL AND LEXICAL RESOURCES 8 hours

INFORMATION RETRIEVAL AND LEXICAL RESOURCES:

Information Retrieval: Design features of Information Retrieval Systems-Classical, Non classical, Alternative Models of Information Retrieval – valuation Lexical Resources: World Net-Frame NetStemmers-POS Tagger- Research Corpora.

Textbook 1: Ch. 9,12

RBT: L1, L2, L3

 

Course outcomes:

The students should be able to:

  • Analyze the natural language text.
  • Define the importance of natural language.
  • Understand the concepts Text mining.
  • Illustrate information retrieval techniques.

 

Question paper pattern:

  • The question paper will have ten questions.
  • There will be 2 questions from each module.
  • Each question will have questions covering all the topics under a module.
  • The students will have to answer 5 full questions, selecting one full question from each module.

 

Text Books:

1. Tanveer Siddiqui, U.S. Tiwary, “Natural Language Processing and Information Retrieval”, Oxford University Press, 2008.

2. Anne Kao and Stephen R. Poteet (Eds), “Natural LanguageProcessing and Text Mining”, Springer-Verlag London Limited 2007.

 

Reference Books:

1. Daniel Jurafsky and James H Martin, “Speech and Language Processing: Anintroduction to Natural Language Processing, Computational Linguistics and SpeechRecognition”, 2nd Edition, Prentice Hall, 2008.

2. James Allen, “Natural Language Understanding”, 2nd edition, Benjamin/Cummingspublishing company, 1995.

3. Gerald J. Kowalski and Mark.T. Maybury, “Information Storage and Retrieval systems”, Kluwer academic Publishers, 2000.

Last Updated: Tuesday, January 24, 2023