X

Introduction to Information Retrieval

By Prof . Dwaipayan Roy   |   Indian Institute of Science Education and Research, Kolkata
Learners enrolled: 6423   |  Exam registration: 2679
ABOUT THE COURSE:

This course offers a comprehensive introduction to Information Retrieval (IR), the ‘science behind search engines’ and document retrieval systems. It covers fundamental concepts such as indexing, ranking, retrieval models, evaluation metrics, and relevance feedback. Learners will also be briefly introduce modern IR applications, including web search, recommender systems, and semantic retrieval techniques. The course balances theory with hands-on components to prepare students for both academic research and industry roles

PREREQUISITES: Basic knowledge of Java/Python. Knowledge of data structures and algorithms.

INDUSTRY SUPPORT: Information Retrieval is a foundational skill for multiple technology-driven industries, particularly those focusing on search engines, recommender systems, natural language processing, and large-scale data analytics. This course will be highly valued and recognized by companies involved in developing search infrastructure, enterprise solutions, and AI applications. Industries that will recognize the value of this course include:
● Google – for its core work in web search, document retrieval, and question answering systems.
● Microsoft – particularly in Bing, Azure Cognitive Search, and Office 365 search features.
● Amazon – in both product search and AWS services like Amazon Kendra.
Summary
Course Status : Ongoing
Course Type : Elective
Language for course content : English
Duration : 12 weeks
Category :
  • Computer Science and Engineering
Credit Points : 3
Level : Undergraduate/Postgraduate
Start Date : 19 Jan 2026
End Date : 10 Apr 2026
Enrollment Ends : 02 Feb 2026
Exam Registration Ends : 20 Feb 2026
Exam Date : 25 Apr 2026 IST
NCrF Level   : 4.5 — 8.0

Note: This exam date is subject to change based on seat availability. You can check final exam date on your hall ticket.


Page Visits



Course layout

Week 1: Introduction, Text processing, Document Representation, Tokenization, Term filtering, Term Document Incidence Matrix, Boolean Retrieval, Inverted Index, query processing, optimization, skip pointers

Week 2: Inverted Index, Storing the index, BSBI, SPIMI, Zipf's and Heaps' Law, Dictionary compression, Postings Compression

Week 3: Getting started with PyLucene - Indexing, Lucene Document and Field Options

Week 4: Indexing Wiki Movies, Non-English Text Analysis, Luke - Index viewer, Different options in Indexing, PyLucene Practice Programming

Week 5: Ranked Retrieval, Jaccard Similarity, Term Frequency, Scaling TF, TF-IDF weighting, Inner product, Euclidean Distance and their problem, Cosine Similarity, VSM Algorithm, SMART Notation

Week 6: VSM Problem Solving, Probabilistic Model - Introduction, Probability Ranking Principle, BIM for ranked retrieval, BM1, BM11, BM15, BM25, Dissecting BM25, BM25 vs VSM, BM25 for long queries, BM25F, BM25+, Why BM25 is still relevant?

Week 7: Language Model for Information Retrieval, Unigram Language Model, Estimating Document Language Model, Zero Frequency Problem and Introduction to Smoothing, Jelinek-Mercer and Dirichlet Smoothed Language Model, Comparing Smoothing with IDF and Summary of LM-based Retrieval

Week 8: Using KLD, JSD in Information Retrieval, PyLucene - Retrieval, PyLucene - Various Query Classes - TermQuery, PhraseQuery, TermRangeQuery, Numerical Range Query, PyLucene - Various Query Classes - PrefixQuery, BooleanQuery, WildcardQuery, FuzzyQuery, MatchAllDocsQuery

Week 9: Evaluation - Set-based evaluation metrics, Precision, Recall, F measure, Precision at K, R-Prec, Incorporating Ranking in Precision and Recall, AP, MAP, GMAP, MRR, Graded relevance, nDCG, Hypothesis Testing, Role of Evaluation Forums, Kappa measure

Week 10: Indexing and Retrieval of Benchmark Datasets, Indexing and Retrieval in TREC-like Benchmark Datasets, Evaluation using `TREC_EVAL`, Hypothesis testing in IR, Relevance Feedback - Rocchio, RLM

Week 11: Web Search And Crawler, Shingling, PageRank - Random Surfer's Algorithm, HITS, SEO

Week 12: Learning to Rank, Latent Semantic Indexing, An Introduction to Embeddings- Word, Sentence and Document embeddings, Applications of BERT and LLMs in IR

Books and references

1.Introduction to Information Retrieval C. Manning, P. Raghavan and H. Schutze https://nlp.stanford.edu/IR-book/information-retri eval-book.html
2.Information Retrieval: Implementing and Evaluating Search Engines S. Buttcher, C. L. A. Clarke, G. Cormack.

Instructor bio

Prof . Dwaipayan Roy

Indian Institute of Science Education and Research, Kolkata
Prof . Dwaipayan Roy received PhD in Computer Science from Indian Statistical Institute, Kolkata in June, 2019. Presently, I am working as an assistant professor at Indian Institute of Science and Education, Kolkata, after working as a Post-doctoral Researcher at GESIS – Leibniz Institute for the Social Sciences, Cologne, Germany.

Course certificate

The course is free to enroll and learn from. But if you want a certificate, you have to register and write the proctored exam conducted by us in person at any of the designated exam centres.
The exam is optional for a fee of Rs 1000/- (Rupees one thousand only).
Date and Time of Exams: April 25, 2026 Morning session 9am to 12 noon; Afternoon Session 2pm to 5pm.
Registration url: Announcements will be made when the registration form is open for registrations.
The online registration form has to be filled and the certification exam fee needs to be paid. More details will be made available when the exam registration form is published. If there are any changes, it will be mentioned then.
Please check the form for more details on the cities where the exams will be held, the conditions you agree to when you fill the form etc.

CRITERIA TO GET A CERTIFICATE

Average assignment score = 25% of average of best 8 assignments out of the total 12 assignments given in the course.
Exam score = 75% of the proctored certification exam score out of 100

Final score = Average assignment score + Exam score

Please note that assignments encompass all types (including quizzes, programming tasks, and essay submissions) available in the specific week.

YOU WILL BE ELIGIBLE FOR A CERTIFICATE ONLY IF AVERAGE ASSIGNMENT SCORE >=10/25 AND EXAM SCORE >= 30/75. If one of the 2 criteria is not met, you will not get the certificate even if the Final score >= 40/100.

Certificate will have your name, photograph and the score in the final exam with the breakup.It will have the logos of NPTEL and IISER Kolkata.It will be e-verifiable at nptel.ac.in/noc.

Only the e-certificate will be made available. Hard copies will not be dispatched.

Once again, thanks for your interest in our online courses and certification. Happy learning.

- NPTEL team
MHRD logo Swayam logo

DOWNLOAD APP

Goto google play store

FOLLOW US