X

Exploratory Data Analysis for Data Science with R Software (English)

By Prof. Shalabh   |   IIT Kanpur
Learners enrolled: 2143   |  Exam registration: 41
ABOUT THE COURSE:

Any data analysis requires statistical tools. The course describes the basic statistical tools and related concepts used in the exploratory data analysis. The use of analytical and graphical tools in data science will be explained. Their implementation using open-source R software will be demonstrated with the related software commands along with the interpretation of the outcomes of analytical and graphical tools.

INTENDED AUDIENCE: All UG students in Mathematics, Engineering, Management and Data Science. 

PREREQUISITES: Mathematics background up to class 10 is needed. Having some preliminary knowledge will be helpful but not necessarily mandatory

INDUSTRY SUPPORT: All analytical companies and industries involved in mathematical and statistical computations, programming and simulations and having R & D set up will use this course
Summary
Course Status : Upcoming
Course Type : Elective
Language for course content : English
Duration : 12 weeks
Category :
  • Computer Science and Engineering
Credit Points : 3
Level : Undergraduate
Start Date : 19 Jan 2026
End Date : 10 Apr 2026
Enrollment Ends : 26 Jan 2026
Exam Registration Ends : 13 Feb 2026
Exam Date : 25 Apr 2026 IST
NCrF Level   : 4.5 — 8.0

Note: This exam date is subject to change based on seat availability. You can check final exam date on your hall ticket.


Page Visits



Course layout

Week 1: Introduction to various topics and commands in R software
 
Week 2: Data Preparation, Basic concepts of exploratory statistical data analysis, frequency and frequency distribution, cumulative distribution functions and their use with R software
 
Week 3: Graphical procedures with various graphs in one dimension
 
Week 4: Graphical procedures with various graphs using ggplot2 package
 
Week 5: Measures of central tendency and their use with R software
 
Week 6: Measures of variation and their use with R software
 
Week 7: Moments and their use with R software
 
Week 8: Skewness, Kurtosis, Scaling of data, and Graphs for visualising the association of variables 
 
Week 9: Graphical procedures for association of variables, Analytical procedures for the association of continuous variables, correlation coefficients and their use with R software
 
Week 10: Rank correlation, Association of discrete variables and their use with R software
 
Week 11: Fitting of linear models, Handling text data and their use with R software
 
Week 12: Analysis of text data, Selection of samples and simple random sampling, Multivariate exploratory data analysis 

Books and references

  • Introduction to Statistics and Data Analysis- With Exercises, Solutions and Applications in R By Christian Heumann, Michael Schomaker and Shalabh, Springer,2022
  • Modern Data Science with R By Benjamin S. Baumer, Daniel T. Kaplan, Nicholas J. Horton, CRC Press, 2021
  • Text Mining with R: A Tidy Approach By Julia Silge and David Robinson, O'Reilly, 2017.

Instructor bio

Prof. Shalabh

IIT Kanpur
Dr. Shalabh is a Professor of Statistics at IIT Kanpur. His research areas of interest are linear models, regression analysis and econometrics. He has about 30 years of experience in teaching and research. He has developed several web based and MOOC courses in NPTEL and has conducted several workshops on statistics for teachers, researchers and practitioners. He has received several national and international award and fellowships. He has authored more than 100 research papers in national and international journals. He has written four books and one of the book on linear models is coauthored with Prof. C.R. Rao. Another seminal book on Statistics with R software has been downloaded more than 5.5 million times.

Course certificate

The course is free to enroll and learn from. But if you want a certificate, you have to register and write the proctored exam conducted by us in person at any of the designated exam centres.
The exam is optional for a fee of Rs 1000/- (Rupees one thousand only).
Date and Time of Exams: April 25, 2026 Morning session 9am to 12 noon; Afternoon Session 2pm to 5pm.
Registration url: Announcements will be made when the registration form is open for registrations.
The online registration form has to be filled and the certification exam fee needs to be paid. More details will be made available when the exam registration form is published. If there are any changes, it will be mentioned then.
Please check the form for more details on the cities where the exams will be held, the conditions you agree to when you fill the form etc.

CRITERIA TO GET A CERTIFICATE

Average assignment score = 25% of average of best 8 assignments out of the total 12 assignments given in the course.
Exam score = 75% of the proctored certification exam score out of 100

Final score = Average assignment score + Exam score

Please note that assignments encompass all types (including quizzes, programming tasks, and essay submissions) available in the specific week.

YOU WILL BE ELIGIBLE FOR A CERTIFICATE ONLY IF AVERAGE ASSIGNMENT SCORE >=10/25 AND EXAM SCORE >= 30/75. If one of the 2 criteria is not met, you will not get the certificate even if the Final score >= 40/100.

Certificate will have your name, photograph and the score in the final exam with the breakup.It will have the logos of NPTEL and IIT Kanpur .It will be e-verifiable at nptel.ac.in/noc.

Only the e-certificate will be made available. Hard copies will not be dispatched.

Once again, thanks for your interest in our online courses and certification. Happy learning.

- NPTEL team
MHRD logo Swayam logo

DOWNLOAD APP

Goto google play store

FOLLOW US