Deep Learning for Computer Vision

By Prof. Vineeth N Balasubramanian | IIT Hyderabad

Learners enrolled: 7730 | Exam registration: 683

ABOUT THE COURSE :
The automatic analysis and understanding of images and videos, a field called Computer Vision, occupies significant importance in applications including security, healthcare, entertainment, mobility, etc. The recent success of deep learning methods has revolutionized the field of computer vision, making new developments increasingly closer to deployment that benefits end users. This course will introduce the students to traditional computer vision topics, before presenting deep learning methods for computer vision. The course will cover basics as well as recent advancements in these areas, which will help the student learn the basics as well as become proficient in applying these methods to real-world applications. The course assumes that the student has already completed a full course in machine learning, and some introduction to deep learning preferably, and will build on these topics focusing on computer vision.

INTENDED AUDIENCE : Senior undergraduate students + post-graduate students

PREREQUISITES :

Completion of a basic course in Machine Learning
(Recommended, but not mandatory) Completion of a course in Deep Learning, or exposure to topics in neural networks
Knowledge of basics in probability, linear algebra, and calculus
Experience of programming, preferably in Python

If you are unsure whether you meet the background requirements for the course, please look at Assignment 0 (both theory and programming). If you are comfortable solving/following these assignments, you are ready for the course.

INDUSTRIES SUPPORT : All companies that use computer vision for their products/services (Microsoft, Google, Facebook, Apple, TCS, Cognizant, L&T, etc)

Summary

Course Status :	Completed
Course Type :	Elective
Language for course content :	English
Duration :	12 weeks
Category :	Computer Science and Engineering
Credit Points :	3
Level :	Undergraduate/Postgraduate
Start Date :	24 Jul 2023
End Date :	13 Oct 2023
Enrollment Ends :	07 Aug 2023
Exam Registration Ends :	18 Aug 2023
Exam Date :	28 Oct 2023 IST

Note: This exam date is subject to change based on seat availability. You can check final exam date on your hall ticket.

Page Visits

Course layout

Week 1: Introduction and Overview:
● Course Overview and Motivation; History of Computer Vision; Image Representation; Linear Filtering, Correlation, Convolution; Image in Frequency Domain
● (Optional) Image Formation; Image Sampling
Week 2: Visual Features and Representations:
● Edge Detection; From Edges to Blobs and Corners; Scale Space, Image Pyramids and Filter Bank; SIFT and Variants; Other Feature Spaces
● (Optional) Image Segmentation, Human Visual System
Week 3: Visual Matching:
● Feature Matching; From Points to Images: Bag-of-Words and VLAD Representations; Image Descriptor Matching; From Traditional Vision to Deep Learning
● (Optional) Hough Transform; Pyramid Matching
Week 4: Deep Learning Review:
● Neural Networks: A Review; Feedforward Neural Networks and Backpropagation; Gradient Descent and Variants; Regularization in Neural Networks; Improving Training of Neural Networks
Week 5: Convolutional Neural Networks (CNNs):
● Convolutional Neural Networks: An Introduction; Backpropagation in CNNs; Evolution of CNN Architectures for Image Classification; Recent CNN Architectures; Finetuning in CNNs
Week 6: Visualization and Understanding CNNs:
● Explaining CNNs: Visualization Methods; Early Methods (Visualization of Kernels; Backprop-to-image/Deconvolution Methods); Class Attribution Map Methods (CAM,Grad-CAM, Grad-CAM++, etc); Going Beyond Explaining CNNs
● (Optional) Explaining CNNs: Recent Methods
Week 7: CNNs for Recognition, Verification, Detection, Segmentation:
● CNNs for Object Detection; CNNs for Segmentation; CNNs for Human Understanding: Faces
● (Optional) CNNs for Human Understanding: Human Pose and Crowd; CNNs for Other Image Tasks
Week 8: Recurrent Neural Networks (RNNs):
● Recurrent Neural Networks: Introduction; Backpropagation in RNNs; LSTMs and GRUs; Video Understanding using CNNs and RNNs
Week 9: Attention Models:
● Attention in Vision Models: An Introduction; Vision and Language: Image Captioning; Self-Attention and Transformers
● (Optional) Beyond Captioning: Visual QA, Visual Dialog; Other Attention Models
Week 10: Deep Generative Models:
● Deep Generative Models: An Introduction; Generative Adversarial Networks; Variational Autoencoders; Combining VAEs and GANs
● (Optional) Beyond VAEs and GANs: Other Deep Generative Models
Week 11: Variants and Applications of Generative Models in Vision:
● GAN Improvements; Deep Generative Models across Multiple Domains; Deep Generative Models: Image Application
● (Optional) VAEs and Disentanglement; Deep Generative Models: Video Applications
Week 12:Recent Trends:
● Few-shot and Zero-shot Learning; Self-Supervised Learning; Adversarial Robustness; Course Conclusion
● (Optional) Pruning and Model Compression; Neural Architecture Search

Books and references

Deep learning is a rapidly evolving field, and we will hence use multiple sources of references, including books, blogs and articles, each of which will be pointed out at the end of each topic.
References for deep learning:
● Ian Goodfellow, Yoshua Bengio, Aaron Courville, Deep Learning, 2016
● Michael Nielsen, Neural Networks and Deep Learning, 2016
● Yoshua Bengio, Learning Deep Architectures for AI, 2009
References for computer vision:
● Richard Szeliski, Computer Vision: Algorithms and Applications, 2010.
● Simon Prince, Computer Vision: Models, Learning, and Inference, 2012.
● David Forsyth, Jean Ponce, Computer Vision: A Modern Approach, 2002.
Tools
We will use PyTorch for our assignments.
Other useful references:
● Bishop, Christopher. Neural Networks for Pattern Recognition. New York, NY: Oxford University Press, 1995. ISBN: 9780198538646.
● Bishop, Christopher M. Pattern Recognition and Machine Learning. Springer, 2006. ISBN 978-0-387-31073-2
● Duda, Richard, Peter Hart, and David Stork. Pattern Classification. 2nd ed. New York, NY: Wiley-Interscience, 2000. ISBN: 9780471056690.
● Mitchell, Tom. Machine Learning. New York, NY: McGraw-Hill, 1997. ISBN: 9780070428072.
● Richard Hartley, Andrew Zisserman, Multiple View Geometry in Computer Vision, 2004.
● David Marr, Vision, 1982.

Instructor bio

Prof. Vineeth N Balasubramanian

IIT Hyderabad

Vineeth N Balasubramanian is an Associate Professor in the Department of Computer Science and Engineering at the Indian Institute of Technology, Hyderabad (IIT-H). He was also the Founding Head of the Department of Artificial Intelligence at IIT-H from 2019-22, and a Fulbright-Nehru Visiting Faculty at Carnegie Mellon University in 2022-23. His research interests include deep learning, machine learning, and computer vision. His research has resulted in over 160 peer-reviewed publications at various international venues, including top-tier venues such as ICML, CVPR, NeurIPS, ICCV, KDD, AAAI, and IEEE TPAMI, with Best Paper Awards at recent venues such as CODS-COMAD 2022, CVPR 2021 Workshop on Causality in Vision, etc. He served as a General Chair for ACML 2022, and serves as a Senior PC/Area Chair regularly for conferences such as CVPR, ICCV, AAAI, IJCAI and ECCV. He is a recipient of the Google Research Scholar Award (2021), NASSCOM AI Gamechanger Award (2022, both Winner and Runner-up), Teaching Excellence Award at IIT-H (2017 and 2021), Research Excellence Award at IIT-H (2022), among others. For more details, please see https://people.iith.ac.in/vineethnb/.

Course certificate

The course is free to enroll and learn from. But if you want a certificate, you have to register and write the proctored exam conducted by us in person at any of the designated exam centres.
The exam is optional for a fee of Rs 1000/- (Rupees one thousand only).
Date and Time of Exams: 28 October 2023 Morning session 9am to 12 noon; Afternoon Session 2pm to 5pm.
Registration url: Announcements will be made when the registration form is open for registrations.
The online registration form has to be filled and the certification exam fee needs to be paid. More details will be made available when the exam registration form is published. If there are any changes, it will be mentioned then.
Please check the form for more details on the cities where the exams will be held, the conditions you agree to when you fill the form etc.

CRITERIA TO GET A CERTIFICATE

● Average assignment score = 25% with 18% from MCQ assignments and 7% from programming assignments
○ Assignment score = 18% of average of best 4 assignments out of the total 6 assignments given in the course.
○ Coding Assignment Score = 7% of average of best 3 assignments out of the total 5 assignments given in the course.
● Exam score = 75% of the proctored certification exam score out of 100.
● Final score = Average assignment score + Exam score.

YOU WILL BE ELIGIBLE FOR A CERTIFICATE ONLY IF AVERAGE ASSIGNMENT SCORE >=10/25 AND EXAM SCORE >= 30/75. If one of the 2 criteria is not met, you will not get the certificate even if the Final score >= 40/100.

Certificate will have your name, photograph and the score in the final exam with the breakup.It will have the logos of NPTEL and IIT Hyderabad .It will be e-verifiable at nptel.ac.in/noc.

Only the e-certificate will be made available. Hard copies will not be dispatched.

Once again, thanks for your interest in our online courses and certification. Happy learning.

- NPTEL team

SWAYAM Helpline / Support