GPU Architectures and Programming

By Prof. Soumyajit Dey   |   IIT Kharagpur
Learners enrolled: 327
The course covers basics of conventional CPU architectures, their extensions for single instruction multiple data processing (SIMD) and finally the generalization of this concept in the form of single instruction multiple thread processing (SIMT) as is done in modern GPUs. We cover GPU architecture basics in terms of functional units and then dive into the popular CUDA programming model commonly used for GPU programming. In this context, architecture specific details like memory access coalescing, shared memory usage, GPU thread scheduling etc which primarily effect program performance are also covered in detail. We next switch to a different SIMD programming language called OpenCL which can be used for programming both CPUs and GPUs in a generic manner. Throughout the course we provide different architecture-aware optimization techniques relevant to both CUDA and OpenCL. Finally, we provide the students with detail application development examples in two well-known GPU computing scenarios.

INTENDED AUDIENCE  : Computer Science, Electronics, Electrical Engg students
PREREQUISITES  : Programming and Data Structure, Digital Logic, Computer architecture
INDUSTRY SUPPORT : NVIDIA, AMD, Google, Amazon and most big-data companies
Course Status : Upcoming
Course Type : Elective
Duration : 12 weeks
Start Date : 24 Jan 2022
End Date : 15 Apr 2022
Exam Date : 24 Apr 2022 IST
Category :
  • Computer Science and Engineering
  • Systems
Credit Points : 3
Level : Undergraduate/Postgraduate

Course layout

Week 1 :Review of Traditional Computer Architecture – Basic five stage RISC Pipeline, Cache Memory, Register File, SIMD instructions
Week 2 :GPU architectures - Streaming Multi Processors, Cache Hierarchy,The Graphics Pipeline
Week 3 :Introduction to CUDA programming
Week 4 :Multi-dimensional mapping of dataspace, Synchronization
Week 5 :Warp Scheduling, Divergence
Week 6 :Memory Access Coalescing
Week 7 :Optimization examples : optimizing Reduction Kernels
Week 8 :Optimization examples : Kernel Fusion, Thread and Block
Week 9 :OpenCL basics
Week 10 :OpenCL for Heterogeneous Computing
Week 11-12 :Application Design : Efficient Neural Network Training/Inferencing

Books and references

1. “Computer Architecture -- A Quantitative Approach” - John L.Hennessy and David A. Patterson
2. "Programming Massively Parallel Processors" - David Kirk and Wen-mei Hwu
3. Heterogeneous Computing with OpenCL” -- Benedict Gaster,Lee Howes, David R. Kaeli

Instructor bio

Prof. Soumyajit Dey

IIT Kharagpur
Dr. Soumyajit Dey joined the dept. of CSE, IIT Kgp in May 2013.He worked at IIT Patna as assistant professor in CSE dept. from beginning of Spring 2012 to end of Spring 2013. He received a B.E. degree in Electronics and Telecommunication Engg. from Jadavpur University, Kolkata in 2004. He received an M.S. followed by PhD degree in Computer Science from Indian Institute of Technology, Kharagpur in 2007 and 2011 respectively.

His research interests include 1) Synthesis and Verification of Safe,Secure and Intelligent Cyber Physical Systems, 2) Runtime Systems for Heterogeneous Platforms. More specifically, as part of his second research interest, he works on building GPGPU application scheduling frameworks considering both a) embedded real time applications, and b) GPGPU cluster level workloads. He has been successfully running a popular course titled “High Performance Parallel Programming” http://cse.iitkgp.ac.in/~soumya/hp3/hp3.html)
in CSE IITKGP for the last three years jointly with Prof. Pralay Mitra.

Course certificate

The course is free to enroll and learn from. But if you want a certificate, you have to register and write the proctored exam conducted by us in person at any of the designated exam centres.
The exam is optional for a fee of Rs 1000/- (Rupees one thousand only).
Date and Time of Exams: 24 April 2022 Morning session 9am to 12 noon; Afternoon Session 2pm to 5pm.
Registration url: Announcements will be made when the registration form is open for registrations.
The online registration form has to be filled and the certification exam fee needs to be paid. More details will be made available when the exam registration form is published. If there are any changes, it will be mentioned then.
Please check the form for more details on the cities where the exams will be held, the conditions you agree to when you fill the form etc.


Average assignment score = 25% of average of best 8 assignments out of the total 12 assignments given in the course.
Exam score = 75% of the proctored certification exam score out of 100

Final score = Average assignment score + Exam score

YOU WILL BE ELIGIBLE FOR A CERTIFICATE ONLY IF AVERAGE ASSIGNMENT SCORE >=10/25 AND EXAM SCORE >= 30/75. If one of the 2 criteria is not met, you will not get the certificate even if the Final score >= 40/100.

Certificate will have your name, photograph and the score in the final exam with the breakup.It will have the logos of NPTEL and IIT Kharagpur .It will be e-verifiable at nptel.ac.in/noc.

Only the e-certificate will be made available. Hard copies will not be dispatched.

Once again, thanks for your interest in our online courses and certification. Happy learning.

- NPTEL team

