X

Computational Genomics

By Prof. Vineet Kumar Sharma   |   IISER Bhopal
Learners enrolled: 2112   |  Exam registration: 512
ABOUT THE COURSE:
With the availability of large amount of biological data including sequences, genomes, transcriptomes, etc, it is necessary to impart skills in students and researchers for the comprehensive analysis of this data. Thus, the emphasis of this course is on building concepts and providing insights into the process of genomic analysis, and understanding the algorithms and basic genomic analysis methods, which are commonly needed for biological data analysis and computational genomics.

INTENDED AUDIENCE: BSc., MSc., MPhil, PhD

PREREQUISITES: Basic Biology Knowledge such as courses in Molecular Biology, Microbiology, Biochemistry, Genetics, etc

INDUSTRY SUPPORT: Any Bioinformatics or Life Science company will find this course relevant such as TCS Life Sciences, Reliance Life Sciences, WIPRO life sciences, Siemens Healthineers, etc.
Summary
Course Status : Completed
Course Type : Core
Language for course content : English
Duration : 12 weeks
Category :
  • Biological Sciences & Bioengineering
  • Computational Biology
Credit Points : 3
Level : Postgraduate
Start Date : 22 Jan 2024
End Date : 12 Apr 2024
Enrollment Ends : 05 Feb 2024
Exam Registration Ends : 16 Feb 2024
Exam Date : 28 Apr 2024 IST

Note: This exam date is subject to change based on seat availability. You can check final exam date on your hall ticket.


Page Visits



Course layout

Week 1: 
Day 1: Introduction to Computational genomics, Transcriptomics, Proteomics, Epigenomics, Metagenomics and their applications, The BIG data of biological sciences
Day 2: Organization of genetic information in prokaryotic and eukaryotic cell, genome maps, Eukaryotic genome structure, High-throughput technologies to translate this information into genomic data
Day 3: How genomic data is organized in public databases, Genomics web resources, Nucleic acid and protein sequence databases, gene expression databases, Metabolic and metabolomic databases. Examples: NCBI GenBank and Expasy, EBI, Ensembl, UCSC, KEGG

Week 2: 
Day 1: First, second generation sequencing technologies including Sanger and Illumina and their data output
Day 2: Long read sequencing and linked read sequencing (Nanopore, PacBio, TELL-Seq)
Day 3: Sequence formats: FASTA, GenBank, EMBL, XML, Fastq, fast5, etc., genomic database versions and archives, NCBI SRA, bio-project, accessions, data retrieval using wget, FTP, FileZilla, and scripts provided by the database team for genomic analysis

Week 3: 
Day 1: Introduction to Linux, basic commands for file handling
Day 2: Running jobs on Linux, processing, installation of genomic packages
Day 3: Introduction to R, commonly used packages, applications in genomic analysis

Week 4: 
Day 1: Introduction to genomes and packages for genomic analysis such as EMBOSS; Specifications of workstations needed for genomic analysis, Introduction to High Performance Computing and servers, and their need in genomic analysis
Day 2 : Overview and concepts in genomic and transcriptomic analysis of an organism with examples and case studies
Day 3: Sample collection, DNA extraction and quantification, and species identification of the species to be sequenced. RNA extraction and transcriptome sequencing approaches

Week 5: 
Day 1: Methods to estimate the amount of sequencing coverage needed for genomic assembly, use of hybrid sequencing approaches for appropriate coverage and assembly
Day 2: Short and long reads, paired-end reads, quality filtering of sequence data, Genome complexity assessment, Jellyfish and GenomeScope for generating k-mer count histograms and calculating genomic heterozygosity
Day 3: Concept of genome assembly, contigs, scaffolds, complete genome, draft genome, chromosomal level assembly, Genome assembly algorithms such as De-Bruijn graph, Overlap layout consensus (OLC), Hybrid assembly

Week 6: 
Day 1: Introduction to common assembly tools ABySS, SOAPdeneno, Flye, Supernova
Day 2: 10X genomic linked-read sequencing, use of proc10xG set of python scripts to pre-process the 10x Genomics raw reads, removal of barcode sequences
Day 3: Nanopore long reads analysis: Guppy for base calling of raw reads, adaptor removal using Porechop, Genome assembly workflow using three different assemblers: wtdbg, SMARTdenovo, and Flye, parameters for assembly

Week 7: 
Day 1: de novo assembly using Supernova, parameters, usage of genomic and transcriptomic reads to increase assembly contiguity
Day 2: Merging assemblies to create hybrid assembly, gap closing of assembly and polishing, fixation of small indels, base errors, and local misassemblies, determining the quality of assembly using N50, BUSCO scores, coverage etc.,
Day 3: Chromosomal level assembly using Hi-C, concept of reference genome, finished genome, draft genome, case studies

Week 8: 
Day 1: Annotation of repeats in final genome assembly using RepeatMasker, Determining the simple and complex repeat content of a genome
Day 2: de novo transcriptome assembly, Determining the coding gene set using MAKER pipeline
Day 3: Prediction of tRNA, rRNA and miRNA in a genome, Identification of metabolic pathways by KEGG

Week 9: 
Day 1: Comprehensive functional annotation of predicted genes or protein sequences by homology-based alignment using Blast or Blat, COGs, Gene ontology based annotation, Interproscan, PROSITE, Pfam, prints, patterns, motifs and fingerprints
Day 2: Evolutionary analysis using homologs, paralogs and orthologs, Multiple signs of adaptation, gene family expansion and contraction
Day 3: Taxonomic classification, marker sequences such as 16S rDNA and ITS, taxonomic hierarchy, Phylogeny reconstruction using multiple sequence alignment, Distance based approaches such as Neighbour joining, Character based approaches such as Maximum parsimony, Maximum likelihood, RAxML

Week 10: 
Day 1: Epigenetics, ChIp-seq, transcriptome and microarrays for regulation of expression
Day 2: Single cell genomics, 10X Chromium linked-reads and Illumina sequencing, single cell gene expression
Day 3: Application of multiomics approaches in human health and diseases such as cancer, diabetes, etc.

Week 11: 
Day 1: Prokaryotic genome sequencing and assembly approaches, draft and complete genomes, taxonomic identification
Day 2: Gene prediction approaches and common methods, annotation of a bacterial genome, t-RNA, rRNA, operon prediction
Day 3: Phylogenetic, metabolic and comparative analysis

Week 12: 
Day 1: Microbiome and Metagenome, Human, organismal and environmental microbiomes
Day 2: Sequencing and assembly of metagenomes, gene prediction, annotation, MAGs
Day 3: Taxonomic analysis using amplicon sequence variants, Statistical analysis

Books and references

1. Bioinformatics: Sequence and Genome Analysis by David Mount
2. Computational Genome Analysis: An Introduction by Richard C. Deonier, Simon Tavare, Michael S. Waterman, Springer India

Instructor bio

Prof. Vineet Kumar Sharma

IISER Bhopal
Prof. Vineet K. Sharma is an Associate Professor at Indian Institute of Science Education and Research Bhopal since July 2011. Prof. Sharma had obtained his PhD in Bioinformatics and Biomedical Sciences from IGIB, New Delhi in 2006. He worked as a Postdoctoral Researcher for two years and then joined as a Scientist at RIKEN, Japan for the next three years. He joined IISER Bhopal after returning to India. The focus of Prof. Sharma’s lab is to gain functional insights into the novel eukaryotic genomes, healthy human microbiome in Indian and other populations and to compare it with the selected disease microbiomes. For the first time in the world, Prof. Sharma’s groups have sequenced significant bird, animal and plant genomes including Peacock, Indian Tiger, Turmeric, Giloy, Aloe vera, Banyan tree, Peepal tree, four native cow breeds, and carried out the largest gut microbiome and scalp microbiome study in the Indian population. His group also employs machine learning and artificial intelligence approaches to carry out the large-scale human gut data analysis and develop new algorithms and software using the BIG data of Biology.

Course certificate

The course is free to enroll and learn from. But if you want a certificate, you have to register and write the proctored exam conducted by us in person at any of the designated exam centres.
The exam is optional for a fee of Rs 1000/- (Rupees one thousand only).
Date and Time of Exams: 28 April 2024 Morning session 9am to 12 noon; Afternoon Session 2pm to 5pm.
Registration url: Announcements will be made when the registration form is open for registrations.
The online registration form has to be filled and the certification exam fee needs to be paid. More details will be made available when the exam registration form is published. If there are any changes, it will be mentioned then.
Please check the form for more details on the cities where the exams will be held, the conditions you agree to when you fill the form etc.

CRITERIA TO GET A CERTIFICATE

Average assignment score = 25% of average of best 8 assignments out of the total 12 assignments given in the course.
Exam score = 75% of the proctored certification exam score out of 100

Final score = Average assignment score + Exam score

YOU WILL BE ELIGIBLE FOR A CERTIFICATE ONLY IF AVERAGE ASSIGNMENT SCORE >=10/25 AND EXAM SCORE >= 30/75. If one of the 2 criteria is not met, you will not get the certificate even if the Final score >= 40/100.

Certificate will have your name, photograph and the score in the final exam with the breakup.It will have the logos of NPTEL and IISER Bhopal. It will be e-verifiable at nptel.ac.in/noc.

Only the e-certificate will be made available. Hard copies will not be dispatched.

Once again, thanks for your interest in our online courses and certification. Happy learning.

- NPTEL team


MHRD logo Swayam logo

DOWNLOAD APP

Goto google play store

FOLLOW US