Wednesday, March 30, 2016

Bioinformatics MOOC Courses

There are a few free bioinformatics MOOC courses:
Coursera:
  1. Biology Meets Programming: Bioinformatics for Beginners,
  2. Bioinformatics: Introduction and Methods
  3. Genomic Data Science 1: Introduction to Genomic Technologies,
  4. Genomic Data Science 2: Genomic Data Science with Galaxy,
  5. Genomic Data Science 3: Python for Genomic Data Science,
  6. Genomic Data Science 4: Algorithms for DNA Sequencing,
  7. Genomic Data Science 5: Command Line Tools for Genomic Data Science,
  8. Genomic Data Science 6: Bioconductor for Genomic Data Science,
  9. Genomic Data Science 7: Statistics for Genomic Data Science, and
  10. Genomic Data Science 8: Genomic Data Science Capstone
 
EdX:
  1. Data Analysis for Life Sciences 1: Statistics and R,
  2. Data Analysis for Life Sciences 2: Introduction to Linear Models and Matrix Algebra
  3. Data Analysis for Life Sciences 3: Statistical Inference and Modeling for High-throughput Experiments
  4. Data Analysis for Life Sciences 4: High-Dimensional Data Analysis
  5. Data Analysis for Life Sciences 5: Introduction to Bioconductor: Annotation and Analysis of Genomes and Genomic Assays,
  6. Data Analysis for Life Sciences 6: High-performance Computing for Reproducible Genomics, and
  7. Data Analysis for Life Sciences 7: Case Studies in Functional Genomics.

When Biology Meets Computer Science


Here is my on-going study notes at the entry point of this fascinating multi-dimensional space - Bioinformatics.http://prn.fm/wp-content/uploads/2015/05/DNA.jpg
  1. Genome: a full set of chromosomes; all the inheritable traits of an organism.
  2. DNA:  Deoxyribonucleic acid (DNA) is a molecule that carries most of the genetic instructions used in the development, functioning and reproduction of all known living organisms and many viruses. Most DNA molecules consist of two biopolymer strands coiled around each other to form a double helix
  3. Nucleic Acid: DNA (along with RNA) is a nucleic acid; alongside carbohydrates, lipids, proteins, and nucleic acids compose the four major macromolecules essential for all known forms of life.
  4. Nucleotide: DNA is a long polymer made from repeating units called nucleotides.
    Each nucleotide is composed of a nitrogen-containing nucleobase—either , adenine (A), guanine (G), cytosine (C), or thymine (T)—as well as a monosaccharide sugar called deoxyribose and a phosphate group.
  5. Nucleotides are summarized in the table:                                                                                                   https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg4GMr7eOQYz7S7_U9XI3Cf-rXfoc2ifGiKIuIVSc4IXVFw8mgTsN4-Nn7B9dJ5s13CyRPznsQUs9jqv_2dKS-Pcp_iN-scviwAQE4CuWXCM7MT9n7UEsE1j338QkCeIs8cwFxCnl7C3X6d/s1600/Screen+Shot+2016-03-26+at+11.55.20+PM.png
  6. Complementary nucleotidesadenine and thymine are complements of each other, as are cytosine and guanine bind to each other in DNA.                                                                                                       
    https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhGuUjRQCPs-IfXFT8AmBzRcamMEPLApBEGw_dGsaQtl472yDtiKk9BRVdak7WsVovKBoD6SzJGUBvwjvLQm8BTZVceYl4lt7uun_5AH5L-GVYQgCIyJGYvppuB-E0DkfHFhqTp1osiONdR/s1600/Screen+Shot+2016-03-25+at+6.05.40+PM.png
    Complementary nucleotides: A & T, G & C
  7. Purine (C 5 H 4 N 4) and Pyrimidine (C 4 H 4 N 2) make up the two groups of nitrogenous bases, including the two groups of nucleotide bases Both purine and pyrimidine are heterocyclic aromatic organic compound.                                                                                                                                                                                      
    https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhveAX92pqWLrhNIpJ2EJl94AH_mQ2bce9U9_gbr-UXGjguU2N-pwqKZqaXyb4fzCJve51xtalKWrYREKM72Qg8jb7koVWuCH5fF8fu3BikUGK8qaKwUO-F1CscTpEbPAr_uA5F1ONkKD4_/s1600/purine+and+pyrimidine.png
    Purine (left) and Pyrimidine (right)
  8. Replication: Replication begins in a genomic region called the replication origin (denoted oriC) and is performed by molecular copy machines called DNA polymerases. Locating oriC presents an important task not only for understanding how cells re,plicate but also for various biomedical problems.
  9. Computational Analysis: To find the replication origin, computational methods are much faster than experimental approaches; in addition, the results of many experiments cannot be interpreted without computational analysis.
  10. k-mer: In computational genomics, k-mers refer to all the possible substrings of length k from a read obtained through DNA Sequencing. The amount of k-mers possible given a string of length, L, is L-k+1 whilst the number of possible k-mers given n possibilities (4 in the case of DNA e.g. ACTG) is n^k. K-mers are typically used during sequence assembly, but can also be used in sequence alignment.
    http://bioinformaticsalgorithms.com/images/Replication/patterncount.png