Monday, April 25, 2016

Today: National DNA Day!

Today, April 25, is the National DNA Day! The day commemorates the completion of the Human Genome Project (see PBS Nova video: Cracking the Code Of Life) in April 2003, and the discovery of DNA's double helix molecular structure in 1953 by James Watson and Francis Crick

63 years later, DNA analysis and manipulation play major roles in nearly every aspect of our lives. It opens the door for new possibilities in medical field such as preventive  medicine, pharmacogenomics, and gene therapy. With the help of computer science and technology (Bioinformatics), gene sequencing has accelerated its speed exponentially, and it's cost has also drop exponentially (even faster than the Moore's Law!). The result is that the available genomics data has grown at the rate faster than any research can handle. 

With this big amount of data, machine learning will be the best candidate technology to tackle this challenge. Machine learning has brought breakthroughs in computer vision, natural language procession, data mining in the past few years. Now, it will become a powerful tool in bioinformatics to revolutionize the field of biology and medicine. While celebrating the National DNA Day, let's start our adventure! 

Thursday, April 14, 2016

Machine Learning Applications in Biology


I visited Professor Weigang Qiu at Hunter College last Wednesday. He is leading the Evolutionary Bioinformatics Lab and his research is focusing on comparative analysis of multiple genomes of the Lyme disease pathogen. We discussed the possibility to run a research project to support high-school students conducting advanced research in bioinformatics field. I also explained to him our machine learning-related activities, and our intention to expand our research activities into the biological science. He believed that we can run projects focusing on "machine-learning applications in biology". The possible applications may include gene prediction, cancer type classification based on gene expression, and gene function prediction (e.g., Borrelia, Lyme pathogen lipoproteins). Professor Qiu will explore more in this area and come up more detailed project description for us. 

In the mean time, our focus will be in three aspects:
  1. Biology: Please review/preview chapters in your Biology/AP Biology textbook related to molecular genetics, gene expression, and genomes. If you need an AP Biology textbook, I can get one for you.
  2. Machine Learning: If you have not trained to understand the perceptron at code level, please arrange time to do it. 
  3. Bioinformatics: You also encouraged to take any MOOC bioinformatics course in the last post to build up the basic programming skill and knowledge of bioinformatics.
Please feel free to use our blog to share your study notes, programming examples, online resources, etc. You are also encouraged to invite like-minded friend to join our group.