Tuesday, August 26, 2008

The DNA Network

The DNA Network

The DNA Network

Research Update: what do all these genes do? [Tomorrow's Table]

Posted: 26 Aug 2008 07:26 PM CDT

From a molecular geneticist's perspective, there are just too many genes in rice. Forty one thousand by my last count. Many seem to carrying out identical functions. If we knock out one, another one compensates. This makes it very confusing to elucidate the role of each gene.

Why bother with this work then? Well if we can figure out what each rice gene does this information can be used to develop new rice varieties with useful properties such as tolerance to flooding or resistance to disease.

In our new paper, we report a method to efficiently overcome the obstacle of gene redundancy by combining information on gene expression data with mutant phenotypes. We first developed a 45K oligo-microarray (PLoS One in press) and used it to examine genes expressed in the light vs. the dark. We then screened for rice lines with mutations in the strongly light-induced genes. This analysis effectively provided candidate functions for genes of previously unknown function.

We also examined the expression of a set of genes in specific biochemical pathways. This analysis allowed us to predict genes likely involved in these pathways. We validated this model by analyzing rice lines carrying mutations in ten of these genes.

We are now using this strategy to identify all the genes that control disease resistance in rice.


ronald, .Identification and Functional Analysis of Light-Responsive Unique Genes and Gene Family Members in Rice.

Art, Genomics and Daily Life [Retail Genomics: The Science and Business of Consumer Genomics & Consumer Bioinformatics]

Posted: 26 Aug 2008 04:49 PM CDT

A couple months ago, I started my investigation on the interaction between visual arts and genomics. The goal is to stimulate public debate on the possibilities and impacts of genomics on everybody's daily life.

In collaboration with my colleagues, I have created two digital frames in the first series of works. They were displayed at the ISMB conference in Toronto, 2008 and currently on display at the ECCB conference in Europe.

Work #1:

Portrait of James D. Watson in his own Word, 2008

Jared Flatow, Brian Chamberlain, and Simon Lin

(click on the picture to zoom in)

According to Wikipedia, "a portrait is a painting, photograph, sculpture, or other artistic representation of a person". Instead of simply using color pigments, we use unique portions of Dr. James Watson's DNA sequence to portrait himself. Dr. Watson was the discoverer of the structure of the DNA and helped to establish the Human Genome Project.DNA, as a primary genetic material, defines the molecular signature of oneself. Dr. Watson's DNA was fully sequenced and made public in 2007 by The Baylor College of Medicine Genome Sequencing Center, 454 Life Sciences Technology, and The Rothberg Institute. We used the SNPs, which define the small differences of DNA from person to person, to uniquely represent Dr. Watson. In order to do this, we took the variant allele base pairs from Dr. Watson's genome (Cold Spring Harbor Laboratory distribution, 6/6/2007) which had a sequence observation count greater than 12, and generated a portrait capturing his phenotype.

Work #2:

DNA and Community, 2008

Simon Lin and Jared Flatow
(click on the picture to zoom in)

Artists constantly explore the interactions between science and society. We looked into the public understanding of DNA in the Web 2.0 era by retrieving Creative Commons (CC)-licensed photos from the Flickr (photo-sharing) website. We retrieved 899 images using the topics of DNA and myself on April 6, 2008. We rearranged these images using a mosaic algorithm to reveal the hidden message of "DNA and Community". Traditional art uses oil and brush; we are using Python and the internet to experiment with new building blocks of CC-licensed photos. By integrating the photos through the lens of 899 individuals, we are investigating how people share their life stories (Flickr) and how people share their creative responsibility (CC license). It is interesting to note that our work is also licensed under CC and thus has 899 lines of acknowledgements.

Hypothesizing about Hypotheses [Bayblab]

Posted: 26 Aug 2008 02:35 PM CDT

I guess that it is important to properly pose your question/model ect when designing experiments and doing science in general. This article is right up Bayman's alley of interest. Published in Cell, the authors discuss the brief history of the hypothesis and some interesting thoughts about the way which science is actually done, without hypothesizing. Access to Cell is required to actually read the full text.
"Hypotheses are not to be regarded in experimental Philosophy" (Newton, 1721).

Science on Tap: Science communication at its best [Discovering Biology in a Digital World]

Posted: 26 Aug 2008 02:17 PM CDT

Last night we went to a pub to hear about some new technology for diagnostic testing. A wonderful speaker, Karen Hedine from Micronics came and told us about the work that her company is doing. She brought along a demonstration machine and passed the machine and several plastic test chambers around the pub so we could all take a look.

The technology, microfluidics, is fascinating stuff. I've written about it a little before( "From Louis Pasteur to "Lab on a chip"").

Read the rest of this post... | Read the comments on this post...

Extended DNA Testing Using Non-Autosomal Markers [The DNA Testing Blog]

Posted: 26 Aug 2008 02:09 PM CDT

As part of DDC’s commitment to provide the most comprehensive relationship testing in the industry, our laboratory has been investigating additional DNA markers that can be used to analyze complex relationship cases. To complement the standard set of STR markers that are routinely used in paternity testing, DDC scientists have recently been undertaking research and [...]

Twisted Tree of Life Award #1: Salk Institute Press Release on Kinases [The Tree of Life]

Posted: 26 Aug 2008 01:26 PM CDT

I am starting a new award here --- for people or sites that do something silly in regard to the "Tree of Life" but should know better. That is, this is for scientists or sciency sites that do something unseemly with the Tree of Life. And the first award goes to the Salk Institute for their press release relating to a paper on kinases in single celled choanoflagellates (OK - the pres release is a month and a half old but I was out sick - and I drafted this 7/8/08). In the press release, which discusses a PNAS paper by Gerard Manning and colleagues (Manning does some really great comparative work on kinases and helped me look at kinases in a few genomes such as that of Tetrahymena thermophila). I note - it seems Manning or someone has paid the OA fee for this paper so anyone can read it.

The paper seems both sound and interesting. And it has a really really cool tree figure.

But the press release has a few doozies. The worst (or best, I guess, depending on your point of view) is the following:
It commands a signaling network more elaborate and diverse than found in any multicellular organism higher up on the evolutionary tree, researchers at the Salk Institute for Biological Studies have discovered.
Yup that is right. A modern organism, living today is somehow "lower" on the tree of life than we are. Too bad the person who wrote the press release did not read Amy Harmon's recent Times story on evolution education. Or they could have gotten help from the high school science teacher Harmon featured, who taught his students about how modern organisms did not evolve from other modern organisms.

And for using one of my most hated metaphors in all of evolution (higher and lower organisms), Salk gets my first "Twisted Tree of Life Award"

Genomics to Save Your Life and Touch Your Heart [The Gene Sherpa: Personalized Medicine and You]

Posted: 26 Aug 2008 11:54 AM CDT

Ladies and Gentlemen. This Video is a must watch for all of my readers. Please watch and pass along. This wonderful artist will present her play in New York City. It will debut this October. I...

[[ This is a content summary only. Visit my website for full links, other content, and more! ]]

Webicina: Frequently Asked Questions [ScienceRoll]

Posted: 26 Aug 2008 09:21 AM CDT

After one and a half days I presented Webicina.com, an online service focusing on Medicine 2.0, I’m still receiving plenty of questions so I thought a FAQ page would make things clear.

What is Webicina?

Webicina is a privately held company aiming to build a bridge between physicians and e-patients. Webicina is also open to collaboration and new partners.

How can Webicina help physicians enter the web 2.0 era?

With personalized Medicine 2.0 Packages, step-by-step tutorials, webinars and online image building solutions. Webicina was designed to help physicians from all the medical specialties to get closer to the web 2.0 based world.

Who are e-patients?

They are patients trying to find reliable medical information on the web; they want to communicate with their doctors via e-mail or Skype; and they store their medical files online.

They will want you to answer their web-related questions and recommend them reliable medical sites. Will you be able to help them properly?

Is Webicina only for doctors?

Webicina is primarily for medical professionals and healthcare workers, but is open to other customers as well.

Why do we need a bridge between physicians and e-patients?

The number of e-patients is growing rapidly while the number of web-savvy doctors is not.

The basics of practicing medicine will never change dramatically due to new technologies or the world wide web, it will change the way healthcare is delivered.

Not because of the technology itself or the attention it receives, but because patients will need this kind of knowledge and expertise. And physicians of the 21st century must be ready and qualified to meet these expectations.

What does my membership get me? Which service should I choose?

If you would like an even more efficient medical practice; more productive research or pharma team, Medicine 2.0 packages are created for you.

Medicine 2.0 Package is a personalized set of web 2.0 tools designed to solve your problems. If you would like to know which part of the web you should follow, which websites and services could be useful in your work (from medical blogs through medical wikis to the educational purposes of Second Life), that is what Webicina can help you with. Your membership also gets you a bi-weekly newsletter about the latest improvements regarding your package.

If you would like to learn how to follow the medical papers of your field of interest more easily; how to create a medical blog or how to organize meetings in Second Life, Webicina’s online courses are made for you. You tell us what your problems are with effectiveness and will get access to the online materials and tutorials through which you can easily learn to use the tools and methods you need.

If your patients make a search for your name in search engines and cannot find content that can represent your practice properly, you should choose our online image building solutions.

If you would like your collegues or employees to know more about the possible implications of web 2.0 in your field, Webicina can help you through webinars, workshops and in-person presentations.

How do I pay?

Please contact us for pricing details.

What is next?

In the near future, Webicina also aims to serve as a community platform for those who are interested in the impact of web 2.0 on their specific fields of interest. Webicina will try to connect e.g. cardiologists to let them share thoughts and links; and let them educate each other about medicine 2.0.

If you have another question, please let us know.

NIH and English as the Language of Science [Bitesize Bio]

Posted: 26 Aug 2008 05:49 AM CDT

Last October, Nobel laureate and biochemist Arthur Kornberg passed away, and I’ve finally gotten around to reading his book For the Love of Enzymes.

While there’s a lot in the book to talk about, for this post I’m focusing on just one passing reference that Kornberg makes (pages 129-134) on NIH and the use of English as the language of science. In it, Kornberg is describing the factors that made NIH a huge success, including 7 major policy decisions, the first four of which I think are most profound:
Medium Image

  1. To expend most of the budget extramurally in grants to universities and private research organizations.
  2. To award these grants to individuals, young and old, rather than to departments or institutions.
  3. To make these awards purely on scientific merit as judged by a panel of peers drawn from outside government.
  4. To be unswayed by political or geographical considerations, national boundaries included.

The first three are what made American science so productive - they freed the creative energies of individual investigators, enabling scientific progress to begin making leaps and bounds in ways that the old system (direction of research determined by the senior administration) could never have accomplished.

But it’s the fourth one that caught my attention. In elaborating, Kornberg continues:

An aspect of the NIH grants program which deserves more notice is the award of grants for support of research outside the United States. [...] The advantages of this international spirit in promoting science proved to be far greater than we expected. In addition, we had not anticipated the enormous boost this altruism gave to medical sciences and technology in the United States. By rejuvenating European and Japanese scientists and laboratories, we were able to enlist the vast reservoirs of talent on all three continents. [...]

As a consequence of the rebirth of science centers in Europe and Japan, a tide of gifted students and senior investigators flowed into the United States. We welcomed them, and many remained to enrich American universities, research institutes, and industries. At the NIH laboratories in Bethesda alone, many thousands of foreign scientists (over 3,000 from Japan alone) received postdoctoral training and became loyal alumni upon returning to their native countries. These developments also helped to create markets for American technology and pharmaceutical products and to establish English as the international language of science.

This surprised me, although I had never given this much thought. I also am far, far too young to appreciate this era of biomedical research.

It also probably was the impetus for the view commonly held today, that science is a unifying and globalizing enterprise. International cooperation has an exceedingly strong example in the preoccupation of science, and this would not have been possible if not for policies such as these that did away with nationalism in Science.

Love thy neighbour… [Mailund on the Internet]

Posted: 26 Aug 2008 04:30 AM CDT

Thanks to bayblab for this video:

Matthew 7:12
“So in everything, do to others what you would have them do to you, for this sums up the Law and the Prophets.”

Luke 6:31
“Just as you want others to do for you, do the same for them.”


Replicating haplotype findings [Mailund on the Internet]

Posted: 26 Aug 2008 04:07 AM CDT

I have a small problem.

We have analysed some cancer data from DeCODE as part of the association mapping project PolyGene. We used Blossoc for this and we found some candidate regions worth examining further.

We have access to samples from Spain and the Netherlands, and we want to try to replicate the findings there. Now the problem is how to choose a strategy for replication.

Blossoc is a haplotype method that tries to infer the local genealogy in a region and then examines the clustering of phenotypes on this genealogy. The problem with such an approach is that you really need an entire region to replicate to try to do the same trick in the replication population. This means typing a lot of markers in the replication sample (expensive) and potentially correcting for a lot of tests (reducing power). It is not really the way to go.

We extended Blossoc to output what it considers the most important SNPs in the genealogy inference in each interesting region. This should contain the most important SNPs in the regions for the replication, and gave us 2-6 SNPs per candidate region (with only 43 SNPs all in all for three diseases, so not a small reduction).

We have typed these SNPs in the replication population, but now we need to figure out how to try to replicate the findings with only that.

It goes without saying that we need to decide exactly what to test for based on the original data. If we start searching for significant signals in the new data we are no longer replicating but data trawling and the risk of false positives drastically increases.

I have a program for listing all haplotype patterns in a data set and testing them for association, and I can run that on the old data to pick the patterns to test for in the new data.  There is a tradeoff, though, between association scores and the complexity of the pattern.  There is bound to be some overfitting in the old data, and we want to avoid that in the patterns to replicate.

It is a tricky problem…

Toothache [Mailund on the Internet]

Posted: 26 Aug 2008 04:03 AM CDT

I was supposed to be in Iceland this week, visiting DeCODE.  I’m not.  Late last week I got a toothache. I’ve been to both the dentist and the doctor, but they couldn’t figure out why I had it, so they’ve just put me on painkillers to see if it disappears by itself.  Pure symptom treatment.

I love painkillers.  It is a great invention.  For me, right now, they only work some of the time, but it is a lot better than nothing.

Still, I would much prefer treating the actual cause, but as long as that is unknown there is no choice.

It sucks, though, to be stuck at home when I should be analysing data at DeCODE.

Gene Genie #35 at MicrobiologyBytes [ScienceRoll]

Posted: 26 Aug 2008 03:29 AM CDT

The  35th edition is up at MicrobiologyBytes. A great compilation of articles and blogposts about human genetics and personalized medicine. Thank you, Alan J. Cann, for hosting Gene Genie.

It's high summer, and the internet is a dead as a doornail, but a few diligent bloggers are still slogging away at the keyboard while everyone else is at the beach.

Gene Genie is the blog carnival of genes and gene-related diseases. Our plan is to cover the whole genome before 2082 (it means 14-15 genes every two weeks). We accept articles on the news of genomics and clinical genetics. The news and articles of personalized genetics are also included. Check out Gene Genie for more about this unique field of medicine.

Many thanks to Ricardo Vidal for the logo!

Next edition is due to be published on the 31st of August 2008 at Human Genetics Disorders. Don't forget to submit your articles via the official page.

Let me know if you would like to host an edition.

Here are all the issues of Gene genie:

This posting includes an audio/video/photo media file: Download Now

Medicine 2.0 Carnival at Michelle vs the Med Student [ScienceRoll]

Posted: 26 Aug 2008 03:17 AM CDT

NY Times on the Challenge of Teaching Evolution in Florida [adaptivecomplexity's column]

Posted: 26 Aug 2008 12:26 AM CDT

Teaching evolution to a bunch of graduate students is easy; conveying something about evolution to a diverse group of high school students, some of whom have been coached to be openly hostile towards the subject, is a major challenge. Sunday's NY Times covered the efforts of one skilled Florida high school teacher. If only we had more teachers like this one.

The teacher explained that science deals with ideas about the natural world that are testable by observation - unlike say, miracles. One student objected: God can be proven, he insisted. We have fragments of Noah's ark from Mt. Ararat to prove it.

What do you say to that? My response would not have been charitable, which is why I'm not a high school teacher.


EMRs at OSCON [business|bytes|genes|molecules]

Posted: 25 Aug 2008 10:47 PM CDT

A medical record folder being pulled from the ...Image via Wikipedia One of the talks I wanted to attend at OSCON, but missed in the blur of booth duty and conflicting sessions was Aaron Thul’s talk on Electronic Medical Office Logistics (EMOL). The talk was a talk on PostgreSQL talk, but wrapped around what it takes to develop an EMOL solution.

The good news is that a lot of the OSCON talks are available online. Check out the pdf for Aaron’s talk, or follow on for some comments (based on just the slides) or do both.

The talk essentially goes through how a FLOSS stack (with one notable exception) can be used to build a system that can collect data from EMRs and other data sources, while maintaining regulatory standards and providing a degree of automation. I remember the first time I was sitting in on a talk on EMRs some years ago and it dawned on me that an EMR is a lot more than just a patients health records. It’s the system that makes a hospital tick, including billing, insurance, test management, inventory management, etc. No easy task, especially in the kinds of environments most facilities operate in.

The part I liked was that they were loading 10 GB of data, and another 10 GB of metadata daily, leading to a pretty darn large DB (~17 TB) with one of the tables almost at 2 TB. What the presentation, at least via slides does not tell you is the security challenge. While Aaron does a good job of going through the steps taken to protect the warehouse.

With all the data that’s building up in most biomedical systems now, whether they are hospital related or otherwise, with the need to potentially access patient data for clinical/translational research, and some of the new data/database paradigms, I wonder how the EMR field will evolve and what trends will be adopted. Postgres is a great, scalable system, so it’s a good choice. My primary interest is in the metadata. How can we go beyond what we do with available metadata today? Wish I could have asked Aaron.

I’d like to end with my favorite line in all the slides


Reblog this post [with Zemanta]

The Joys of DIY Dynamic Programming [Omics! Omics!]

Posted: 25 Aug 2008 10:16 PM CDT

For nearly the first decade of my bioinformatics career I carried around a dirty little secret -- well, at least at times I felt it was one. I had coded many things, I could explain many algorithms, but I had never coded a dynamic programming alignment algorithm -- the core to so much I did. I had slightly hacked one version (just to have it do an all-all comparison of a database, doing each possible pairing only once). Finally, for a bunch of reasons, I sat down and did it -- my very own Smith-Waterman implementation.

I'm reminded of this because a couple of weeks ago I rolled back my sleeves and knocked one out again. Now, just the fact I did this reveals a bit about me. I did find at least two freely available C# implementations on-line (e.g. the C# version of JAlign) and there is a plethora of C implementations. There is also Ewan Birney's magnificent Dynamite, pretty much the catch-all for the field (Dynamite is a programming kit for doing this; in effect a programming language for dynamic programming). But, partly as a point of pride & partly because I saw I'd need to hack the one C# copy I looked at in detail, I did it. I even wrote a schmancy version -- a simple cDNA to genomic sequence aligner with two classes of gaps (one being an intron, with a really trivial model of a splice junction -- I think it used dinucleotides) All coded in Perl -- no speed demon, but it solved the problem where we needed it.

Now, it took me a good few hours to do it -- better than the few days of the first time, but not instant. I can claim that this time I didn't fall back on any study aids, such as the many online descriptions or Eddy & Durbin's & co. very well written book.

The implementation says a lot about me too. I thought of many ways to code it and finally settled on one. For example, there is the question of how to represent the alignment matrix; I used a two-dimensional array scheme (actually implemented using dictionaries -- a holdover from my Perl-centric days) but I could have also made it a graph of nodes. There is also the actual thrashing through the matrix -- the algorithm is inherently recursive, but following familial idiosyncracies I wrote the code to use loops -- well, actually I completely waffled and implemented so it can use recursion, but actually loops through! The applications I'm considering are going to be short alignments, so I didn't worry about memory efficiency (who wants to be that will bite me back!) nor did I fixate on speed (care to double the bet?) -- indeed, I wrote it to allow all sorts of baroque variations, such as different penalties for opening gaps in the two different sequences & for basic profile-to-sequence alignments. Plus it is either Smith-Waterman (local) or Needleman-Wunsch-Sellers (global), with a simple toggle.

So now the pitch: If you are a bioinformatics programmer & you haven't written one, I urge you to do it. It's great practice & nothing illustrates an algorithm like trying to implement it. If you don't consider yourself a programmer, guess what? It's perhaps not the obviously easy first start, but just thinking about it will stretch your mind. Plus, you get a free bioinformatics Rorschach test from your implementation choices!

One last thought: who can think up (and execute) the most comically baroque -- but functional -- implementation of S-W/NWS? Has it already been done in PostScript? How about in a relational database (I've written some pretty baroque SQL this year, but I doubt I could tackle this)? S-W as an Excel spreadsheet? Coded with glider guns? A full description for a true Turing machine? Of course, the grand prize winner would clearly either be to build a DNA computer to compute an alignment -- but perhaps that could even be topped by implementing the algorithm with living cells as the alignment cells!

Building a new Alvin [The Tree of Life]

Posted: 25 Aug 2008 09:51 PM CDT

Quick post here. For those interested in Deep Sea research, you should check out the story by William Broad in the NY Times on building a replacement for Alvin (New Submersible to Expand Deep-Sea Exploration). Alvin is a wonderful little submarine that I and many others have relied upon for much of our research. But it definitely has some issues. And it looks like Woods Hole Oceanographic Institute is in the processof building a replacement.

Tree of Life Imagery at Starbucks [The Tree of Life]

Posted: 25 Aug 2008 04:02 PM CDT

Not the best resolution (damn that iPhone camera) but just thought I would post the picture I saw at a Starbucks in San Francisco where I was for a DARPA meeting discussion the "laws of biology". You see - even Starbucks is a fan of the Tree of Life.

No comments: