The DNA Network

Anticipation for Carl Zimmer's New Book on E. coli [adaptivecomplexity's column]

Posted: 02 May 2008 08:23 PM CDT

Science writer Carl Zimmer is building anticipation for his new book, Microcosm: E. coli and The New Science of Life.

Google management opposes motion against censorship [the skeptical alchemist]

Posted: 02 May 2008 07:03 PM CDT

Now that you have got angry, please hold on and get ready to read the news (via il blog censurato).

Google's management is encouraging their stockholders to vote against two motions being raised at the annual stockholders' meeting - a motion against internet censorship, and another in favor of the creation of an internal Human Rights Commission, aimed at legally fighting violation of human rights in the countries where Google operates.

Stockholders' hearts might be bad for their pockets, apparently.

Also via il blog censurato, I found this very cool tool comparing English search results for China, the USA, Germany and France. Check out this screen shot, looking at the difference in results when we compare France and China in relations to the search "human rights".

This is a tool developed by Mark Meiss and Filippo Menczer at the Indiana University School of Informatics - you can find more information on the Censearchip website about how the tool works. Notice how, in the image, some of the terms most frequently arising from the French search about human rights are "China" and "Beijing". Not surprisingly, these terms do not emerge in the Chinese search.

But my favorite has got to be the image search, comparing US and Chinese search, using the term "Tienanmen". Here comes the screenshot.

I hope you are not surprised by what's missing on the left side...and no, I was not referring to the lesbian kiss.

Censorship on the net is everybody's business. I encourage you to cover, no matter what the main subject of your blog is, as much news about censorship as you can.

View blog reactions

Fun: Islandic DNA Runic Reading [Think Gene]

Posted: 02 May 2008 04:20 PM CDT

Last week, I visited deCODE Genetics in Reykjavík, Iceland. To culturally prepare (procrastinate work), I hit my trusty Wikipedia to research all about Iceland. Obviously, the first thing one needs to know when visiting Iceland is the ancient Viking runic system. So, for a bit of Icelandic-cultural bio-blog genomics flare: DNA Binary Rune Name Translation Unicode AA 0000 (00) fe wealth 0×16A0 AG 0001 (01) ur rain 0×16A2 AC 0010 (02) thurs giant (as in Thursday) 0×16A6 AT 0011 (03) as aesir 0×16AC GA 0100 (04) reidh journey 0×16B1 GG 0101 [...]

Tranche in the news: More wins for Open Data [business|bytes|genes|molecules]

Posted: 02 May 2008 02:41 PM CDT

Proteome Commons Tranche is one of the cooler resources on the web. Ever since I met Jayson Falkner, I have liked their approach to open data, and their early support for CC0. Looks like Tranche has hit the big time with the announcement that the resource has been chosen to host all mouse model proteomics data collected by the National Cancer Insititute. From the press release (which you can read in its entirety here).

The innovative scientific file sharing network and data repository, Tranche, has been chosen to host all Mouse Models proteomics data collected by the National Cancer Institute (NCI) Mouse Proteomic Technologies Initiative (MPTI) for public release.

In collaboration with Dr. Philip Andrews, University of Michigan, Department of Biological Chemistry and the Tranche team, the NCI MPTI project consortia deposited their mass spectrometry data sets into the Tranche data repository for storage and secure data sharing among participating research labs. See details about the MPTI projects below.

The mouse model data sets are already available on Tranche

Further reading
MPTI
Science Commons blog
Technorati Tags: Proteome Commons Tranche, Jayson Falkner, Open Data, Open Science, MPTI, NCI

ShareThis

Photo Caption Contest [Bayblab]

Posted: 02 May 2008 02:29 PM CDT

I stole this idea from Rob, but he's a bit busy to post it himself.

This photo of the Governator in 'research mode' is just begging for a witty caption, so have at it! After an inderminate amount of time, the funniest one will be chosen arbitrarily and the author showered with praise (sorry, our prize budget doesn't allow for much more than that).

(Photo source: Nature News)

Just to get things going, here's my entry: "It's not a too-mah."

Grad student motivation cabinet from Fisher [T Ryan Gregory's column]

Posted: 02 May 2008 01:59 PM CDT

Always on the lookout for new ways to motivate graduate students to get their work finished, I took note of this handy item while flipping through the Fisher catalogue in search of a flammables cabinet for the lab.

Confusion over Cloning [adaptivecomplexity's column]

Posted: 02 May 2008 01:24 PM CDT

Ethical debates over cloning are confusing enough, but even without the ethics issues the terminology of cloning is extremely confusing. Scientists bat around the word in many different contexts, often with subtly different meanings; if you don't know the biological background, it's easy to become disoriented.

1000th post! [Bayblab]

Posted: 02 May 2008 01:02 PM CDT

For the occasion I wish to share with you interesting tidbits about penguin sexuality:

Penguins are famous for engaging in animal homosexual behaviour. They even sometimes steal eggs from heterosexual couples and rear the youngs. A famous case at the NY zoo, and reports in zoos from all over the world have confirmed that not only are homosexual couples frequent, but the bonding between the gay partnersis strong . Case in point: a zoo in Germany tried importing Swedish chicks and even they couldn't lure the males away!

Penguins are also known to prostitute themselves for a rock. They will let other penguins exchange sexual favors for nest building material.

Penguins sometimes engage in inter-species sex. Perhaps unwillingly. sometimes sexually frustrated seals molest penguins on the beach. I am not even making that stuff up!

An MCMC-EM algorithm for statistical analysis of DNA evolution with neighbour-dependent substitution rates [Mailund on the Internet]

Posted: 02 May 2008 12:18 PM CDT

When estimating evolutionary parameters from DNA sequences, we usually assume that each site evolves independent of all the others. This is a choice of convenience — the mathematics is a lot simpler this way — and doesn’t actually match the data that well. In fact, we know several cases where the neighbouring sites greatly influence the evolutionary process. Just think of codons or CpG di-nucleotides.

The reason we like to consider the sites independent is that it gives us a very manageable model for the sequence evolution. Each site can be modelled as a continuous time finite state Markov model with four states and we can easily derive substitution probabilities from this. When the sites are not independent, then this idea breaks down. We cannot split the sequence into separate parts, so although we can still model the sequence evolution as a finite state continuous Markov model, the model now needs to consider the entire sequence as a whole. Instead of a 4-state Markov model we end up with a 4ⁿ-state Markov model, where n is the number of nucleotides. This is no longer all that manageable.

One of my colleagues at BiRC, Asger Hobolth, just published a paper addressing this problem:

A Markov Chain Monte Carlo Expectation Maximization Algorithm for Statistical Analysis of DNA Sequence Evolution with Neighbor-Dependent Substitution Rates

Asger Hobolth

Journal of Computational & Graphical Statistics, Volume 17, Number 1, 2008 , pp. 138-162(25)

Abstract

The evolution of DNA sequences can be described by discrete state continuous time Markov processes on a phylogenetic tree. We consider neighbor-dependent evolutionary models where the instantaneous rate of substitution at a site depends on the states of the neighboring sites. Neighbor-dependent substitution models are analytically intractable and must be analyzed using either approximate or simulation-based methods. We describe statistical inference of neighbor-dependent models using a Markov chain Monte Carlo expectation maximization (MCMC-EM) algorithm. In the MCMC-EM algorithm, the high-dimensional integrals required in the EM algorithm are estimated using MCMC sampling. The MCMC sampler requires simulation of sample paths from a continuous time Markov process, conditional on the beginning and ending states and the paths of the neighboring sites. An exact path sampling algorithm is developed for this purpose.

He did the work while at North Carolina State University, though, and not at BiRC, so I have no inside information and I just read the paper today and I’m in Oxford so I haven’t been able to discuss it with him yet. Anyway, I’ll try to briefly describe the results here, and maybe go back and correct any misunderstandings later, after I’ve discussed with Asger.

The problem with neighbour dependencies

The problem is actually quite easy to understand. If the substitution rate of one nucleotide depends on its neighbours — say CpG changes to TpG or CpA faster than C changes to T or G to A in general — then, when modelling the substitution rate of one of the nucleotides, I need to know what the second is. If the second nucleotide didn’t change over time, then there wouldn’t be a problem, and I could model the substitution rate of the first as a Markov chain and derive a 4×4 substitution probability matrix. But there is no reason to think that one nucleotide can evolve but the other will just stay put, so I need to model the evolution of both. At any point in time, I need to capture their joint state. So I would have to work with a 16×16 rate and probability matrix.

Of course, if there are three nucleotides, then the first and second might be dependent while the second and third are dependent, and that indirectly makes all three dependent of each other. I always need to model the full state, so I have 4ⁿ-states and need to work with a 4ⁿx4ⁿ rate matrix.

Theoretically, I just need to exponentiate the rate matrix, but in practise you cannot exponentiate a 4ⁿx4ⁿ matrix for any realistic sequence length n.

Sampling your way out of the problem

Asger solves this problem by calculating substitution probabilities by sampling instead of matrix exponentiation. Sampling doesn’t get rid of the problem of having to model the state of the entire string over time, of course, so he needs two tricks.

First he derives a way of sampling the evolutionary history of one nucleotide, conditional on the history of its neighbours. Some of the details here are not clear to me yet, so I’ll refrain from saying more — I might add details later after I’ve talked to Asger. Anyway, he derives a way of sampling the evolution of a single nucleotide, conditional on all other histories.

Secondly, and this now is the easy part, he constructs an MCMC to then sample from all histories. This is easy because if you can sample from conditionals like he can, then you can just use the Gibbs sampling machinery and Bob’s your uncle.

Ok, so now he can sample paths between two DNA sequences which means he can calculate expectations over the set of all paths. From these expectations he can then estimate maximum likelihood parameters using the expectation maximisation (EM) machinery.

There’s a few more tricks needed to deal with multiple sequences, but essentially, that is the idea.

No more from me now, I’m off to the pub with the Oxford guys…

Hobolth, A. (2008). A Markov Chain Monte Carlo Expectation Maximization Algorithm for Statistical Analysis of DNA Sequence Evolution with Neighbor-Dependent Substitution Rates. Journal of Computational & Graphical Statistics, 17(1), 138-162. DOI: 10.1198/106186008X289010

I am still alive [Mary Meets Dolly]

Posted: 02 May 2008 11:32 AM CDT

It may seem to all of you that I have disappeared. I am still here. I think about you all everyday. Unfortunately, things in my life have been super crazy lately causing me to have to quit my job.

The good news is, after my last day at the end of this month, I will be back in full blogging mode! I hope to see you then!

A gene controlling brain size and schizophrenia? [adaptivecomplexity's column]

Posted: 02 May 2008 08:58 AM CDT

When it comes to manipulating your body with drugs, you have no better friend than your G-protein coupled receptors. G-protein coupled receptors (ok, you can call them GPCRs) are proteins embedded within the membrane that makes up the outer border of our cells, and their exposed cell-surface position makes them great targets for drugs. If you've ever taken Claritin, Zantac, beta blockers like Lopressor, oxytocin, epinephrine, Zyprexa, antihistamines, some anti-HIV drugs, opioids, cannabis, or merely consumed a caffeinated beverage, then you've medicinally manipulated some of your GCPRs. Nearly 1,000 of our 24,000 genes encode GCPRs, which testifies to the major role this class of proteins plays in our physiology.

Given the importance of these receptors, it is not surprising that an interesting new study is describing a GPCR which may play a role in brain size, memory, and social interaction, and mutations in this GPCR could play an important role in schizophrenia.

Cancer Carnival #9 is Here! [Bayblab]

Posted: 02 May 2008 08:57 AM CDT

The 9th Edition of the Cancer Research Blog Carnival has gone live at Hematopoiesis. Your host, Alexey, has a nice collection of posts, including news from the recent AACR meeting. The next edition is set for June 6 so start penning your submissions. Contact the Bayblab if you're interested in hosting a future edition.

Thanks to Alexey for a great job and, as always, hats off to Ben for designing the logo.

Medical Bloggers’ Panel: Anyone interested? [ScienceRoll]

Posted: 02 May 2008 08:12 AM CDT

We are currently recruiting bloggers who would be interested in participating in a Medical Bloggers’ Panel during the Medicine 2.0 Congress taking place in Toronto, Canada this September, 4-5.

Some details:

Panels are 45-60 min presentations or debate sessions of a group of leaders in a field discussing a broad issue of general interest from various perspectives.

Please note that normally we will not be able to cover the registration fee, travel and accommodation for any of the panelists.

The abstract should contain up to 500 words, containing a short overview of the common issues and 1-2 sentences per presenter about the contribution of each panelist.

So far we have:

Jen McCabe Gorman - Health Management Rx
Sam Solomon - Canadian Medicine Blog
Berci Mesko - ScienceRoll

If you’re interested, please send me an e-mail (berci.mesko at gmail.com) so we could submit our abstract in the next couple of days.

Also check out Jen’s post about it.

This posting includes an audio/video/photo media file: Download Now

Evolutionary Applications and Evolution: Education and Outreach [T Ryan Gregory's column]

Posted: 02 May 2008 07:13 AM CDT

In case you have missed them, two issues are now available for each of the new evolution journals:

Evolutionary Applications

GraphJam on website insanity. [T Ryan Gregory's column]

Posted: 02 May 2008 07:00 AM CDT

I had mentioned GraphJam on the old Genomicron. It's like LOLcats for nerds. Erm, for even bigger nerds. This one seems highly apt, though ALL CAPS alone is usually sufficient.

Latest on Spectral Lines [Sciencebase Science Blog]

Posted: 02 May 2008 07:00 AM CDT

Spectral Floyd There have been 32 issues of my science news column on spectroscopynow.com since it was last officially called Spectral Lines, but I thought it was a nice name so occasionally resurrect it here when I highlight the latest research findings I cover on the site. It also gives me an excuse to re-use a logo I did in the early days of the site touting the line “David Bradley On Spec” (geddit?).

So this, week the first May issue is brought to you by the letter “F” with articles entitled: Fishing for amines, Fancy ants for arthritis, and Fixing chemotherapy. We also have, Rewiring brains therapeutically, Hybrid contact, and Boning up with Raman, but they don’t start with an “F” so required a separate sentence. Anyway…

Those fancy ants are perhaps not the first organism one would think to turn to for medical assistance, but researchers in Hong Kong and Japan have now used spectroscopy to study the chemical structures of various compounds extracted from Chinese medicinal ants that are thought to have anti-arthritic activity and be beneficial in treating hepatitis. There are lessons to be learned here, regarding the harvesting of traditional knowledge from folk medicine as well as yet another reason to try and conserve biodiversity the world over.

In Rewiring brains therapeutically, Edward Taub and colleagues at UAB use MRI scans to lay to rest once and for all the medical myth that the adult brain cannot grow new neurons. They show that a form of therapy, developed by Taub in the early 1990s for helping stroke patients recover use of paralysed limbs, so-called constraint induced (CI) therapy, really does induce a remodelling of the brain.

And in my Hybrid contact item, I discuss how early attempts to create protein-polymer hybrid materials often foundered because the mixed chemistry was simply not up to the task. Now, a UCB team has developed a new approach to hooking up natural proteins with synthetic polymers that could work with almost any protein and any polymer and could be used to develop new types of chemical sensor for medical diagnostics, quality control and environmental analysis. Related materials might also work as highly targeted drug-delivery systems, or even as the components of a future nanomachine.

A post from David Bradley Science Writer

Latest on Spectral Lines

Taking My Eye Off DNA [Eye on DNA]

Posted: 02 May 2008 04:36 AM CDT

Eye on DNA celebrates its first birthday this week! In celebration, I’ve decided reward myself by slowing down a bit.

As some of you already know, I am expecting my second child in a few weeks. Last night, I was reading The Last Lecture on my new Kindle (woohoo!), and this passage got my attention:

Ask yourself: Are you spending your time on the right things? You may have causes, goals, interests. Are they even worth pursuing?

In some ways, blogging is becoming an albatross around my neck. The wonderful aspects of blogging–learning, networking, educating–still outweigh the annoyances. But in my present condition, I’m not sure if I’m spending my time on the right things. The clock is ticking and my attention span is shortening along with my temper. And on top of welcoming a new member to our family, my family and I are also relocating to Singapore from London this summer.

While I’ll still be keeping my eye on DNA *cough* over these next few months, the rest of me will be quite busy doing other things. Instead of posting every day, I intend to spend much more of my usual blogging time having fun with my five-year-old and husband before our lives turn upside down.

I’ll still be here but maybe not jumping around as much as usual. (How can I when I’m about to pop?!)

Thank you all for a great year. I’ll be back before too long so don’t forget about me!

Books About DNA: Tomorrow’s Table [Eye on DNA]

Posted: 02 May 2008 03:02 AM CDT

Tomorrow’s Table: Organic Farming, Genetics, and the Future of Food by Pamela C. Ronald and R. W. Adamchak

From Dr. Ronald’s blog:

One of the major themes of our book “Tomorrow’s Table: Organic Farming, Genetics and the Future of Food” is that the judicious incorporation of two important strands of agriculture—genetic engineering and organic farming—is key to helping feed the growing population in an ecologically balanced manner. We are not suggesting that organic farming and GE alone will provide all the changes needed in agriculture. Other farming systems and technological changes, as well as modified government policies, undoubtedly are also needed. Yet it is hard to avoid the sense that organic farming and genetic engineering each will play an increasingly important role, and that they somehow have been pitted unnecessarily against each other. Our ambition in this book, therefore, is not to be comprehensive, but to identify roles for both GE and organic farming in the future of food production.

Another theme of the book is that the broader goals of ecologically responsible farming, and the adherence to those ideals, are more important than the methods used to develop new plant varieties. To this end, we have generated a list of key criteria
to help guide policy decisions about the use of GE in food and farming.

Around the Blogs [Bitesize Bio]

Posted: 02 May 2008 02:24 AM CDT

As per tradition, it’s time for the weekly roundup of informative blog posts outside of your regular Bite of Bio. This week, it’s striking that the posts to choose from have an extra supply of posts on the science, and light on the personal or social commentary that bloggers enjoy so much. So this week, we’re focusing on the science itself - visit the posts, and leave comments if you find them interesting.

New Research on How Visual Memory Works -
A paper about memory, just published, is an example of one incremental step in this process. In short, this research works out some of the fine detail at the molecular level for the process of forming visual memories.

Whose Genome? -
“What is a genome?” and “whose genome was sequenced?” are legitimate questions, and what follows is an attempt at clarification that is, by necessity, as much philosophical as scientific.

The Human Genome is Old News. Next Stop: the Human Proteome -
A Nature News article describes the initial plans for an ambitious effort to begin mapping the complete human proteome: the set of all human proteins expressed in all of our cells at all points during our development and adult life.

Widdle Biddy Stem Cells -
Very small embryonic-like stem cells pay provide a potential clue as to tissue renewal in adults.

The Individuality of Bacteria -
Larry debunks the common misconceptions about the biological study of life, which is that it promotes a determinism that denies individuality and freedom.

Where the Wild Microbes Are: A New Theory on How Pathogens Survive Food Processing -
Common sense says that washing and proper handling of our food should simply be enough to prevent illness outbreaks, but this isn’t always true.

Accuracy of Large-Scale Genome Scanning Services [The Genetic Genealogist]

Posted: 02 May 2008 02:00 AM CDT

Although the genome scanning services offered by companies such as 23andMe, deCODEme, and SeqWright have been front and center in the press the last few weeks, I’m sure that the following information will not be included in any of the reports.

Comparisons

Two different sources have concluded that the scanning service offered by 23andMe and deCODEme, who use different types of Illumina SNP Chips, are highly reproducible. In January 2008, Ann Turner compared the results of testing at deCODEme and 23andMe, and concluded that of the 560,163 SNPs that overlapped and had a “call” (meaning there was a measurable result), they agreed on 560,128 and disagreed on 35. Ann wrote in January:

In all of [the disagreed calls], one company would make a homozygous call while the other company made a heterozygous call - there were no cases where they made a completely discordant call. All in all, I’d say that is pretty impressive.

The second analysis comes from Antonio C B Oliveira at Longa Vista, a new blog that appears to have been created to present these results and related information. Oliveira obtained results from 23andMe and deCODEme and compared the results, which are available here. He concluded that of the 560,299 SNPs that overlapped and had a call, the two scans agreed on 560,276 and disagreed on 23. The 23 disagreed upon SNPs are listed by chromosome. Oliveira writes:

This error rate seems to me to be quite acceptable and I wonder if this is the rate expected in scientific studies using the same technology.

Program to Compare Your Results

Interestingly, Oliveira created a computer program to analyze the results for him, and he has graciously made that program available “as a Windows executable and the source code is provided under the GNU General Public License.”

Conclusions/Thoughts

Note that Oliveira’s results contained 136 more overlapping results, presumably because of fewer no-calls in the data. Is Illumina able to produce more calls as they gain experience with the process, or is this an expected amount of variation from person to person? I would be interested to see more results and comparisons to determine the answer to this question.

HT: Genetic Future. If you’re interested in genome sequencing or personalized genomics, you should be reading Genetic Future. I highly recommend adding the feed to your reader. Genetic Future gave a hat tip about this information to Kevin Kelly at The Quantified Self. There, Kelly points out that none of the SNPs in Oliviera’s analysis are currently associated with any physical phenotype or disease. I hope Kelly plans to do a comparative analysis of his results, as that would be an interesting addition to the information provided by Turner and Oliviera.

Biobootcamp 2008 [business|bytes|genes|molecules]

Posted: 02 May 2008 01:06 AM CDT

Perhaps I was premature in bemoaning the lack of a startup school for life scientists. Adam Rubenstein points to biobootcamp 2008. Not exactly what I had in mind, but knowing some of the people involved, I suspect it will be quite useful to people.

Image via Wikipedia

Technorati Tags: biobootcamp, entrepreneurship

ShareThis

On the road… [Mailund on the Internet]

Posted: 02 May 2008 12:47 AM CDT

I’m writing this sitting on Jotun Hein’s floor where I slept last night. I’m on a trip to the UK, starting here in Oxford. I arrived pretty late last night — the plane out of Billund was delayed — and I woke up early this morning when Jotun got up to prepare a talk for later today. Four hours of sleep is a bit on the small side, but the very strong coffee Jotun brews should get me through.

I’m looking forward to chatting with people here in Oxford. It’s been a while since last time I was here, and I’ve somewhat lost touch of what is going on. I’ll only be here over the weekend, though, but a lot of people come to the office in the weekend anyway, so it shouldn’t be a problem.

Monday morning I’m moving on, to London, to visit David Balding’s group to do some work on HapCluster.

Genetic breakthrough explains dangerously high blood glucose levels [Think Gene]

Posted: 01 May 2008 11:56 PM CDT

Canadian, French and British researchers have identified a DNA sequence that controls the variability of blood glucose levels in people. This is a potentially significant discovery because high blood glucose levels in otherwise healthy people often are indications of heart disease and higher mortality rates. The results will be published May 1 in the online [...]

Wakame waste: Composting polluted seaweed [Think Gene]

Posted: 01 May 2008 11:55 PM CDT

Bacteria that feed on seaweed could help in the disposal of pollutants in the world’s oceans, according to a new study by researchers in China and Japan. The discovery is reported in the International Journal of Biotechnology, an Inderscience publication. Shinichi Nagata of the Environmental Biochemistry Group, at Kobe University, Japan, working with colleagues at Shimane [...]

Stanford researchers synthesize compound to flush HIV out of hiding [Think Gene]

Posted: 01 May 2008 11:34 PM CDT

Any hunter will tell you that when your quarry goes into hiding, you have to flush it out to get a good shot at it. Such is the case with HIV, the virus that causes AIDS. Though antiretroviral “cocktails” can target an active infection, they cannot get at the virus when it retreats inside the host’s [...]

UF scientists discover compound that could lead to new blood pressure drugs [Think Gene]

Posted: 01 May 2008 11:33 PM CDT

University of Florida researchers have identified a drug compound that dramatically lowers blood pressure, improves heart function and — in a remarkable finding — prevents damage to the heart and kidneys in rats with persistent hypertension. The findings, which appear in today's (May 1) edition of the American Heart Association journal Hypertension, could lead to a [...]

agribusiness

Friday, May 2, 2008

The DNA Network

The DNA Network

The problem with neighbour dependencies

Sampling your way out of the problem

No comments:

Blog Archive

About Me