Tuesday, November 25, 2008

The DNA Network

The DNA Network

Using a "distributed grid of undergraduate students" to annotate genomes [Discovering Biology in a Digital World]

Posted: 25 Nov 2008 07:36 PM CST

I just love this title! It's nerdy and cute, all at the same time.

I read about this in www.researchblogging.org and had to check out the paper and blog write up from The Beagle Project (BTW: some of you may be interested in knowing that The Beagle Project is not a blog about dogs.)

Read the rest of this post... | Read the comments on this post...

The role of neutral mutations in the evolution of phenotypes [The Seven Stones]

Posted: 25 Nov 2008 05:16 PM CST

Research highlight by Pedro Beltrao, University of California, San Francisco

MSB Research HighlightsIn a recent opinion piece, Andreas Wagner tries to reconcile the tension between proponents of neutral evolution and selectionism (Wagner 2008). He argues that "neutral mutations prepare the ground for later evolutionary innovation". Wagner illustrates this point using a network model of genotype-phenotype relationships (Wagner 2005). In a so-called 'neutral network', nodes correspond to distinct genotypes associated with the same phenotype and are connected by an edge if the respective genotypes differ only by a single mutation event (eg point mutation). Examples of neutral networks include different genotypes coding for RNA or protein structures. In this representation, highly connected networks correspond to robust phenotypes that are not very sensitive to changes in genotype. Wagner notes the zinc finger fold as an impressive example of a highly connected neutral network as its structure remains essentially the same even after mutating all but seven of its 26 residues to alanine.

Using this model, Wagner describes how highly robust phenotypes can lead to faster exploration of the genotype space. He further proposes that evolution of innovation occurs via cycles of exploration of nearly neutral spaces (dubbed neutralist regime) followed by a reduction in diversity once a new phenotype of higher fitness is discovered (selectionist regime).

Although these models and ideas were mostly developed using models of sequence to structure relationships, Wagner cites several examples suggesting that these concepts are equally valid for cellular phenotypes that depend on molecular interactions (ex. gene expression patterns).

As Wagner points out, in order to understand the evolution of innovation we must fully understand the mapping between genotypes to phenotypes. This is why it is important to continue to develop richer evolutionary models to link changes at the DNA level with changes in molecular structures, interactions and ultimately phenotypes with a quantifiable impact on fitness. This is an area where systems biology should play an important role.

Models of RNA and protein structure stability upon mutation have existed now for some time (Hofacker et al. 1994, Guerois et al. 2002). More recently the study of large amounts of genomic information and/or systematic interactions studies are providing us with accurate models for different types of molecular interactions (Berger et al. 2008, Burger & van Nimwegen 2008, Chen et al. 2008). In parallel to these, theoretical analysis has been use to aid in the understanding of cellular phenotypes (i.e. cell-cycle, signaling pathways etc) (Tyson et al. 2003). Connecting these different layers of abstraction is an important challenge that will allow us to better understand the origins of biological innovation.

Berger MF et al. (2008). Variation in homeodomain DNA binding revealed by high-resolution analysis of sequence preferences. Cell 133:1266-76

Burger L & van Nimwegen E (2008). Accurate prediction of protein-protein interactions from sequence alignments using a Bayesian method. Mol Syst Biol 4:165

Chen JR et al. (2008). Predicting PDZ domain-peptide interactions from primary sequences. Nat Biotechnol 26:1041-5

Guerois R, Nielsen JE & Serrano L (2002). Predicting changes in the stability of proteins and protein complexes: a study of more than 1000 mutations. J Mol Biol 320:369-87

Hofacker IL et al. (1994). Fast folding and comparison of RNA secondary structures. Monatshefte für Chemie / Chemical Monthly 125:167-188

Tyson JJ, Chen KC & Novak B (2003). Sniffers, buzzers, toggles and blinkers: dynamics of regulatory and signaling pathways in the cell. Curr Opin Cell Biol 15:221-31

Wagner A (2005). Robustness and Evolvability in Living Systems. Princeton University Press

Wagner A (2008). Neutralism and selectionism: a network-based reconciliation. Nat Rev Genet 9:965-974

Open Metagenomics Highlight - Metagenome Annotation using massively parallel undergrads. [The Tree of Life]

Posted: 25 Nov 2008 03:52 PM CST

Another fun metagenomics related paper in PLoS Biology. In it Pascal Hingamp et al discuss an Open Source, Open Science system for metagenome annotation (see PLoS Biology - Metagenome Annotation Using a Distributed Grid of Undergraduate Students).

They do this as part of a course on metagenome annotation. And the software for running this is all Open Source and available. They say
"Teachers wishing to use the Annotathon for their courses are invited to create new teams on the public server at http://annotathon.univ-mrs.fr/ (course logistics and team management are detailed in the instructor manual:http://annotathon.univ-mrs.fr/Metagenes/index.php/Instructor_Manual). The underlying open-source software (PHP and MySQL scripts, under a General Public License) is also available for local installation (https://launchpad.net/annotathon/). In addition, a special "Open Access" team is available for freelance students (volunteer instructors are most welcome to help oversee the Open Access team)."
IN a way this is a metagenomics version of the Undergraduate Genomics Research Initiative (UGRI) which was described in a PLoS Biology paper previously.

Well, this is really the end all be all for me combining so many things I like - genomics, metagenomics, annotation, OA publishing, open source software, etc. Nice job Pascal et al ...

Genetic Privacy: T.J. Maxx and the NIH [PredictER Blog]

Posted: 25 Nov 2008 12:23 PM CST

What do T.J. Maxx, the V.A. and NIH have in common? They have all been involved in handling personal data in such a way that individual privacy and confidentiality may have been violated. In December 2006 the financial information of over 40 million customers of T.J. Maxx and Marshall's was accessed by a hacker potentially exposing customers to identity theft. Also in 2006, a laptop computer containing personal information including names, addresses, dates of birth and social security numbers for 38,000 veterans went missing. This past August, large amounts of aggregate human DNA data that the National Institute of Health and other groups had made open to researchers around the world was removed from public view due to privacy concerns. The reason behind this removal was a study (doi:10.1371/journal.pgen.1000167) released by the Translational Genomics Research Institute and the University of California showing that using an algorithm and a microarray a curious individual could possibly identify whether or not an individual's DNA was in a genome wide association study (GWAS) database.

Why does this matter? NIH and other groups conducting GWA studies know that one of the core ethical components of their work, and a critical element for convincing people to participate in these studies, is being able to promise that their personal medical and genetic information will not be compromised and will never be used in such a way that might cause them harm. Being able to demonstrate, for example, that a representative of law enforcement armed with a DNA sample from a crime scene could search an existing NIH database for a sample match and be successful, undermines this promise in a way that might give us all pause. Researchers will still have access to the data, but they will now have to apply for access to the data and agree to protect the confidentiality of the data.

As researchers strive to use the information gained by the Human Genome Project for the improvement of health care and the prevention and treatment of disease, more and more of us will be asked to participate in efforts to establish enormous databases of our genotypic (DNA) and phenotypic (medical records) information. I still shop at Marshall's, but I am not sure I will be giving my DNA anytime soon. --Kimberly A. Quaid

Determining the genetic composition of an unknown founder population [Yann Klimentidis' Weblog]

Posted: 25 Nov 2008 11:29 AM CST

Using the HLA system, they describe a method to determine the haplotypes of an unknown founder population given information on the haplotypes of the admixed population and those of the other founder population.

Re-creation of the genetic composition of a founder population.
Klitz W, Maiers M, Gragert L.
Human Genetics 2008 Nov;124(4):417-21.
Abstract: Human ethnic groups are frequently comprised of two or more founder populations. One of these founding populations is often available for contemporary sampling. We describe a method for reconstructing the composition of a missing founder population using the highly informative haplotypes comprising the HLA system. An application of the method is demonstrated using bone marrow registry samples of African Americans. We use contemporary samples of African Americans and European Americans to derive haplotypes of the West African founder populations. This approach may also be useful for reconstructing ancestral haplotypes for regions elsewhere in the genome.

Do big brains call for special milk? [Yann Klimentidis' Weblog]

Posted: 25 Nov 2008 11:28 AM CST

Apparently not...
I can't believe no one has already studied this.

Evolutionary modifications of human milk composition: evidence from long-chain polyunsaturated fatty acid composition of anthropoid milks
Lauren A. Milligan and Richard P. Bazinet
Journal of Human Evolution 55: 6, December 2008, 1086-1095
Abstract: Brain growth in mammals is associated with increased accretion of long-chain polyunsaturated fatty acids (LCPUFA) in brain phospholipids. The period of maximum accumulation is during the brain growth spurt. Humans have a perinatal brain growth spurt, selectively accumulating docosahexaenoic acid (DHA) and other LCPUFA from the third trimester through the second year of life. The emphasis on rapid postnatal brain growth and LCPUFA transfer during lactation has led to the suggestion that human milk LCPUFA composition may be unique. Our study tests this hypothesis by determining fatty acid composition for 11 species of captive anthropoids (n = 53; Callithrix jacchus, Cebus apella, Gorilla gorilla, Hylobates lar, Leontopithecus rosalia, Macaca mulatta, Pan troglodytes, Pan paniscus, Pongo pygmaeus, Saimiri boliviensis, and Symphalangus syndactylus). Results are compared to previously published data on five species of wild anthropoids (n = 28; Alouatta paliatta, Callithrix jacchus, Gorilla beringei, Leontopithecus rosalia, and Macaca sinica) and human milk fatty acid profiles. Milk LCPUFA profiles of captive anthropoids (consuming diets with a preformed source of DHA) are similar to milk from women on a Western diet, and those of wild anthropoids are similar to milk from vegan women. Collectively, the range of DHA percent composition values from nonhuman anthropoid milks (0.03–1.1) is nearly identical to that from a cross-cultural analysis of human milk (0.06–1.4). Humans do not appear to be unique in their ability to secrete LCPUFA in milk but may be unique in their access to dietary LCPUFA.

Death of the MicroArray/oaCGH [The Gene Sherpa: Personalized Medicine and You]

Posted: 25 Nov 2008 11:12 AM CST

I just got out of a lecture given by Allen Bale M.D. a researcher and clinician here in the Department of Genetics at Yale.... I have come to the conclusion...... MicroArrays are going...

[[ This is a content summary only. Visit my website for full links, other content, and more! ]]

DIY Electrocompetent E. coli [Bitesize Bio]

Posted: 25 Nov 2008 08:34 AM CST

If you buy competent E.coli regularly, you’ll know that they are pretty expensive.

So the cost of screwing up a cloning or transformation experiment is pretty high in terms of money, as well as your time and sanity!

But you don’t need this extra worry because despite what their high commercial cost would suggest, making good quality competent E.coli is very easy. One morning’s work (with a bit of work ahead of time) is all it takes to make great electrocompetent E.coli prep.

In this article, I’ll describe a protocol for making electrocompetent E.coli that contains a variety of tricks and tweaks that make it possible to routinely get competencies of 1×10^10, with a little practice.

I’ll also describe a couple of quality control checks that you can do to validate each prep you make.

This protocol is for making a fairly large batch of cells but can be scaled down easily without loss of quality.

The tips and tweaks are as follows:

1. Keep everything fresh and chilled at all times
2. Wash the cells extensively in glycerol
3. Start with a high volume of cells so that the final competent cell aliquots are very concentrated.
4 Hand-wash the glassware before autoclaving to ensure that no detergent is present

These are all included in this protocol and the original references are listed at the bottom of this article.

The Protocol

1. Streak the strain you wish to make competent onto an LB plate and incubate overnight at the appropriate temperature.

2. The next afternoon, pick a single colony into 10 mL of LB in a sterile bottle and grow overnight in a shaking incubator at 37degC.

At this point chill the following in the freezer:
-falcon tubes or centrifuge pots (see step 5)
-1L of sterile 10% glycerol
-35 sterile cryovials, labelled with the strain name

3. In the morning, inoculate 800 mL of LB in a 2L baffled flask with 8 mL of the overnight culture and grow at 37degC in a shaking incubator.

4. Grow the culture to an OD of between 0.7 and 1.0 at 37degC. This should take around 2-3 hours.


5. Transfer 400mL of the culture to 8x pre-chilled 50 mL falcon tubes (or a suitably sized sterile centrifuge pot). Chill the tubes, and the remaining 400 mL on ice for 1 hour.

6. Centrifuge for 10 minutes at 4500 rpm and 4degC then very carefully remove the supernatant.

7. Pour the remaining 400mL of culture into the tubes and repeat step 6.

8. Add 5-10 mL of chilled 10% glycerol to each falcon tube and gently re-suspend the cells. Then make up the volume in each tube to 25 mL with 10% glycerol.

9. Centrifuge for 10 minutes at 4500 rpm and 4degC then remove the supernatant.

10. Repeat steps 8 and 9 twice times. On the final repeat, pool all of the cells into 1 Falcon tube, centrifuge as before then resuspend in a final volume of 6 mL in 10% glycerol.

11. Leave the cells on ice for 10 minutes then pipette 180 ul into each cryovial and transfer immediately to the -80degC freezer.

12. Keep the remaining cell suspension for quality control checks.

Quality control checks

A batch of competent cells like this is only good if you actually know how good they are so it is worth performing a couple of simple quality control checks.

1. Phage check

Streak 35ul of cell suspension onto an LB plate and grow overnight at 37degC. If there is no phage contamination, the cells will grow to form a thick, healthy lawn.

But if phage is present, circular clearings will appear or, if there is a very high amount of phage, there will be no visible growth at all.

2. Competency check

Transform 2x 50 ul of the cell suspension with 1ul of an empty plasmid (preferably pUC18) at 0.1ng/ul. Plate 5 and 50 ul on separate plates with the appropriate antibiotic selection and grow overnight.

Count the number of colonies on the plates and calculate the number of colonies formed per ug of DNA. (e.g. If you obtain 50 colonies on the 5ul plate, the efficiency is 1×10^8).

Normally, 1×10^8 to 1×10^10 cfu/ug DNA for standard 3-5kb plasmids should be easily achievable with this protocol.

Any questions/comments? Click on the link below to discuss this article in the Bitesize Bio Bistro.


  • Dower, W. J., Miller, J. F., and Ragsdale, C. W. (1988) High efficiency transformation of E. coli by voltage electroporation. Nucleic Acids Research, 16, 6127-6145.
  • Chuang, S. E., Chen, A. L., and Chao, C. C. (1995) Growth of E. coli at low temperature dramatically increase transformation frequency by electroporation. Nucleic Acids Research, 23(9), 1641.
  • Sheng, Y., Mancino, V., and Birren, B. (1995) Transformation of Escherichia coli with large DNA molecules by electroporation. Nucleic Acids Research, 23(1), 1990-1996.
  • Engberg J., Andersen P. S., Nielsen L. K., Dziegiel M., Johansen L. K., Albrechtsen B., (1996) Phage display libraries of murine and human antibody fragments. Molecular Biotechnology, 6, 287-310

Wikis with students: what I've learned about managing files and folders [Discovering Biology in a Digital World]

Posted: 25 Nov 2008 08:00 AM CST

It's funny but even though I work with data on a regular basis, I can't always predict the best way to manage data until I have my own data to manage.

My classroom wiki site is no exception.

Now, that I've been seriously using a wiki with my class, I've found that I should have set a few things up a bit differently.

Technorati Tags: , , , ,

Read the rest of this post... | Read the comments on this post...

Family Tree DNA Facts & Genes Newsletter [The Genetic Genealogist]

Posted: 25 Nov 2008 06:52 AM CST

present for myself

Image by Valerie Reneé via Flickr

Family Tree DNA has a new issue of Facts & Genes available on their website.  If you didn’t receive this newsletter but would like to receive it in the future, you can register here.

I especially like the “Case Study in Genetic Genealogy”, which is reprinted in full below.  I, like others, sometimes jump too quickly to the conclusion that there has been a non-paternal event in a line.

Case Study in Genetic Genealogy

When I [”I” being a hypothetic someone who has tested through a genetic genealogy company] first tested, I had no matches with my surname, and a match with another surname. I was told that there was an event in the past, breaking the link of the Y chromosome and the surname - an illegitimacy.

Several years later, I now have matches with 3 other surnames, and no match with my surname. I am thoroughly confused. How can you explain this?


A conclusion of an illegitimate event, or other events that can break the link between the Y chromosome and surname, such as adoption, infidelity, and name change, should only be made after significant research, and testing all or at least the majority of the family trees that exist for the surname.

A surname distribution map of your surname in the ancestral country shows approximately 150 origins for the surname. This is not unusual for an occupational surname.

In your Surname Project, there are only 15 different groups of results - representing approximately 10% of the possible results for the surname.

Most likely you will find a match with your surname as more people test.

Keep in mind that for any surname, some Y-DNA results will ramify, while others will have a smaller population. This may mean that the Surname Project has one large group who match and other smaller groups. This situation could also depend on where recruiting has taken place.

In addition, some Y-DNA results for the surname may not be represented in your country beyond your family tree, but may be found in the ancestral country or another destination country.

I realize that one of the surnames you match is found in the same county as your surname in the 1800s. This is not sufficient evidence that an event occurred, such as illegitimacy.

Unless there is documented evidence, and until a majority of the family trees for a surname and variants are tested, it is recommended that where a result has no matches yet with the surname, a conclusion is not made until sufficient testing occurs.

Reprinted courtesy of Family Tree DNA (Copyright 2008, Family Tree DNA).

Open Microbial Diversity: PLoS papers on using 454-Roche pyrosequencing for rRNA studies [The Tree of Life]

Posted: 24 Nov 2008 01:29 PM CST

Two new papers that just came out in PLoS Journals are definitely worth checking out. They are
Of course I am a bit biased I suppose as I am heavily involved in PLoS and also served as Academic Editor for these papers. But with that being said, I encourage people to check them out. In the PLoS Genetics paper from the labs of Mitch Sogin and David Relman labs discusses continued development of the use of 454-Roche pyrosequencing technology to carry out deep rRNA sampling. Anybody interested in characterizing a microbial community deeply in terms of what organisms are there should consider this approach.

And in the second paper, the same two labs present an in depth study using the 454-Roche rRNA sequencing to characterize the response of microbes in the human gut to antibiotic treatment. Though there have been a few other such studies this is the one that has the deepest characterization of the microbes present.

Note - one thing I find kind of humorous is that one of the authors is listed as Susan M. Huse in one of the papers (she is the first author on the PLoS Genetics paper) and Sue Huse in the other.

Call for Mendel's Garden #26 Submissions [evolgen]

Posted: 24 Nov 2008 08:00 AM CST

The 26th edition of Mendel's Garden will be hosted by A Free Man on December 7. If you have written a blog post about any topics in Genetics in the past month or so, send a link to Chris (chris[at]afreeman[dot]org) to be included in the carnival.

We're also looking for hosts for upcoming editions. If you would like to host the original genetics blog carnival, send me an email (evolgen-at-yahoo-dot-com). Every month from February onward is available.

Read the comments on this post...

No comments: