Wednesday, June 16, 2010

Genome compexity and the number of genes

This month marks the 10th anniversary of (one of) the announcement(s) of the completion of the human genome. Several have taken this occasion to comment on the successes of genome-based biomedical research, or lack thereof.

At "The Loom," Carl Zimmer has a neat graphic depicting estimates of the number of genes present in the human genome. This number has, more or less, steadily fallen as progress in sequencing and then in filling in remaining gaps in the reference genome sequence has moved forward. My comments to that blog entry are:

The reduction in the estimates of the number of protein-coding genes in the human genome parallels our increased understanding of the complexities involved in regulation of gene activity. For example, many types of non-coding RNAs have been described as well as their roles in modulating the information flow from DNA to protein.

At the same time, I believe that the human genome’s reduced “tool kit” (in terms of number of protein-coding genes) shows a certain level of our genome’s sophistication. Think of the many different ways one can use a screwdriver – say to open a can of paint, or its handle as a hammer. In other words, different proteins can join to different networks in a tissue- or developmental-specific manner. In conjunction with this are the alternatively spliced mRNAs, which often lead to different protein isoforms (proteins that are mostly the same, but with perhaps one different functional subdomain). Think of a Phillips vs. regular screwdriver.

Thus, fewer genes has not meant there are fewer protein isoforms nor less complex protein-protein or protein-small molecule interaction networks. To the contrary, there is an increased complexity and that is one reason it has been difficult to define all the players in a particular human affliction such as type 2 diabetes or cancer.

Here, I would like to add a couple of other points:

1. Genetic variation, whether common, moderately rare or even unique to an individual or family, no doubt has a role in adding to the complexity of interactions among the (relatively) small number of genes and small molecules. For example, our research is considering transcription factor binding sites and seed sites for mRNA-mRNA interactions that are created by minor alleles of SNPs.

2. Where much of the above leads is toward differential pathway dynamics. Because the number of protein-coding genes is low while the human organism, its response to a number of different situations (consider how long tobacco smoking or a poor diet must be endured, on average, before life-threatening phenotypes emerge), and its great capacity to develop, survive and even thrive with numerous genetic aberrations are all complex, many of the answers we research seek simply remain to be discovered. We just do not know all the players - proteins, RNAs, genome state (e.g., methylation) and small molecules - and so cannot fully describe a type 2 diabetes pathway in a series of affected tissues, or a given cancer for that matter. Progress is being made - sequencing of the genomes of a tumor and healthy tissue from the same individual have uncovered common pathways and perhaps drug targets. There, however, remains much more to describe before the full potential of human genomics research will be realized.

1 comment:

  1. Nice post Larry - plenty of interaction possibilities in the "reduced" number of genes. The complexity is too much for old style research based on cases/controls, cohorts, prospective RCTs etc - as we are clearly seeing. GWAS --> weak effects, therefore try whole genome sequencing for rare variants...that will probably be unenlightening too. We can only get a certain amount of info from genetic data, we could probably get to know each base on an individual basis and still not find what we are looking for! As long as the geneticists look just at the genes we'll miss the rest. The hope is maybe in discovery software. Pattern recognition matching genetic variation with everything else...that's your domain!