IMPDH duplication and CBS domain

From genomewiki
Revision as of 20:35, 27 January 2008 by Tomemerald (talk | contribs)
Jump to navigationJump to search

IMPDH duplication and CBS domain

Introduction to the IMPDH Gene Family

Inosine 5' monophosphate dehydrogenase (IMPDH) is a highly conserved, ubiquitous enzyme that catalyzes the first step unique to GTP synthesis, playing a key role in de novo synthesis of nucleic acids, maintaining the intracellular balance of A and G nucleotides and hence in regulation of cell cycle, cell growth, differentiation, growth, apoptosis and signalling (via cGMP).

Vertebrates in general have two very close paralogs, constitutively expressed IMPDH1 and inducible IMPDH2 otherwise indistinguishable in catalytic activity, substrate affinities, tetrameric structure, and interaction with inhibitors. IMPDH2 is greatly up-regulated in proliferating cells, notably activated leukocytes and tumor cells. These two genes lie on different autosomal chromosomes but share 12 identically placed and phased introns, suggesting segmental duplication. That duplication took place after lamprey divergence but before chondrichtyhes; the deuterostome event should not be confused with unrelated duplications in other species (eg yeast).

IMPDH genes are unusually conserved, in top 95th percentile of all human genes. There is no drift of amino or carboxy termini nor any indel of any length tolerated in any species. That conservation extends to both members of the IMPDH duplication; for example human IMPDH1 is still 84% and 91% identical to IMPDH2 some 500 million years after the duplication. That is quite odd in that usually one gene copy subfunctionalizes or neofunctionalizes after a duplication and so diverges considerably more rapidly. Here divergence may have been concentrated in upstream regulatory regions relevent to constitutive vs inducible expression. Although most drugs affect the paralogs more or less equally, one drug, MPA, has a higher affinity for IMPDH2.

Curiously, IMPDH has two further remote paralogs in GMPR and GMPR2. These 9-exon guanosine monophosphate reductases catalyze the preceding reaction in the pathway, one of very few instances of support for the 1945 Horowitz retrograde theory of the origin of metabolic pathways. Their intronation pattern could illuminate the relative timing of the gene duplications involved. IMPDH1 in human has 11 processed pseudogenes not considered further here.

CBS domain

CBS is an ancient non-catalytic paired domain found in a wide range of otherwise non-homologous genes. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. In IMPDH, the domain comprises part of exons 5,6,7 but does not correspond cleanly to exon boundaries, unsurprising since domain appears appended in a common ancestor with prokaryotes prior to establishment of introns. GMPR and GMPR2 lack the CBS domain.

The CBS region can be deleted in its entirety without affecting catalytic properties or the ability to form homotetramers. However R224P and D226N mutations at the end of the CBS domain affect deeply invariant residues and result in retinitis pigmentosa; the complex chain of events is not understood. IMPDH1 may control the rate-limiting regulated step in regeneration of cyclic GMP needed so intensively in GPCR photoreceptor signalling. IMPDH1 is found predominately in retinal inner segment and synaptic termini. However this effect would be relevent solely to ciliary opsins, meaning rhabodomeric opsins in Drosophila (and other arthropods) could not serve as model systems for retinitis pigmentosa.

IMPDH.png


Timing the IMPDH duplication

Using released genomes, wgs contigs, trace archives, cDNA, and ad hoc GenBank sequences, the evolutionary history of IMPDH can be traced. It emerges that echinoderms, hemichordates, cephalochordates, and urochordates have but a single copy of the gene. Since it is implausible that a second copy was lost repeatedly in separate clades, by parsimony early deuterostomes did not experience an IMPDH duplication. Consequently, earlier species with two copies, such as the cnidarian Nematostella and yeast, experienced independent duplications not necessarily to the same purpose, making them irrelevent as model species to human.

The situation in lamprey is slightly ambiguous because, while numerous exons or short blocks of contiguous exons can be located, these are hard to assign unambiguously to IMPDH1 or IMPDH2 due to the latter's high percent identity. However it appears that lamprey contains a single gene of IMPDH1 character -- trace coverage averages 3-4 per exon and these alway of the same amino acid sequence (up to inherent trace sequence errors).

The cartilaginous fish, Callorhinchus milii, is also in a state of incomplete assembly. Here however distinct multi-exon fragments can be recovered, with duplicate coverage at five of the 13 exons. These cluster with IMPDH1 and IMPDH2 respectively. However, the contigs are too short to incorporate neighboring genes and syntenic correlation with other vertebrates is not yet possible. (The paralogs in elephantfish should have the same neighbors, unless subsequent chromosomal rearrangements have irrevocably scrambled gene order.)

Transcripts for skate, Leucoraja erinacea, support this view: EE991359 clusters with blastx with IMPDH1 and DT726645 with IMPDH2. The situation is the same dogfish, Squalus acanthias, for EG027286 and CV720525 respectively. No genome projects are underway in these species so again a duplication specific to chondrichtythes is difficult to rule out.

Five teleost fish genomes are available but these experienced a whole genome duplication with complex retention patterns here. making them unsuitable for evolutionary issues. Frog has the expected paralog pair, as does the amniote, Anolis. Chicken presents an odd situation with a clear IMPDH2 ortholog but an assembly gap where IMPDH1 should be. At the trace archives, a diverged and gappy second gene can be located that resembles IMPDH1. It is likely a pseudogene despite the lack of internal stop codons. The second bird genome, zebrafinch, also lacks IMPDH1. Thus in Aves it appears from the available data that IMPDH1 has been lost.

Synteny is important in this particular gene family for building an accurately labelled set of reference sequences. That is, in a fast evolving clade, IMPDH1 and IMPDH2 might be confused if neighboring gene information was not available to establish orthology. However, the useful lifespan of synteny for IMPDH1 especially is quite short. For example, Xenopus gene order (+SND1 -IMPDH1 -OPN1SW +CALU -SAPS2) already differs from human either by inversion or misassembly. Where synteny would be really helpful, say Branchiostoma or Petromyzon, it is either entirely lost or not yet available.

human gene order
+SND1      PH4
+LEP       DALRD3 
-RBM28     C3orf60
-IMPDH1    IMPDH2
+HIG2      QARS
+METTL2B   USP19 
+CALU
-OPN1SW