Opsin evolution: orgins of opsins

From genomewiki
Jump to navigationJump to search

See also: Curated Sequences | Tetrachromatic Ancestral Mammal | Ancestral Introns | Informative Indels | Update Blog

Introduction: the origin of opsins

OpsinOrigins.jpg

The origin of the first opsins is a bit murky. Opsins are operationally defined here as 7-transmembrane proteins structurally and sequentially homologous to GPCR with (Schiff base) lysine in TM7 in alignment with K296 of bovine rhodopsin (or any established opsin).

This section moves forward in time from the parental gene content of the immediate ancestral genome (greatly facilitated by the new Trichoplax and Monosiga assemblies) that gave rise to the first opsin via gene duplication and neofunctionalization of one copy to photoreception. Subsequent sections work backwards in time, first coalescing separate gene trees of ciliary, melanopsic and other opsins to their respective ur-opsins and ultimately deducing properties of the crown group opsin.

The opsin origination event was not necessarily unique -- GPCR always retain many essential properties via their own evolutionary constraints and conceivably could have given rise to opsins at widely scattered intervals from rather different parental genes. In this type of history, the minimal gene tree containing all opsins is not 'monophyletic' but instead contains embedded non-opsin GPCR. Nothing prevents an established opsin from later giving rise to a gene duplicate that 'reverts' to a non-K296 GPCR. Conceivably the lysine could be retained even as the photobiological functionality is lost.

In the case of multiple such opsins surviving to the present day, branches will coalesce first to separate parental non-lysine GPCRs, which in turn eventually coalesce -- as all GPCR must do -- to a master parental gene.

PolyphylOpsins.png

A gene tree illustrates these hypothetical complexities at left:

-- opsins arose independently from GPCR at nodes 2,3 and 4

-- these opsins initially coalesce to 3 ancestral opsins

-- the first two groups of opsins coalesce to a parental gene at node 1 whose descendants include 7 GPCR

-- at node 5, an opsin has 'reverted' to a new GPCR, also a descendent of these opsins' parent gene

-- the full set of opsins coalesces at a master parental gene at node 0 with numerous non-separable GPCR descendents

This scenario -- the molecular version of whether 'vision' arose once vs multiple times -- can be ruled out for bilateran opsins (provided the relevant GPCR outgroups have left descendent genes) but still must be considered seriously in the case of cnidarian opsins and perhaps ctenophores and sponges as well. It appears today however that the entire bilateran opsin set forms a single branch exclusive of all non-K296 GPCR in the tree generated from the roughly 100,000 known GPCR. (For practical reasons, only near-opsin GPCR can be considered.)

Events 600 million years ago may seem hopelessly inaccessible and indeed many uncertainties will remain even after every relevant genome has been sequenced. However sequencing to date has been phylogenetically lopsided with far too little effort expended on early diverging non-model organisms with strategic tree positions. Yet comparative genomics has already provided substantial insights into certain aspects of opsin evolution:

  • The first opsins were not associated with gross morphological structures (such as stalked eyes) that could possibly leave a fossil record (as in trilobites) -- key events took place strictly at the molecular subcellular level. Genomes of extant species (some more than others) are not exactly living fossils because the evolutionary accrual of mutations never ceases.

Cases exist of opsins demonstrably obliterated both by gradual pseudogenization and large scale deletions, confusing the record. Yet opsin genes and even their regulatory regions, when compared across the entire metazoan tree, can furnish reliable reconstructions of opsin content and even sequence at ancestral species divergence nodes.

  • Opsins are definitely not the 'original' GPCR because these were already widely deployed at much earlier divergence nodes -- yeast, protozoa, choanflagellates, trichoplax have GPCR but lack opsins. Nor are opsins the prototype for the 'rhodopsin class' R of the GRAFS classification of GPCR which again was established far earlier. Indeed, even the Ralpha subgroup with of rhodopsin class GPCR was well-established prior to the first metazoan opsin.
  • Opsins are thus latecomers, not pioneers, to a rapidly expanding paralogous gene clade within already full-featured GPCR. Judging by their closest extant blastp relatives among tens of thousands of GPCR at GenBank, opsins specifically arose as a gene duplication within the peptide receptor subgroup PEP. Indeed, certain of these proteins list opsins among their top ten best back-blast matches (ie have better matches than to almost all non-opsin GPCR). Note here that blast scores can be misleading because the 'floor' of percent identity is about 25% just due to universal conserved residues plus accidental matches.
  • Note an 'intermediate' GPCR does not exist: either lysine is present at K296 or it isn't. Reconstructing ancestral states from the best contemporary set of GPCR proteins lacking K296 cannot produce a lysine there by any rational methodology. The 20 encoded amino acids can be clustered into subgroups (eg by polarity or bulk) but ultimately form a unorderable discrete set not furnishing continuum transitional states.
  • Most likely the parental gene had several introns and the original opsins inherited this pattern (ie the duplication was segmental rather than retroprocessional as in some cnidarian opsins). The history of introns within opsins is already complex and becomes quite problematic within the enveloping GPCR gene family. Opsins (with the exception of a fragmentary sea urchin melanopsin) lack the ubiquitous phase 21 intron breaking the DRY motif arginine.
  • Intracellular targeting of early opsins was likely to cytoplasmic or endoplasmic reticulum membranes as isolated monomers, with limited microvillar or especially ciliary specialization (to motile larva) also plausible. These opsins were the first eyes to the world but only in the sense of indicating the intensity (and later directionality) of sunlight striking the cell utilizing already refined GPCR second messenger signal transduction.
  • Opsin creation does not imply saltatory evolution because the basics had been established far earlier -- the 7-transmembrane helical structure with fixed topology, the TM1-TM2 salt bridge N55-D83 that could serve as initial counter-ion, the DRY ionic lock, the GWS.Y..E.....C..DW........SY region of EC2, the NPxxY terminal helix, the conformational shift upon binding of ligand that could trigger signaling, the Galpha protein binding site needed for the signaling cascade, and an arrestin-type mechanism signaling termination. The earliest opsins contained and continued all of these features from the get-go, adapting them over the course of time to various photoreceptive functions.
  • Opsins are unique among GPCR in several respects: they catalyze a mild in-situ enzymatic reaction -- cis-trans photoismorization -- that furnishes the signaling agonist. (This reaction also occurs thermally without enzyme but so does carbon dioxide dissolution in water yet humans have 15 carbonic anhydrases). Cis-retinal, being lipid soluble, does not diffuse through the extracellular milieu to reach its receptor binding site as in all other GPCR. Instead it is covalently bound to a lysine deeply internal to TM7, again unprecedented among GPCR (though other internal charged amino acids can occur, notably the D83 glutamate salt bridge and K90 of ultraviolet opsins).
  • Opsins did not arise from flavinoid-based cryptochromes, mechanistically different photoreceptors that evolved much earlier to establish circadian rhythm and eventually magneto-sensing. Cryptochromes are homologous to DNA photolyase repair enzymes, not GPCR.
  • Although literature searches turn up scattered assertion about 'opsins' in species such as Chlamydemonas ('chlamyopsin' Z48968) and 'volvoxopsin', not to mention bacterial 'rhodopsins', these amount to abusive terminological metaphors, unwelcome additions to an already complex gene family. These proteins do not have seven transmembrane helices in the same arrangement as GPCR nor possess the slightest sequence homology at deeply conserved GPCR residues, so represent independent evolution of photobiology (along the lines of bat and butterfly wings representing independent origins of flying).
  • Conceivably forerunners of opsins bound a related chromophore non-covalently, perhaps an all-trans retinoid in the manner of peropsins. Retinoic acid is sometimes proposed as ancestral ligand but retinoic acid receptors (RAR and RXR) are non-GPCR nuclear hormone receptors that bind all trans-RA or 9-cis-RA but not 13-cis-RA. Furthermore, the GPCR receptors inducible by retinoic acid -- RAIG1 proteins (GPRC5C etc) belong elsewhere in the GRAFS classification, have no particular affiliation with opsins and again do not bind retinoids themselves. The fact that pseudo-opsin chromophores are similar retinoids may be coincidence arising from the ubiquity of metabolic carotenoids (availability) and the restricted number of biochemicals (isoprenoids but not amino acids) with tunable adsorption in the visual range (suitability).
  • Very recent experiment has investigated the consequences of K296G in conjunction with replacement of retinal with its ethylamine Schiff base (which mimics the previous situation but with non-covalently bound chromophore). This had no effect on site specificity of photoisomerization nor quantum yield but greatly reduced activation, suggesting the K296 covalent bond transmits structural changes within the protein, with the bond retaining the low-affinity agonist enhancing the duration of activation. This suggests both an intermediate evolutionary stage of inefficient but region-specific photoisomerization prior to the acquisition of K296 and raises the issue of whether some early opsins acquired a distinct enhancement mechanism not involving K296.
  • In principle, GPCRs could continue to spawn new clades of opsins from time to time. However, they did not in bilaterans. That is, no gene tree of a bilateran opsin coalesces with a GPCR gene later than the bilateran common ancestor. All bilateran opsins are descended from one of six opsins classes present in the ur-bilateran. Indeed gene tree comprised of all opsins excludes all GPCR, consistent with a unique K296 origination event. However, it remains possible that some cnidarian or ctenophoran opsins arose from a second wing of GPCR with no representative of this opsin surviving in bilaterans.

Two genes in separate species are by definition orthologous only when descended vertically from a single gene in their last common ancestor. It appears that all bilateran opsins -- after accounting for later clade-specific expansions and losses -- are orthologous to either a cilopsin, melanopsin, peropsin, rgropsin, or neuropsin at the bilateran common ancestor. ('Rhabdomeric' protostome opsins do not define a separate class but instead coalesce with vertebrate melanopsins.)

These 5 opsin classes appear not fully coalesced even at the last common ancestor of bilaterans with cnidarians -- while sequence data is woefully limited today in early taxa, it seems both melanopsins and cilopsins classes existed in this ancestor, perhaps in addition other opsin classes no longer represented in bilaterans. Conversely, peropsins have been retained in lophotrochozoan, ecdysozoan, and deuterostome lineages but not in any cnidarian sequence to date. Neuropsins survived solely in chordates, whereas rgropsins are even more restricted to vertebrates, even though they could not have originated there. These latter genes are conceptual analogs of cnidarian-only opsin classes.

All opsins are homologous so any given pair is ultimately orthologous at some earlier common ancestor -- but which one? The species tree itself is confused here on sistering vs independent nodes at cnidarian/ctenophore. The single ctenophore opsin available -- regrettably just a distal fragment -- is difficult to classify. The fact that its best blast matches cluster about equally well with melanopsins and cilopsins (to the exclusion of other bilateran classes) suggests that their merger is not far off.

The opsin gene tree can largely be worked out and coordinated with species tree divergences. Despite many efforts at this, some deeper topology remains problematic. It appears from sequence clustering, indel analysis, and especially intron conservation that ((peropsin, rgropsin),neuropsin) is a valid subgroup. Further, this assemblage associates more closely with cilopsins, leaving a final topology to be superimposed on the phylogenetic tree:

gene tree    ((((cilopsin,((peropsin,rgropsin),neuropsin)),melanopsin),cnidopsin),GPCRpep);
species tree (((((((((echinoderm,acornworm),amphioxus),tunicate),vertebrate),((chelicerate,(crustacean,insect)),(mollusk,annelid))),cnidaria),ctenophore),trichoplax),sponge);

Nearest neighbors of opsins among GPCR

The immediate outgroup of opsins lies among a vast number GPCR receptors. The reference collection defines a close-in subset utilizing human GPCR which have the best prospects for determined ligand. Note blast score order is not ideal because they are squeezed between a 'floor' of ~23% identity attributable universally conserved residues plus accidental matching, and a 'ceiling' of ~30% to remain non-opsin.

None of these GPCR represent the actually parental gene to opsin because they have themselves evolved forward some 600 million years from the putative opsin creation event. Conceivably one or more is also directly descended from it. The consensus line of the alignment below perhaps represents a better approximation to the desired ancestral sequence. It is difficult to reconstruct an ancestral sequence accurately because non-adjacent opsin residues co-evolve, creating algorithmic errors in methods that neglect this. Some co-evolving residues are suggested by structural studies but not all relationships can be described.

OpsinOutgroup.jpg

Opsins are not the 'original' GPCR (which are trackable, barely, to yeast) even for the 'rhodopsin' group R (or even its Ralpha subgroup) within the GRAFS classification but rather form a specialized set that arose later as the rhodopsin gene class (which contains the AMIN cluster [adrenalin, serotonin, dopamine, and histamine receptors], MECA branch [peptide and lipid binding receptors] in addition to opsins) underwent significant expansions.

This expansion of the Ralpha class had largely taken place in the last common metazoan ancestor shared with Monosiga and Trichoplax (which do not contain opsins), implying the ancestral metazoan lacked them as well. The orphan receptors GPR21 and GPR52 form the immediate outgroup (within the 800 human GPCR) in an oft-cited 2003 study. These have isoleucine at K296; their ligands are still not known as of Dec 2009. Conservation is high throughout deuterostomes; blast matches are restricted within opsins to molluscan melanopsins suggesting Gq signaling.

The melatonin receptor MLTNR1A emerges as a close relative to opsins. Curiously it plays a key role in circadian rhythms and so needs to coordinate with opsin photosensors. N-acetyl-5-methoxytryptamine, the ligand, bears no obvious relationship to cis-retinal however and K296 is lacking, making an immediate parent gene relationship problematic.

Another clue to the origin of opsins might be provided by examining GPCR intron positions and phases to see if shared with ancient introns in opsins. Many non-olfactory GPCR with sequence similarity to opsins have no introns or just one, suggesting the genes duplicated by retroprocessing, perhaps acquiring an intron at unrelated position later. UROPS2 has an intron but it does not seem to correspond to one in any opsin. Cnidarian opsins are either intronless (Nematostellata) or undetermined (just known from processed transcripts).

Closeness in the GRAFS tree does not fully accord with closeness of blastp hit and relatedness of diagnostic regions, suggesting (unsurprisingly) that its topology is slightly wrong at some internal nodes. On average rank in blastp top scores (or by average 5 best blast expectation values), as representatives of all opsin classes are aligned with the GPCR below, the highest scoring ones by far are are the Trichoplax opsins followed by various peptide receptors:

Rank  Gene          Exp    Exons  Receptor      Ligand

4.2   UROPS2_triAd  e-29   2      orphan        histamine? (HRH2:  best human non-opsin blast match)
5.4   UROPS1_triAd  e-28   1      orphan        peptide?   (SSTR1: best human non-opsin blast match)
5.6   SSTR1_homSap  e-26   1      somatostatin  peptide
7.2   TACR2_homSap  e-25   5      tachykinin    peptide
8.1   GALR1_homSap  e-24   3      galanin       peptide
8.9   MTNR1A_homSa  e-23   2      melatonin     N-acetyl-5-methoxytryptamine 

The biological literature contains various scattered claims about 'opsins' in species such as Chlamydemonas (chlamyopsin Z48968), not to mention bacterial 'rhodopsins'. These do not have the seven transmembrane helices in the same arrangement as GPCR nor significant sequence homology and may represent independent evolution of photobiology (just as bat and butterfly wings represent independent origins of flying).

Trichoplax has two very curious 7-transmembrane proteins that emerge as its best genomic match to opsin queries. While lacking K296 for a Schiff base, their best back-blast to all of GenBank returns almost entirely opsins (rather than nest within other GPCR receptors). While Trichoplax is 600+ million years removed from the common ancestor with eumetazoa, this gene could still offer clues about the immediate GPCR ancestor to opsins.

These Trichoplax genes retain uncanny similarities to opsins in otherwise rapidly changing regions. These two genes are not plausibly derived from an opsin expansion with subsequent loss of K296 because Trichoplax and other early diverging lineages lack opsins. Perhaps these genes should be considered opsins in spite of lacking K296. Recall here Schiff base formation dramatically redshifts the absorption spectrum, yet non-covalently bound retinal still has significant adsorption at optical wavelengths which might be further tuned by Trichoplax binding pocket residues.

Conversely, several cnidarian species exhibit far too many K296-type GPCR for their apparent photoreceptive needs and accompanying lack of overt photobiological anatomical specializations. These may represent divergent gene duplications of valid opsins that have evolved into some other type of GPCR; alternatively they could represent a lineage of pre-opsin GPCR that developed K296 but never acquired an opsin-like light-sensing role nor served as parental gene to bona fide opsins.

Together the Trichoplax pre-opsins lacking K296 and putative cnidarian non-opsins possessing K296 push the opsin-defining envelope to its limits. Given the immense time span separating contemporary genes from ancestral, we can anticipate their computed nesting arrangement within the opsin gene tree relative to a close-in GPCR outgroup with known non-retinal ligands will lack convincing statistical support at the critical nodes. The best way forward is additional sequencing and experimentation with cubomedusae, ctenophores and sponges because these seem to contain conventional opsins that can clarify the positions of the outliers.

In summary, the parental GPCR that gave rise to the first opsin can be localized fairly reliably to the PEP subgroup of R class GPCR within GRAFS but no particular gene there stands out as the definitive pre-opsin. The time span involved is immense and this gene class has experienced much churning through expansion and contraction cycles, as well as moderately rapid pointwise residue change.

An independent approach to opsin origins might compare intron positions and phases of candidate parental GPCR to those of opsins. The ancestral introns of opsins are easily reconstructed, reducing noise and potential coincidence, but that program is quite difficult to extend to GPCR. Too often, GPCR with relevant sequence similarity to opsins have no introns or just one, suggesting gene duplication by retroprocessing followed by a later intron acquisition at non-historic position followed by more rounds of duplication (as seen in sulfatases).

UROPS2 of trichoplax has one intron but unfortunately it does not correspond to any in opsins. Cnidarian opsins to date have been either intronless (Nematostella) or not determined (known only from processed transcripts). Thus the intronic approach to parental GPCR awaits more extensive sequencing of early genomes.

A third approach to opsin origins considers informative indels and diagnostic residues in the set of all opsins expanded by select GPCR. While perhaps subject to more homoplasy than introns, regions such as extracellular loops TM2 and EC2 do illuminate issues such as ancestral length and define signature residues of opsin classes.

Origin of contemporary opsin classes

Traceback of opsins can begin by selecting certain 'index sequences'. It ultimately does not matter which or how many, but for historical reasons bovine rhodopsin, frog melanopsin, human peropsin, mouse neuropsin and so forth might be used.

Each index sequence is then built out to a larger class of orthologs in nearby species using flanking gene synteny to confirm best-blast. Lineage-specific gene duplications with close affinities (eg from recent clade-specific paralogous expansions such as teleost fish whole genome duplications) are added. Eventually the set collides with an expanding set of another index sequence and all bilateran opsin sequences fall into one of five clusters.

Ciliary opsins (generated from RHO1) forms a cohesive gene clade called here cilopsins that does not coalesce with melanopsins, peropsins, neuropsins, or rgropsins within vertebrates, deuterostomes, or even bilatera. The index gene picks up rod and cone imaging opsins, pinopsin, parapinopsin, parietopsin, very ancient opsin, encephalopsin, teleost multiple tissue, and certain ciliary opsins from protostomes.

Hardly a vertebrate innovation, ciliary opsins appear in early deuterostomes lacking imaging eyes, in both branches of protostomes (initially bee and ragworm), in pre-bilateran cnidarians and quite possibly ctenophores. Sponges are still uncertain because of a 5 year wait on the assembly but the very earliest metazoan genomes (Monosiga and Trichoplax) definitely lack ciliary, indeed any K296 GPCR. If those genomes are representative, then ciliary opsins emerged on the post-Trichoplax stem. Certain cnidarian opsins -- but not all -- already exhibit certain sequence specializations of ciliary opsins.

Ciliary opsins have been totally lost on numerous occasions in numerous lineages, notably 'model' organisms like drosophila and worse nematodes, which have lost all opsins. Hemichordates and non-annelid lophotrochozoans have lost ciliary opsins independently. Other explanations (such as multiple re-emergences of ciliary-like opsins from GPCR or distantly related opsins) are manifestly impossible given intron structure alone.

The earliest deuterostome ciliary ur-opsin is best represented by the TMT class of opsins, in particular by the TMT1 subgroup that has retained important ancestral characteristics in the diagnostic TM2 region. Sequential expansion of TMT1 gave rise to all the other ciliary opsins found in vertebrates, including all rod and cone opsins. This fundamental gene, though retained through amphibian and amniote, curiously was eventually lost in birds and mammals. Transcripts are often annotated as testis libraries suggesting a function in gamete release timing. Its immediate descendent gene TMT2, whose subfunctionalization is unknown, is retained in monotremes and marsupials but lost in all placentals. The best experimental organism for studying TMT1 is probably Xenopus.

Melanopsins, discovered in 1998 in frog lateral line dermal melanophores (as well as hypothalamus, iris, and retinal horizontal cells) form another ancient opsin class. Melanopsins include rhabdomeric arthropod opsins (which have an unnecessary dual nomenclature -- they're melanopsins by multiple independent criteria) and lophotrochozoan melanopsins (which other than scallop, squid and octopus genes lie undocumented within genome projects). One cnidarian opsin from coral classifies as a melanopsin yet closely shares other properties with cnidarian opsins that don't.

OpsinLoss.jpg

Peropsins are a third major class of opsins in the sense of broad but not universal retention. Expanded in deuterostomes, they occur rarely in arthropods but are quite important in lophotrochozoa. Peropsins are the only opsin class retained in hemichordates. Nothing resembling them has been retained in cnidaria, reflecting loss in the two genomes available because their coalescence with cilopsins lies much further in the past.

Neuropsins are a much expanded but little studied group of opsins restricted to living deuterostomes though they did not originate there (unless divergence from another opsin class was exceeding abrupt and then immensely slowed). The neuropsin expansion to 4 genes in the lamprey stem continued unchanged to the amniote ancestor but subsequently contracted to 2 in monotremes and only 1 in marsupials and placentals.

Rgropsins constitute another little-studied group represented today only beyond the tunicate-vertebrate last common ancestor. Again these opsins must have originated far earlier in pre-bilaterans because their ancestral reconstructed sequence is still far from coalescence with other ancestral opsin classes.

Conceivably rgropsins and neuropsins are retained in other bilatera but diverged to the point of unrecognizability. This scenario can be rejected because analytic methods of complete genomes are sufficiently sensitive to locate all GPCR and screen them for K296. This reasoning is applicable to peropsins as well -- they have definitely been lost in all insect and molluscan genomes though fortunately retained in two chelicerate arachnids.

Peropsin, neuropsins and rgropsins are unified by their intronation, sharing three ancestral introns despite numerous differences. This indicate -- given the slow rate of intron gain and loss in most metazoan clades -- that they share deep roots in pre-bilatera, implying near total loss of neuropsins and rgropsins in invertebrates.. None of these introns are shared with cilopsins or melanopsins or for that matter known GPCR.

Opsins, plagued again and again by losses on stem lineages, illustrate why ancestral node-based sequencing is a far better strategy than terminal speciation-based. If sequencing effort is proportional to contemporary species numbers, the millions of opsins from insects (respectively ray-finned fish), opsin evolution will never be illuminated. Each major node requires equal sequencing intensity. Thus onychophoran and tardigrade opsins have a far greater priority than more butterflies or cichlid fish.

Cnidopsins are a taxonomically based collection of opsins that do not all classify satisfactorily within the bilateran opsin system. Much more intensive sampling is needed here because neither Hydra nor Nematostella has remotely the cubomedusan repertoire. Ctenophores currently have a single unexpected opsin gene obtained accidentally in a shotgun project -- obviously much greater sequencing and structural effort is warranted given their currently basal position within the opsin-containing species.

In hindsight, large scale loss of opsin classes should not come as a surprise -- humans lost 12 of 20 opsin loci that otherwise persisted from lamprey stem to amniote ancestor to living frogs, lizards and birds. This is characteristic of GPCR evolution overall (notably the olfactory subgenome): collapse of a large gene clade, followed by later massive expansion but retention to contemporary species only in scattered lineages.

This can result in two species having similar number of GPCR genes but a very poor correspondence between them. This pattern of gene churning (cycles of explosive expansion followed by mass die-offs) differs dramatically from gene histories of ribosomal proteins or catabolic enzymes (eg homogenistate dioxygenase) retained in all species as single copies (ancient birth, never death). Other genes like globins exhibit moderate expansion to several copies accompanying a trend to organismal specializational complexity with little evidence of contraction (occasional births, rarely deaths). Still other gene classes, for example selenoproteins, seem headed systemically for oblivion (births but trending to extinction) in the sense of the cysteine replacement ratchet.

Once over this conceptual hurdle, cycles of expansion and contraction in the GPCR gene family can be repeatedly invoked on various branches of the phylogenetic tree to explain many aspects of opsin classification. After several such cycles, the utility of terms such as ortholog and paralog are stretched to the breaking point -- words become inadequate to describe the gene tree.

Origin of ciliary opsins

CilOpDecline.jpg

The evolutionary history of vertebrate imaging vision has six peculiar aspects: delayed onset relative to ocean oxygenization, delayed onset relative to predatory arthropod vision,a ciliary opsin class vs melanopsin class basis, and very rapid maturation followed by a long period of stasis with little additional innovation and dramatic collapse in mammals.

  • Early deuterostomes -- represented today by living echinoderms, hemichordates, cephalochordates and urochordates -- retained various opsin classes descended from the ur-bilateran (indeed eumetazoan) ancestor but never possessed imaging vision nor subsequently developed it except in one descendent lineage (vertebrates). One tunicate opsin specialized in the direction of parapinopsin but the main expansion and maturation of the opsin gene family took place very rapidly in the lamprey stem.
  • Indeed at the time of divergence with jawed vertebrates, the last common ancestor possessed a full set of modern opsin genetic loci, furnishing four-color imaging ciliary opsin-based cone and rod vision with advanced oil-drop filtration. Intermediary states in chordate imaging vision development are no longer represented among extant species unless hagfish provides an intermediate node or better opsin retention occurs in additional urochordate genomes.
  • It is sometimes claimed that chordate predation on protostomes dramatically rewarded rapid evolution of imaging vision. If so, this was lamprey-specific suction and boring because from the opsin standpoint, vertebrate eyes were fully matured tens of millions of years earlier than jaws that could seize prey.
  • The fossil record establishes that arthropods developed eyes tens of millions of years earlier than vertebrates. Thus any vision-based feasting was arthropod-on-chordate rather than the other way around. Perhaps evading arthropod predation provided the selective advantage. Observe imaging vision is not essential to survival because echinoderms, hemichordates, cephalochordates and urochordates successfully survived for 520 million years without it.
  • The next 500 million years of vertebrate evolution brought refinement but not further innovation and some major clades, notably mammals, backslid to the point where their color vision today is inferior to that of the common ancestor with lamprey, only limited recovery has occurred. The period of stasis in opsin gene number has various narrow exceptions (eg zebrafish) that have drawn too much experimental attention away from the important trends in opsin evolution.

The rapid expansion in opsin gene number in early vertebrates cannot be attributed to supposed 1R and 2R whole genome expansions -- indeed the observed lack of sistering in the ciliary opsin gene tree actively conflicts with such a scenario. If all opsin 'ohnologs' are lost, what relevance would whole genome duplication have to their evolution? (Observe the published amphioxus genome contains more genes than human whereas it should have 1/4 the number under 2R.)

Some inferences must be tempered by the choices of species for genome sequencing and the quality of their current assemblies. Thus Geotria better represents the ancestral lamprey opsin repertoire than the sequenced Petromyzon. Because of a 16 kbp retroposon, its assembly is completely unsatisfactory in terms of contig size (rarely containing full length genes and almost never allowing syntenic determination) and in lacking coverage of genes that surely are present but are not. Assembly coverage is not in fact 6x despite 19 million traces because that calculation erroneously assumed genome size similar to mammal (3 gbp).

The amphioxus assembly, while far better, suffers from numerous paired end read misassemblies (resulting in faux tandem gene pairs and unreliable gene models) and lack of community follow-up to the March 2006 release -- less than 0.5% of its proteome has been analyzed in 220 papers published in the subsequent four years. Sequencing of a second species, Branchiostoma belcheri, apparently underway, will not be helpful because opsins there have already been determined from transcripts.

The first two assemblies of tunicates, though deeply flawed due to heterozygosity, established that many genes (notably opsins) are extremely diverged by any measure. Tunicates serve as an unfortunate but essential immediate vertebrate outgroup. Given a coding gene count again comparable to human, one of the great mysteries of the genomic era is how hundreds of journal claims for 1R and 2R survived peer review. No vestige of these alleged events can be found in opsins or related support genes.

Opsin genetic loci have undergone many gains and losses. Examples of gains include sharks (extra copy of LWS), zebrafish (greatly expanded RHO2 repertoire and retained RHO1 retrogene), primates (tandem LWS copy) and so forth, with the most sweeping gain being whole genome duplication at some point after teleost fish divergence. A duplicated gene can only be retained for an adaptive reason.

Examples of losses include dolphin (SWS1), chickens (TMT1), cave fish and blind mole rat (opsin pseudogenes), platypus (SWS1) and so forth, with the most remarkable loss being the massive attrition era in mammals (60% of all opsin loci lost in placentals, foreseen by GL Walls in 1942). Opsin gene loss is generally not adaptive ('less is more') but simply neutral drift ('use it or lose it').

It follows from standard phylogenetic tree analysis and consideration of diagnostic regions that the very earliest ciliary opsins in deuterostomes were of TMT1 class. Only these opsins continue the ancestral pattern in the N D P C iron triangle centered in the second transmembrane helix -- already established in pre-cnidaria, indeed predecessor GPCR -- and still found today in most other opsin classes, including ecdysozoan and cnidaria cilopsins, melanopsins and peropsins.

Encephalopsin is the earliest diverging gene duplicate of TMT, with representatives in sea urchin and amphioxus but not protostomes (whose ciliary opsins more closely resemble TMT). This opsin class contains two key derived features: T/S/N substituted for the immensely conserved diagnostic proline kink in TM2 with consequent significant impacts on signaling and the carboxy terminal VxPx* ciliary targeting motif.

These are features of all other imaging and near-imaging ciliary opsins, unlike basal TMT genes which have a different, undocumented HYxx* motif possibly targeting encoded protein elsewhere in a cell membrane. (Recall 'ciliary' opsins are not necessarily targeted to elaborated ciliary membranes -- some gene loci have never been studied with electron microscopy.) Rather than convergent evolution of these two features, encephalopsin more likely served as immediate parental gene for the cascade of gene duplications giving rise to these specialized opsins. Semi-conserved distal cysteines with potential for palmitoylation (as in RHO1) arose so early in deuterostomes they do not serve as useful characters; distal cysteines are absent in protostomal cilopsins.

Palmitoylation sites following YR motif:      Terminal localization motif VxPx*
ENC_homSap   .........-C.....C..............  ENC_homSap  DDSDKTNGS-KVDVIQVRPL*
ENC_canFam   .........-C.....C..............  ENC_otoGar  DNSDKTNGS-KVDVIQVRPL*
ENC_loxAfr   .........-C.....C..............  ENC_canFam  DDSDKTSVS-KVDVIQVRPL*
ENC_pteVam   .........-C.....C..............  ENC_loxAfr  NNIDKTNGS-KADVIQIRPL*
ENC_musMus   .........-C.....C..............  ENC_pteVam  DDSDKINGS-KADGIQVRPL*
ENC_monDom   ...C.....-C....................  ENC_musMus  EDSDRSSAS-KVDVIQVRPL*
ENC_anoCar   ...C.....-C....................  ENC_monDom  DENDKNSGT-KVNVIQVRPL*
ENC_galGal   ...C.....-C....................  ENC_anoCar  DVSTKCSDT-KINVIQVKPL*
ENC_danRer   ...C.....-C....................  ENC_galGal  DDNSKHNGT-KVNVIQVKPL*
ENC_tetNig   ...C.....-C....................  ENC_danRer  DITSKCNDEPDINVIQVRPL*
ENC_takRub   ...C.....-C....................  ENC_tetNig  DVTSRAGDSADVNAIQVRPL*
ENC_gasAcu   ...C.....-C....C..C............  ENC_takRub  DVTSKSGDSADVNAIQVRPL*
ENC_oryLat   ...C.....-C.......C............  ENC_gasAcu  DVTSNTRESSEANVFQVRPL*
ENC_petMar   ...C....---................C...  ENC_oryLat  DVTSKSRNSSEANVFHVRPL*
ENC_xenTro   ...C.....----..................  ENC_petMar  EKSDALAAPADTCVVHVKPT*
ENC4_braFl   ....---........CC..............  ENC4_braFl  YVQENCKPKAD--SLSTISE*
ENC4_braBe   ....---........CC..............  ENC4_braBe  -VQGN-QLKAD--SLSTISE*
TMT5_braFl   .........CC.............C......  TMT5_braFl  QWIEMQTIAVVVKADEVNNK*
TMT5_braBe   .........CC....................  TMT5_braBe  QWIEMQTIAVVVKAVEVDTS*
TMT_monDom   ...C......C.-..........C.......               Terminal motif HYxx*
TMT_macEug   ...C......C.-..........C.......  TMT_macEug  NVAPSSGHPQEKMEEKPLSE*
TMT_galGal   ...C......C.-..........C.......  TMT_galGal  TSLLHPEPGLEPAAKTVPPM*
TMT_taeGut   ...C......C.-..........C.......  TMT_taeGut  SKMTSLLCPETTSKATPPTS*
TMT_anoCar   ...C......C.-..........C.......  TMT_anoCar  SSLDPVLESTPQLSKENSFL*
TMT_xenTro   ...C......C.-..........C.......  TMT_xenTro  NESPSDQMPQSTTEHLISGT*
TMT_ornAna   ...C......C.-..................  TMT_ornAna  LGPQLHETPSWERSTPVHPE*
TMTa_danRe   ...C.....CC.-..................  TMTa_danRe  HNHDCSTHVTERSNPPEVIP*
TMTb_danRe   ...C......C.-..................  TMT_tetNig  CCSA-----KNASTIEVKLS*
TMT_tetNig   ...C......C.-..................  TMT_takRub  CCSANTISTKNTSTVEGKLS*
TMT_takRub   ...C......C.-..................  TMT_gasAcu  NMIRQENHSHDEAAKNQLDC*
TMT_gasAcu   ...C......C.-..................  TMTa_anoCa  PTDNSPKAKQRVLLVAHYSV*
TMT_oryLat   ...C......C.-..................  TMTa_xenTr  SKTHNGDTKPFKTLVANYVI*
TMTa_anoCa   ...C.C....C.-.........C...-.--.  TMTa1_danR  -PSADNTKPAVLSLVAHYNG*
TMTa_xenTr   ...C......C.-.............-.--.  TMTa_takRu  PPSSDTIKPVVVSLAAHCDG*
TMTa1_danR   ...C......C.-.............----.  TMTa_tetNi  TKAPSSDNHQPVVVSLEAHG*
TMTa_takRu   ...C.....CC.-...............--.  TMTa_gasAc  GPSSDNNKPVIVSLVAQCDG*
TMTa_tetNi   ...C.....CC.-...............--.  TMTb_takRu  KSPAANRSKPKLILVAHYRE*
TMTa_gasAc   ...C......C.-...............--.  TMTb_tetNi  KTPAANSSKPKLILVVHYRE*
TMTb_takRu   ...C......C.-...............--.  TMTb_gasAc  RCAAAGAAKPKRTLVAHYRE*
TMTb_tetNi   ...C......C.-...............--.  TMTb_oryLa  ASDSPDSRKPKVVLVAHYQE*
TMTb_gasAc   ...C......C.-..................  TMTa_pimPr  TFAVASAGHP-TICAPH...*
TMTb_oryLa   ...C.....CC.-..................  TMTc_xenTr  TSSPEMQRKLT--LTVHYRD*
TMTa_pimPr   .C.C......C.-..................  TMTc_danRe  AEGPQKKEQHSLSLVVHYTP*
TMTc_xenTr   ...C......C............------..  TMTc_oncMy  TRGPQR-EKRDLVLVVHYTP*
TMTc_danRe   ...C......C.....------------...  TMTc_oryLa  DRH---------VLFVHYTP*
TMTa_oncMy   ...C......C....................  
TMTc_oryLa   ...C......C....................  
ENC_strPur   ................C..............                  
TMT1_strPu   .........CC.................... 

TMTsurvivors.png

(Amphiop4 and Amphiop5) are the amphioxus counterparts to encephalopsin exhibiting threonine substitution of this proline and close but not definitive blast clustering. The amphioxus proteins lack any sign of VxPx* motif appropriate to amphioxus divergence preceding the anatomical specializations of imaging vision.

However support is somewhat ambiguous due to rapid divergence in amphioxus and an inadequate range of data. In particular, no flanking gene synteny occurs between these amphioxus opsins and any vertebrate (or invertebrate) opsin. Vertebrate encephalopsins provide no clue to their parent gene via flanking genes, indels or intron structure. Early diverging taxa such as amphioxus and echinoderms must be sequenced much more intensively to provide satisfactory intermediate nodes, here along 550 myr of long branch -- imagine vertebrate comparative genomics based solely on human and lamprey data.

TMTgeneTree2.jpg

Tunicate genomes have not retained any gene resembling encephalopsin but instead contain a parapinopsin-like gene with valine for proline and a full VAPA* motif in C. savignyi as appropriate to later urochordate divergence. No echinoderm or acornworm opsin bears striking affinity to vertebrate encephalopsins -- again phylogenetic sampling intensity is far too low for divergence times involved. Protostomal ciliary opsins most closely align with TMT and share most of its diagnostic residues, though again no synteny remains in arthropod genomes because of gene order churning.

So when did the encephalopsin locus arise? That might be clarified by sequencing hagfish and additional early deuterostomes, but for now it appears that a distinctive encephalopsin locus was formed by an intron-preserving segmental genome duplication in the cephalochordate stem that subsequently lost the ancestral proline. This gene diverged in quite different ways in amphioxus, tunicates, and vertebrates. In tunicates, the locus duplicated again with one copy specializing to today's parapinopsin and the other being lost (along with all TMT loci). In vertebrates, both copies were retained, one descending to contemporary encephalopsins and the other to parapinopsins and their various imaging and near-imaging ciliary opsins.

Through cascading segmental gene duplications, the TMT1 ciliary ur-opsin gave rise directly and indirectly to all other ciliary opsins observed in living deuterostomes. The ur-opsin likely retained the ancestral ciliary opsin photoreceptive function even as its daughter genes have neofunctionalized to new roles. Structures cannot be directly modeled from bovine RHO1 (the most recently derived of all imaging opsins) because the latter lacks the induced proline kink in TM2.

The current phylogenetic distribution of TMT1 extends from sea urchin (but not acornworm) to amphioxus and tunicate through chondrichthyes and teleost fish to frog and lizard, with orthology mostly validatable by syntenic location. Remarkably birds, platypus, marsupials, and all placental mammals have lost the ciliary ur-opsin which requires two independent events. Lizard flanking gene order is fully preserved in chicken but no pseudogene debris remains at the site.

This is a familiar story in opsins ... an old gene fades out mid-amniote but otherwise continues on for another 310 million years (Walls hypothesis plus birds). The graphic below summarizes the history of gene gain and loss at 24 opsin loci in deuterostomes from the earliest Cambrian to the present day. The genes and species are sorted to illustrate sequential loss in the human lineage. The other sort order would provide a conventional species tree vs gene tree view. Species are clustered by color to indicate subtrees, for example (lizard,(finch,chicken)).

What is the function of the ciliary ur-opsin in the contemporary organisms that retain it? To determine whether it has ever been studied under another name, tblastn of each TMT1 in the curated reference collection can be used against GenBank transcripts and gene deposits (which often provide the necessary PubMed id). Transcript data from non-pooled tissues might at least determine some sites of expression; however no data is available for frog or lizard (ie no data for tetrapods since other species have lost the gene).

Note first that the gene appears to have undergone a segmental duplication in the chondrichthyes stem. Both loci persisted to teleost fish, whereas additional copies presumably created by whole genome duplication were not retained. Only one locus, presumably more fundamental, persisted into frog and lizard. It defines the TMT locus in earlier diverging species via best-blast and synteny and so the second TMT locus by default. These are not to be confused with a later TMT gene duplication that arose in fish and persisted through lizard, birds, platypus and marsupials but not placentals.

One of the three distinct TMT genetic loci was specifically studied in adult eyes and embryonic cell lines of zebrafish. Little has happened since (other than bioinformatics): TMT genes from Tetraodon, Gasterosteus and Oryzias still lack articles and informative transcripts. Zebrafish TMT genes remain crazily annotated at GenBank (as adiponectin, with pipeline analysis of genomic dna mislabeled as mrna). None of the zebrafish genes has never been studied though two transcripts are available from delimited libraries (developing eggs with support cells). Pimephales promelas has TMT transcripts from brain and testis; Oncorhynchus mykiss has transcripts from testis.

In summary, while thousands of articles address the highly derived RHO1 locus, the much more fundamental ciliary ur-opsin TMT has scarcely been investigated. Until its core function is better understood, the early history of ciliary opsins in vertebrates will remain a mystery. Without this core role for TMT that allowed it to be retained for tens of millions of years in the pre-vision era, no ciliary opsin would have been available for later expansion into imaging ciliary opsins.

Below is a preliminary assessment of the three TMT loci and the related encephalopsin locus. The right column revises nomenclature relative to that of the reference collection (second column). Gene names arose in the pre-genomic era when the full paralog complement, phylogenetic distribution and sites of expression were not understood. With complete genomes now in hand, a final and sensible nomenclature can be envisioned.

The TM2 region aligned proves useful for defining the diagnostic residues and indels of these four gene classes necessary to sort out their evolutionary relationships. As the names suggest, this is (TMT1,(TMT2,(TMT3,ENC))) rooted by protostome ciliary opsins which indicate TMT1 is the original ur-opsin. The final column contains information on synteny that validates the proposed history of gene duplication. Gene order is barely conserved but enough for the duplication history to be unravelled.


The tree above can be generated using the following Newick string:
((((((((((((((ENC_homSap,ENC_otoGar),ENC_musMus),(ENC_pteVam,ENC_canDom)),ENC_loxAfr),ENC_monDom),(ENC_galgal,ENC_anoCar)),ENC_xenTro),(ENC_danRer,((ENC_takRub,ENC_gasAcu),ENC_oryLat))),
ENC_calMil),ENC_petMar),((TMT_braFlo,TMT_braBel),(TMT5_braFlo,TMT5_braBel))),(((((((((TMT2_monDom,TMT2_macEug),TMT2_ornAna),(TMT2_galGal,TMT2_taeGut)),TMT2_anoCar),TMT2_xenTro),
((TMT2a_danRer,TMT2b_danRer),((TMT2_tetNig,TMT2_takRub),(TMT2_gasAcu,TMT2_oryLat)))),(((TMT1a_anoCar,TMT1a_xenTro),((TMT1a2_danRer,TMT1a1_danRer),((TMT1a_tetNig,TMT1a_takRub),
(TMT1a_gasAcu,TMT1a_oryLat)))),((((TMT1b_tetNig,TMT1b_takRub),(TMT1b_gasAcu,TMT1b_oryLat)),TMT1a_pimPin),TMT1a_calMil))),((TMT3_xenTro,(TMT3_danRer,((TMT3_tetNig,TMT3_takRub),
(TMT3_oryLat,TMT3_oncMyk)))),TMT3_calMil)),((TMTx_braFlo,TMTy_braFlo),(TMT1_strPur,TMT2_strPur)))),((((((((TMT1a_anoGam,TMT1b_anoGam),(TMT1_aedAeg,TMT1_culPip)),(TMT1_bomMor,TMT1_helVir)),
TMT1_triCas),TMT1_apiMel),(TMT1_rhoPro,TMT1_acyPis)),(TMT1a_dapPul,TMT1b_dapPul)),(TMT1a_plaDum,TMT1b_plaDum))),(TMT_triCys,TMT_carRas));

ENC_homSap    ENCEPH_hom NN LLVLVLYYKFQRLRTPTHLLLVNISLS D LLVS-LFGV T FTFVSCLRNGWVWDT VGC ...
ENC_otoGar    ENCEPH_oto NN LLVLVLYYKFPRLRTPTHLFLVNISLS D LLVS-LFGV T FTFVSCLRNGWVWDT VGC ...
ENC_loxAfr    ENCEPH_lox NN LLVLVLYYKFQRLRTPTHLFLVNISLS D LLVS-LFGV T FTFVSCLRNGWVWDT VGC ...
ENC_pteVam    ENCEPH_pte NN LLVLVFYYKFQQVRTPFYLFLVNISFS D LLVS-FFGV T FTFVSCLRNGWVWDT VGC ...
ENC_musMus    ENCEPH_mus GN LLVLLLYSKFPRLRTPTHLFLVNLSLG D LLVS-LFGV T FTFASCLRNGWVWDA VGC ...
ENC_canDom    ENCEPH_can CH FCPQKGFLEFQRLRTPTHLLLVNLSLS D LLVS-LFGV T FTFVSCLRNGWVWDS VGC ...
ENC_monDom    ENCEPH_mon NN LLVLVLYYKFQRLRTPTHLFLVNISFN D LLVS-LFGV T FTFVSCLRSGWVWDS VGC syn(-EXO1   -WDR64   +ENC -KMO     +FH +RGS7)
ENC_galgal    ENCEPH_gal NN LLVLVLYYKFKRLRTPTNLFLVNISLS D LLVS-VCGV S LTFMSCLRSRWVWDA AGC syn(-EXO1   -WDR64   +ENC        -PIGM +RGS7)
ENC_anoCar    ENCEPH_ano NN LLVLVLYAKFKRLRTPTHLFLVNISLS D LLVS-LFGV S FTFGSCLRHRWVWDA AGC syn(-EXO1   -WDR64   +ENC        -PIGM +RGS7)
ENC_xenTro    ENCEPH_xen NN LLVLILYCKFKRLQTPTNLLFFNTSLC H FVFS-LLAI T FTFMSCVRGSWAFSV EMC syn(-ASAH3L -ACER2   +ENC     -ADFP -DENND4C)
ENC_danRer    ENCEPH_dan NN IIVIILYSRYKRLRTPTNLLIVNISVS D LLVS-LTGV N FTFVSCVKRRWVFNS ATC syn(-MTRF1L -TMEM63B +ENC -KMO +IDE -MARCH5 +CPEB3 -BTAF1)
ENC_takRub    ENCEPH_tak NN FVVLALYCRFKRLRTPTNLLLVNISLS D LLVS-LFGI N FTFAACVQGRWTWTQ ATC syn(-ABLIM1 -PTK7    +ENC -KMO +IDE         +CPEB4        -CCNJ)
ENC_gasAcu    ENCEPH_gas NN VVVIVLYCKFKRLRTPTNLLVVNISLS D LLVS-VIGI N FTFVSCIRGGWTWSR ATC syn(FAM82A  CDC42EP2 +ENC -KMO +IDE -MARCH5 +CPEB4        -CCNJ)
ENC_oryLat    ENCEPH_ory NN LLVILLYCKFKRLRTPTSLLLVNISLS D LLVS-VVGI N FTLASCVKGRWMWSQ ATC syn(CYP1B1  CDC42EP2 +ENC -KMO +IDE -MARCH5 +CPEB4 -BTAF1 -CCNJ)
ENC_calMil    ENCEPH_cal NN ILVLLLYYKFKRLRTPTNLLLVNISVS D LLVS-VFGL S FTFVSCTQGRWGWDS AAC ---
ENC_squAca    ENCEPH_squ NN LLMLVLYCKFKRLRTPTNLFLVNISIS D LLLS-VFGV I FTFVSCVKGRWVWDS AAC ---
ENC_petMar    ENCEPH_pet NN LLLVALFVGFKRLQTPTNLLLVNISLS D LLVS-VFGN T LTLVSCVRRRWVWGN GGC ---
ENC_braFlo    ENCEPH4_br NN FVVILLIGCHRQLRTPFNLLLLNMSVA D LLVS-VCGN T LSFASAVRHRWLWGR PGC syn(     -ZFYVE1 +RTF1 +ENC -CES1 -POMT2)
ENC_braBel    ENCEPH4_br NN FVVILLIGCHRQLRTPFNLLLLNVSVA D LLVS-VCGN T LSFASAVQHRWLWGR PGC ---
ENC_braFlo    TMT5_braFl SN GAVVLLFLKFRQLRTPFNMLLLNMSVA D LLVS-VCGN T LSFASAVRHRWLWGR PGC syn(NKX2 -ZFYVE1 +RTF1 +ENC ERF1 TMED9 LARS2)
ENC_braBel    TMT5_braBe SN GAVVVLFLKFPQLRTPFNLLLLNMAVA D LLVS-VCGN T LSFASAVRHRWLWGR PGC ---
TMT3_monDom   TMT_monDom SN FIVLVLFCKFKVLRNPVNMLLLNISIS D MLVC-LSGT T LSFASSIQGRWIGGK HGC syn(+NCK2 -UXS1 +TMT3 -ST6GAL2 RALY SLC5A7 +SULT1E1)
TMT3_macEug   TMT_macEug NN FIVLVLFCKFKVLRNPVNMLLLNISIS D MLVC-LTGT T LSFASSIRGRWIAGY HGC ---
TMT3_ornAna   TMT_ornAna NN LIVLILFCKFKALRNPVNMIMLNISAS D MLVC-VSGT T LSFASNISGRWIGGD PGC syn(+NCK2 -UXS1 +TMT3 -ST6GAL2_overlap -UCHL3  +TBCID4)
TMT3_galGal   TMT_galGal NN LIVLILFCKFKTLRNPVNMLLLNISIS D MLVC-ISGT T LSFASNIHGKWIGGE HGC syn(+NCK2 -UXS1 +TMT3 -ST6GAL2_overlap +SLC5A7 +SULT1C4)
TMT3_taeGut   TMT_taeGut NN LIVLILFCKFKTLRNPVNMLLLNISVS D MLVC-ISGT T LSFASNIRGKWIGGD HAC ...
TMT3_anoCar   TMT_anoCar NN LVVLILFCKFKTLRNPVNMLLLNISAS D MLVC-ISGT T LSFVSNIYGRWIGGE HGC syn(            +TMT3 -ST6GAL2_overlap +SLC5A7 RANBP2)
TMT3_xenTro   TMT_xenTro NN FVVLILFCKFKTLRTPVNMMLLNISAS D MLVC-VSGT T LSFTSSIKGKWIGGE YGC syn(      -UXS1 +TMT3 -ST6GAL2_overlap +SLC5A7)
TMT3_danRer   TMT_danRer NN LVVLVLFCKFKTLRTPVNMLLLNISIS D MLVC-MFGT T LSFASSVRGRWLLGR HGC syn(      -UXS1 +TMT3 -ST6GAL2_overlap    +GPR89A -PDZK1l)
TMT3_tetNig   TMT_tetNig NN FIVLLLFCKFKKLRTPVNVLLLNISVS D MLVC-LFGT T LSFASSLRGRWLLGR SGC syn(      -UXS1 +TMT3 -ST6GAL2_overlap)
TMT3_takRub   TMT_takRub NN FVVLLLFCKFKKLRTPVNMLLLNISVS D MLVC-LFGT T LSFASSIRGRWLLGR IGC ...
TMT3_gasAcu   TMT_gasAcu NN LVVLLLFCKFKKLRTPVNMLLLNISVS D MLVC-LFGT T LSFASSLRGKWLLGR SGC syn(+NCK2 -UXS1 +TMT3 -ST6GAL2_overlap -TFDP2 POU2)
TMT3_oryLat   TMT_oryLat NN FVVLILFCKFKKLRTPVNMLLLNISVS D MLVC-LFGT T LSFASSIRGRWLLGR GGC ...
TMT2_anoCar   TMTa_anoC  NN LLVLVLFCRNKVLRSPINLLLMNISLS D LMIC-IVGT P FSFAASTQGKWLIGP AGC syn(VAMP PER2 HES6 TUBA1 GPR35 +TMT1 -ST6GAL1 MYEOV2 GPC5)
TMT2_xenTro   TMTa_xenT  NN LVVLILFCQYKVLRSPINMLLMNISLS D LMVC-ILGT P FSFAASTQGHWLIGE IGC syn(VAMP PER2 HES6       GPR35 +TMT1 -ST6GAL1 MYEOV2 GPC5)
TMT2_danRer   TMTb_danRe NN TLVLVLFCRYKVLRSPMNCLLISISVS D LLVC-VLGT P FSFAASTQGRWLIGR AGC syn(-PTCHD1 -PHEX -CNKSR2 SH3KBP1 -MAP3K15 TNK2 +TMT2 -MYEOV2 -MAP4K4 PRMT6)
TMT2_tetNig   TMTb_tetNi SN LLVLALFCRFRALRTPMNLMLVSISAS D LLVS-VLGT P FSFAASTQGRWLLGR AGC syn(MYO3A GAD2 ARHGAP21 TFR2 +TMT2  MYEOV2 SH3KBP1 MAP3K15 PHEX PTCHD1)
TMT2_takRub   TMTb_takRu SN FLVLALFCRYRALRTPMNLMLVSISAS D LLVS-VLGT P FSFAASTQGRWLIGR AGC ...
TMT2_gasAcu   TMTb_gasAc SN FLVLALFCRYRALRTPMNLLLVSISAS D LLVS-MVGT P FSFAASTQGRWLIGR AGC syn(PTCHD1 PHEX MAP3K15 SH3KBP1 +TMT2 -MYEOV2 ARHGAP21 MYO3A)
TMT2_oryLat   TMTb_oryLa SN LLVLALFCRYRALRTPMNLLLVSISVS D LLVS-VLGT P FSFAASTQGRWLIGR AGC ...
TMT1_danRer   TMTa1_danR NN LLVLVLFGRYKVLRSPINFLLVNICLS D LLVC-VLGT P FSFAASTQGRWLIGD TGC syn(      RAB25 PBX3 TNK2 +TMT1  WAC +LPPR4 +AGL PTBP1)
TMT1_tetNig   TMTa_tetNi SN LLVLVLFCRFKVLRSPINLLLVNISVS D LLVC-VLGT P FSFAASTQGRWLIGA AGC syn(TMEM8 RGS11 TFRC TNK2 +TMT1  WAC RAB18 YME1L1 ABI1)
TMT1_takRub   TMTa_takRu NN LLVLVLFCRYKMLRSPINLLLMNISIS D LLVC-VLGT P FSFAASTQGRWLIGE AGC syn(     LRRN3 CALD1 TNK2 +TMT1      RAB18 YME1L1 ABI1 TLK1 EDRNB)
TMT1_gasAcu   TMTa_gasAc NN LLVLVLFCRYKMLRSPINLLLINISIS D LLVC-VLGT P FSFAASTQGRWLIGE GGC syn(TMEM8 RGS11 TFRC TNK2 +TMT1  WAC RAB18 YME1L1 ABI1)
TMT1_oryLat   TMTa_oryLa NN LLVLVLFCRYKILRSPINLLLINISIS D LLVC-VLGT P FSFAASTQGRWLIGE GGC ...
TMT1_pimPin   TMTa_pimPr NN TLVLILFCRYKVLRSPMNYLLVSIAVS D LLVC-VLGT P FSFAASTQGRWLIGR AGC ---
TMT1_oncMyk   TMTa_oncMy SN LFVLLVFARFQVLRTPINLILLNISVS D MLVC-IFGT P FSFAASLYGRWLIGA HGC ---
TMT1_calMil   TMTa1_calM NN LLVLVLFCKYKVLRSPMNMLLLNISVS D MLVC-ICGT P FSFAASVQGRWLVGE QGC ---
TMT2_calMil   TMTa2_calM NN LLVLLLFVCFKEIRTPLNMILLNISLS D LSVC-VFGT P FSFAASIYRRWLIGH KGC ---
TMT1_braFlo   TMTx_braFl NN STTLYLVGRYKQLRTPFNILMVNLSVS D LLMC-VLGT P FSFVSSLHGRWMFGH SGC syn(TNPPO2 HECTD3 ABCCA4 PRPRA TMT ATP5D TMT PTPRA FDE4A PTPRA PYRNXN1
TMT2_braFlo   TMTy_braFl TN LLTVLVFWCFKSLRTPFHLYLGGIALS D LLVA-ALGS P FAVASAVGERWLFGR AVC syn(ZFYVE1 FBXL4 RTF1 TMT CES4 TMTY POMT2 GSTZ1) 
TMT1_strPur   TMTPIN_str NN GIVMILFARFPSLRHPINSFLFNVSLS D LIIS-CLAS P FTFASNFAGRWLFGD LGC syn(ARG2 NEK9 FAM164A ZC3H14 TMT PRPF39 YIPF4 SPATA5)
TMT2_strPur   ENCEPH_str GN SVVLFLFAWDRHLRTPTNMFLLSLTIS D WLVT-VVGI P FVTASIYAHRWLFAH VGC ---
TMT1_apiMel   TMT_apiMel AN LLVAIVIVKDAQLWTPVNVILFNLVFG D FLVS-IFGN P VAMVSAATGGWYWGY KMC syn(HEX MAK FASN SPTBN4 PSMA3 TMT LSM11 SEC23A KNSL8)
TMT1a_anoGam  TMT1_anoGa LN IFVIALMYKDVQLWTPMNIILFNLVCS D FSVS-IIGN P LTLTSAISHRWLYGK SIC ...
TMT1b_anoGam  TMT2_anoGa LN LFVIALMCKDMQLWTPMNIILFNLVCS D FSVS-IIGN P LTLTSAISHRWIFGR TLC ...
TMT1_aedAeg   TMT_aedAeg LN LFVIALMCKDVQLWTPINIILFNLVCS D FSVS-IIGN P FTLTSAISRHWIFGR TVC ...
TMT1_culPip   TMT_culPip LN LFVIALMCKEVQLWTPMNIILLNLVCS D FSVS-IVGN P FTLSSAISHRWLFGR KLC ...
TMT1_triCas   TMT_triCas LN LTVIIFMLKERQLWSPLNIILFNLVVS D FLVS-VLGN P WTFFSAINYGWIFGE TGC ...
TMT1_bomMor   TMT_bomMor LN LMVILLMFKDRQLWTPLNIILFNLVCS D FSVS-VLGN P FTLISALFHRWIFGH TMC ...
TMT1_helVir   TMT_helVir LN LMVILLMFKDRQLWTPLNIILFNLVCS D FSVS-VLGN P FTLISALFHRWIFGK TMC ...
TMT1_rhoPro   TMT_rhoPro GN LIVIIIMCRDKNLWTPVNFILFNVIVS D FSVA-ALGN P FTLASAIAKRWFFGQ SMC ...
TMT1_acyPis   TMT_acyPis FN TCVIFIMIRDTRLWTPQNVIIFNLATS D LAVS-VLGN P VTLAAAITKGWIFGQ TIC ...
TMT1a_dapPul  TMTa_dapPu MN IVVVVIILNDSQKMTPLNWMLLNLACS D GAIA-GFGT P ISAAAALKFTWPFSH ELC ...
TMT1b_dapPul  TMTb_dapPu MN VVVVIVILNDSQRMTPLNWMLLNLACS D GAIA-GFGT P ISTAAALEFGWPFSQ ELC ...

                        10        20        30        40        50        60        70        80        90       100       110       120       130       140       150
                         |         |         |         |         |         |         |         |         |         |         |         |         |         |         |
ENCEPH_homSap   FSPGTYERLALLLGSIGLLGVGNNLLVLVLYYKFQRLRTPTHLLLVNISLSDLLVSLFGVTFTFVSCLRNGWVWDTVGCVWDGFSGSLFGIVSIATLTVLAYERYIRVV-----HARVINFSWAWRAITYIWLYSLAWAGAPLLGWNRYI
ENCEPH_loxAfr   FRSGTYERLALLVGSIGLLGVGNNLLVLVLYYKFQRLRTPTHLFLVNISLSDLLVSLFGVTFTFVSCLRNGWVWDTVGCVWDGFSSSLFGIASITTLTVLAYERYIRVV-----HARVINFSWAWRAITYIWLYSLAWSGAPLLGWNRYI
ENCEPH_canFam   IPAAVLDIESQAPKDESLYFSICHFCPQKGFLEFQRLRTPTHLLLVNLSLSDLLVSLFGVTFTFVSCLRNGWVWDSVGCVWDGFSSSLFGIVSITTLTVLAYERYIRVV-----HARVINFSWAWRAITYIWLYSLAWSGAPLLGWNRYI
ENCEPH_monDom   FSPGTYELLALLIATIGLLGLCNNLLVLVLYYKFQRLRTPTHLFLVNISFNDLLVSLFGVTFTFVSCLRSGWVWDSVGCAWDGFSNTLFGIVSIMTLTVLAYERYNRIV-----HAKVINFSWAWRAITYIWLYSLVWTGAPLLGWNRYT
ENCEPH_galGal   FSAGTYELLALLIATIGTLGVCNNLLVLVLYYKFKRLRTPTNLFLVNISLSDLLVSVCGVSLTFMSCLRSRWVWDAAGCVWDGFSNSLFGIVSIMTLTVLAYERYIRVV-----HAKVIDFSWSWRAITYIWLYSLAWTGAPLLGWNRYT
ENCEPH_anoCar   FSAGTYELLALLVAAIGLLGLCNNLLVLVLYAKFKRLRTPTHLFLVNISLSDLLVSLFGVSFTFGSCLRHRWVWDAAGCVWDGFSNSLFGIVSIMTLTVLAYERYIRVV-----HARVIDFSWSWRAITYIWLYSLAWTGAPLLGWNHYT
ENCEPH_danRer   FADETYKLLTFTIGSIGVLGFCNNIIVIILYSRYKRLRTPTNLLIVNISVSDLLVSLTGVNFTFVSCVKRRWVFNSATCVWDGFSNSLFGIVSIMTLSGLAYERYIRVV-----HAKVVDFPWAWRAITHIWLYSLAWTGAPLLGWNRYT
ENCEPH_tetNig   FAVHTYRLLAAAIGAIGVLGFCNNLAVAALYWRFRRLRTPTNLLLLNISLSDLLVSLLGVNFTFAACVQGRWTWNQATCVWDGFSNSLFGIVSIMTLAALAYERYIRVV-----HAQVVDFPWAWRAIGHIWLYSLAWTGAPLLGWNRYT
ENCEPH_takRub   FSGDTYRVLAFTIGTIGAFGFCNNFVVLALYCRFKRLRTPTNLLLVNISLSDLLVSLFGINFTFAACVQGRWTWTQATCVWDGFSNSLFGIVSIMTLAALAYERYIRVV-----HAQVVDFPWAWRAIGHIWLYALAWTGAPLLGWNRYT
ENCEPH_gasAcu   FAVGTYKLLAFAIGTIGVFGFCNNVVVIVLYCKFKRLRTPTNLLVVNISLSDLLVSVIGINFTFVSCIRGGWTWSRATCIWDGFSNSLFGIVSIMTLASLAYERYIRVV-----HAQVVDFPWAWRAIGHIWLYSLVWTGAPLLGWNRYT
ENCEPH_oryLat   FAVGTYKLLTVIIGTIGVFGFCNNLLVILLYCKFKRLRTPTSLLLVNISLSDLLVSVVGINFTLASCVKGRWMWSQATCVWDGFSNSLFGIVSIMTLAALAYERYIRVV-----HAQVVDFPWAWRAIGHIWLYSLAWTGAPLLGWNRYT
ENCEPH_choDri   FSPNTYKLLAVIIGTIGIVGFCNNILVLLLYYKFKRLRTPTNLLLVNISVSDLLVSVFGLSFTFVSCTQGRWGWDSAACVWDGFSHSLFGTVSIVTLTVLAYERYIRVV-----NAKATNFPWAWRAITYTWFYSLAWSGAPLLGWNRYT
ENCEPH_petMar   FSAATFRLLAGVVGTIGVAGFLNNLLLVALFVGFKRLQTPTNLLLVNISLSDLLVSVFGNTLTLVSCVRRRWVWGNGGCVWDGFSNSLFGIVSISTLTALSYERYARLI-----KAQVLDFSWAWRAVTYTWLYSAAWTGAPLLGWSRYV
ENCEPH_xenTro   FTEDTYHFLALIVATVGFLGLVNNLLVLILYCKFKRLQTPTNLLFFNTSLCHFVFSLLAITFTFMSCVRGSWAFSVEMCVFHGFSKNLLGIVSFGTLTVVAYERYARVV-----YGKYVNSSWSKRSITFVWVYSLAWTGFPLIGWNLYT
TMT2_monDom     LSRTGHTIVAVFLGIILIFGSISNFIVLVLFCKFKVLRNPVNMLLLNISISDMLVCLSGTTLSFASSIQGRWIGGKHGCRWYGFANSCFGIVSLISLAILSYERYRTLTLC--PGQ-GADYQKALLAVAGSWLYSLVWTVPPLIGWSSYG
TMT2_macEug     LSRTGHTVTAVFLGLILILGVINNFIVLVLFCKFKVLRNPVNMLLLNISISDMLVCLTGTTLSFASSIRGRWIAGYHGCRWYGFANSCFGIVSLISLAVLSYERYRTLTLC--PRQ-GTDYHKALLAVAGSWLYSLIWTVPPLIGWSSYG
TMT2_ornAna     LSRTGHTMVAVFLGIILVFGFMNNLIVLILFCKFKALRNPVNMIMLNISASDMLVCVSGTTLSFASNISGRWIGGDPGCRWYGFVNSCLGIVSLISLAVLSYERYRTLTLH--PKQ-STDYQKAVLAVGASWIYSLIWTIPPLLGWSSYG
TMT2_galGal     LSRNGHTVVAVFLGFILFFGFLNNLIVLILFCKFKTLRNPVNMLLLNISISDMLVCISGTTLSFASNIHGKWIGGEHGCRWYGFVNSCFGIVSLISLAVLSYERYSTLTLC--NKR-SDDYRKALLAVGGSWVYSLLWTVPPLLGWSSYG
TMT2_taeGut     LSRSGHTVVAVFLGLILFFGFLNNLIVLILFCKFKTLRNPVNMLLLNISVSDMLVCISGTTLSFASNIRGKWIGGDHACRWYGFVNSCFGVVSLISLAVLSYERYNTLTLC--HKR-SDDFRKALLAVAGSWIYSLVWTVPPLLGWSSYG
TMT2_anoCar     LSRMGHNIVAVFLGLILVFGFLNNLVVLILFCKFKTLRNPVNMLLLNISASDMLVCISGTTLSFVSNIYGRWIGGEHGCRWYGFVNSCFGIVSLISLAILSYERYSTLTQT--NKR-GSDYQKALLGVGGSWLYSLIWTVPPLIGWSSYG
TMT2_xenTro     LSRTGHTVVAIFLGFILIFGFLNNFVVLILFCKFKTLRTPVNMMLLNISASDMLVCVSGTTLSFTSSIKGKWIGGEYGCQWYGFVNSCFGIVSLISLAILSYERYSTLTLY--NKG-GPNFKKALLAVASSWLYSLVWTVPPLLGWSSYG
TMT2a_danRer    LSRAGFIALSVFLGFIMTFGFFNNLVVLVLFCKFKTLRTPVNMLLLNISISDMLVCMFGTTLSFASSVRGRWLLGRHGCMWYGFINSCFGIVSLISLVVLSYDRYSTLTVY--HKR-APDYRKPLLAVGGSWLYSLIWTVPPLLGWSSYG
TMT2b_danRer    LSRTGHNVVAVILGSILIFGTLNNLVVLVLFCKFKTLRTPVNMLLLNISVSDMLVCLFGTTLSFAASIRGRWLVGRHGCMWYGFVNSCFGIVSLISLAILSYDRYSTLTVY--NKR-APDYSKPLLAVGGSWLYSLFWTVPPLLGWSSYG
TMT2_tetNig     LSPTGFVVLSVVLGFIITFGFLNNFIVLLLFCKFKKLRTPVNVLLLNISVSDMLVCLFGTTLSFASSLRGRWLLGRSGCNWYGFINSCFGIVSLISLVILSHDRYSTLTVY--NKQ-GINYRKPLLAVGGTWLYSLLWTVPPLLGWSSYG
TMT2_takRub     LSPTGFVVLSVVLGFIMTFGFLNNFVVLLLFCKFKKLRTPVNMLLLNISVSDMLVCLFGTTLSFASSIRGRWLLGRIGCSWYGFINSCFGIVSLISLVILSYDRYSTLTVY--NKQ-GINYRKPLLAVGGTWLYSLFWTVPPLLGWSSYG
TMT2_oryLat     LSQAGFVVLSVVLGFIMTFGFLNNFVVLILFCKFKKLRTPVNMLLLNISVSDMLVCLFGTTLSFASSIRGRWLLGRGGCSWYGFINSCFGIVSLISLVILSYDRYSTLTVY--NKG-GLNYRKPLLAVGGSWLYSLFWTVPPLLGWSSYG
TMT2_gasAcu     LSPTGFVVLSVMLGFIMTFGFVNNLVVLLLFCKFKKLRTPVNMLLLNISVSDMLVCLFGTTLSFASSLRGKWLLGRSGCSWYGFINSCFGIVSLISLVILSYDRYSTLTVY--NKA-GPDYRKPLLAIGGSWLYSLFWTVPPLLGWSSYG
TMT3_xenTro     LSRTAHSVVAVCLGCILVLGSLYNSFVLLIFVKFTAIRTPINMILLNISVSDLLVCIFGTPFSFVSSVSGGWLLGQQGCKWYGFCNSLFGLVSMISLSMLSYERYLTVLKC--TKADMTDYKKSWLCIIVSWLYSLCWTLPPLIGWSSYG
TMT3_calMil     LSQSGHTTVAVFLGIILVLGCVNNLLVLLLFVCFKEIRTPLNMILLNISLSDLSVCVFGTPFSFAASIYRRWLIGHKGCKWYGFANSLFGLVSMISLSMLSYERYLTVLKC--TKADMTDYKKSWLCIIVSWLYSLCWTLPPLIGWSSYG
TMT3_danRer     LSRTGHTVTAVCLGAILLLGCLNNLFVLLVFARFRTLWTPINLILLNISVSDILVCLFGTPFSFASSLYGKWLLGHHGCKWYGFANSLFGIVSLMSLSILSYERYAALLRA--TKADVSDFRRAWLCVAGSWLYSLLWTLPPFLGWSNYG
TMT3_oncMyk     LGRTGHTVVAVFLGVIFLLGFLSNLFVLLVFARFQVLRTPINLILLNISVSDMLVCIFGTPFSFAASLYGRWLIGAHGCKWYGFANSLFGIVSLVSLAILSYERYSTILCY--TKADPSDYKKAWLAIAGAWLYSLVWTVPPFFGWSSYG
TMT3_tetNig     LSRSSHTAVAVLLGVILVAGILSNSLVLLLFVKYRSLWTPINLILLNINLSDILVCVFGTPFSFAASLQGRWLIGEGGCMWYGFANSLFGIVSLVSLSVLSYERCTVVLQP--SQVDVSDFRKARFCVGGSWLYALLWTSPPLLGWSSYG
TMT3_takRub     MSRTGHTVVAVMLGTILLAGVFGNSVVFLVFVKYRSLRTPINLILLNISLSDILVCVFGTPLSFAASLKGRWLLGERGCEWYGFANSLFGIVSLVSLSVLSYERYTVVLQP--TQVDVSYFRKAWFCVGGSWLYALFWTLPPLLGWSRYG
TMT3_oryLat     LSRTGHTAVAVCLGFILVAGILNNFLTLLVFAKFRSLWTPINLILLNISLSDILVCVLGTPFSFAASVRGRWLIGESGCKWYAFANSLFGIVSLVSLSVLSYERYITVLHS--SQADLSNFRKAWFCVGGSWLYSLLWTLPPFLGWSSYG
TMT1a_anoCar    LSPTGHLITAICLGVIGSLGFLNNLLVLVLFCRNKVLRSPINLLLMNISLSDLMICIVGTPFSFAASTQGKWLIGPAGCVWYGFANTFFGTVSLISLAVLSYERYCTMMGT--TEADATNYKKVWMGIFLSWIYSLFWSLPPLFGWSSYG
TMT1a_xenTro    LSPTGHLLVAVFLGVIGSLGFFNNLVVLILFCQYKVLRSPINMLLMNISLSDLMVCILGTPFSFAASTQGHWLIGEIGCIWYGFVNTLFGTVSLVSLAVLSYERYCTMLRS--TEADLTNYKKAWLGILVSWIYSLVWTLPPLFGWSKYG
TMT1a1_danRer   LSPTGHLVVAVCLGFIGTFGFLNNTLVLVLFCRYKVLRSPMNCLLISISVSDLLVCVLGTPFSFAASTQGRWLIGRAGCVWYGFINSFLGVVSLISLAVLSYERYCTMMGS--TQADSTNYRKVVIGIAFSWIYSMVWTLPPLFGWSCYG
TMT1a_pimPro    LSPTGHLVVAVCLGFIGTFGFLNNTLVLILFCRYKVLRSPMNYLLVSIAVSDLLVCVLGTPFSFAASTQGRWLIGRAGCVWYGFINSCLGVVSLISLAVLSYERYCTMMGA--TQADSTNYKKVAMGIAFSWIYSMVWTLPPLFGWSCYG
TMT1a2_danRer   LSPTGHILVAVSLGFIGTFGFLNNLLVLVLFGRYKVLRSPINFLLVNICLSDLLVCVLGTPFSFAASTQGRWLIGDTGCVWYGFANSLLGIVSLISLAVLSYERYCTMMGS--TEADATNYKKVIGGVLMSWIYSLIWTLPPLFGWSRYG
TMT1a_tetNig    LTPTGNLVVSVFLGLIGTSGLVSNLLVLVLFCRFKVLRSPINLLLVNISVSDLLVCVLGTPFSFAASTQGRWLIGAAGCVWYGFVNSLFGIVSLISLAVLSFERYSTMMTP--TEADSSNYCKVCLGIGLSWVYSLLWTVPPLLGWSSYG
TMT1a_takRub    LTPTGNLVVSVFLGFIGTFGLVNNLLVLVLFCRYKMLRSPINLLLMNISISDLLVCVLGTPFSFAASTQGRWLIGEAGCVWYGFANSLFGVVSLISLAVLSFERYSTMMTP--TEADPSNYCKVCLGITLSWVYSLVWTVPPLFGWSSYG
TMT1a_gasAcu    LTPTGHLVVAVCLGFIGTLGLMNNLLVLVLFCRYKMLRSPINLLLINISISDLLVCVLGTPFSFAASTQGRWLIGEGGCVWYGFANSLFGIVSLISLAVLSYERYSTMVAP--TEADSSNYHKISLGITLSWVYSLIWTAPPLFGWSHYG
TMT1a_oryLat    LTPTGHLIVAVCLGFIGTFGLVNNLLVLVLFCRYKILRSPINLLLINISISDLLVCVLGTPFSFAASTQGRWLIGEGGCVWYGFANSLCGIVSLISLAVLSYERYSTMMTP--AEADSSNYRKISLGIILSWGYSLLWTLPPLFGWSHYG
TMT1a_calMil    LSRTGLTVVAVCLGIIMVLGFLNNLLVLVLFCKYKVLRSPMNMLLLNISVSDMLVCICGTPFSFAASVQGRWLVGEQGCKWYGFANSLFGIVSLMSLTILSYDRYITITGT--TEADITNYNKTIVGIALSWIYSLMWTLPPLFGWSNYG
TMT1b_tetNig    LSQRGHLVVAVCLGAIGTVGFLSNLLVLALFCRFRALRTPMNLMLVSISASDLLVSVLGTPFSFAASTQGRWLLGRAGCVWYGFVNACLGIVSLISLAVLSYERYCTMMAS--TMASNRDYRPVLLGICFSWFYSLAWTVPPLLGWSRYG
TMT1b_takRub    LSQRGHLVVAVCLGFIGTVGFLSNFLVLALFCRYRALRTPMNLMLVSISASDLLVSVLGTPFSFAASTQGRWLIGRAGCVWYGFVNACLGIVSLISLAVLSYERYCTMVSS--TIASNRDYRPVLGGICFSWFYSLAWTVPPLLGWSRYG
TMT1b_gasAcu    LSPKGHLVVAVCLGFIGTFGFLSNFLVLALFCRYRALRTPMNLLLVSISASDLLVSMVGTPFSFAASTQGRWLIGRAGCVWYGFVNACLGIVSLISLAVLSFERYSTMVKP--TVADGRDFRPALGGIAFSWLYSVAWTVPPLLGWSEYG
TMT1b_oryLat    LSPTGHLVVAVCLGLIGTCGFLSNLLVLALFCRYRALRTPMNLLLVSISVSDLLVSVLGTPFSFAASTQGRWLIGRAGCVWYGFINACLGIVSLISLAVLSYERYSTVMTP--NMADGRDFRPALGGICFSWLYSVAWTVPPLLGWSRYG
TMT1a_braFlo    LSPTGHLVVAAILALIGVLGIVNNSTTLYLVGRYKQLRTPFNILMVNLSVSDLLMCVLGTPFSFVSSLHGRWMFGHSGCEWYGFICNFLGIVSLITLTVISYERYLLMKRL--PNERILSYRAVALAVVFIWCYSLLWTAPPLVGWSSYG
TMTq_braFlo     VEFSGFDTVAVVIAAIGIAGFLSNGAVVLLFLKFRQLRTPFNMLLLNMSVADLLVSVCGNTLSFASAVRHRWLWGRPGCVWYGFANHLFGLVSLISLAVISYERYRMVVKPKGPGSSYLTYNKVGLAIIFIYLYCLLWTTLPIVGWSSYQ
TMTq_braBel     VEFFGYDAVAGVIAIIGVVGFVSNGAVVVLFLKFPQLRTPFNLLLLNMAVADLLVSVCGNTLSFASAVRHRWLWGRPGCVWYGFANHLFGLVSLISLAVISFLRYRMVVKPKGPGSSYLTYTKVGLAILFIYLYCLLWTTLPIAGWSSYQ
TMTp_braFlo     FSDAGYTAIATCLALIGFVGFTNNFVVILLIGCHRQLRTPFNLLLLNMSVADLLVSVCGNTLSFASAVRHRWLWGRPGCVWYGFANSLFGIVSLVTLSALAFERYCVVVR----SSDMLTYKSSLVVITFIWLYSLLWTSLPLLGWSSYQ
TMTp_braBel     FSDAGYTAIATGLALIGLVGSMNNFVVILLIGCHRQLRTPFNLLLLNVSVADLLVSVCGNTLSFASAVQHRWLWGRPGCVWYGFANSLFGIVSLVTLSALAFERYCVVVR----SSEMLTYKSSLGMIAFIWMYSLLWTSLPLLGWSSYQ
TMT1a_strPur    VSRTTYNYLTVYTGFLTIFGILNNGIVMILFARFPSLRHPINSFLFNVSLSDLIISCLASPFTFASNFAGRWLFGDLGCTLYAFLVFVAGTEQIVILAALSIQRCMLVVRP--FTAQKMTHRWALFFISLTWIYSLIICVPPLFGWNRYT
TMT1a_plaDum    FGPTSYVITAIYLCIVGVIGTLSNGVIMYLYFKDKSLRSPMNLLFVNLAMSDFTVAFFGAMFQFGLTCTRKYMSPGMACDFYGFITFLGGLASEMNLFIISVERYLAVVRP--FDVGNLTNRRVIAGGVFVWLYSLVFAGGPLVGWSSYR
TMT1b_plaDum    FTATDYNICAAYLFFIACLGVSLNVLVLVLFIKDRKLRSPNNFLYVSLALGDLLVAVFGTAFKFIITARKTLLREEDGCKWYGFITYLGGLAALMTLSVIAFVRCLAVLRL--GSFTGLTTRMGVAAMAFIWIYSLAFTLAPLLGWNHYI
TMT1_apiMel     VSPVMYIGAAIALGFIGFFGFTANLLVAIVIVKDAQLWTPVNVILFNLVFGDFLVSIFGNPVAMVSAATGGWYWGYKMCLWYAWFMSTLGFASIGNLTVMAVERWLLVARP--MQALSIRPQHAVILASFVWIYALSLSLPPLFGWGSYG
TMT1p_anoGam    MAPWAYNGAAVTLFFIGFFGFFLNIFVIALMYKDVQLWTPMNIILFNLVCSDFSVSIIGNPLTLTSAISHRWLYGKSICVAYGFFMSLLGIASITTLTVLSYERFCLISRP--FAAQNRSKQGACLAVLFIWSYSFALTSPPLFGWGAYV
TMT1q_anoGam    MAPWAYNASAVTLFFIGFFGFFLNLFVIALMCKDMQLWTPMNIILFNLVCSDFSVSIIGNPLTLTSAISHRWIFGRTLCVAYGFFMSLLGITSITTLTVLSYERYCLISRP--FSSRNLTRRGAFLAIFFIWGYSFALTSPPLFGWGAYV
TMT1_aedAeg     MESWAYVASAVTLFFIGFFGFFLNLFVIALMCKDVQLWTPINIILFNLVCSDFSVSIIGNPFTLTSAISRHWIFGRTVCIAYGFFMSLLGITSITTLTVLSYERFCLISHP--FSSRSLSRRGAVFAILFIWSYSFALTSPPLFGWGAYV
TMT1_culPip     MPPWAYVATAVVLFFIGFFGFFLNLFVIALMCKEVQLWTPMNIILLNLVCSDFSVSIVGNPFTLSSAISHRWLFGRKLCVAYGFFMSLLGITSITTLTVLSYERFYLISRP--FSSRSLSRRGALGAVLLIWCYSFALTSPPLFGWGAYV
TMT1_bomMor     MPRWGYVASAFVLFLIGFFGFFLNLMVILLMFKDRQLWTPLNIILFNLVCSDFSVSVLGNPFTLISALFHRWIFGHTMCVLYGFFMALLGITSITTLTVISFERYLMVTRP--LTSRHLSSKGAVLSIMFIWTYSLALTTPPLLGWGNYV
TMT1_rhoPro     MPSAGFLAASIILFLIGFLGFFGNLIVIIIMCRDKNLWTPVNFILFNVIVSDFSVAALGNPFTLASAIAKRWFFGQSMCVAYGFFMALLGITSINSLTVLALERYLIVSQP--VSHGSLSRPTASDIVGSIWLYSFVITIPPLVGWGEYG
TMT1_triCas     IPVEGYIAAAVVLFCIGFFGFSLNLTVIIFMLKERQLWSPLNIILFNLVVSDFLVSVLGNPWTFFSAINYGWIFGETGCTIYGFIMSLLSITSITTLTVLAFERYLLIARP--FRNNALNFHSAALSVFSIWLYSLSLTIPPLIGWGEYV
TMT1_acyPis     ISDAIYLGAAIVLSIIGIVGFIFNTCVIFIMIRDTRLWTPQNVIIFNLATSDLAVSVLGNPVTLAAAITKGWIFGQTICVIYGFFMALFGIASITTLTVLAYDRYLMIRYP--FSSSRLTKETALYAIAGIWIYAFAVTGPPLFGWNRYV
TMTr_strPur     FTTEAHLLAGSFLTLVFIISIIGNSVVLFLFAWDRHLRTPTNMFLLSLTISDWLVTVVGIPFVTASIYAHRWLFAHVGCIIYAFIMTFLGLNSLMSHAVIAVDRYLVITKP--HFGIVVTYPKAFLMISIPWVFSFAWAVFPLAGWGEFT
TMT1c_braFlo    FTTEQHLLMAVWLGFIGSFGFVTNLLTVLVFWCFKSLRTPFHLYLGGIALSDLLVAALGSPFAVASAVGERWLFGRAVCVWYAFVNYFLSIVSIVTMATMSFSRYWVIIRPQ-SAPRLDTVYGACVVNALAWCYSFFWTIMPVLGWSRFT
PPINa_cioInt    ANRSTYSFLCVYMTFVFLLSCSLNILVIVATLKNKVLRQPLNYIIVNLAVVDLLSGFVGGFISIAANGAGYFFWGKTMCQIEGYFVSNFGVTGLLSIAVMAFERYFVICKP--FGPVRFEEKHSIFGIVITWVWSMFWNTPPLIFWDGYD
PPINa_cioSav    ADRSVYSFLAVYMTFICLISCSLNILVITATLKNKVLRQPLNYIIVNLAVVDLLSGLVGGVISIFANGAGYFFWGKFMCQVEGYTVSNFGVTGLLSIAVMAFERYFVICKP--FGPVRFEEKHAVIGIAVTWIWAMFWNTPPLIFWDGYD
PPINb_cioInt    AERHIYTILAVYMTFIFLLAVSLNGFVIIATMKNKKLRQPLNYIIINLSIADFLSGLVGGFIGMISNSAGYFYFGKTVCILEGYIVSVAGVCGLMSISVMAFERYFVVCKP--YGPFTLTNTHAALGIGFTWTWSVLWSTPGLIWLDGYV
PPINb_cioSav    ANRSTYSGLCVFMSFVFVLAVPLNLLVIVATYKNKDLRRPINYIIVNLAVADLTCSVVGGLLGVLNNGAGYYFLGKSVCIFEGYVMSVTGVCGILSITVMAFERYFVVCKP--FGQTNLKWSHAITGIVFTWTWSVIWHTPGLFFWNGYE
Consensus                av l  !g  G   N  V  l  k   LrtP N  l n s sDll!   G  f f s    rW  g  gC wygF  sl GivS   $ vlsy#RY                  a   !   W YSl wt pPL GW  Y 
Prim.cons.      LSPTGYLVVAVFLGFIGTFGFLNNLLVLVLFCKFKRLRTPINLLLLNISVSDLLVSVFGTPFSFASSIRGRWLWGRAGCVWYGFANSLFGIVSLISLAVLSYERYSTVVRPKGTKADVLDYRKA2LAIGFSWLYSLAWTVPPLLGWSSYG

                       160       170       180       190       200       210       220       230       240       250       260       270       280       290       300
                         |         |         |         |         |         |         |         |         |         |         |         |         |         |         |
ENCEPH_homSap   LDVHGLGCTVDWKSKDANDSSFVLFLFLGCLVVPLGVIAHCYGHILYSIRMLRCVEDLQTIQV--IKILKYEKKLAKMCFLMIFTFLVCWMPYIVICFLVVNGHGHLVTPTISIVSYLFAKSNTVYNPVIYVFMIRKFRRSLLQLLCLRL
ENCEPH_loxAfr   LDTHGLACTVDWKSNNSSDSSFVLFLFLGCLVVPVGVIAHCYGHILYSIRMLRCVEDLQTIQV--IKILRHEKKLAKMCLFMIFTFLICWMPYIVICFLVVNGYGHLVTPTISIVSYLFAKSSTVYNPVIYTFMIRKFRRSLLQLLCFRL
ENCEPH_canFam   LDVHGLGCTVDWKSKDANDSFFVLFLFLGCLVVPMGVIVHCYGHILYSIRMLRCVEDLQTIQV--IKILRYEKKVAKMCFLMIFIFLIFWMPYIVICFLVVNGYGHLVTPTVSIVSYLFAKSSTVYNPVIYIIMIRKFRRSLLQLLCFRP
ENCEPH_monDom   LEIHGLGCSVDWKSKDPNDSSFVIFLFFGCLMLPVGVMAYCYGHILYAIRMLRCVEELQTIQV--IKILRYEKKVAKMCFLMIAIFLFCWMPYAVICLLVANGYGSLVTPTVAIIASLFAKSSTAYNPIIYIFMSRKFRRCLLQLLCFRL
ENCEPH_galGal   LEIHGLGCSMDWKSKDPNDTSFVLLFFLGCLVAPVVIMAYCYGHILYAVRMLRCVEDFQTSQV--IKLLKYEKKVAKMCFLMISTFLICWMPYAVVSLLVTYGYSNLVTPTVAIIPSFFAKSSTAYNPVIYIFMSRKFRQCLLQLLCFRL
ENCEPH_anoCar   LEIHGLGCSVDWQSKEPSDSSFVLFFFLGCLAAPVGIMAYCYGHILHAIRMLRCVEDLQSIQV--IKILRYEKKVAKMCFLMVTTFLICWMPYAVVSLLIAYGYGHLITPTVAIIPSFFAKSSTAYNPVIYIFMSRKFRRCLVQLFCVQF
ENCEPH_danRer   LEVHQLGCSLDWASKDPNDASFILFFLLGCFFVPVGVMVYCYGNILYTVKMLRSIQDLQTVQT--IKILRYEKKVAVMFLMMISCFLVCWTPYAVVSMLEAFGKKSVVSPTVAIIPSLFAKSSTAYNPVIYAFMSRKFRRCMLQMLCSRL
ENCEPH_tetNig   LEIHRLGCSLDWASKDPNDASFILLFLLACFFVPVGIMIYCYGNILYAVHMIRSIQDLQTVQI--IKILRYEKKVSVMFFLMISCFLLCWTPYAVVSMMVAFGRKSMVSPTVAIIPSFFAKSSTAYNPVIYVFMSRKFRRCLLQLLCSRL
ENCEPH_takRub   LEIHRLGCSLDWASKDPNDASFILLFLLACFFVPVGIMIYCYGNILYAVQMIRSIQDLQTVQI--IKILRYEKKVSVMFFLMISCFLLCWTPYAVVSMMVAFGRRSMVSPTMAIIPSFFAKSSTAYNPLIYVFMSRKFRHCLLQLLCSRL
ENCEPH_gasAcu   LEIHRLGCSLDWASKDPNDASFILLFLLACFFVPVGIMIYCYGNILYAVQMLRSIQDLQTVQI--IKILRYEKKVAVMFLLMISCFLLCWTPYAVVSMMEAFGRKNMVSPTVAIIPSFFAKSSTAYNPLICVFMSRKFRRCLMQLLCSRV
ENCEPH_oryLat   LEIHQLGCSLDWASKDPNDAAFILLFLLGCFFVPVGIMIYCYGNILYAVRMLRSIEDLQTVQI--IKILRYEKKVAAMFLLMISCFLVCWTPYAVVSMMEAFGKKSMVSPTVAIVPSFFAKSSTAYNPLIYVFMNRKFRRCFLQLLCSKI
ENCEPH_choDri   LEMHRLGCSVTWELEKPSDTSFILFLFLGSLLIPVGVIAYCYGNI-YTIRMLQSIEDFQTARF--AKTLTNEMNSSKMCFFMISVAFSCWLPYAVTSFMVVYGCTDVITPTITIIFSLLAKSSAISYPIIYIFMSRKFRWCLMQLLCFRL
ENCEPH_petMar   LEKHGLGCSIDWASSNPPDAAFVLFFFLGCLAAPLLVMGFCFGRIALAITQFRKLDRLQTPRV--LKARCSERKVSAVCLLMMLLFLLCWSPYAVASLFVASGFEHLVSPPVSIVPSLLAKSNAVCNPLLFLLMSGNFFRCLRTMFFTLR
ENCEPH_xenTro   FETHKLDCSFEWTATDPKDTAFVLLFFLACITLPLSIMAYCYGYILYEIQKLRSVKNIQNFQE--ITILDYEIKMAKMCLLMMLTFLIGWMPYTILSLLVTSGYSKFITPTITVMPSLLAIASAAYNPVIHIFTIKKFRQCLVQLLFHNF
TMT2_monDom     TEGAGTSCSVHWTSKSVESVSYIMCLFIFCLVIPILVMVYFYGRLLYAVKQ----VGKIRKTA----ARKREYHVLFMVVTAVICYLICWVPYGMIALLATFGPPGVVSPVANVVPSILAKSSTVCNPIIYVLMNKQFYKCFLILFHCQP
TMT2_macEug     TEGAGTSCSVHWTSKSVESVSYIMCLFIFCLVIPILFMVYFYGRLLYTVKQ----VGKIRKSA----ARKREYHVLFMVVTAVICYLICWVPYGMIALLATFGPPGVVSPVANVVPSILAKSSTVCNPIIYILMNKQFYKCFLILFHCQP
TMT2_ornAna     TEGAGTSCSVHWSSKSPVSVSYIVCLFIFCLVIPVLVMIYCYGRLLYAVKQ----IGKARKTA----ARKREYHVLFMVITTVICYLVCWMPYGVTALLATFGQPGTVSPEASVIPSILAKSSTVCNPIIYILMNKQFYKCFLILFHCQP
TMT2_galGal     IEGAGTSCSVRWSSETAESTSYIICLFIFCLVIPVMVMMYCYGRLLYAVKQ----VGKIHKNT----ARKREYHVLFMVITTVICYLVCWIPYGVIALLATFGKPGVVTPVASIIPSILAKSSTVCNPIIYILMNKQFYKCFRQLFHCQP
TMT2_taeGut     VEGAGTSCSVRWSSESAESTSYIICLFVFCLVVPVMVMMYCYGRLLYAVKQ----VGKIHKNA----ARKREYHVLFMVIPTVICYLVCWIPYGVIALLATFGKPGAVTPITSIIPSILAKSSTVCNPIIYILMNKQFYKCFRQLFHCQP
TMT2_anoCar     LEGAGTSCSVRWTSETLESVTYIICLFIFCLAIPVLVMIYCYARLFYAVKQ----VGKLRKTS----ARKREFHVLFMIITTIICYLICWMPYGVIALLATFGRPGLVSPVASVIPSILAKSSTVFNPIIYILMNKQFYKCFLMLLHCQP
TMT2_xenTro     REGAGTSCSVRWTSESVESVSYIICLFIFCLALPVFVMLYCYGRLLYAVKQ----VGKIRKIA----ARKREYHVLFMVITTVICYLLCWLPYGVVALLATFGRPGVISPVASVVPSILAKSSTVFNPIIYILMNKQFYKCFLILFHCHP
TMT2a_danRer    LEGAGTSCSVSWTQRTAESHAYIICLFVFCLGLPVLVMVYCYGRLLYAVKQ----VGKIRKTA----ARKREYHVLFMVITTVVCYLLCWMPYGVVAMMATFGRPGIISPVASVVPSLLAKSSTVINPLIYILMNKQFYRCFRILFCCQR
TMT2b_danRer    LEGAGTSCSVTWTANTPQSHSYIICLFIFCLGIPVLVMVYCYSRLICAVKQ----VGRIRKTA----ARRREYHILFMVITTVVCYLLCWMPYGVVAMMATFGRPGIISPIASVVPSLLAKSSTVINPLIYILMNKQFYRCFLILIHCKH
TMT2_tetNig     IEGAGTSCSVSWTVQTAQSHAYIICLFIFCLGLPVLVMVYCYSRLLWAVKQ----VGKIRKTS----ARKREYHILFMVVTTAACYLVCWMPYGVVAMMATFGPPNIISPVASVVPSLLAKSSTVINPLIYILMNKQFYKCFLILFHCSH
TMT2_takRub     IEGAGTSCSVSWTVQTAQSHAYIICLFTFCLGIPILVMIYCYSRLLWAVKQ----VGRIRKTA----ARKREYHILFMVVTTAACYLVCWMPYGVVAMMATFGPPNIISPVASVVPSLLAKSSTVINPLIYILMNKQFYKCFLILFHCGH
TMT2_oryLat     LEGAGTSCSVSWTANTAQSHAYIICLFIFCLGLPILVMIYCYSRLLLAVKQ----VGKIRKTA----ARKREYHILFMVLTTAACYLLCWMPYGVVAMMATFGPPNIISPVASVVPSLLAKSSTVINPLIYILMNKQFYRCFLILFHCDH
TMT2_gasAcu     IEGAGTSCSVSWTVQTAQSHAYIICLFTFCLGLPMLVMIYCYSRLLLAVKQ----VGRIRKTA----ARRREYHILFMVLTTAACYMLCWMPYGVVAMMATFGPPNIISPVASVVPSLLAKSSTVINPLIYILMNKQFYRCFLILFHCKH
TMT3_xenTro     LESSGTTCSVVWHSKSSNNISYIVCLFLFCLVLPLFIMIFCYGHIVRVIRG----VCRINMTT----AQKREHRLLFMVVCMVTCYLLCWMPYGLVSLMTAFGKPGMITPTVSIIPSILAKSSTFINPLIYIFMNKQFYRCFIALIKCES
TMT3_calMil     LESSGTTCSVVWHSKSSNNISYIVCLFLFCLVLPLFIMIFCYGHIVRVIRG----VGKINQMT----AQTREHRILLMVISMVTFYLLCWLPYGTVALIGTFGNADLITPTCSVIPSILAKSSTVINPVIYVIMNKQFYRCFIALIKCES
TMT3_danRer     PEGPGTTCSVQWHLRSTSSISYVMCLFIFCLLLPLVLMIFCYGKILLLIKG----VTKINLLT----AQRRENHILLMVVTMVSCYLLCWMPYGVVALLATFGRTGLITPVTSIVPSVLAKSSTVVNPVIYVLFNNQFYRCFVAFLKCQG
TMT3_oncMyk     PEGPGTTCSVQWHQRSSGNISYVTCLFIFCLLLPLLLMMFCYGKILFAIRG----VAKINQSS----AQRRETHVLVMVVSMVSCYLLCWMPYGVVALLATFGQVGLVSPTTSIVPSILAKSSTFLNPVIYGLLNNQFYRCFLAFMSCGS
TMT3_tetNig     PEGAGTTCSVQWQLRSPASVSYVLCLLVFCLLLPFLVMVYSYGRILVAIRR----VGRINQLT----AQRREQHILLMVLSMVSCYMLCWMPYGIMALVATFGKLGLVTPMVSVVPSILAKFSTVVNPIIYMFFNNQFYRCFMAFIRCQK
TMT3_takRub     PEGPGTMCSVQWHLRSPANISYVLCLFIFCLLLPLVVMVYSYGRIWVAVRR----AGRINLLT----AQRREQHILWMVLSMVSCYMLCWMPYGIIALVATLGRLGPISPAVSVVPSILAKFSTVVNPVIYMFFNNQFYRCFMAFVRCQK
TMT3_oryLat     PEGPGTTCSIQWHLRSPTSVSYVLCLFIFCLVLPLVLMVYSYGRILVALRR----VGKINLLA----AQRREQRILVMVFSMVSCYILCWMPYGIVALMATFGRKGLVTPLTSVIPSILAKFSTVVNPVIYVFFNSQFYRCLVAFVRCSG
TMT1a_anoCar    PEGPGTTCSVNWHSRDANNISYIICLFIFCLVIPFIVIVYCYGKLLCAIKK----VSGVTQGM----AQTREQRVLIMVVVMIICFLLCWLPYGIVALIATFGKPGLITPSASIIPSVLAKSSTVYNPVIYIFLNKQFYRCFCALLKCGK
TMT1a_xenTro    PEGPGTTCSVNWHSRDANNISYIVCLFIFCLALPFAVIVYCYGRLLFAIKQ----VSGVSKSS----SRAREQRVLIMVIVMVVCFLLCWLPYGVMALVATFGKPGIISPSASIIPSVLAKSSTVYNPIIYIFLNKQFYRCFTALIHCNK
TMT1a1_danRer   PEGPGTTCSVNWAARTPNNVSYIVCLFVFCLILPFIVIVYSYGRLLQAITQ----VSRINTVV----SRKREQRVLFMVVTMVVCYLLCWLPYGIMALLATFGHPGLVTPAASIVPSLLAKSSTVINPIIYIFMNKQFCRCFHALIMCTT
TMT1a_pimPro    PEGPGTTCSVNWAARTANNVSYIICLFFFCLILPFIVIVYSYGRLLQAITQ----VSRINTVV----SRKREQRVLFMVITMVVCYLLCWLPYGIMALLAAFGRPGLVTPAASIVPSVLAKTSTVINPIIYIFMNKQFCRCFHALIMCTT
TMT1a2_danRer   PEGPGTTCSVDWTTKTANNISYIICLFIFCLIVPFLVIIFCYGKLLHAIKQ----VSSVNTSV----SRKREHRVLLMVITMVVFYLLCWLPYGIMALLATFGAPGLVTAEASIVPSILAKSSTVINPVIYIFMNKQFYRCFRALLNCDK
TMT1a_tetNig    PEGPGTTCSVNWTAKTANSVSYIICLFVFCLILPFLVIVFCYGKLLCAIRQ----VSGVNASM----SRRREQRVLFMVVVMVICYLLCWLPYGVVALLATFGPPGLVTPAASIIPSILAKSSTVINPVIYVFMNKQFSRCFLSLLCCED
TMT1a_takRub    PEGPGTTCSVNWTAKTTNSISYIICLFVFCLIVPFLVIVFCYGKLLCAIRQ----VSGINAST----SRKREQRVLCMVVIMVICYLLCWLPYGVVALLATFGPPDLVTPEASIIPSVLAKSSTVINPIIYVFMNKQFYRCFLALLCCQD
TMT1a_gasAcu    PEGPGTTCSVDWTARTANSISYIICLFVFCLIVPFLVIVFCYGKLLCAIRQ----VSGINASL----SRKREQRVLFMVVIMVVCYLLCWLPYGIMALMATFGPPGLITPVASIIPSVLAKTSTVINPVIYVFMNKQFYRCFKALLRCEA
TMT1a_oryLat    PEGPGTTCSVDWTAKTANNISYIICLFVFCLIVPFMVIVFCYGKLLYAIKQ----VSGINVSV----SRKREQRVLFMVVIMVICYLLCWLPYGIMALLATFGPPDLVTPEASIIPSVLAKTSTAINPVIYVFMNKQFYRCFKALLRCEA
TMT1a_calMil    PEGPGTTCSVNWQSKEVSSKSYIICLFIFCLLMPFLVIVYCYGKLVLAVRK----VSA-NNSM----GRTRENKLLIMVTFMIICFLLCWLPYGIVALLATFGSPGLITPTASIIPSVLAKTSTVYNPIIYIFMNKQFYRCFKALLRCEA
TMT1b_tetNig    PEGPGTTCSVDWRTQTPNNISYIVCLFAFCLLLPFCVILYSYGKLLHTIRQ----VSSVSSAV----TRRREHRVLVMVVAMVVCYLICWLPYGVTALLATFGPPNLLTPEATITPSLLAKFSTVINPFIYIFMNKQFYRCFRAFLSCSS
TMT1b_takRub    PEGPGTTCSVDWRTQTPNNISYIVCLFTFCLLLPFFVILYSYGKLLHTIRQ----VRRVSSTV----TRRREHRVLVMVVAMVVCYLICWLPYGVTALLATFGPPNLLTPEATITPSLLAKFSTVINPFIYIFMNKQFYRCFRAFLNCST
TMT1b_gasAcu    PEGPGTTCSVDWKTQTANNISYIVCLFVFCLVLPFCVILYSYSRLLQAIRQ----VSVVSSVV----TRHREQRVLAMVVVMVACYLVCWLPYGVAALLATFGPRDLLSPEASITPSLLAKFSTVVNPFIYIFMNKQFYRCFRAFLSCST
TMT1b_oryLat    PEGPGTTCSVDWKTQTPNNISYIICLFTFCLLLPFGVIVYSYGKMLRVIRQ----VRSMSSVV----TRRREQRVLVMVVTMVVCYLVCWLPYGIAALLATFGPRDLLTPAASITPSLLAKFSTVINPLIYIFMNKQFYRCFWAFFCCST
TMT1a_braFlo    PEGYGISCSVNWESRTANDTSYIVAYFVGCLVFPVAIIVISYTRLILYMRQ---QAPSAPMQM----LVRREKRVTKMVVVMIMGFTICWTPYTIVALIVTCGGEGIITPAAATVPALFAKSSVVYNAAIYVAMNNQFRKCFLRSLNCRS
TMTq_braFlo     LEGPKISCSVAWEEHSLSNTSYIVAIFIMCLLLPLLIIIYSYCRLWYKVKK---GSQNLPPAI--RKSSQKEQKIARMVVVMITCFLVCWLPYGAMALVVSFGGESLISPTAAVVPSLLAKSSTCYNPLVYFAMNNQFRRYFQDLLCCGR
TMTq_braBel     LEGPKIGCSVAWEEHSWSNTSYIVVLFITCLFAPLLIIVYSYYRLWHKVKQ---GSRNLPAAM--RKSSQKEQKIAMMVIVMITCFMVCWLPYGAMALVVTFGGERLISHTAAVVPSLLAKSSTCYNPVVYFAMNSQFRRYFQDLLCCGR
TMTp_braFlo     FEGHNVGCSVNWVQHNPDNVSYIVTLMVTCFFVPMVVVCWSYAWIWRTVRM---SSE-AKPEC--GNSQNAGRLVTTMVVVMIICFLVCWTPYAVMALIVTFGADHLVTPTASVIPSLVAKSSTAYNPIIYVLMNNQFREFLLARLQ---
TMTp_braBel     FEGHSVGCSVNWVKHNVNNVSYIITLMVTCFFVPMVVVCWSYACIWRTVRM---SAE-MKSEF--GNPQNTGRLVTTMVVVMIVCFLVCWTPYTVMALIVTFGADHLVTPTASVIPSLVAKSSTAYNPIIYVLMNNQFREFLLARLR---
TMT1a_strPur    YEGPGTACSVAWNSPSPGDTSYIIFIFVLVLVIPFGIIIFCYGLLVYAVKK----ISRTQAAL--SSEAKADRKVSKMIFIMILFFLIAWTPYTGFSLYVTFGKNVVITPLAGTFPPFFAKLCTIHNPIIYFLLNKQFKDALIQLFCCGE
TMT1a_plaDum    PEGLGTWCSISWQDRSMNTMSYVTAVFLGCYFFPVSIIIFCYFNVWRKVKE----AADAQGGA--GTAGKAEKSIFRMSVIMVTCYLTAWTPYAIVCLIASYGPPNGLPIYAEVLPSLFAKSSQVYNPIIYVLMNKPYRSALVSLVCRGR
TMT1b_plaDum    PEGLATWCSIDWLSDETSDKSYVFAIFIFCFLVPVLIIVVSYGLIYDKVRK----VAKT---G--GSVAKAEREVLRMTLLMVSLFMLAWSPYAVICMLASFGPKDLLHPVATVIPAMFAKSSTMYNPLIYVFMNKQFRRSLKVLLGMGV
TMT1_apiMel     PEAGNVSCSVSWEDPVTNSDTYIGFLFVLGLIVPVFTIVSSYAAIVLTLKK------VRK-RA--GASGRREAKITKMVALMITAFLLAWSPYAALAIAAQYFNAK-PSATVAVLPALLAKSSICYNPIIYAGLNNQFSRFLKKIFDA--
TMT1p_anoGam    NEAANISCSVNWESQTANATSYIIFLFIFGLILPLAVIIYSYINIVLEMRK------NSA-RV--GRVNRAERRVTSMVAVMIVAFMVAWTPYAIFALIEQFGPPELIGPGLAVLPALVAKSSICYNPIIYVGMNTQFRAAFWRIRRSNG
TMT1q_anoGam    QEAANISCSVNWESQTKNATTYIIFLFVFGLVVPLIVIVYSYTNIIVNMRE------NSA-RV--GRINRAEQRVTSMVAVMIVAFMVAWTPYAIFALIEQFGPPELIGPGLAVLPALVAKSSICYNPIIYVGMNTQFRAAFSRVRNK--
TMT1_aedAeg     NEAANISCSVNWESQTLNATSYIIFLFVFGLVVPLVVIVYSYTNIVVNMKR------NAA-RV--GRINRAEKRVTRMVFVMVLAFMIAWTPYAVFALIEQFGPTDIISPALGVLPALIAKSSICYNPIIYVGMNTQFRAAFNRVRNN--
TMT1_culPip     NEAANISCSVNWETQTLNATTYIIYLFVFGLVVPLTVIVYSYTNIIVNMKK------NAA-RV--GRINRAEKRVTTMVAVMVIAFMVAWTPYSVFALMEQFGPPDVIGPGLAVLPALIAKSSICYNPIIYVGMNTQFRAAFNRVRHD--
TMT1_bomMor     NEAANIQCSVNWHEQSTNTLTYIMFLFAMGQILPLSVITFSYVNIIRTLKR------NSQ-RL--GRVSRAEARATAMVFIMIIAFTVAWTPYSLFALMEQFGGVH-ISPVVSIIPALCAKSSICWNPIIYIGLNTQFRAAFNRVRHD--
TMT1_rhoPro     LEAANISCSINWETRSHSSTSYILFLFTFGFFIPIIVISYSYMNIILTMKK------STM-NA--GRVNKAESRVTWMIFVMIFAFFLAWTPYAILALMIAFFDSN-VSPAIATIPAIFAKTSICYNPFIYAGLNTQFRAAFNRVRHD--
TMT1_triCas     HEAANLSCSVNWEEKSPNSTSYILYLFAFGLFLPLVIITFSYVNIILTMRR------NAAFRV--GQVSKAENKVAYMIFIMIIAFLTAWSPYAIMALIVQFGDAALVTPGMAVIPALLAKSSICYNPVIYIGLNAQVKGAKWVSGLIYL
TMT1_acyPis     NESANISCSIDWESGEHS--NYVIYIFVFGLFLPVTVIIYSYVSLVVTVRK------AEK-II--GQATKAECRVAIMVAVMILAFLTAWMPYSVLALMIAFGGVH-ISPVVSIIPALCAKSSICWNPIIYIGLNTQFRSAWKRFLNIQD
TMTr_strPur     YEGTGAWCSVRWDSDQPQIMSYVLAMMFLTFISSIVIMMYCYICIFLTTRRMPRWATSNSIKTHERNRRRREQKLLKTLIAIAIAFLVAWSPYAITSMIVVFGGSELLSLTATTLPSLFAKSSVMINPIIYAVTSRVFRKSLKKMLTPGC
TMT1c_braFlo    QVAAMTVCSLDWDHHTPLSKSYIPVAFLTCLFLPLGVIIFSVFKTTMHLRRAAEVEDEVPNEV------RAGRKTTRITLVMAGCWLVAWLPYACMALVIAAGGR--VSPTVEVLATKFAKTSYIVNTIIYLVMEKEFRKSLVLLLFDPF
PPINa_cioInt    TEGLGTSCAPNWFVKEKRERLFIILYFVFCFVIPLAVIMICYGKLILTLRQ-------IAKESSLSGGTSPEGEVTKMVVVMVTAFVFCWLPYAAFAMYNVVNPEAQIDYALGAAPAFFAKTATIYNPLIYIGLNRQFRDCVVRMIFNGR
PPINa_cioSav    TEGLGTSCAPNWFVKGNTERLFIILYFVFCFLIPLAIIVLCYGKLILQLRQ-------IAKESSLSGGTSPEGEVTKMVVVMVTAFVICWLPYAAFAMYNVVNPEAQIDYALGAAPAFFAKTATIYNPLIYIGLNRQFRDCVVRMIFNGR
PPINb_cioInt    PEGLGTSCAPNWFSKNKSERIFIFVYFVFCFFIPLLVIIICYGKIVLFLKQ-------ATRQSSASSNRQADNKVTKMVLVMISAFLICWTPYGVLSLYNAINPDKQLDYGLGAVPVFFAKTANIYNPLIYIGLNKQFRDGVIKMVFRGR
PPINb_cioSav    PEGFGTSCAPNWFSQQKSERIFIFAYFAFCFLTPLTIIFACYLKLILFIRK-------VSKKSMVNEADRRDFEVTRMVFVMIAAFLICWLPYGCLSMYNAIHPDNLLSYGIGSVPAFFAKTATIYNPIIYMGLNKKFRDGVIRMLFKGR
Consensus        Eg g  CSv W        s%!  lF  cl  P  !i  cYg                            #  v  Mv  M!  %$ cW PY   a$   fG      p     Ps  AKsSt  NP IY  $n qFr           
Prim.cons.      PEGAGTSCSVDWTSKTPNS2SYIICLF2FCLVLPVLVIVYCYGR2LYAVRQLRSVVGKINKQVSLGKARRREQRVLFMVVVMVICFLLCWLPYGVVALLATFGPPGLVTPTASIIPSLLAKSSTVYNPIIYIFMNKQFRRCFLALLCC22
A groomed set of TMT-encephalopsin class sequences that align without anomalies, resortable to input order:
>01RHO1_bosTau
LAAYMFLLIMLGFPINFLTLYVTVQHKKLRTPLNYILLNLAVADLFMVFGGFTTTLYTSLHGYFVFGPTGCNLEGFFATLGGEIALWSLVVLAIERYVVVCKPMSNFRFGENHAIMGVAFTWVMALACAAPPLVGWSRYIPEGMQCSCGIDYYTPHEETNNESFVIYMFVVHFIIPLIVIFFCYGQLVFTVKEAAAQQQESATTQKAEKEVTRMVIIMVIAFLICWLPYAGVAFYIFTHQGSDFGPIFMTIPAFFAKTSAVYNPVIYIMMNKQFRNCMVTTLCCGKNPL

>10ENC_homSap
LALLLGSIGLLGVGNNLLVLVLYYKFQRLRTPTHLLLVNISLSDLLVSLFGVTFTFVSCLRNGWVWDTVGCVWDGFSGSLFGIVSIATLTVLAYERYIRVVHARVINFSWAWRAITYIWLYSLAWAGAPLLGWNRYILDVHGLGCTVDWKSKDANDSSFVLFLFLGCLVVPLGVIAHCYGHILYSIRMLRCVEDLQTIQVIKILKYEKKLAKMCFLMIFTFLVCWMPYIVICFLVVNGHGHLVTPTISIVSYLFAKSNTVYNPVIYVFMIRKFRRSLLQLLCLRLLRC

>11ENC_canFam Canis familiaris (dog) Deut.Euth.Laur DN422921 frag early gap G? OPN3 encephalopsin XP_854433 grossly wrong
LALLLGSVGLLGVGNNLLVLVLYSKFQRLRTPTHLLLVNLSLSDLLVSLFGVTFTFVSCLRNGWVWDSVGCVWDGFSSSLFGIVSITTLTVLAYERYIRVVHARVINFSWAWRAITYIWLYSLAWSGAPLLGWNRYILDVHGLGCTVDWKSKDANDSFFVLFLFLGCLVVPMGVIVHCYGHILYSIRMLRCVEDLQTIQVIKILRYEKKVAKMCFLMIFIFLIFWMPYIVICFLVVNGYGHLVTPTVSIVSYLFAKSSTVYNPVIYIIMIRKFRRSLLQLLCFRPLRC

>12ENC_loxAfr
LALLVGSIGLLGVGNNLLVLVLYYKFQRLRTPTHLFLVNISLSDLLVSLFGVTFTFVSCLRNGWVWDTVGCVWDGFSSSLFGIASITTLTVLAYERYIRVVHARVINFSWAWRAITYIWLYSLAWSGAPLLGWNRYILDTHGLACTVDWKSNNSSDSSFVLFLFLGCLVVPVGVIAHCYGHILYSIRMLRCVEDLQTIQVIKILRHEKKLAKMCLFMIFTFLICWMPYIVICFLVVNGYGHLVTPTISIVSYLFAKSSTVYNPVIYTFMIRKFRRSLLQLLCFRLLRC

>13ENC_monDom
LALLIATIGLLGLCNNLLVLVLYYKFQRLRTPTHLFLVNISFNDLLVSLFGVTFTFVSCLRSGWVWDSVGCAWDGFSNTLFGIVSIMTLTVLAYERYNRIVHAKVINFSWAWRAITYIWLYSLVWTGAPLLGWNRYTLEIHGLGCSVDWKSKDPNDSSFVIFLFFGCLMLPVGVMAYCYGHILYAIRMLRCVEELQTIQVIKILRYEKKVAKMCFLMIAIFLFCWMPYAVICLLVANGYGSLVTPTVAIIASLFAKSSTAYNPIIYIFMSRKFRRCLLQLLCFRLLKF

>14ENC_galGal
LALLIATIGTLGVCNNLLVLVLYYKFKRLRTPTNLFLVNISLSDLLVSVCGVSLTFMSCLRSRWVWDAAGCVWDGFSNSLFGIVSIMTLTVLAYERYIRVVHAKVIDFSWSWRAITYIWLYSLAWTGAPLLGWNRYTLEIHGLGCSMDWKSKDPNDTSFVLLFFLGCLVAPVVIMAYCYGHILYAVRMLRCVEDFQTSQVIKLLKYEKKVAKMCFLMISTFLICWMPYAVVSLLVTYGYSNLVTPTVAIIPSFFAKSSTAYNPVIYIFMSRKFRQCLLQLLCFRLMRF

>15ENC_anoCar
LALLVAAIGLLGLCNNLLVLVLYAKFKRLRTPTHLFLVNISLSDLLVSLFGVSFTFGSCLRHRWVWDAAGCVWDGFSNSLFGIVSIMTLTVLAYERYIRVVHARVIDFSWSWRAITYIWLYSLAWTGAPLLGWNHYTLEIHGLGCSVDWQSKEPSDSSFVLFFFLGCLAAPVGIMAYCYGHILHAIRMLRCVEDLQSIQVIKILRYEKKVAKMCFLMVTTFLICWMPYAVVSLLIAYGYGHLITPTVAIIPSFFAKSSTAYNPVIYIFMSRKFRRCLVQLFCVQFLRF

>16ENC_xenTro
LALIVATVGFLGLVNNLLVLILYCKFKRLQTPTNLLFFNTSLCHFVFSLLAITFTFMSCVRGSWAFSVEMCVFHGFSKNLLGIVSFGTLTVVAYERYARVVYGKYVNSSWSKRSITFVWVYSLAWTGFPLIGWNLYTFETHKLDCSFEWTATDPKDTAFVLLFFLACITLPLSIMAYCYGYILYEIQKLRSVKNIQNFQEITILDYEIKMAKMCLLMMLTFLIGWMPYTILSLLVTSGYSKFITPTITVMPSLLAIASAAYNPVIHIFTIKKFRQCLVQLLFHNFWRL

>17ENC_tetNig
LAAAIGAIGVLGFCNNLAVAALYWRFRRLRTPTNLLLLNISLSDLLVSLLGVNFTFAACVQGRWTWNQATCVWDGFSNSLFGIVSIMTLAALAYERYIRVVHAQVVDFPWAWRAIGHIWLYSLAWTGAPLLGWNRYTLEIHRLGCSLDWASKDPNDASFILLFLLACFFVPVGIMIYCYGNILYAVHMIRSIQDLQTVQIIKILRYEKKVSVMFFLMISCFLLCWTPYAVVSMMVAFGRKSMVSPTVAIIPSFFAKSSTAYNPVIYVFMSRKFRRCLLQLLCSRLSWL

>18ENC_gasAcu
LAFAIGTIGVFGFCNNVVVIVLYCKFKRLRTPTNLLVVNISLSDLLVSVIGINFTFVSCIRGGWTWSRATCIWDGFSNSLFGIVSIMTLASLAYERYIRVVHAQVVDFPWAWRAIGHIWLYSLVWTGAPLLGWNRYTLEIHRLGCSLDWASKDPNDASFILLFLLACFFVPVGIMIYCYGNILYAVQMLRSIQDLQTVQIIKILRYEKKVAVMFLLMISCFLLCWTPYAVVSMMEAFGRKNMVSPTVAIIPSFFAKSSTAYNPLICVFMSRKFRRCLMQLLCSRVTCL

>19ENC_calMil composite
LAVIIGTIGIVGFCNNILVLLLYYKFKRLRTPTNLLLVNISVSDLLVSVFGLSFTFVSCTQGRWGWDSAACVWDGfSHSLFGISSIMSLTVLAYERYIRVVNATAIDFSWAWRAITYIWLYSLAWTGAPLIGWNSYTLELHRLGCSVNWDSRNPSDTSFILFLFLGSLLIPVGVIAYCYGNIFYTIRMLQSIEDFQTARFAKTLTNEMNSSKMCFFMISVAFSCWLPYAVTSFMVVYGCTDVITPTITIIFSLLAKSSAISYPIIYIFMSRKFRWCLMQLLCFRLVRI

>20ENC_petMar
LAGVVGTIGVAGFLNNLLLVALFVGFKRLQTPTNLLLVNISLSDLLVSVFGNTLTLVSCVRRRWVWGNGGCVWDGFSNSLFGIVSISTLTALSYERYARLIKAQVLDFSWAWRAVTYTWLYSAAWTGAPLLGWSRYVLEKHGLGCSIDWASSNPPDAAFVLFFFLGCLAAPLLVMGFCFGRIALAITQFRKLDRLQTPRVLKARCSERKVSAVCLLMMLLFLLCWSPYAVASLFVASGFEHLVSPPVSIVPSLLAKSNAVCNPLLFLLMSGNFFRCLRTMFFTLRWRV

>21ENC4_braFlo
IATCLALIGFVGFTNNFVVILLIGCHRQLRTPFNLLLLNMSVADLLVSVCGNTLSFASAVRHRWLWGRPGCVWYGFANSLFGIVSLVTLSALAFERYCVVVRSSDMLTYKSSLVVITFIWLYSLLWTSLPLLGWSSYQFEGHNVGCSVNWVQHNPDNVSYIVTLMVTCFFVPMVVVCWSYAWIWRTVRMSSEAKPECGNSQNAGRLVTTMVVVMIICFLVCWTPYAVMALIVTFGADHLVTPTASVIPSLVAKSSTAYNPIIYVLMNNQFREFLLARLQRVC

>22ENC4_braBel
IATGLALIGLVGSMNNFVVILLIGCHRQLRTPFNLLLLNVSVADLLVSVCGNTLSFASAVQHRWLWGRPGCVWYGFANSLFGIVSLVTLSALAFERYCVVVRSSEMLTYKSSLGMIAFIWMYSLLWTSLPLLGWSSYQFEGHSVGCSVNWVKHNVNNVSYIITLMVTCFFVPMVVVCWSYACIWRTVRMSAEMKSEFGNPQNTGRLVTTMVVVMIVCFLVCWTPYTVMALIVTFGADHLVTPTASVIPSLVAKSSTAYNPIIYVLMNNQFREFLLARLRTFC

>24TMT5_braFlo
VAVVIAAIGIAGFLSNGAVVLLFLKFRQLRTPFNMLLLNMSVADLLVSVCGNTLSFASAVRHRWLWGRPGCVWYGFANHLFGLVSLISLAVISYERYRMVVKPPGSSYLTYNKVGLAIIFIYLYCLLWTTLPIVGWSSYQLEGPKISCSVAWEEHSLSNTSYIVAIFIMCLLLPLLIIIYSYCRLWYKVKKGSQNLPPAIRKSSQKEQKIARMVVVMITCFLVCWLPYGAMALVVSFGGESLISPTAAVVPSLLAKSSTCYNPLVYFAMNNQFRRYFQDLLCCGRRLF

>25TMT5_braBel
VAGVIAIIGVVGFVSNGAVVVLFLKFPQLRTPFNLLLLNMAVADLLVSVCGNTLSFASAVRHRWLWGRPGCVWYGFANHLFGLVSLISLAVISFLRYRMVVKPPGSSYLTYTKVGLAILFIYLYCLLWTTLPIAGWSSYQLEGPKIGCSVAWEEHSWSNTSYIVVLFITCLFAPLLIIVYSYYRLWHKVKQGSRNLPAAMRKSSQKEQKIAMMVIVMITCFMVCWLPYGAMALVVTFGGERLISHTAAVVPSLLAKSSTCYNPVVYFAMNSQFRRYFQDLLCCGRRLF

>26TMT_monDom
VAVFLGIILIFGSISNFIVLVLFCKFKVLRNPVNMLLLNISISDMLVCLSGTTLSFASSIQGRWIGGKHGCRWYGFANSCFGIVSLISLAILSYERYRTLTLCPGQGADYQKALLAVAGSWLYSLVWTVPPLIGWSSYGTEGAGTSCSVHWTSKSVESVSYIMCLFIFCLVIPILVMVYFYGRLLYAVKQVGKIRKTAARKREYHVLFMVVTAVICYLICWVPYGMIALLATFGPPGVVSPVANVVPSILAKSSTVCNPIIYVLMNKQFYKCFLILFHCQPAQS

>27TMT_ornAna
VAVFLGIILVFGFMNNLIVLILFCKFKALRNPVNMIMLNISASDMLVCVSGTTLSFASNISGRWIGGDPGCRWYGFVNSCLGIVSLISLAVLSYERYRTLTLHPKQSTDYQKAVLAVGASWIYSLIWTIPPLLGWSSYGTEGAGTSCSVHWSSKSPVSVSYIVCLFIFCLVIPVLVMIYCYGRLLYAVKQIGKARKTAARKREYHVLFMVITTVICYLVCWMPYGVTALLATFGQPGTVSPEASVIPSILAKSSTVCNPIIYILMNKQFYKCFLILFHCQPPRA

>28TMT_galGal
VAVFLGFILFFGFLNNLIVLILFCKFKTLRNPVNMLLLNISISDMLVCISGTTLSFASNIHGKWIGGEHGCRWYGFVNSCFGIVSLISLAVLSYERYSTLTLCNKRSDDYRKALLAVGGSWVYSLLWTVPPLLGWSSYGIEGAGTSCSVRWSSETAESTSYIICLFIFCLVIPVMVMMYCYGRLLYAVKQVGKIHKNTARKREYHVLFMVITTVICYLVCWIPYGVIALLATFGKPGVVTPVASIIPSILAKSSTVCNPIIYILMNKQFYKCFRQLFHCQPPSS

>29TMT_anoCar
VAVFLGLILVFGFLNNLVVLILFCKFKTLRNPVNMLLLNISASDMLVCISGTTLSFVSNIYGRWIGGEHGCRWYGFVNSCFGIVSLISLAILSYERYSTLTQTNKRGSDYQKALLGVGGSWLYSLIWTVPPLIGWSSYGLEGAGTSCSVRWTSETLESVTYIICLFIFCLAIPVLVMIYCYARLFYAVKQVGKLRKTSARKREFHVLFMIITTIICYLICWMPYGVIALLATFGRPGLVSPVASVIPSILAKSSTVFNPIIYILMNKQFYKCFLMLLHCQPSSV

>30TMT_xenTro
VAIFLGFILIFGFLNNFVVLILFCKFKTLRTPVNMMLLNISASDMLVCVSGTTLSFTSSIKGKWIGGEYGCQWYGFVNSCFGIVSLISLAILSYERYSTLTLYNKGGPNFKKALLAVASSWLYSLVWTVPPLLGWSSYGREGAGTSCSVRWTSESVESVSYIICLFIFCLALPVFVMLYCYGRLLYAVKQVGKIRKIAARKREYHVLFMVITTVICYLLCWLPYGVVALLATFGRPGVISPVASVVPSILAKSSTVFNPIIYILMNKQFYKCFLILFHCHPTSS

>31TMTa_danRer
LSVFLGFIMTFGFFNNLVVLVLFCKFKTLRTPVNMLLLNISISDMLVCMFGTTLSFASSVRGRWLLGRHGCMWYGFINSCFGIVSLISLVVLSYDRYSTLTVYHKRAPDYRKPLLAVGGSWLYSLIWTVPPLLGWSSYGLEGAGTSCSVSWTQRTAESHAYIICLFVFCLGLPVLVMVYCYGRLLYAVKQVGKIRKTAARKREYHVLFMVITTVVCYLLCWMPYGVVAMMATFGRPGIISPVASVVPSLLAKSSTVINPLIYILMNKQFYRCFRILFCCQRSLL

>32TMT_takRub
LSVVLGFIMTFGFLNNFVVLLLFCKFKKLRTPVNMLLLNISVSDMLVCLFGTTLSFASSIRGRWLLGRIGCSWYGFINSCFGIVSLISLVILSYDRYSTLTVYNKQGINYRKPLLAVGGTWLYSLFWTVPPLLGWSSYGIEGAGTSCSVSWTVQTAQSHAYIICLFTFCLGIPILVMIYCYSRLLWAVKQVGRIRKTAARKREYHILFMVVTTAACYLVCWMPYGVVAMMATFGPPNIISPVASVVPSLLAKSSTVINPLIYILMNKQFYKCFLILFHCGHWSA

>33TMT_gasAcu
LSVMLGFIMTFGFVNNLVVLLLFCKFKKLRTPVNMLLLNISVSDMLVCLFGTTLSFASSLRGKWLLGRSGCSWYGFINSCFGIVSLISLVILSYDRYSTLTVYNKAGPDYRKPLLAIGGSWLYSLFWTVPPLLGWSSYGIEGAGTSCSVSWTVQTAQSHAYIICLFTFCLGLPMLVMIYCYSRLLLAVKQVGRIRKTAARRREYHILFMVLTTAACYMLCWMPYGVVAMMATFGPPNIISPVASVVPSLLAKSSTVINPLIYILMNKQFYRCFLILFHCKHWSA

>34TMTc_xenTro
VAVCLGCILVLGSLYNSFVLLIFVKFTAIRTPINMILLNISVSDLLVCIFGTPFSFVSSVSGGWLLGQQGCKWYGFCNSLFGLVSMISLSMLSYERYLTVLKCTKADMTDYKKSWLCIIVSWLYSLCWTLPPLIGWSSYGLESSGTTCSVVWHSKSSNNISYIVCLFLFCLVLPLFIMIFCYGHIVRVIRGVCRINMTTAQKREHRLLFMVVCMVTCYLLCWMPYGLVSLMTAFGKPGMITPTVSIIPSILAKSSTFINPLIYIFMNKQFYRCFIALIKCESGPH

>35TMTc_danRer
TAVCLGAILLLGCLNNLFVLLVFARFRTLWTPINLILLNISVSDILVCLFGTPFSFASSLYGKWLLGHHGCKWYGFANSLFGIVSLMSLSILSYERYAALLRATKADVSDFRRAWLCVAGSWLYSLLWTLPPFLGWSNYGPEGPGTTCSVQWHLRSTSSISYVMCLFIFCLLLPLVLMIFCYGKILLLIKGVTKINLLTAQRRENHILLMVVTMVSCYLLCWMPYGVVALLATFGRTGLITPVTSIVPSVLAKSSTVVNPVIYVLFNNQFYRCFVAFLKCQGEPS

>36TMTc_tetNig
VAVLLGVILVAGILSNSLVLLLFVKYRSLWTPINLILLNINLSDILVCVFGTPFSFAASLQGRWLIGEGGCMWYGFANSLFGIVSLVSLSVLSYERCTVVLQPSQVDVSDFRKARFCVGGSWLYALLWTSPPLLGWSSYGPEGAGTTCSVQWQLRSPASVSYVLCLLVFCLLLPFLVMVYSYGRILVAIRRVGRINQLTAQRREQHILLMVLSMVSCYMLCWMPYGIMALVATFGKLGLVTPMVSVVPSILAKFSTVVNPIIYMFFNNQFYRCFMAFIRCQKEPE

>37TMTc_oryLat
VAVCLGFILVAGILNNFLTLLVFAKFRSLWTPINLILLNISLSDILVCVLGTPFSFAASVRGRWLIGESGCKWYAFANSLFGIVSLVSLSVLSYERYITVLHSSQADLSNFRKAWFCVGGSWLYSLLWTLPPFLGWSSYGPEGPGTTCSIQWHLRSPTSVSYVLCLFIFCLVLPLVLMVYSYGRILVALRRVGKINLLAAQRREQRILVMVFSMVSCYILCWMPYGIVALMATFGRKGLVTPLTSVIPSILAKFSTVVNPVIYVFFNSQFYRCLVAFVRCSGDPE

>38TMTa_oncMyk
VAVFLGVIFLLGFLSNLFVLLVFARFQVLRTPINLILLNISVSDMLVCIFGTPFSFAASLYGRWLIGAHGCKWYGFANSLFGIVSLVSLAILSYERYSTILCYTKADPSDYKKAWLAIAGAWLYSLVWTVPPFFGWSSYGPEGPGTTCSVQWHQRSSGNISYVTCLFIFCLLLPLLLMMFCYGKILFAIRGVAKINQSSAQRRETHVLVMVVSMVSCYLLCWMPYGVVALLATFGQVGLVSPTTSIVPSILAKSSTFLNPVIYGLLNNQFYRCFLAFMSCGSEAA

>39TMTa_anoCar
TAICLGVIGSLGFLNNLLVLVLFCRNKVLRSPINLLLMNISLSDLMICIVGTPFSFAASTQGKWLIGPAGCVWYGFANTFFGTVSLISLAVLSYERYCTMMGTTEADATNYKKVWMGIFLSWIYSLFWSLPPLFGWSSYGPEGPGTTCSVNWHSRDANNISYIICLFIFCLVIPFIVIVYCYGKLLCAIKKVSGVTQGMAQTREQRVLIMVVVMIICFLLCWLPYGIVALIATFGKPGLITPSASIIPSVLAKSSTVYNPVIYIFLNKQFYRCFCALLKCGKKSI

>40TMTa_xenTro
VAVFLGVIGSLGFFNNLVVLILFCQYKVLRSPINMLLMNISLSDLMVCILGTPFSFAASTQGHWLIGEIGCIWYGFVNTLFGTVSLVSLAVLSYERYCTMLRSTEADLTNYKKAWLGILVSWIYSLVWTLPPLFGWSKYGPEGPGTTCSVNWHSRDANNISYIVCLFIFCLALPFAVIVYCYGRLLFAIKQVSGVSKSSSRAREQRVLIMVIVMVVCFLLCWLPYGVMALVATFGKPGIISPSASIIPSVLAKSSTVYNPIIYIFLNKQFYRCFTALIHCNKHPQ

>41TMTa_takRub
VSVFLGFIGTFGLVNNLLVLVLFCRYKMLRSPINLLLMNISISDLLVCVLGTPFSFAASTQGRWLIGEAGCVWYGFANSLFGVVSLISLAVLSFERYSTMMTPTEADPSNYCKVCLGITLSWVYSLVWTVPPLFGWSSYGPEGPGTTCSVNWTAKTTNSISYIICLFVFCLIVPFLVIVFCYGKLLCAIRQVSGINASTSRKREQRVLCMVVIMVICYLLCWLPYGVVALLATFGPPDLVTPEASIIPSVLAKSSTVINPIIYVFMNKQFYRCFLALLCCQDPRS

>42TMTa_gasAcu
VAVCLGFIGTLGLMNNLLVLVLFCRYKMLRSPINLLLINISISDLLVCVLGTPFSFAASTQGRWLIGEGGCVWYGFANSLFGIVSLISLAVLSYERYSTMVAPTEADSSNYHKISLGITLSWVYSLIWTAPPLFGWSHYGPEGPGTTCSVDWTARTANSISYIICLFVFCLIVPFLVIVFCYGKLLCAIRQVSGINASLSRKREQRVLFMVVIMVVCYLLCWLPYGIMALMATFGPPGLITPVASIIPSVLAKTSTVINPVIYVFMNKQFYRCFKALLRCEAPRP

>43TMTb_danRer
VAVCLGFIGTFGFLNNTLVLVLFCRYKVLRSPMNCLLISISVSDLLVCVLGTPFSFAASTQGRWLIGRAGCVWYGFINSFLGVVSLISLAVLSYERYCTMMGSTQADSTNYRKVVIGIAFSWIYSMVWTLPPLFGWSCYGPEGPGTTCSVNWAARTPNNVSYIVCLFVFCLILPFIVIVYSYGRLLQAITQVSRINTVVSRKREQRVLFMVVTMVVCYLLCWLPYGIMALLATFGHPGLVTPAASIVPSLLAKSSTVINPIIYIFMNKQFCRCFHALIMCTTPER

>44TMTb_tetNig
VAVCLGAIGTVGFLSNLLVLALFCRFRALRTPMNLMLVSISASDLLVSVLGTPFSFAASTQGRWLLGRAGCVWYGFVNACLGIVSLISLAVLSYERYCTMMASTMASNRDYRPVLLGICFSWFYSLAWTVPPLLGWSRYGPEGPGTTCSVDWRTQTPNNISYIVCLFAFCLLLPFCVILYSYGKLLHTIRQVSSVSSAVTRRREHRVLVMVVAMVVCYLICWLPYGVTALLATFGPPNLLTPEATITPSLLAKFSTVINPFIYIFMNKQFYRCFRAFLSCSSPER

>45TMTb_gasAcu
VAVCLGFIGTFGFLSNFLVLALFCRYRALRTPMNLLLVSISASDLLVSMVGTPFSFAASTQGRWLIGRAGCVWYGFVNACLGIVSLISLAVLSFERYSTMVKPTVADGRDFRPALGGIAFSWLYSVAWTVPPLLGWSEYGPEGPGTTCSVDWKTQTANNISYIVCLFVFCLVLPFCVILYSYSRLLQAIRQVSVVSSVVTRHREQRVLAMVVVMVACYLVCWLPYGVAALLATFGPRDLLSPEASITPSLLAKFSTVVNPFIYIFMNKQFYRCFRAFLSCSTPER

>46TMTa1_calMil
VAVCLGIIMVLGFLNNLLVLVLFCKYKVLRSPMNMLLLNISVSDMLVCICGTPFSFAASVQGRWLVGEQGCKWYGFANSLFGIVSLMSLTILSYDRYITITGTTEADITNYNKTIVGIALSWIYSLMWTLPPLFGWSNYGPEGPGTTCSVNWQSKEVSSKSYIICLFIFCLLMPFLVIVYCYGKLVLAVRKVSANNSMGRTRENKLLIMVTFMIICFLLCWLPYGIVALLATFGSPGLITPTASIIPSVLAKTSTVYNPIIYIFMNKQ

>47TMTx_braFlo
VAAILALIGVLGIVNNSTTLYLVGRYKQLRTPFNILMVNLSVSDLLMCVLGTPFSFVSSLHGRWMFGHSGCEWYGFICNFLGIVSLITLTVISYERYLLMKRLPNERILSYRAVALAVVFIWCYSLLWTAPPLVGWSSYGPEGYGISCSVNWESRTANDTSYIVAYFVGCLVFPVAIIVISYTRLILYMRQAPSAPMQMLVRREKRVTKMVVVMIMGFTICWTPYTIVALIVTCGGEGIITPAAATVPALFAKSSVVYNAAIYVAMNNQFRKCFLRSLNCRSQPR

>48TMTy_braFlo
MAVWLGFIGSFGFVTNLLTVLVFWCFKSLRTPFHLYLGGIALSDLLVAALGSPFAVASAVGERWLFGRAVCVWYAFVNYFLSIVSIVTMATMSFSRYWVIIRPSAPRLDTVYGACVVNALAWCYSFFWTIMPVLGWSRFTQVAAMTVCSLDWDHHTPLSKSYIPVAFLTCLFLPLGVIIFSVFKTTMHLRRAAEVEDEVPNEVRAGRKTTRITLVMAGCWLVAWLPYACMALVIAAGGRVSPTVEVLATKFAKTSYIVNTIIYLVMEKEFRKSLVLLLFCGRDPF

>49TMT1_strPur
LTVYTGFLTIFGILNNGIVMILFARFPSLRHPINSFLFNVSLSDLIISCLASPFTFASNFAGRWLFGDLGCTLYAFLVFVAGTEQIVILAALSIQRCMLVVRPFTAQKMTHRWALFFISLTWIYSLIICVPPLFGWNRYTYEGPGTACSVAWNSPSPGDTSYIIFIFVLVLVIPFGIIIFCYGLLVYAVKKISRTQAALSSEAKADRKVSKMIFIMILFFLIAWTPYTGFSLYVTFGKNVVITPLAGTFPPFFAKLCTIHNPIIYFLLNKQFKDALIQLFCCGENPF

>23ENC_strPur
AGSFLTLVFIISIIGNSVVLFLFAWDRHLRTPTNMFLLSLTISDWLVTVVGIPFVTASIYAHRWLFAHVGCISYAFIMTFLGLNSLMSHAVIAVDRYLVITKPHFGIVVTYPKAFLMISIPWVFSFAWAVFPLAGWGEFTYEGTGAWCSVRWDSDQPQIMSYVLAMMFLTFISSIVIMMYCYICIFLTTRRMPATSNSIKTHERNRRRREQKLLKTLIAIAIAFLVAWSPYAITSMIVVFGGSELLSLTATTLPSLFAKSSVMINPIIYAVTSRVFRKSLKKMLTSFFPGC

>52TMT1_anoGam
AAVTLFFIGFFGFFLNIFVIALMYKDVQLWTPMNIILFNLVCSDFSVSIIGNPLTLTSAISHRWLYGKSICVAYGFFMSLLGIASITTLTVLSYERFCLISRPFAAQNRSKQGACLAVLFIWSYSFALTSPPLFGWGAYVNEAANISCSVNWESQTANATSYIIFLFIFGLILPLAVIIYSYINIVLEMRKNSARVGRVNRAERRVTSMVAVMIVAFMVAWTPYAIFALIEQFGPPELIGPGLAVLPALVAKSSICYNPIIYVGMNTQFRAAFWRIRRSNGVAG

>53TMT2_anoGam
SAVTLFFIGFFGFFLNLFVIALMCKDMQLWTPMNIILFNLVCSDFSVSIIGNPLTLTSAISHRWIFGRTLCVAYGFFMSLLGITSITTLTVLSYERYCLISRPFSSRNLTRRGAFLAIFFIWGYSFALTSPPLFGWGAYVQEAANISCSVNWESQTKNATTYIIFLFVFGLVVPLIVIVYSYTNIIVNMRENSARVGRINRAEQRVTSMVAVMIVAFMVAWTPYAIFALIEQFGPPELIGPGLAVLPALVAKSSICYNPIIYVGMNTQFRAAFSRVRNKGQQAA

>54TMT_aedAeg
SAVTLFFIGFFGFFLNLFVIALMCKDVQLWTPINIILFNLVCSDFSVSIIGNPFTLTSAISRHWIFGRTVCIAYGFFMSLLGITSITTLTVLSYERFCLISHPFSSRSLSRRGAVFAILFIWSYSFALTSPPLFGWGAYVNEAANISCSVNWESQTLNATSYIIFLFVFGLVVPLVVIVYSYTNIVVNMKRNAARVGRINRAEKRVTRMVFVMVLAFMIAWTPYAVFALIEQFGPTDIISPALGVLPALIAKSSICYNPIIYVGMNTQFRAAFNRVRNNESVDN

>54TMT_culPip
TAVVLFFIGFFGFFLNLFVIALMCKEVQLWTPMNIILLNLVCSDFSVSIVGNPFTLSSAISHRWLFGRKLCVAYGFFMSLLGITSITTLTVLSYERFYLISRPFSSRSLSRRGALGAVLLIWCYSFALTSPPLFGWGAYVNEAANISCSVNWETQTLNATTYIIYLFVFGLVVPLTVIVYSYTNIIVNMKKNAARVGRINRAEKRVTTMVAVMVIAFMVAWTPYSVFALMEQFGPPDVIGPGLAVLPALIAKSSICYNPIIYVGMNTQFRAAFNRVRHDPGDMA

>56TMT_triCas
AAVVLFCIGFFGFSLNLTVIIFMLKERQLWSPLNIILFNLVVSDFLVSVLGNPWTFFSAINYGWIFGETGCTIYGFIMSLLSITSITTLTVLAFERYLLIARPFRNNALNFHSAALSVFSIWLYSLSLTIPPLIGWGEYVHEAANLSCSVNWEEKSPNSTSYILYLFAFGLFLPLVIITFSYVNIILTMRRNAAFRVGQVSKAENKVAYMIFIMIIAFLTAWSPYAIMALIVQFGDAALVTPGMAVIPALLAKSSICYNPVIYIGLNAQVKGAKWVSGLIYLFQF

>57TMT_apiMel
AAIALGFIGFFGFTANLLVAIVIVKDAQLWTPVNVILFNLVFGDFLVSIFGNPVAMVSAATGGWYWGYKMCLWYAWFMSTLGFASIGNLTVMAVERWLLVARPMQALSIRHAVILASFVWIYALSLSLPPLFGWGSYGPEAGNVSCSVSWEVHDPNSDTYIGFLFVLGLIVPVFTIVSSYAAIVLTLKKVRKRAGASGRREAKITKMVALMITAFLLAWSPYAALAIAAQYFNAKPSATVAVLPALLAKSSICYNPIIYAGLNNQFSRFLKKIFDARGSRT

>58TMT_rhoPro
ASIILFLIGFLGFFGNLIVIIIMCRDKNLWTPVNFILFNVIVSDFSVAALGNPFTLASAIAKRWFFGQSMCVAYGFFMALLGITSINSLTVLALERYLIVSQPVSHGSLSRPTASDIVGSIWLYSFVITIPPLVGWGEYGLEAANISCSINWETRSHSSTSYILFLFTFGFFIPIIVISYSYMNIILTMKKSTMNAGRVNKAESRVTWMIFVMIFAFFLAWTPYAILALMIAFFDSNVSPAIATIPAIFAKTSICYNPFIYAGLNTQFRQSWRRVLGGKREDS

>59TMT_acyPis
AAIVLSIIGIVGFIFNTCVIFIMIRDTRLWTPQNVIIFNLATSDLAVSVLGNPVTLAAAITKGWIFGQTICVIYGFFMALFGIASITTLTVLAYDRYLMIRYPFSSSRLTKETALYAIAGIWIYAFAVTGPPLFGWNRYVNESANISCSIDWESGEHSNYVIYIFVFGLFLPVTVIIYSYVSLVVTVRKAEKIIGQATKAECRVAIMVAVMILAFLTAWMPYSVLALMIAFGGVHISPVVSIIPALCAKSSICWNPIIYIGLNTQFRSAWKRFLNIQDTLS

>60TMT_bomMor
SAFVLFLIGFFGFFLNLMVILLMFKDRQLWTPLNIILFNLVCSDFSVSVLGNPFTLISALFHRWIFGHTMCVLYGFFMALLGITSITTLTVISFERYLMVTRPLTSRHLSSKGAVLSIMFIWTYSLALTTPPLLGWGNYVNEAANIQCSVNWHEQSTNTLTYIMFLFAMGQILPLSVITFSYVNIIRTLKRNSQRLGRVSRAEARATAMVFIMIIAFTVAWTPYSLFALMEQFATEGIVSPGASVIPALVAKSSICYDPLIYVAHKLSLNNTKFNSETNSSTAT

>61TMTa_dapPul
ASAYLLFISIAGLFMNIVVVVIILNDSQKMTPLNWMLLNLACSDGAIAGFGTPISAAAALKFTWPFSHELCVAYAMIMSTAGIGSITTLTVLALWRCQHVVSNDPNGRLDRRQGALLLTFIWTYTLIVTCPPLFGWGRYDREAAHISCSVNWESKMDNNRSYILYMFAMGLFIPLMAIFVSYISILLFIHKSQQTSNNSDTVEKRVTFMVAVMIGAFLTAWTPYSIMALVGDNVYAGTISPAVATVPSLFAKTSAVLNPLIYGLLNTQFRTAWEKFSSRFLGRK

>62TMTb_dapPul
TAAYLLLISVLGLIMNVVVVIVILNDSQRMTPLNWMLLNLACSDGAIAGFGTPISTAAALEFGWPFSQELCVAYAMIMSTAGIGSITTLTALAIWRCQLVVSANHSGRLGCRQGVILLVIIWIYALAITCPPLFGWGRYDREAAHISCSVNWESKTNNNRSYILYMFCMGLVVPLAVIIISYVRILRVVQKNQQQSGNVHRHRRDAAFMVAVMIGAFLTAWTPYSIMALVEKRVYVGTISPAFATIPSLFAKTSAVLNPLIYGLLNTQFRLAWERFSLRFLGRF

>63TMT_triCys
MGSFIAGSACCSFLLNGLVIAVLIKYIRTITNTNIIVLSMSCANILIPLLGSPLSATSSLMRKWQFGNGGCTWYGFINTLSGISGIYHLTFLSFERFITIVLPLKRDTISTKNIYIGLGILWVAAIGVAGAPVFGWCEYIKE
GVRTSCSVAWSSKENMNVFSYNLFMIFTVFLLPMLVIIYCNYRFIKEVSRARGLQGGDSEMTASASKAEKQLTIMVITMIIAFNIAWLPYTVVSMVFGYGDVVGPMGASVPSVFAKTSVIYNPVIYCLLNRSFRKMLCGNSVEP

>64CUB_carRas Carybdea rastonii (sea_wasp) CnidCuboCary AB435549 18832159 full G? cubop
LSGFLACVVFLSISLNMIVLITFYRLRHKLAFKDALMASMAFSDVVQAIVGYPLEVFTVVDGKWTFGMELCQVAGFFITALGQVSIAHLTALALDRYFTVCRPFVATAISMRNAGMVIFVCWFYASFWAVLPLVGWSNYDVEGDGMRCSINWADDSPKSYSYRVCLFVFIYLIPVLLMVATYVLVQGEMKRAAQLFGSESEAALKNIKAEKRHTRLVFVMILSFIVAWTPYTFVAMWVSFKQLGPIPLYVDTLAAMLAKSSAMFNPIIYCFLHKQFRRAVLRGVCGRIVGG

>50TMT1_plaDum
TAIYLCIVGVIGTLSNGVIMYLYFKDKSLRSPMNLLFVNLAMSDFTVAFFGAMFQFGLTCTRKYMSPGMACDFYGFITFLGGLASEMNLFIISVERYLAVVRPFDVGNLTNRRVIAGGVFVWLYSLVFAGGPLVGWSSYRPEGLGTWCSISWQDRSMNTMSYVTAVFLGCYFFPVSIIIFCYFNVWRKVKEAADAQGGAGTAGKAEKSIFRMSVIMVTCYLTAWTPYAIVCLIASYGPPNGLPIYAEVLPSLFAKSSQVYNPIIYVLMNKPYRSALVSLVCRGRNPF

>51TMT2_plaDum
CAAYLFFIACLGVSLNVLVLVLFIKDRKLRSPNNFLYVSLALGDLLVAVFGTAFKFIITARKTLLREEDGCKWYGFITYLGGLAALMTLSVIAFVRCLAVLRLGSFTGLTTRMGVAAMAFIWIYSLAFTLAPLLGWNHYIPEGLATWCSIDWLSDETSDKSYVFAIFIFCFLVPVLIIVVSYGLIYDKVRKVAKTGGSVAKAEREVLRMTLLMVSLFMLAWSPYAVICMLASFGPKDLLHPVATVIPAMFAKSSTMYNPLIYVFMNKQFRRSLKVLLGMGVEDL

Origins of melanopsins

Melanopsins are well-represented in all three bilateran clades -- the only sequenced genome to date lacking a melanopsin is the acornworm Saccoglossus. Many erratically named genes in arthropods and mollusks are actually simple orthologs at the bilateran ancestor to the first described melanopsin locus in Xenopus. A single melanopsin locus existed in ur-Bilatera. It is not currently possible to specify its syntenic relationships.

The evidence for this takes three forms: conventional alignment tree clustering, the presence of shared distinctive introns, and the HPK motif in the NP..Y...HPK terminal helix which unites all bilateran melanopsins and especially distinguishes them from NP..Y...NKQ pattern in cilopsins in this key conformational shift signaling region. Arthropod melanopsins also share remarkable conservation in their HEK motif in cytoplasmic loop 3.

In ecdysozoa, the melanopsin locus duplicated early on with copies specializing to ultraviolet and long wavelengths but evidently remaining under the same strong and unusual selection in the third cytoplasmic loop attributable perhaps to protein-protein interaction involving the alpha protein specialized to Gq signaling. The ultraviolet melanopsin largely retains the ancestral intronation (based on deuterostome and lophotrochozoa outgroup sequences) whereas the longwave form largely lost these but acquired others. The duplication process was segmental rather than retropositional because a distal intron at EVTR 252 00 is still shared (3' introns are the first to be lost in retropositioning).

These ecdysozoan melanopsin paralogs in turn underwent additional expansion -- in some cases many duplications -- depending on the specific lineage. These refine imaging vision and do not have auxiliary functions. These end-leaf specializations and their evolutionary adaptiveness are best pursued in the original journal articles -- the emphasis here is the broader sweep of opsin evolution.

Melanopsins in lophotrochozoans have a simpler history. Taxonomic sampling leaves something to be desired at this point. Like in ecdysozoans, they seem to provide all the opsins used in imaging vision. (Peropsins and rarely cilopsins are also found in this clade.)

Melanopsins in deuterostomes do not provide imaging vision in any living species and probably never have. However gene retention is very high and the rate of evolution has been quite slow (in the core region), indicative of important roles. A single locus duplication occurred post-chondrichthyes pre-teleost and continued on through living fish, frogs, lizard, and birds but not any mammal. This could not reflect whole genome duplication because that would be restricted to fish in conflict with observed continuing synteny and blast clustering. Indeed, two rounds of WGD would have produced 8 copies of the two pre-existing paralogs but none of these survived in any of the five available fish genomes.

Origins of peropsins, neuropsins, rgropsins

This group primarily occurs in deuterostomes, though peropsins have now been located sporadically in ecdysozoa and lophotrochozoa. The first sponge opsin -- known only from a terminal fragment -- also appears to be in this group, with surprisingly close affinities to Branchiostoma (and less surprising to Nematostella). This raises the question of whether peropsins are perhaps more fundamental in the evolutionary origins of opsins than previously considered. The opsins of amphioxus, while appearing very diverged and homologically isolated, may in fact have retained deep connections to the ancestral conditon.


See also: Curated Sequences | Tetrachromatic Ancestral Mammal | Ancestral Introns | Informative Indels | Update Blog