Marsupial phyloSNPs: Difference between revisions
Tomemerald (talk | contribs) |
Tomemerald (talk | contribs) |
||
| Line 108: | Line 108: | ||
lower-case letters for less confident calls. | lower-case letters for less confident calls. | ||
Paralog and pseudogene issues: ERN2 has not generated potentially confusing recent pseudogenes (lack of human or opossum genome Blat matches to ERN2 query). GeneSorter shows a single remote full-length paralog ERN1. However this particular exon is a good match (3 differences out of 23), so there is potential for experimental difficulties in distinguishing them in short reads. However at positions 15 and 20, ERN1 is identical at the amino acid level to ERN2. | '''Paralog and pseudogene issues:''' ERN2 has not generated potentially confusing recent pseudogenes (lack of human or opossum genome Blat matches to ERN2 query). GeneSorter shows a single remote full-length paralog ERN1. However this particular exon is a good match (3 differences out of 23), so there is potential for experimental difficulties in distinguishing them in short reads. However at positions 15 and 20, ERN1 is identical at the amino acid level to ERN2. | ||
Homoplasy (recurrent mutation) issues: This exon is very conserved and does not exhibit repetitive sequence, compositional simplicity, or indels in any species in either paralog that could foster experimental error or alignment ambiguity. At position 15, the ancestral value is arginine in both paralogs. The G--> A transition to histidine in one individual is conservative under most circumstances (still basic) and arises from an arginine codon CpG hotspot conserved back to lamprey, yet histidine is not observed part of a reduced alphabet (ie R/H) at this position over many billions of years of branch length. Consequently R-->H is a significant change in this individual tasmanian devil. | '''Homoplasy (recurrent mutation) issues:''' This exon is very conserved and does not exhibit repetitive sequence, compositional simplicity, or indels in any species in either paralog that could foster experimental error or alignment ambiguity. At position 15, the ancestral value is arginine in both paralogs. The G--> A transition to histidine in one individual is conservative under most circumstances (still basic) and arises from an arginine codon CpG hotspot conserved back to lamprey in 30 of 32 species with available data, yet histidine is not observed part of a reduced alphabet (ie R/H) at this position over many billions of years of branch length. <font color="blue">Consequently R-->H is a significant change in this individual tasmanian devil.</font> | ||
'''Side issues:''' a very ancient conserved leucine at position 21 appears to be transitioning to phenylalanine at marsupial node but has not been fixed, so settles out as L or F depending on lineage-sorting on each terminal marsupial leaf whereas placentals are all changed to phenylalanine (a phyloSNP caught in mid-air). While L and F might seem about the 'same' as amino acids, the branch length conservation totals say both are important but for different reasons: this is not a waffle codon nor reduced alphabet situation. This raises the question -- given the extreme conservation of this exon otherwise -- of whether the L-->F change at position 21 in both individuals has 'enabled' (made neutral or adaptive) an otherwise unfavorable R-->H change at position 15 in one individual. | |||
'''Structural significance:''' By good fortune, the crystal structure of ERN1 (alternately called IRE1) has been [http://www.pnas.org/cgi/pmidlookup?view=long&pmid=16973740 published]. The PDB [http://www.proteopedia.org/wiki/index.php/2HZ6 2HZ6] structure has good coverage of this particular exon. Consequently the marsupial ERN2 could be very accurately modelled and the structural effects of L-->F with or without R-->H computed by submission to online SwissProt modelling service. | |||
Alignment of Monodelphis ERN2 (key exon replaced by that of sarHar2) with crytallograph human ERN1 alpha luminal domain | |||
ERN2_homSap KLPFTIPELVHASPCRSSDGVFYT | Expect = 5.8e-65 Identities = 109/180 (60%), Positives = 141/180 (78%) | ||
ERN2_panTro KLPFTIPELVHASPCRSSDGVFYT | |||
ERN2_ponAbe KLPFTIPELVHASPCRSSDGVFYT | ERN2_monDom 1 PESLLFISTLDGSLHAVSKKTGDIQWTLKDDPIIQGPVYATEPAFLPDPSDGSLYILGEE 60 | ||
ERN2_rheMac KLPFTIPELVHASPCRSSDGVFYT | PE+LLF+STLDGSLHAVSK+TG I+WTLK+DP++Q P + EPAFLPDP+DGSLY LG + | ||
ERN2_calJac KLPFTIPELVHASPCRSSDGVFYT | ERN1_homSap 8 PETLLFVSTLDGSLHAVSKRTGSIKWTLKEDPVLQVPTHVEEPAFLPDPNDGSLYTLGSK 67 | ||
ERN2_tarSyr KLPFTIPELVHASPCRSSDGVFYT | |||
ERN2_micMur KLPFTIPELVHASPCRSSDGVFYT | ERN2_monDom 61 SKQGLM<font color="blue">KLPFTIPELVHASPC<font color="red">H</font>SSDGV<font color="magenta">F</font>YT</font>GRKQDTWFMVDPKSGKKQTMLSTETWDGLY 120 | ||
ERN2_tupBel KLPFTIPELVHASPCRSSDGVFYT | + +GL <font color="blue">KLPFTIPELV ASPCRSSDG+LY </font>G+KQD W+++D +G+KQ LS+ D L | ||
ERN2_musMus KLPFTIPELVHASPCRSSDGVFYT | ERN1_homSap 68 NNEGLT<font color="blue">KLPFTIPELVQASPCRSSDGILYM</font>GKKQDIWYVIDLLTGEKQQTLSSAFADSLS 127 | ||
ERN2_ratNor KLPFTIPELVHASPCRSSDGVFYT | |||
ERN2_cavPor KLPFTIPELVHTSPCRSSDGVFYT | ERN2_monDom 121 PSAPLLYIGRTQYTVTMYDPRSQALRWNTTYRGYSAPLLDHLPGYQVGHFTCSGEGLVVT 180 | ||
ERN2_speTri KLPFTIPELVHASPCRSSDGVFYT | PS LLY+GRT+YT+TMYD +++ LRWN TY Y+A L + Y++ HF +G+GLVVT | ||
ERN2_oryCun KLPFTIPELVHASPCRSSDGVFYT | ERN1_homSap 128 PSTSLLYLGRTEYTITMYDTKTRELRWNATYFDYAASLPEDDVDYKMSHFVSNGDGLVVT 187 | ||
ERN2_ochPri KLPFSIPELVHASPCRSSDGVFYT | |||
ERN2_turTru RLPFTIPELVHASPCRSSDGVFYT | '''Functional significance:''' A considerable amount is known about the paralog ERN1. Annotation transfer may be applicable to ERN2. | ||
ERN2_bosTau RLPFTIPELVHASPCRSSDGVFYT | |||
ERN2_equCab KLPFTIPELVHASPCRSSDGVFYT | "The unfolded protein response (UPR) is an evolutionarily conserved mechanism by which all eukaryotic cells adapt to the accumulation of unfolded proteins in the endoplasmic reticulum (ER). Inositol-requiring kinase 1 (IRE1) and PKR-related ER kinase (PERK) are two type I transmembrane ER-localized protein kinase receptors that signal the UPR through a process that involves homodimerization and autophosphorylation. To elucidate the molecular basis of the ER transmembrane signaling event, we determined the x-ray crystal structure of the luminal domain of human IRE1alpha. The monomer of the luminal domain comprises a unique fold of a triangular assembly of beta-sheet clusters. Structural analysis identified an extensive dimerization interface stabilized by hydrogen bonds and hydrophobic interactions. Dimerization creates an MHC-like groove at the interface. However, because this groove is too narrow for peptide binding and the purified luminal domain forms high-affinity dimers in vitro, peptide binding to this groove is not required for dimerization. Consistent with our structural observations, mutations that disrupt the dimerization interface produced IRE1alpha molecules that failed to either dimerize or activate the UPR upon ER stress. In addition, mutations in a structurally homologous region within PERK also prevented dimerization. Our structural, biochemical, and functional studies in vivo altogether demonstrate that IRE1 and PERK have conserved a common molecular interface necessary and sufficient for dimerization and UPR signaling." | ||
ERN2_felCat RLPFTIPELVHASPCRSSDGVFYT | |||
ERN2_canFam KLPFTIPELVHASPCRSSDGVFYT | ^ * | ||
ERN2_myoLuc KLPFTIPELVHASPCRSSDGVFYT | ERN2_homSap KLPFTIPELVHASPCRSSDGVFYT | ||
ERN2_eriEur KLPFTVPELVHTSPCRSSDGVFYT | ERN2_panTro KLPFTIPELVHASPCRSSDGVFYT | ||
ERN2_sorAra KLPFTIPELVHASPCRSSDGVFYT | ERN2_ponAbe KLPFTIPELVHASPCRSSDGVFYT | ||
ERN2_loxAfr KLPFTIPELVHAS----------- | ERN2_rheMac KLPFTIPELVHASPCRSSDGVFYT | ||
ERN2_proCap ---------------------FYT | ERN2_calJac KLPFTIPELVHASPCRSSDGVFYT | ||
ERN2_echTel KLPFTIPELVLASPCRSSDGVFYT | ERN2_tarSyr KLPFTIPELVHASPCRSSDGVFYT | ||
ERN2_dasNov KLPFTIPELVHTSPCRSSDGIFYT | ERN2_micMur KLPFTIPELVHASPCRSSDGVFYT | ||
ERN2_monDom KLPFTIPELVHASPCRSSDGVLYT | ERN2_tupBel KLPFTIPELVHASPCRSSDGVFYT | ||
ERN2_macEug KLPFTIPELVQASPCRSSDGILYM | ERN2_musMus KLPFTIPELVHASPCRSSDGVFYT | ||
ERN2_ornAna KLPFTIPELVQSSPCRSSDGILYT | ERN2_ratNor KLPFTIPELVHASPCRSSDGVFYT | ||
ERN2_anoCar KLPFTIPELVQSSPCRSSDGIIYT | ERN2_cavPor KLPFTIPELVHTSPCRSSDGVFYT | ||
ERN2_taeGut KLPFTIPELVQSSPCRSSDGVLYT | ERN2_speTri KLPFTIPELVHASPCRSSDGVFYT | ||
ERN2_galGal KLPFTIPELVQASPCRSSDGILYM | ERN2_oryCun KLPFTIPELVHASPCRSSDGVFYT | ||
ERN2_xenTro KLPFTIPELVQSSPCRSSDGILYT | ERN2_ochPri KLPFSIPELVHASPCRSSDGVFYT | ||
ERN2_xenLae KLPFTIPELVQSSPCRSSDGILYT | ERN2_turTru RLPFTIPELVHASPCRSSDGVFYT | ||
ERN2_tetNig KLPFTIPELVQASPCRSSDGVLYM | ERN2_bosTau RLPFTIPELVHASPCRSSDGVFYT | ||
ERN2_takRub KLPFTIPELVQASPCRSSDGVLYM | ERN2_equCab KLPFTIPELVHASPCRSSDGVFYT | ||
ERN2_gasAcu KLPFTIPDLVQSAPCRSSDGILYT | ERN2_felCat RLPFTIPELVHASPCRSSDGVFYT | ||
ERN2_oryLat KLPFTIPELVQSAPCRSSDGILYT | ERN2_canFam KLPFTIPELVHASPCRSSDGVFYT | ||
ERN2_petMar KLPFTIPELVHASPCRTSDGVLYT | ERN2_myoLuc KLPFTIPELVHASPCRSSDGVFYT | ||
ERN2_eriEur KLPFTVPELVHTSPCRSSDGVFYT | |||
ERN2_sorAra KLPFTIPELVHASPCRSSDGVFYT | |||
ERN2_loxAfr KLPFTIPELVHAS----------- | |||
ERN2_proCap ---------------------FYT | |||
ERN2_echTel KLPFTIPELVLASPCRSSDGVFYT | |||
ERN2_dasNov KLPFTIPELVHTSPCRSSDGIFYT | |||
ERN2_monDom KLPFTIPELVHASPCRSSDGVLYT | |||
ERN2_macEug KLPFTIPELVQASPCRSSDGILYM | |||
ERN2_sarHar1 KLPFTIPELVQASPCRSSDGI<font color="blue">F</font>YM | |||
ERN2_sarHar2 KLPFTIPELVQASPC<font color="red">H</font>SSDGI<font color="blue">F</font>YM | |||
ERN2_ornAna KLPFTIPELVQSSPCRSSDGILYT | |||
ERN2_anoCar KLPFTIPELVQSSPCRSSDGIIYT | |||
ERN2_taeGut KLPFTIPELVQSSPCRSSDGVLYT | |||
ERN2_galGal KLPFTIPELVQASPCRSSDGILYM | |||
ERN2_xenTro KLPFTIPELVQSSPCRSSDGILYT | |||
ERN2_xenLae KLPFTIPELVQSSPCRSSDGILYT | |||
ERN2_tetNig KLPFTIPELVQASPCRSSDGVLYM | |||
ERN2_takRub KLPFTIPELVQASPCRSSDGVLYM | |||
ERN2_gasAcu KLPFTIPDLVQSAPCRSSDGILYT | |||
ERN2_oryLat KLPFTIPELVQSAPCRSSDGILYT | |||
ERN2_petMar KLPFTIPELVHASPCRTSDGVLYT | |||
ERN1_homSap KLPFTIPELVQASPCRSSDGILYM | ERN1_homSap KLPFTIPELVQASPCRSSDGILYM | ||
ERN1_panTro KLPFTIPELVQASPCRSSDGILYM | ERN1_panTro KLPFTIPELVQASPCRSSDGILYM | ||
| Line 198: | Line 218: | ||
ERN1_oryLat KLPFTIPELVQASPCRSSDGVLYM | ERN1_oryLat KLPFTIPELVQASPCRSSDGVLYM | ||
ERN1_danRer KLPFTIPELVQASPCRSSDGILYM | ERN1_danRer KLPFTIPELVQASPCRSSDGILYM | ||
Ancient CpG in ERN2 homSap chr16:23625855-23625856 | |||
Ancient CpG in ERN2 homSap chr16:23625855- | |||
Human CG | Human CG | ||
Chimp CG | Chimp CG | ||
Gorilla | Gorilla -- | ||
Orangutan CG | Orangutan CG | ||
Rhesus CG | Rhesus CG | ||
| Line 209: | Line 228: | ||
Tarsier CG | Tarsier CG | ||
Mouse lemur CG | Mouse lemur CG | ||
Bushbaby | Bushbaby -- | ||
TreeShrew CG | TreeShrew CG | ||
Mouse CG | Mouse CG | ||
Rat CG | Rat CG | ||
Kangaroo rat | Kangaroo rat -- | ||
Guinea Pig CG | Guinea Pig CG | ||
Squirrel CG | Squirrel CG | ||
Rabbit CG | Rabbit CG | ||
Pika CG | Pika CG | ||
Alpaca | Alpaca -- | ||
Dolphin CG | Dolphin CG | ||
Cow CG | Cow CG | ||
| Line 225: | Line 244: | ||
Dog CG | Dog CG | ||
Microbat CG | Microbat CG | ||
Megabat | Megabat -- | ||
Hedgehog CG | Hedgehog CG | ||
Shrew CG | Shrew CG | ||
Elephant | Elephant -- | ||
Rock hyrax -- | Rock hyrax -- | ||
Tenrec CG | Tenrec CG | ||
| Line 237: | Line 256: | ||
Tetraodon CG | Tetraodon CG | ||
Fugu CG | Fugu CG | ||
Stickleback | Stickleback C<font color="red">T</font> | ||
Medaka | Medaka C<font color="red">T</font> | ||
Lamprey CG | Lamprey CG | ||
Revision as of 13:08, 10 February 2009
Introduction to Marsupial phyloSNPs
In this project, new genomic data from the Tasmanian devil (Sarcophilus harrisii), Tasmanian tiger (Thylacinus cynocephalus), and echidna (Tachyglossus aculeatus) are analyzed for significant changes at the protein coding level. The goal is to find single amino acid changes in one of these species at a highly invariant residue in a well-conserved exon in a gene with known or predictable tertiary structure. Such changes are thought to enrich for genetic changes with significant, adaptive biochemical or phenotypic consequences (1,2,3,4), in contrast to ordinary SNPs at positions of low conservation. Thus phyloSNPs are informative to the distinctive biology of the species carrying them and suggest a focus for subsequent experiment.
Marsupial genomic and cDNA data to date has been quite limited compared to placental mammal. Yet as outgroup, metatheran animals provide important context to placentals and represent important context in understanding human protein evolution. The monotheres are inevitably limited by the paucity of extant species (basically platypus and echidna) and dim prospects for fossil DNA. Consequently echidna provides an important adjunct to the existing but incomplete platypus assembly. While extant birds and reptiles -- the preceding divergence node -- are abundant it must be remembered that a very considerable time elapsed (from 310 mry to 175 mry) prior to divergence of mammals with living representatives. This gap of 135 myr is comparable to the whole evolutionary record of theran mammals.
Assumed vertebrate phylogenetic tree
Marsupial relationships taken from 2009 paper establishing the mitochondrial genome sequence of the Tasmanian tiger (Thylacinus cynocephalus):
Newick tree that generates vertebrate phylogenetic tree used in the analysis here: ((((((((((((((((((homSap,panTro),gorGor),ponPyg),macMul),calJac),tarSyr),(micMur,otoGar)),tupBel), (((((musMus,ratNor),dipOrd),cavPor),speTri),(oryCun,ochPri))), (((((vicPac,susScr),turTru),bosTau),((equCab,(felCat,canFam)),(myoLuc,pteVam))),(eriEur,sorAra))), (((loxAfr,proCap),echTel),(dasNov,choHof))), (monDom,((macEug,triVul),(sarHar,thyCyn)))), (ornAna,tacAcu)), ((galGal,taeGut),anoCar)), xenTro), (((tetNig,takRub),(gasAcu,oryLap)),danRer)), calMil), petMar);
Phylo-sorting data
- - - - - - - (((((((((((((((((( - - - - 10 26 10 > 27 gene homSap , Homo sapiens (human) hg181 11 38 11 > 40 gene panTro ), Pan troglodytes (chimp) panTro 12 25 12 > 26 gene gorGor ), Gorilla gorilla (gorilla) gorGor 13 40 13 > 42 gene ponPyg ), Pongo pygmaeus (orang) ponAbe 14 28 14 > 30 gene macMul ), Macaca mulatta (rhesus) rheMac 15 12 15 > 12 gene calJac ), Callithrix jacchus (marmoset) calJac 16 48 16 > 53 gene tarSyr ),( Tarsius syrichta (tarsier) tarSyr 17 29 17 > 31 gene micMur , Microcebus murinus (mouse_lemur) micMur 18 37 18 > 39 gene otoGar )), Otolemur garnettii (bushbaby) otoGar 19 50 19 > 57 gene tupBel ),((((( Tupaia belangeri (tree_shrew) tupBel 20 31 20 > 33 gene musMus , Mus musculus (mouse) mm91 21 43 21 > 45 gene ratNor ), Rattus norvegicus (rat) rn41 22 18 22 > 19 gene dipOrd ), Dipodomys ordii (kangaroo_rat) dipOrd 23 14 23 > 15 gene cavPor ), Cavia porcellus (guinea_pig) cavPor 24 45 24 > 48 gene speTri ),( Spermophilus tridecemlineatus (squirrel) speTri 25 35 25 > 37 gene oryCun , Oryctolagus cuniculus (rabbit) oryCun 26 33 26 > 35 gene ochPri ))),((((( Ochotona princeps (pika) ochPri 27 52 27 > 59 gene vicPac , Vicugna pacos (lama) vicPac 54 57 28 > 49 gene susScr ), Sus scrofa (pig) 28 51 29 > 58 gene turTru ), Tursiops truncatus (dolphin) turTru 29 11 30 > 11 gene bosTau ),(( Bos taurus (cow) bosTau 30 20 31 > 21 gene equCab ,( Equus caballus (horse) equCab 31 22 32 > 23 gene felCat , Felis catus (cat) felCat 32 13 33 > 14 gene canFam )),( Canis familiaris (dog) canFam 33 32 34 > 34 gene myoLuc , Myotis lucifugus (microbat) myoLuc 34 42 35 > 44 gene pteVam ))),( Pteropus vampyrus (macrobat) pteVam 35 21 36 > 22 gene eriEur , Erinaceus europaeus (hedgehog) eriEur 36 44 37 > 47 gene sorAra ))),((( Sorex araneus (shrew) sorAra 37 27 38 > 28 gene loxAfr , Loxodonta africana (elephant) loxAfr 38 41 39 > 43 gene proCap ), Procavia capensis (hyrax) proCap 39 19 40 > 20 gene echTel ),( Echinops telfairi (tenrec) echTel 40 17 41 > 18 gene dasNov , Dasypus novemcinctus (armadillo) dasNov 41 15 42 > 16 gene choHof ))),( Choloepus hoffmanni (sloth) choHof 42 30 43 > 32 gene monDom ,(( Monodelphis domestica (opossum) monDom 55 55 44 > 29 gene macEug , Macropus eugenii (wallaby) 56 56 45 > 46 gene sarHar ),( Sarcophilus harrisii (tasmanian_devil) 57 60 46 > 56 gene triVul , Trichosurus vulpecula (bushytail_possum) 58 59 47 > 55 gene thyCyn )))),( Thylacinus cynocephalus (tasmanian_tiger) 43 34 48 > 36 gene ornAna , Ornithorhynchus anatinus (platypus) ornAna 59 58 49 > 50 gene tacAcu )),(( Tachyglossus aculeatus (echidna) 44 23 50 > 24 gene galGal , Gallus gallus (chicken) galGal 45 46 51 > 51 gene taeGut ), Taeniopygia guttata (finch) taeGut 46 10 52 > 10 gene anoCar )), Anolis carolinensis (lizard) anoCar 47 53 53 > 60 gene xenTro ),((( Xenopus tropicalis (frog) xenTro 48 49 54 > 54 gene tetNig , Tetraodon nigroviridis (pufferfish) tetNig 49 47 55 > 52 gene takRub ),( Takifugu rubripes (fugu) fr21 50 24 56 > 25 gene gasAcu , Gasterosteus aculeatus (stickleback) gasAcu 51 36 57 > 38 gene oryLap )), Oryzias latipes (medaka) oryLat 52 16 58 > 17 gene danRer )), Danio rerio (zebrafish) danRer 60 54 59 > 13 gene calMil ), Callorhinchus milii (elephantfish) 53 39 60 > 41 gene petMar ) Petromyzon marinus (lamprey) petMar 44 44 51 f 51 gene fasta tree_syntax genus species common ucsc phy alp phy alp
Candidate analysis
(methods explained here shortly)
Case of ERN2
chr6_5971 ERN2 4
contig00001 length=355 numreads=5
KLPFTIPELVHASPCRSSDGVLYT
.....................F..
^
15 R=3(75) H=2(50
Read data format: the top row gives project gene name, HGNC gene name and exon number from ENSEMBL monDom5
and human orthology predictions, then Monodelphis amino-acid segment, then sequence differences in
tasmanian devil (in this case, both individuals differ from Monodelphis by L->F), then differences between the two thylacines
(here one individual has R at position 15, the other has H), and finally the number of experimental reads that confirm the nucleotide
difference and the sum of the quality scores. The sequences were assembled by Newbler (the official 454 assembler) which uses
lower-case letters for less confident calls.
Paralog and pseudogene issues: ERN2 has not generated potentially confusing recent pseudogenes (lack of human or opossum genome Blat matches to ERN2 query). GeneSorter shows a single remote full-length paralog ERN1. However this particular exon is a good match (3 differences out of 23), so there is potential for experimental difficulties in distinguishing them in short reads. However at positions 15 and 20, ERN1 is identical at the amino acid level to ERN2.
Homoplasy (recurrent mutation) issues: This exon is very conserved and does not exhibit repetitive sequence, compositional simplicity, or indels in any species in either paralog that could foster experimental error or alignment ambiguity. At position 15, the ancestral value is arginine in both paralogs. The G--> A transition to histidine in one individual is conservative under most circumstances (still basic) and arises from an arginine codon CpG hotspot conserved back to lamprey in 30 of 32 species with available data, yet histidine is not observed part of a reduced alphabet (ie R/H) at this position over many billions of years of branch length. Consequently R-->H is a significant change in this individual tasmanian devil.
Side issues: a very ancient conserved leucine at position 21 appears to be transitioning to phenylalanine at marsupial node but has not been fixed, so settles out as L or F depending on lineage-sorting on each terminal marsupial leaf whereas placentals are all changed to phenylalanine (a phyloSNP caught in mid-air). While L and F might seem about the 'same' as amino acids, the branch length conservation totals say both are important but for different reasons: this is not a waffle codon nor reduced alphabet situation. This raises the question -- given the extreme conservation of this exon otherwise -- of whether the L-->F change at position 21 in both individuals has 'enabled' (made neutral or adaptive) an otherwise unfavorable R-->H change at position 15 in one individual.
Structural significance: By good fortune, the crystal structure of ERN1 (alternately called IRE1) has been published. The PDB 2HZ6 structure has good coverage of this particular exon. Consequently the marsupial ERN2 could be very accurately modelled and the structural effects of L-->F with or without R-->H computed by submission to online SwissProt modelling service.
Alignment of Monodelphis ERN2 (key exon replaced by that of sarHar2) with crytallograph human ERN1 alpha luminal domain
Expect = 5.8e-65 Identities = 109/180 (60%), Positives = 141/180 (78%)
ERN2_monDom 1 PESLLFISTLDGSLHAVSKKTGDIQWTLKDDPIIQGPVYATEPAFLPDPSDGSLYILGEE 60
PE+LLF+STLDGSLHAVSK+TG I+WTLK+DP++Q P + EPAFLPDP+DGSLY LG +
ERN1_homSap 8 PETLLFVSTLDGSLHAVSKRTGSIKWTLKEDPVLQVPTHVEEPAFLPDPNDGSLYTLGSK 67
ERN2_monDom 61 SKQGLMKLPFTIPELVHASPCHSSDGVFYTGRKQDTWFMVDPKSGKKQTMLSTETWDGLY 120
+ +GL KLPFTIPELV ASPCRSSDG+LY G+KQD W+++D +G+KQ LS+ D L
ERN1_homSap 68 NNEGLTKLPFTIPELVQASPCRSSDGILYMGKKQDIWYVIDLLTGEKQQTLSSAFADSLS 127
ERN2_monDom 121 PSAPLLYIGRTQYTVTMYDPRSQALRWNTTYRGYSAPLLDHLPGYQVGHFTCSGEGLVVT 180
PS LLY+GRT+YT+TMYD +++ LRWN TY Y+A L + Y++ HF +G+GLVVT
ERN1_homSap 128 PSTSLLYLGRTEYTITMYDTKTRELRWNATYFDYAASLPEDDVDYKMSHFVSNGDGLVVT 187
Functional significance: A considerable amount is known about the paralog ERN1. Annotation transfer may be applicable to ERN2.
"The unfolded protein response (UPR) is an evolutionarily conserved mechanism by which all eukaryotic cells adapt to the accumulation of unfolded proteins in the endoplasmic reticulum (ER). Inositol-requiring kinase 1 (IRE1) and PKR-related ER kinase (PERK) are two type I transmembrane ER-localized protein kinase receptors that signal the UPR through a process that involves homodimerization and autophosphorylation. To elucidate the molecular basis of the ER transmembrane signaling event, we determined the x-ray crystal structure of the luminal domain of human IRE1alpha. The monomer of the luminal domain comprises a unique fold of a triangular assembly of beta-sheet clusters. Structural analysis identified an extensive dimerization interface stabilized by hydrogen bonds and hydrophobic interactions. Dimerization creates an MHC-like groove at the interface. However, because this groove is too narrow for peptide binding and the purified luminal domain forms high-affinity dimers in vitro, peptide binding to this groove is not required for dimerization. Consistent with our structural observations, mutations that disrupt the dimerization interface produced IRE1alpha molecules that failed to either dimerize or activate the UPR upon ER stress. In addition, mutations in a structurally homologous region within PERK also prevented dimerization. Our structural, biochemical, and functional studies in vivo altogether demonstrate that IRE1 and PERK have conserved a common molecular interface necessary and sufficient for dimerization and UPR signaling."
^ *
ERN2_homSap KLPFTIPELVHASPCRSSDGVFYT
ERN2_panTro KLPFTIPELVHASPCRSSDGVFYT
ERN2_ponAbe KLPFTIPELVHASPCRSSDGVFYT
ERN2_rheMac KLPFTIPELVHASPCRSSDGVFYT
ERN2_calJac KLPFTIPELVHASPCRSSDGVFYT
ERN2_tarSyr KLPFTIPELVHASPCRSSDGVFYT
ERN2_micMur KLPFTIPELVHASPCRSSDGVFYT
ERN2_tupBel KLPFTIPELVHASPCRSSDGVFYT
ERN2_musMus KLPFTIPELVHASPCRSSDGVFYT
ERN2_ratNor KLPFTIPELVHASPCRSSDGVFYT
ERN2_cavPor KLPFTIPELVHTSPCRSSDGVFYT
ERN2_speTri KLPFTIPELVHASPCRSSDGVFYT
ERN2_oryCun KLPFTIPELVHASPCRSSDGVFYT
ERN2_ochPri KLPFSIPELVHASPCRSSDGVFYT
ERN2_turTru RLPFTIPELVHASPCRSSDGVFYT
ERN2_bosTau RLPFTIPELVHASPCRSSDGVFYT
ERN2_equCab KLPFTIPELVHASPCRSSDGVFYT
ERN2_felCat RLPFTIPELVHASPCRSSDGVFYT
ERN2_canFam KLPFTIPELVHASPCRSSDGVFYT
ERN2_myoLuc KLPFTIPELVHASPCRSSDGVFYT
ERN2_eriEur KLPFTVPELVHTSPCRSSDGVFYT
ERN2_sorAra KLPFTIPELVHASPCRSSDGVFYT
ERN2_loxAfr KLPFTIPELVHAS-----------
ERN2_proCap ---------------------FYT
ERN2_echTel KLPFTIPELVLASPCRSSDGVFYT
ERN2_dasNov KLPFTIPELVHTSPCRSSDGIFYT
ERN2_monDom KLPFTIPELVHASPCRSSDGVLYT
ERN2_macEug KLPFTIPELVQASPCRSSDGILYM
ERN2_sarHar1 KLPFTIPELVQASPCRSSDGIFYM
ERN2_sarHar2 KLPFTIPELVQASPCHSSDGIFYM
ERN2_ornAna KLPFTIPELVQSSPCRSSDGILYT
ERN2_anoCar KLPFTIPELVQSSPCRSSDGIIYT
ERN2_taeGut KLPFTIPELVQSSPCRSSDGVLYT
ERN2_galGal KLPFTIPELVQASPCRSSDGILYM
ERN2_xenTro KLPFTIPELVQSSPCRSSDGILYT
ERN2_xenLae KLPFTIPELVQSSPCRSSDGILYT
ERN2_tetNig KLPFTIPELVQASPCRSSDGVLYM
ERN2_takRub KLPFTIPELVQASPCRSSDGVLYM
ERN2_gasAcu KLPFTIPDLVQSAPCRSSDGILYT
ERN2_oryLat KLPFTIPELVQSAPCRSSDGILYT
ERN2_petMar KLPFTIPELVHASPCRTSDGVLYT
ERN1_homSap KLPFTIPELVQASPCRSSDGILYM
ERN1_panTro KLPFTIPELVQASPCRSSDGILYM
ERN1_ponAbe KLPFTIPELVQASPCRSSDGILYM
ERN1_rheMac KLPFTIPELVQASPCRSSDGILYM
ERN1_calJac KLPFTIPELVQASPCRSSDGILYM
ERN1_tarSyr KLPFTIPELVQASPCRSSDGILYM
ERN1_micMur KLPFTIPELVQASPCRSTDGILYM
ERN1_otoGar KLPFTIPELVQASPCRSSDGILYM
ERN1_tupBel KLPFTIPELVQASPCRSSDGILYM
ERN1_musMus KLPFTIPELVQASPCRSSDGILYM
ERN1_ratNor KLPFTIPELVQASPCRSSDGILYM
ERN1_dipOrd KLPFTIPELVQASPCRSSDGILYM
ERN1_cavPor KLPFTIPELVQASPCRSSDGILYM
ERN1_speTri KLPFTIPELVQASPCRSSDGILYM
ERN1_oryCun KLPFTIPELVQASPCRSSDGILYM
ERN1_vicPac KLPFTIPELVQASPCRSSDGILYM
ERN1_turTru KLPFTIPELVQASPCRSSDGILYM
ERN1_bosTau KLPFTIPELVQASPCRSSDGILYM
ERN1_equCab KLPFTIPELVQASPCRSSDGILYM
ERN1_canFam KLPFTIPELVQASPCRSSDGILYM
ERN1_myoLuc KLPFTIPELVQASPCRSSDGILYM
ERN1_pteVam KLPFTIPELVQASPCRSSDGILYM
ERN1_eriEur KLPFTIPELVQASPCRSSDGILYM
ERN1_sorAra KLPFTIPELVQASPCRSSDGILYM
ERN1_loxAfr KLPFTIPELVQASPCRSSDGILYM
ERN1_proCap KLPFTIPELVQASPCRSSDGILYM
ERN1_echTel KLPFTIPELVQASPCRSSDGILYM
ERN1_dasNov KLPFTIPELVQASPCRSSDGILYM
ERN1_choHof KLPFTIPELVQASPCRSSDGILYM
ERN1_monDom KLPFTIPELVQASPCRSSDGILYM
ERN1_ornAna KLPFTIPELVHASPCRSSDGILYM
ERN1_galGal KLPFTIPELVQASPCRSSDGILYM
ERN1_taeGut KLPFTIPELVQASPCRSSDGILYM
ERN1_anoCar KLPFTIPELVQASPCRSSDGILYM
ERN1_xenTro KLPFTIPELVQSSPCRSSDGILYT
ERN1_tetNig KLPFTIPELVQASPCRSSDGVLYM
ERN1_takRub KLPFTIPELVQASPCRSSDGVLYM
ERN1_gasAcu KLPFTIPELVQASPCRSSDGVLYM
ERN1_oryLat KLPFTIPELVQASPCRSSDGVLYM
ERN1_danRer KLPFTIPELVQASPCRSSDGILYM
Ancient CpG in ERN2 homSap chr16:23625855-23625856
Human CG
Chimp CG
Gorilla --
Orangutan CG
Rhesus CG
Marmoset CG
Tarsier CG
Mouse lemur CG
Bushbaby --
TreeShrew CG
Mouse CG
Rat CG
Kangaroo rat --
Guinea Pig CG
Squirrel CG
Rabbit CG
Pika CG
Alpaca --
Dolphin CG
Cow CG
Horse CG
Cat CG
Dog CG
Microbat CG
Megabat --
Hedgehog CG
Shrew CG
Elephant --
Rock hyrax --
Tenrec CG
Armadillo CG
Opossum CG
Platypus CG
Lizard CG
Tetraodon CG
Fugu CG
Stickleback CT
Medaka CT
Lamprey CG
Case of XXXX
(more shortly)
Case of YYYY
(more shortly)
Case of ZZZZ
(more shortly)

