Iron sulfur clusters: Difference between revisions

From genomewiki
Jump to navigationJump to search
(Created page with "== Introduction == The surprisingly numerous nuclear proteins containing 4Fe-4S clusters are made from their respective apoproteins in the cytoplasm during the final stages of a...")
 
No edit summary
Line 3: Line 3:
The surprisingly numerous nuclear proteins containing 4Fe-4S clusters are made from their respective apoproteins in the cytoplasm during the final stages of an assembly process that begins within mitochondria and ends with an embedded cluster in polymerases, helicases, primases, telomerases, and photolyases with no explained need for a cofactor otherwise associated with oxidation and reduction.  
The surprisingly numerous nuclear proteins containing 4Fe-4S clusters are made from their respective apoproteins in the cytoplasm during the final stages of an assembly process that begins within mitochondria and ends with an embedded cluster in polymerases, helicases, primases, telomerases, and photolyases with no explained need for a cofactor otherwise associated with oxidation and reduction.  


These 4Fe-4S clusters do not spontaneously associate with their target protein because they do not occur in free solution, being quite unstable to unwanted oxidation. Instead, nascent clusters are attached to a series of mediating proteins, carrier scaffolds and conformational chaperones throughout a complex process of maturation. That process and the gene products involved -- which are conserved from yeast to human -- have been recently [http://www.ncbi.nlm.nih.gov/pubmed/22609301 reviewed in depth] and new results (1,2) have clarified the roles of the four main protein components that collaborate on the final stage of cytoplasmic assembly.
These 4Fe-4S clusters do not spontaneously associate with their target protein because they do not occur in free solution, being quite unstable to unwanted oxidation. Instead, nascent clusters are attached to a series of mediating proteins, carrier scaffolds and conformational chaperones throughout a complex process of maturation. That process and the gene products involved -- which are conserved from yeast to human -- have been recently [http://www.ncbi.nlm.nih.gov/pubmed/22609301 reviewed in depth] and new results ([http://www.sciencemag.org/lookup/doi/10.1126/science.1219723 1],[http://www.sciencemag.org/lookup/doi/10.1126/science.1219664 2]) have clarified the roles of the four main protein components that collaborate on the final stage of cytoplasmic assembly.


Not all extra-mitochondrial 4Fe-4S cluster proteins are assembled in this pathway, but the molecular basis for specificity has not yet been determined. Indeed, a surprising number of proteins -- some studied for decades -- have only been recognized as iron sulfur proteins in 2011-12.  
Not all extra-mitochondrial 4Fe-4S cluster proteins are assembled in this pathway, but the molecular basis for specificity has not yet been determined. Indeed, a surprising number of proteins -- some studied for decades -- have only been recognized as iron sulfur proteins in 2011-12.  
Line 66: Line 66:


(coming shortly)
(coming shortly)
== Curated reference sequences ==
It serves no current purpose to collect all possible full length MMS19 sequences from GenBank, so only a sample of 20 uniformly distributed over the eukaryotic phylogenetic tree is provided here. MMS19 presents no real homology complications, being present as a single-copy gene. Genes in early diverging eukaryotes are assumed single-exon, ie taken as the largest open reading frame enveloping the match to the ultra-conserved region. MMS19 is studied experimentally only in yeast and human.
The [http://www.ncbi.nlm.nih.gov/sutils/genom_table.cgi apparent absence] in Giardia and various obligate parasites could be attributable to a reduced genome, extreme sequence divergence relative to available probes, or incomplete assembly -- it is inconceivable that these species lack core iron sulfur proteins of DNA metabolism. Indeed, the conserved cysteine pattern of primase large subunit are readily located in these species. It remains conceivable however that the very earliest diverging eukaryotes retain components of the archaeal iron sulfur cluster formation system.
The yeast gene, sometimes called MET18 in that literature, is unsurprisingly single-exon (only 283 of 6000 yeast proteins have them) and not located in a [http://www.ncbi.nlm.nih.gov/pubmed/9697097 yeast-type operon]. While some immediate neighbors are involved in DNA processes, none are homologous to iron sulfur cluster assembly components or have recognized 4Fe-4S cofactors themselves.
Gene  Position      Description
MET18  chrIX:113806  DNA repair, TFIIH regulator, nucleotide excision repair, RNA polymerase II, telomere maintenance
RRT14  chrIX:117024  rDNA transcription, localizes to nucleolus, involved in ribosome biogenesis
STH1  chrIX:117992  ATPase component in chromatin remodeling, expression of early meiotic genes, helicase-related protein homologous to Snf2p
KGD1  chrIX:122689  mitochondrial alpha-ketoglutarate dehydrogenase
ASG1  chrIX:102782  zinc cluster transcriptional regulator stress response
CSM2  chrIX:99860  homologous recombination repair, accurate chromosome segregation during meiosis
SIM1  chrIX:128151  may participate in DNA replication
The human gene has 31 coding exons. These do not correspond to natural structural breaks in the tertiary structure (eg HEAT units) and the ultra-conserved regions is spread across parts of 3 exons. Thus despite its modular structure, MMS19 had already completed its internal expansion of domain units prior to the main era of exon formation and could not today expand further by exon duplication because these would present issues of compatible [[http://genomewiki.ucsc.edu/index.php/Opsin_evolution:_ancestral_introns#Intron_location_and_phase_for_dummies|intron phasing]] as well as not corresponding cleanly to structural units.
Exon structure of human MMS19: columns show exon number, amino acid size, intron phasing (donor bp overhang), primary sequence, and <font color=blue>ultra-conserved region</font>.
1  37  1  MAAAAAVEAAAPMGALWGLVHDFVVGQQEGPADQVAA
2  17  2  DVKSGNYTVLQVVEALG
3  33  1  SSLENPEPRTRARAIQLLSQVLLHCHTLLLEKE
4  29  0  VVHLILFYENRLKDHHLVIPSVLQGLKAL
5  25  0  SLCVALPPGLAVSVLKAIFQEVHVQ
6  23  1  SLPQVDRHTVYNIITNFMRTREE
7  43  1  ELKSLGADFTFGFIQVM<font color=blue>DGEKDPRNLLVAFRIVHDLISRDYSL</font>
8  21  0  <font color=blue>GPFVEELFEVTSCYFPIDFTP</font>
9  29  0  <font color=blue>PPNDPHGIQREDL</font>ILSLRAVLASTPRFAE
10  25  0  FLLPLLIEKVDSEVLSAKLDSLQTL
11  26  0  NACCAVYGQKELKDFLPSLWASIRRE
12  46  1  VFQTASERVEAEGLAALHSLTACLSRSVLRADAEDLLDSFLSNILQ
13  52  0  DCRHHLCEPDMKLVWPSAKLLQAAAGASARACDSVTSNVLPLLLEQFHKHSQ
14  26  1  SSQRRTILEMLLGFLKLQQKWSYEDK
15  42  1  DQRPLNGFKDQLCSLVFMALTDPSTQLQLVGIRTLTVLGAQP
16  28  2  DLLSYEDLELAVGHLYRLSFLKEDSQSC
17  33  1  RVAALEASGTLAALYPVAFSSHLVPKLAEELRV
18  50  1  GESNLTNGDEPTQCSRHLCCLQALSAVSTHPSIVKETLPLLLQHLWQVNR
19  52  1  GNMVAQSSDVIAVCQSLRQMAEKCQQDPESCWYFHQTAIPCLLALAVQASMP
20  34  2  EKEPSVLRKVLLEDEVLAAMVSVIGTATTHLSPE
21  34  0  LAAQSVTHIVPLFLDGNVSFLPENSFPSRFQPFQ
22  23  0  DGSSGQRRLIALLMAFVCSLPRN
23  42  1  VEIPQLNQLMRELLELSCCHSCPFSSTAAAKCFAGLLNKHPA
24  34  0  GQQLDEFLQLAVDKVEAGLGSGPCRSQAFTLLLW
25  19  0  VTKALVLRYHPLSSCLTAR
26  62  1  LMGLLSDPELGPAAADGFSLLMSDCTDVLTRAGHAEVRIMFRQRFFTDNVPALVQGFHAAPQ
27  28  0  DVKPNYLKGLSHVLNRLPKPVLLPELPT
28  55  0  LLSLLLEALSCPDCVVQLSTLSCLQPLLLEAPQVMSLHVDTLVTKFLNLSSSPSM
29  20  0  AVRIAALQCMHALTRLPTPV
30  34  2  LLPYKPQVIRALAKPLDDKKRLVRKEAVSARGEW
31  9  0  FLLGSPGS*
Alignment of ultra-conserved region:
MMS19_homSap  FTFGFIQVMDGEKDPRNLLVAFRIVHDLI-------SRDYSLGPFVEELFEVTSCYFPIDFTPPPNDPHG-IQREDLILSLR
MMS19_musMus  FTFGFIQVMDGEKDPRNLLLAFRIVHDLI-------SKDYSLGPFVEELFEVTSCYFPIDFTPPPNDPYG-IQREDLILSLR
MMS19_cioInt  FLYQYIQVIDGEQDPRNLLTIFQLTKNLI-------ESSFPLFDLVEELFDVSSCYFPIDFNPAAAGKKSTITNLDLVSSLR
MMS19_braFlo  FVWGFIQAMDGEKDPRNLIIAFSIAR-IV-------AQAFPIGTFTEELFEVISCYFPIDFTPPADDPHG-VTREDLVLGLR
MMS19_strPur  FVLGLLHAMDGEKDPRNLILLFNILP-TV-------INNFKIDMFIEETFEVVACYYPVDFHPPPNDPYG-ISREKLALSLK
MMS19_sacKow  FVFGYIQCMDGEKDPRNLTMIFRCVP-II-------IHNFPIDVFIEELFEVVSCYFPIDFTPPPNDPYK-VTQEELVLGLR
MMS19_dapPul  FVFGLIQLADQERDPRNLLILFSIFPVVA--------RYFRFEPFTEEFFEVFSCYFPIDFTPPANDPYA-VTKEQLCDGLR
MMS19_droMel  FVYGLINSIDGERDPRNLDIIFSFMPEFL--------STYPLLHLAEEMFEIFACYFPIDFNPSKQDPAA-ITRDELSKKLT
MMS19_nemVec  LVFGFLQAMDGEKDPRNLVVAFKLAR-II-------IKNFPIGLFAEDLFEVTSCYFPIDFTP-------------------
MMS19_triAdh  FVFGYIQVMDGEKDPRNLLLALKIAKFIV--------QNFNIDLFLDDFFEIISCYFPIDFTPPPNLP----SNENVTK---
MMS19_sacCer  FIETFLHVANGEKDPRNLLLSFALNKSIT-------SSLQNVENFKEDLFDVLFCYFPITFKPPKHDPYK-ISNQDLKTALR
MMS19_schPom  FFSGICSTFAGEKDPRNLMLVFSMLK-KI-------LSTFPIDGFEQQFFDITYCYFPITFRAPPDATNLAITSDDLKIALR
MMS19_araTha  LVYAMCEAIDGEKDPQCLMIVFHLVELLAPLFP---SPSGPLASDASDLFEVIGCYFPLHFTHTKDDEAN-IRREDLSRGLL
MMS19_dicDis  FMVGYLQFIDNEKDPRNLIFSFKLLPKVIYNIP---EHKHFLES----LFEIISCYFPISFNPKGNDPNS-ITKDDLSNSLL
MMS19_pytUlt  FAQAFLNAMEGEKDPRNLLLCLQIARELLAKLE---VVFDRHDAVLQQYFDVVSCYFPITFTPPPNDPYG-ITSEELILSLR
MMS19_sapPar  LMDGFLRAMSGEKDPRNLLFCLRFAAELLTTYA---NVVDAD--VAKGFFDATSCYFPITFRPPPNDPYG-ITSEDLVLALR
MMS19_polPal  FMAGFLQFIDGEKDPRNIIYTFRLIPRVILYIP---EYKNFADS----LFEILSCYFPISFNPKPGDPNS-ITKDDLVSSLL
MMS19_entHis  -IDTCVQLIELERDPECLKEVFDLIKLVS-------QKNEIDADSAPLLFDCASAYFPILYPPKGDEA----LRIDLTNKIL
MMS19_naeGru  FLNGFIQSLEGERDPGNLLYCFNLIPKVIAIFDDSELSSKILSAVSDDLFDITSCYFPITYTPPANDTRG-ITREDLSRSLK
MMS19_phyInf  LAQTFLSAMEGEKDPRNLLLCMQVARTLLSKLE---PVFSRSDTLLQQYFDVVSCYFPIIFTPPPNDPYG-ITSEGLILSLR
MMS19_albLai  FIRSFLNAMTGEKDPRNLKHCFQIAQTMMQKLE---MVFQEAE-LSEQYFRVISCYFPITFTPPPNDPYG-VTTEELIRSLR
Consensus      f  g  q  dgEkDPrnL  f                          lF#v sCYFPI ftpp ndp  it edl  lr
Summary of MMS19 reference sequences:
MMS19_homSap Homo sapiens mammal Q96T76 1030 aa 7 HEAT 100%
MMS19_musMus Mus musculus mammal Q9D071 1031 aa 9 HEAT 89%
MMS19_cioInt Ciona intestinalis urochordate XP_002128657 1026 aa x HEAT 32%
MMS19_braFlo Branchiostoma floridae cephalochordate XP_002588594 1027 aa x HEAT 42%
MMS19_strPur Strongylocentrotus purpuratus echinoderm XP_001194909 975 aa x HEAT 36%
MMS19_sacKow Saccoglossus kowalevskii hemichordate XP_002735310 1007 aa x HEAT 40%
MMS19_dapPul Daphnia pulex crustacean EFX86854 961 aa x HEAT 38%
MMS19_droMel Drosophila melanogaster insect NP_649519 959 aa x HEAT 30%
MMS19_nemVec Nematostella vectensis cnidarian XM_001629116 897 aa x HEAT 32%
MMS19_triAdh Trichoplax adhaerens single-celled metazoan XP_002114595 959 aa x HEAT 39%
MMS19_sacCer Saccharomyces cerevisiae budding yeast P40469 MET18 1032 aa 13 HEAT 29%
MMS19_schPom Schizosaccharomyces pombe fission yeast Q9UTR1 1018 aa 14 HEAT 25%
MMS19_araTha Arabidopsis thaliana plant NM_124186 1134 aa x HEAT 28% armadillo/beta-catenin-like
MMS19_dicDis Dictyostelium discoideum slime mold Q54J88 1115 aa 18 HEAT 31%
MMS19_pytUlt Pythium ultimum stramenopiles ADOS01001616 957 aa 31%
MMS19_sapPar Saprolegnia parasitica stramenopiles ADCG01000470 804 aa 31%
MMS19_polPal Polysphondylium pallidum amoeba EFA86574 994 aa x HEAT 28%
MMS19_entHis Entamoeba histolytica amoeba XP_651925 868 aa x HEAT 25%
MMS19_naeGru Naegleria gruberi early eukaryote: heterolobosea XP_002678884 1070 aa x HEAT 27%
MMS19_phyInf Phytophthora infestans early eukaryote: stramenopiles 1114 aa x HEAT 33%
MMS19_albLai Albugo laibachii early eukaryote: stramenopiles 1077 aa x HEAT 27%
<pre>
>MMS19_homSap Homo sapiens mammal Q96T76 1030 aa 7 HEAT 100%
MAAAAAVEAAAPMGALWGLVHDFVVGQQEGPADQVAADVKSGNYTVLQVVEALGSSLENPEPRTRARAIQLLSQVLLHCHTLLLEKEVVHLILFYENRLK
DHHLVIPSVLQGLKALSLCVALPPGLAVSVLKAIFQEVHVQSLPQVDRHTVYNIITNFMRTREEELKSLGADFTFGFIQVMDGEKDPRNLLVAFRIVHDL
ISRDYSLGPFVEELFEVTSCYFPIDFTPPPNDPHGIQREDLILSLRAVLASTPRFAEFLLPLLIEKVDSEVLSAKLDSLQTLNACCAVYGQKELKDFLPS
LWASIRREVFQTASERVEAEGLAALHSLTACLSRSVLRADAEDLLDSFLSNILQDCRHHLCEPDMKLVWPSAKLLQAAAGASARACDSVTSNVLPLLLEQ
FHKHSQSSQRRTILEMLLGFLKLQQKWSYEDKDQRPLNGFKDQLCSLVFMALTDPSTQLQLVGIRTLTVLGAQPDLLSYEDLELAVGHLYRLSFLKEDSQ
SCRVAALEASGTLAALYPVAFSSHLVPKLAEELRVGESNLTNGDEPTQCSRHLCCLQALSAVSTHPSIVKETLPLLLQHLWQVNRGNMVAQSSDVIAVCQ
SLRQMAEKCQQDPESCWYFHQTAIPCLLALAVQASMPEKEPSVLRKVLLEDEVLAAMVSVIGTATTHLSPELAAQSVTHIVPLFLDGNVSFLPENSFPSR
FQPFQDGSSGQRRLIALLMAFVCSLPRNVEIPQLNQLMRELLELSCCHSCPFSSTAAAKCFAGLLNKHPAGQQLDEFLQLAVDKVEAGLGSGPCRSQAFT
LLLWVTKALVLRYHPLSSCLTARLMGLLSDPELGPAAADGFSLLMSDCTDVLTRAGHAEVRIMFRQRFFTDNVPALVQGFHAAPQDVKPNYLKGLSHVLN
RLPKPVLLPELPTLLSLLLEALSCPDCVVQLSTLSCLQPLLLEAPQVMSLHVDTLVTKFLNLSSSPSMAVRIAALQCMHALTRLPTPVLLPYKPQVIRAL
AKPLDDKKRLVRKEAVSARGEWFLLGSPGS*
>MMS19_musMus Mus musculus mammal Q9D071 1031 aa 9 HEAT 89%
MAAATGLEEAVAPMGALCGLVQDFVMGQQEGPADQVAADVKSGGYTVLQVVEALGSSLENAEPRTRARGAQLLSQVLLQCHSLLSEKEVVHLILFYENRL
KDHHLVVPSVLQGLRALSMSVALPPGLAVSVLKAIFQEVHVQSLLQVDRHTVFSIITNFMRSREEELKGLGADFTFGFIQVMDGEKDPRNLLLAFRIVHD
LISKDYSLGPFVEELFEVTSCYFPIDFTPPPNDPYGIQREDLILSLRAVLASTPRFAEFLLPLLIEKVDSEILSAKLDSLQTLNACCAVYGQKELKDFLP
SLWASIRREVFQTASERVEAEGLAALHSLTACLSCSVLRADAEDLLGSFLSNILQDCRHHLCEPDMKLVWPSAKLLQAAAGASARACEHLTSNVLPLLLE
QFHKHSQSNQRRTILEMILGFLKLQQKWSYEDRDERPLSSFKDQLCSLVFMALTDPSTQLQLVGIRTLTVLGAQPGLLSAEDLELAVGHLYRLTFLEEDS
QSCRVAALEASGTLATLYPGAFSRHLLPKLAEELHKGESDVARADGPTKCSRHFRCLQALSAVSTHPSIVKETLPLLLQHLCQANKGNMVTESSEVVAVC
QSLQQVAEKCQQDPESYWYFHKTAVPCLFALAVQASMPEKESSVLRKVLLEDEVLAALASVIGTATTHLSPELAAQSVTCIVPLFLDGNTSFLPENSFPD
QFQPFQDGSSGQRRLVALLTAFVCSLPRNVEIPQLNRLMRELLKQSCGHSCPFSSTAATKCFAGLLNKQPPGQQLEEFLQLAVGTVEAGLASESSRDQAF
TLLLWVTKALVLRYHPLSACLTTRLMGLLSDPELGCAAADGFSLLMSDCTDVLTRAGHADVRIMFRQRFFTDNVPALVQGFHAAPQDVKPNYLKGLSHVL
NRLPKPVLLPELPTLLSLLLEALSCPDSVVQLSTLSCLQPLLLEAPQIMSLHVDTLVTKFLNLSSSYSMAVRIAALQCMHALTRLPTSVLLPYKSQVIRA
LAKPLDDKKRLVRKEAVSARGEWFLLGSPGS*
>MMS19_cioInt Ciona intestinalis urochordate XP_002128657 1026 aa x HEAT 32%
MEKVKFEMEEMIQLWLRDKNDDHKILKCAQQIENREQTIGDLVTALGPHLTNKDTKIRIDACTLLSNVIHKLPKDCLNQGELESLVQFLCSRLEDHYTLQ
PVALSLLLQLSSADNLTGENACSIITSVFKEVHIQTCMQHDRLKIFQILGTLLDIHTKDVITMGRDFLYQYIQVIDGEQDPRNLLTIFQLTKNLIESSFP
LFDLVEELFDVSSCYFPIDFNPAAAGKKSTITNLDLVSSLRGVLASTKQFAQYCIPLMLEKLESDVESAKIDSLETLTACLGCYGKQELEKYLSSLWSDV
KREINQSSSEQIEKCCLTFLTSLLSNLSSWPVDQKSEKATDLKSFLDDVLEDCVPRLQAQSDDRSKWMAGHVVLACAKSSKKACSQIVTTVLPILLQNAQ
SKSASTTLAGQSVQQSALDNLVKLTAVCGQFNFENHPVLKKKEEFFTILNELALKSEIEEQLKCIAVAGFASLLKLEILSNVELTEIASLLLKMIKLKPE
SHLRGEVLSVAGYLSSQHPDVAKSHLIPCVMRRMEEGDDSCFDVLASVCTHFDVLKLVLGFIMERIVNTQVDETSEPLLHACLESLQKMTSSSWVGNTEI
EYMALNLVLPLLKRCIEVTLELSVPEQCCANCHIFEDVSKECASLPILKSAAIVIRNVCQKLKPGKSTDLVIQLIASLYNNSKLSSLDIKSDVHFTPFHP
KASPLQTRTLCFLPATICALHPNIEIPELAELETKLLNTCLHCTDQPSYVFAAKALSGLVNKYKKPSIPILEKLKSHFDTDPNWSLKSEEEKMMILTLLI
WICKALVLSNHPDSLIFIKNLLYWMGDDSVGEVAAAGFDIILRESNEVLSPSSHSTIRLMHKQRFFLLIIPEIVSSFKTSENKTQQTNILTALSHLIGHL
PKQVLMQHFTELLPLLTQALHTDNTQLLKSVLSTLFCFIQDTTEAMTAHLENLMKHFLRLSKFKQDIDVRVKAVQCIGVVTLLPPIVILPFKNDIVRHLV
SVLDDRKRDVRTEASKARSEWFLVGT*
>MMS19_braFlo Branchiostoma floridae cephalochordate XP_002588594 1027 aa x HEAT 42%
MAALSGNVQENVLEFVQGQQDSALQSVAKAVFDGETSLLQLVESLGSSLTSTEVTTRARATQLLAEVLHRSPSNRLTEKEAEVLSAFFCDRLLDHHSVQP
HVLHGLLALSAAPQLPQGEEVKIVQVIFKEVYVQSLVQTDRRAIYNILANFLDTRLEALQALGADFVWGFIQAMDGEKDPRNLIIAFSIARIVAQAFPIG
TFTEELFEVISCYFPIDFTPPADDPHGVTREDLVLGLRQVLAATSKFAQFCVPMLLEKLSSDVTSAKLDSLHTLAACAEVYGADSMKSFLDLLLSAISKE
VYSSIHQDVENAALAGLTAVVATLSHAVTETRSVFSLHHFLDSLLKGCKHHLCEPELKLMWPSAKLLQAAARASDPACVHVLDTAVPLLVEQFQVHPYPQ
HRHTILEVTIAFIHVAHASTSGTDAPNPVVPHSDNFLTLFYSVLEDADAGLRSSGVGGMAAMIGITDVVKGKHLDLCANHLGRLVLHDADPTVQRRSTEA
LAAMATAHPDVVREEVLSKLLQVLENNNPNAMDTNQSEQVCAKHVTNQYVLNTLAAVSTHPTIVRCTIPKLLSHLQALIESCPDQATQEAIATLDCVYKV
VEKTVINDANVEYFVDTIVLNLMSMALSAAVNTSDENLLHDTSLLEIVAKVLRAVARSLPNSTGKGIVNSTVQAFLQGNLAAISLNTSASFEPLDVSSPW
QQTQTVQLLSAIVCSVARNVDIPSISELAQKLLTLSCASDHEPTSLAAAKSLSGVVNKWDQGEQLQTFLQETRDCLEQILSKTEDEKARCRAVAVLVWLT
KALVIRGHPSGSQFTKTLMALFEDEAIGRRAAEGFYVILSDSPDVLSKESHANIRLMYKQRFFMENLPALVDGFNQADDGRKQSFLCAVSHLLTFIPRQV
LLGALPPLVPLLVQSLLGEDPSLQVSTLEMFSSLVQEAPQVISKNIDALIPQLLELSKNGPTMKVRMAALKSVGSMTSLPHAVVYPYRNRVVRELAVAVG
DKKRMVRKEAVAARGEWFLLGSPGGK*
>MMS19_strPur Strongylocentrotus purpuratus echinoderm XP_001194909 975 aa x HEAT 36%
MGTYLMSTETRIRAKGVELMSEVLTSLPRAFLNQQQIQVLIEFLCARLLDHHSITQHTLKGLLAMSSQSSFPPSSSVQVMTAIFKEVQVQTMLQVDRRTV
YNIVVNLLRISLTELQGMGSEFVLGLLHAMDGEKDPRNLILLFNILPTVINNFKIDMFIEETFEVVACYYPVDFHPPPNDPYGISREKLALSLKTCLSST
PKFAQFCLPMVMEKLSSDLQTARLDSYQLLQACAPVYSQGDMMSYIEAIWSYCRKELMVGASVELDQEAVKTLGAVVKAVSTGIQSTSGGGGGDGDLNSF
LRNILTECRQHLCKPDHRMIHPCSKLLVAVATASYPACIAILKYSVPILLDQFHIFDQTRERMVLLDIIQRLLHSGHDHIKEADDWRAIYAHHLDTVVTT
VLSTLAKDQGVPDLRMAGLGTLGELVQVPVLMDQSRLELVGQELTRILLEEANEDVRCECIETVSCFASRHTEFVKSTILTTLWTTVQKGESGYQRIVVD
MATIVTDTSDDLSRSLTSELMVEIAKTELNSEQHLVYLATLNTLTAHLSSAPSNLESLLSSVVHPLMKMVVSATLQSSVEAGNNPHCCGEVLVAMAEVFR
TVIPKLDSSMGSKLCQCAVDVFLHGNLTSLELTNPNTSVPFSPLDPRAPVHQTQLVTVLQPIVCSLRRDIHIPSSKQLMSSLLHIAAHSRAWLASTWAAK
GLAGGVNKHPAGSDLDEVLVEAESLLGQAMSSGQEGSVKQQALMAWVLLTKALVMRSHPKATAFLTTLLRLLEDAELGAQVPQTLGMLLEDMRDVLSEGL
HADVKIMYKQRVFLQALPFVMALFNKDDLRTKAITALCHLLPSIPRPVLLAELPPIIPRLVQSLRVTDPRPTLPILDILESLLEETLPSLVDQADTLLPT
LLELSAYQASMKVRIASLKCVGAITSFPHHLVYPHQETVVRSLAPRLDDKKRLVRQEAGKARTKWILLQQDTKG*
>MMS19_sacKow Saccoglossus kowalevskii hemichordate XP_002735310 1007 aa x HEAT 40%
MATSMCIEIVENYVRGEDESAIHAAEIKILELVENLGTYLTSTEKNIRCRGTRLLSEVLNRLPKNFLSSDEVRALVIFYCDRLSDHYSVTPHTLLGMLAL
STYDNLPKGCEVQLVQAIYKEVHVQSMIQVDRRSVYAILSNLLDTRIKDLQSLGRDFVFGYIQCMDGEKDPRNLTMIFRCVPIIIHNFPIDVFIEELFEV
VSCYFPIDFTPPPNDPYKVTQEELVLGLRKCLAATPKFAEHCLPLLMEKLSSDVQRAKIDSFLTLAECCEVYGEDDLMEFLPAMWSTIRREVFQAFSHEV
EKSALTCLCSIVKTLSNAVSNANKAAGGLDEFLDLVLKDCSKHLRDPGLRLMLPTSKLLQSAASASDPACYKIISAVVPILLEQFHKCKQVNERVSLLHA
ALDFIKVCKSFTFGDDTPSPVIPFKDSLASLFLSLLSDHSSQLRCIGITGLVGLMSLNAIMNINEKKLAAMHFTNIVLTDQDNKVCSEAVTALAFMSMEF
SLLVKEEVLPQLIKELDSRATGTRHRFIVNTLAGISMHSDIVLTTIPVMLQHLGTLSEDNTAESLETAVNTIQSIDIVVNSNISDEQCLDFFHSKLLPQL
LRITVDQALQVNNYILCKEDVVSSIATVCRNIAKVLDDRVASNLVSNTISLFLDGNLENIGLKQSSQHFRPLEISSPWQHTQLVSLLTSIICSMKTFELS
SQCLELMEKLLKLSLSSEHHLTCVSAAKCYAGLVNKHKQGTDLDSSLETVVESTCRMLQDEISDQNIYNRQKALTLWLWVTKALVLRAHFKSTQFTTKLI
SLFEDHQLSQMAADGFYIILSESQDVLNKDMHCDIKLMYRQRLFMQTLPRILAGFEKANEDKKQYYLSALSHLLQFIPKQVLLSELPPLMPMLVQSLYCQ
DVGLYVSTSDTLSMLIQDAPTVISLYVDTLLPQLLTLSTYQQSMKVRIAALKCIGLFVTLPTHVIYPRQKEVVRRLASVLDDRKRLVRQQAVTARGLWFL
LEAPKK*
>MMS19_dapPul Daphnia pulex crustacean EFX86854 961 aa x HEAT 38%
MAISTAIQKLRDSFNSEESANESIRCISQSIASKELTILKLVEDLQPDLLNQQNTHRCKAVSTIGTILEQLGPELKGLNEKEVELVTEFFCSKLKDHHSI
LPAALQGLHALSTAPKLSPGLARLISQSIFQDVHCQSQLQHDRRAIYKTLKNLLAFHLKELQDLGQDFVFGLIQLADQERDPRNLLILFSIFPVVARYFR
FEPFTEEFFEVFSCYFPIDFTPPANDPYAVTKEQLCDGLRQCLAGSPHFAEYCLPLLQEKLESDLVSAKVEALKTLELCCQTYQAGQLEKWVDSFWTGIR
REVLINVNTDDLEHASLDALAALSRAFTTDGEFNSPAFTKLLKNVLTECQGHLCEPERRLMTPSSYILLAICSGSAPACALIVSQVIPLLMDQYRIRPQS
NPRQFILNSLNKMVHAGLYGFTEENVAQSGLASLIPKLLELYLEVLKEDDAVLRNLSLQGLSHLIGTCLNHQDLEKVNGTLLDLLQKSTATDSVIAEIGH
FFCKSAEKNENLFLEQVLVKLLDIAVSGSIPTDGCARTIRPGITTGSTQSFDSFRTKGRTRNIPAIAPLGIENIGRRRSSGVTPSHLCLIEKSGFQFIFR
VILLDQNVGRNVFVTFSALYRKATEFINEQTEQYVSQHLARSPWTLSIMEATLGSLDATPSGHSLERLVNTLEPLTVCHPKADVRLSACRLMAALVNKLP
EGHELEAILDSLRRKWQDPSTDRCNSVCLFVWITKALLMRSYSKLNQYIQELVDSLNDPTHGYQVAEGFKTILCDTEECLNFNCHANIRLMYRQRFFQEV
VPRLLKLYRESESCNKAACFAAIANQLAFIPEGVLIAHITTLIPLLIQCLSTDQPAQLIISTINAFMGLMSDNVSAIEEYISSLVPRLLTLAKDGITMDV
RRLALQCLSELRKAQSIVLLPLRSEVILRLVPCLSDKKRLVRREAALARQKWIMLGQPGCN*
>MMS19_droMel Drosophila melanogaster insect NP_649519 959 aa x HEAT 30%
MTTPTRATLEKALKSDQKLVNSATQIAKDLTAKAYDISALAEALGFALSSPDMEERVAGTNLLSAVLIALPQDLLQERQLEFLSTFYMDRLRDHHNVMPA
IIDGIDALVHMKALPRAQIPQILQSFFEHTTCQSQTRSDRTKLFHIFQYLTENFQDELQAMAGDFVYGLINSIDGERDPRNLDIIFSFMPEFLSTYPLLH
LAEEMFEIFACYFPIDFNPSKQDPAAITRDELSKKLTNCLVANNEFAEGTVVLAIEKLESELLVAKLDSIELLHQAAVKFPPSVLEPHFDQIWQALKTET
FPGNDNEEILKASLKALSALLERAAHIPDISHSYQSSILGVILPHLSDVNQRLFHPATGIALVCVSGDAPYAADKILNSFLLKLQAADASSEQRIKIYYI
VSQVYKLSALRGSLQKLDTTIRESVQDDVIASLRLIEQEEFDAKKEDLELQKAALSVLNESAPLLNEKQRALIYKALVQLVSHPSIDIDFTTLTVSLGAL
QPVEVQSNFIDVCVRNFEIFSTFVKRKIYTNLLPLMPQIAFTQRILDLVMTQTFNDTTAEPVRLLALEALNKLLLLADQRFIVDVQQESNLLHKLIELGQ
KTEGLSMQSLEQIAGALSRITQQLPLSEQSAIVSEYLPGLNLSQSADLYITKGLLGYLHKDITLDDHFERLLTDLTQLSLNSDNEQLRVIAHHLLCSMVN
KMESNPANLRKVKKITEQLKVAIKKGDVRAVEILAWVGKGLVVAGFDEAADVVGDLSDLLKHPSLSTAAALGFDIIAAEYPELDLPVVKFLYKQKLFHTI
MGKMGSKLANYCVHHLKAFVYVLKATPQAVIKLNIEQLGPLLFKSLEEHNEAQSLCIALGICEKFVAQQDTYFQGHLAHLIPSCLELSKYKAQHTMQVRI
AALQLLYDVTKYPTFVLLPHKVDVTLALAAALDDPKRLVRNTAVKARNAWYLVGAPSPN*
>MMS19_nemVec Nematostella vectensis cnidarian XM_001629116 897 aa x HEAT 32%
MAALGQEEYPSLATLLQDVYQRRKNLLQVVELLGPSLTSTDTDKRCSAVQLLSSLLQKLVNYKLTDREDLKPVGSDLVFGFLQAMDGEKDPRNLVVAFKL
ARIIIKNFPIGLFAEDLFEVTSCYFPIDFTPFCLPLLMEKLSSDVINAKIDSLLTLVFQTVSSELEDAAFKALSSIIKNLESSSPGQEPFLSRIFINFYT
ISCYVTQCHPDVVEFKTPFLDCVIKECCANIEGADLRKVKPSGQLLQAAFVTDTTYNEITSTAVPLILSKYNDEATQGLVKKLLLDVLLGLLTASKPYYK
RKGSVLASHTSALVDVLFSALVSDSPSLCRAAIAGLVSMVTLPGLLLEQKVGMFVEHLTSFVLNTKDLTVRQESNAALAFLAMEFPELIKTKLVSVLAEQ
LQKEDGSAMDEENISHLQSDKSHPQYDQMLNTLSAVCTEEGVVRHVVPIILDHGEYLVTGKDLERGVLHGKISETLKCLNSIVKGTLQSSTVEPNYYTEV
VIYRIIDLCTQSALQESPDCPMATPEALALVCSIVRQVISHLAVNEAEDVLHIIVSNFIEGKTPLSARAEQKFAPLEPSSPWQQSQLVTVLMAAVCSARR
EVRIPRQKELVPRLQVLASGCNHRKTTVAASKCLGGIINKMAQGDDLTADLHSLKGQLQNHMDGNEEQRWRAVITWLWLTRALVTRSHPMAQEFVQKVLH
LLDDVSVGRVAADGFYVIVSDCDDVMNQAMHADIKMMYKQRFFMETLPLLLKGFHDTRPECKYLYLCALSHLLQWIPKQVLLTEIPTLMPMLIQALSRDE
PSLLLSTLQTLYSLVFDAPEVISRQVTSLIPNFLELAKCKASMKVRMEAIKCLGAMTTLEHHVVYVYKARVIKELACTLDDPKRLVRAEAVKCRNEW*
>MMS19_triAdh Trichoplax adhaerens single-celled metazoan XP_002114595 959 aa x HEAT 39%
MEKDSSAKSLQQLMDEFILGNSSAINEIIKGIYDGHIKLSTIVELLGPYLTSVEHEKRLQGMKLLSEVLQMLSMYKMQATEVQLLVAFYSDRLQDHFSIL
PETLRGILALVQHQIISEEDAVTIVKGIFKEVQNQALLQADRNKVYAILAGLLDKHYEGIKIMDADFVFGYIQVMDGEKDPRNLLLALKIAKFIVQNFNI
DLFLDDFFEIISCYFPIDFTPPPNLPSNENVTKEDLIIVLRESLTSTRKFAGISCAKIYTATDFQEYLQPIWTAIRQEVFLSMDDQVQELSLEALKHVVV
TISSNSLQQPDQDPLNDFINMIVTETQQYLQDPELKLANPCGNVLNAVASASDRSCYSILTPIIPRLVNLYSTDKTVIFRCKVLDILIKLLNAAANCQLS
EQFIAPMDWHEIVKLLQLAMDTSEEDIRLRVTASFSILIQIKDALPADEIERISNDILKRALEDPSSIVRHGSISTLATIASVLPDVIITTVIPYIRTSV
TNLQLLLQCLANVKNRIENCLYLYHYLFDDILWLCVYNSLEESINSFEFKTIKIIASIGQLIYLNLDESSQKKFIDNLLELFMNGQVSVLKPMTVIDELP
LKQFYPLNVASSQRQVQLIEILCKILGAIKFRDGILSPNDMITNLLDISCKSVHQPSATSAAQLLSSIINKMEEGDQLENYIKSITNTICNVLYSKNVET
EMKNAVNTWIWMFAILCRYSCSLFHYSNFDIKTSFDASFQLMKALIMRSHPYSNEALIQVLKFFKLPNVGHVASAGFKIIIGDEENILCESTNAIVKFMY
KNRFFMMASEKLMENYRIASKGIKHHYLTALSHLLNGVPKQMLLNHLQMLMPLLVESVSCDEESLRLSSLQTLRPLITEAPDIISNYVASMLPELLKLCN
FPSSMKIRISALQSVNDLASLPIHLVVPYKSKVINELGNTVNDKKRLVFTVINPKKQQ*
>MMS19_sacCer Saccharomyces cerevisiae budding yeast P40469 MET18 1032 aa 13 HEAT 29%
MTPDELNSAVVTFMANLNIDDSKANETASTVTDSIVHRSIKLLEVVVALKDYFLSENEVERKKALTCLTTILAKTPKDHLSKNECSVIFQFYQSKLDDQA
LAKEVLEGFAALAPMKYVSINEIAQLLRLLLDNYQQGQHLASTRLWPFKILRKIFDRFFVNGSSTEQVKRINDLFIETFLHVANGEKDPRNLLLSFALNK
SITSSLQNVENFKEDLFDVLFCYFPITFKPPKHDPYKISNQDLKTALRSAITATPLFAEDAYSNLLDKLTASSPVVKNDTLLTLLECVRKFGGSSILENW
TLLWNALKFEIMQNSEGNENTLLNPYNKDQQSDDVGQYTNYDACLKIINLMALQLYNFDKVSFEKFFTHVLDELKPNFKYEKDLKQTCQILSAIGSGNVE
IFNKVISSTFPLFLINTSEVAKLKLLIMNFSFFVDSYIDLFGRTSKESLGTPVPNNKMAEYKDEIIMILSMALTRSSKAEVTIRTLSVIQFTKMIKMKGF
LTPEEVSLIIQYFTEEILTDNNKNIYYACLEGLKTISEIYEDLVFEISLKKLLDLLPDCFEEKIRVNDEENIHIETILKIILDFTTSRHILVKESITFLA
TKLNRVAKISKSREYCFLLISTIYSLFNNNNQNENVLNEEDALALKNAIEPKLFEIITQESAIVSDNYNLTLLSNVLFFTNLKIPQAAHQEELDRYNELF
ISEGKIRILDTPNVLAISYAKILSALNKNCQFPQKFTVLFGTVQLLKKHAPRMTETEKLGYLELLLVLSNKFVSEKDVIGLFDWKDLSVINLEVMVWLTK
GLIMQNSLESSEIAKKFIDLLSNEEIGSLVSKLFEVFVMDISSLKKFKGISWNNNVKILYKQKFFGDIFQTLVSNYKNTVDMTIKCNYLTALSLVLKHTP
SQSVGPFINDLFPLLLQALDMPDPEVRVSALETLKDTTDKHHTLITEHVSTIVPLLLSLSLPHKYNSVSVRLIALQLLEMITTVVPLNYCLSYQDDVLSA
LIPVLSDKKRIIRKQCVDTRQVYYELGQIPFE*
>MMS19_schPom Schizosaccharomyces pombe fission yeast Q9UTR1 1018 aa 14 HEAT 25%
MSSNLVALYLFSIDRSQDEANDVVDRIVEEIVTDRMGIVDLVTSIGEYLTDNNISVRAKAVLLLSQTLGELPKDRLPAKHVSVLLQFYLSRLDDEVTMKE
NALGIGALLNMQNFPAQKIVDVCKALFSSTDMPKYAQATRLNILKVFETIIDNYLFFISSQTRDAFFSGICSTFAGEKDPRNLMLVFSMLKKILSTFPID
GFEQQFFDITYCYFPITFRAPPDATNLAITSDDLKIALRETLVANDAFSKLLLPALFERLKASTVRIKIDALNIYIEACKTWRVGAYLWSAKDFWESIKQ
EILNSTDAELQNLALGALNTLASKFYKEEGFSSSFTEFVDMILIQLSQRLLEDVNVKSCGSCAAVFASLASISVETFNYCSCNFLPSVLDLPMVNEPLEK
QKGMLVFLEYVYKCLVLLYGKWRSKNQADIDNPLLVYKDKQLSFVSGSLMGTAKDETEIRMLALKVIFLMASIKNFLTESELTMVLQFLDDIAFDFSDPI
KKKATECLKDLGLLKPDFLLLTSFPFAFSKLTDDVTAKSSSEETFKQYLSVLVSISEERSLFKALVIRLVEMLKDQFKSKEMSVDLVESIVQSLSVAFKE
RNDRNEQEIPFFFEELLKQLFTLCFANCESMNVRCLIYVSQTINEIVRVNHFEFQEKFVGQLWKLYMENSNSDLIETEGCEKAAERFTLAASLSDQKFLN
LVVLLQGGLNGLSKKLHFIEKLNIELLNLLINVVFVTESPGVKISALRLISSLINKCEKDEDISSFISSKGVTSLWDKVYTGTPKESEAALDVLAWVDKA
LVSRKHSEGIPLAFKLLDTLNLQNVGDSSVKALSIIIKDDPALSKENSYVEKLLYKQRFYASVSPKILEHISTATGGEKSLYLMLLSNVIGNVPKEIVIP
DMPSILPLLLQCLSLSDISVKLSTLNVIHTSVKELTSLLTEYLDTLIPSLLAIPKDMNNPTVVRLLALKCLGSLPEFTPTTNLQLFRDKVIRGLIPCLDD
PKRVVRTEASRTRHKWYI*
>MMS19_araTha Arabidopsis thaliana plant NM_124186 1134 aa x HEAT 28% armadillo/beta-catenin-like
MMVEPNQLVQHLETFVDTNRSSSQQDDSLKAIASSLENDSLSITQLVREMEMYLTTTDNLVRARGILLLAEILDCLKAKPLNDTIVHTLVGFFSEKLADW
RAMCGALVGCLALLKRKDVAGVVTDIDVQAMAKSMIQNVQVQALALHERKLAFELLECLLQQHSEAILTMGDLLVYAMCEAIDGEKDPQCLMIVFHLVEL
LAPLFPSPSGPLASDASDLFEVIGCYFPLHFTHTKDDEANIRREDLSRGLLLAISSTPFFEPYAIPLLLEKLSSSLPVAKVDSLKCLKDCALKYGVDRMK
KHYGALWSALKDTFYSSTGTHLSFAIESLTSPGFEMNEIHRDAVSLLQRLVKQDISFLGFVVDDTRINTVFDTIYRYPQYKEMPDPSKLEVLVISQILSV
SAKASVQSCNIIFEAIFFRLMNTLGIVEKTSTGDVVQNGNSTVSTRLYHGGLHLCIELLAASKDLILGFEECSPTSGCANSGCSMVKSFSVPLIQVFTSA
VCRSNDDSVVDVYLGVKGLLTMGMFRGGSSPVSRTEFENILVTLTSIITAKSGKTVVWELALKALVCIGSFIDRYHESDKAMSYMSIVVDNLVSLACSSH
CGLPYQMILEATSEVCSTGPKYVEKMVQGLEEAFCSSLSDFYVNGNFESIDNCSQLLKCLTNKLLPRVAEIDGLEQLLVHFAISMWKQIEFCGVFSCDFN
GREFVEAAMTTMRQVVGIALVDSQNSIIQKAYSVVSSCTLPAMESIPLTFVALEGLQRDLSSRDELILSLFASVIIAASPSASIPDAKSLIHLLLVTLLK
GYIPAAQALGSMVNKLGSGSGGTNTSRDCSLEEACAIIFHADFASGKKISSNGSAKIIVGSETTMSKICLGYCGSLDLQTRAITGLAWIGKGLLMRGNER
VNEIALVLVECLKSNNCSGHALHPSAMKHAADAFSIIMSDSEVCLNRKFHAVIRPLYKQRCFSTIVPILESLIMNSQTSLSRTMLHVALAHVISNVPVTV
ILDNTKKLQPLILEGLSVLSLDSVEKETLFSLLLVLSGTLTDTKGQQSASDNAHIIIECLIKLTSYPHLMVVRETSIQCLVALLELPHRRIYPFRREVLQ
AIEKSLDDPKRKVREEAIRCRQAWASITSGSNIF*
>MMS19_dicDis Dictyostelium discoideum slime mold Q54J88 1115 aa 18 HEAT 31%
MTSNITELNKWIEGYVNPQSEESVKTNAINMVLLYMKSNKIDLQDVVQGLGDYLKSNDSILRARGTLLLSEVLCRLPDLPLNQDQVHFLAMFYCDRLQDY
ACSSEVVKGITGLITNHTPDYPDNQKLLRNIFSEVHPTSLTQAHRKMVLQVIDIMFNKCLSEIQELKNDFMVGYLQFIDNEKDPRNLIFSFKLLPKVIYN
IPEHKHFLESLFEIISCYFPISFNPKGNDPNSITKDDLSNSLLNCFSCTPLLAEHSIPFLIDKICSNLIETKIEALQTLVYCCDRYGGFAVQPFLEEIWS
TLRTLILTHKNTTVIEESKKTIFYLTRSFTKERKVLESFLSIMIKECLHHIKSSQDSKIAIYCASILYQSVSASLLSSKIILIHIFPNLFNFLSELQKQD
TVQKVNEQNSVIALFNDLLKANSIAFEMYSNENKEPNPLEPFVDQLFKLFSDLLLLNSSSSIRSNSIECLSNLYISKKVHTTEQDDDDSEQITNEFLLDL
EKRQFIIKSLVSLLNSSDNTLRHKSLDSLFTIASNEDPSVLNLYVIPTLLQMINHSSCNINTTNNKINNNNNNNNIVIKNNKCQDEHCNEDHSNKNENNN
NSNENSNGNSTSGSDDDLKHYLEAFTKLCTHQPLLESVIPQIQVLLQHNIKETYQSNEDFEKSILILQSISFILEKSTNIKSMTICSKSILFPLIKGLYK
QELISSSNDNNNNNNNNSNRFNQILTPTLKMIHSIFENISIESQKPLLEKLIKLFLNGDTLVINYQLPTTTTTIIKPFEKSSPYKYLIPIFTTIISQSKL
DLSENNELKQSLYQMSLDVNVDDSIAISCSKAYSSIINKQQQQQQQDQINFNFFNDNLLKVINDTTTPLPLKIRHLDLFTWCTKALLTNGNSINIKLGSC
LADIISNENVELSYHASKSFGILLSETDVLNEKSGSIIKILFEQKFFTLMFPILLESFKVSKNKELQTISSHYLIAISNLLKHVPKEILLAELNEILPIV
MQSLKSSDNNDQVQLLDSSLQTLTMLINETPSSFISYLDSLIPSLIKISTKSTKYNLKRSALEILTLLSKSIPFVNLFPYKTQVVTDIIPCLDDKKRIVR
REAQKCRNSWYILQK*
>MMS19_pytUlt Pythium ultimum stramenopiles ADOS01001616 957 aa 4 ARM units 31%
MFSLDAPLAPAIDAFVNPENDDNVHKTSLNTVVMQVHRKVSMEALIQALGLHLTSTDDKVRARAMQLLAEVLSRLPELPLTPNAVQLLVDFFADRLADYP
SASACLQALLALESNHAKKIASPTVTIILIQKMVKVLHVPQLGQAMRKQCFELMQLALGQKVVVDVLVTAPESSSIDHGLLFAQAFLNAMEGEKDPRNLL
LCLQIARELLAKLEVVFDRHDAVLQQYFDVVSCYFPITFTPPPNDPYGITSEELILSLRKAFAASDLLAKHVLPFLLEKLSSTVVEAKLDSLQTLVFCCE
AYSINVALLHMLSIANALYHEVVKGEKKEVIEASLRAISRFSSVIGLAKTKAAGGAAYAWNKFVVELTTRAMSDLTGHATDSLVSVSAGQVLAALGKDSV
LGFTHVLETSVPLLIQQFNESSTSTESKCEASLARLLLIVNTIDREVDQSASAQPMRPHALVLIDALVAFLSNNEALSTPTAKCSAIEALSHLVTYPPSP
IVEIAQVKALVELFINFLLFDASPEVRRECLQSLRAISTIKQKATVKNYASLVMEIALTQLMDAVQLSAQNTKVAAVLASSGRDHPEFFNDVLDSITQLS
QEASLFQATIVRLVDFCVVENQDSNKITFVANSSANGTQAHVDGILNAVAKIVELNADDKASMEFCVTSGGDNSIVFRLLKAVTTTAADAAAQNALLDDA
KLASCARIFRTPMQNVSTETQQLLANAAISAFLTTQSTGASASHPAYLQLVPLFSAVINSANRNLNLPETSRVINTLLELAQSSTAVYHTTASTQQIEQI
SSEAALSAAKSLASIVNKMSDGEEFDALIVLLLDQKLSQIIANEQKDVSVRVAALQIYVWIAKALVIRGHREHAPACLFFLCKFLTPETSDARSQIAMHV
AKSFKLLVTEFPDVLNRKCGAFITVRQHKKKYAGILGNADLTFYFVCVVPVPSANV*
>MMS19_sapPar Saprolegnia parasitica stramenopiles ADCG01000470 804 aa 1 large ARM repeat 31%
MFSLDAPLQPAIDGFVDPENGEQQHTTHLNNVVMSVHRKTPIEQVIQGLGAHLTHVQDKRRARATLLLAEVLTRLPDLRLSSDTAHLLLTFFLERLKDGP
SMAACLKALVALISLHAALLPANDAWTVCATCHAWCERAVVETLLNLPTPIASLSQSMRKQSFELLQLIVRRGALGDHEGRVLMDGFLRAMSGEKDPRNL
LFCLRFAAELLTTYANVVDADVAKGFFDATSCYFPITFRPPPNDPYGITSEDLVLALRSVFVGHDSLAKHVLPMVLDKLSRTTVVEMTKDILETLAFCCA
KYPLNRLLLHFTPVAAAVYHHVLHGDNTAVIAVAIDALKTITRAVSPPSKLPGMQALAWNKCIVYLVNQAVEDLAHQAPDSMVSTGAGHVLCAIASVGVA
GFSHVLSSALALLLEQCAAQAGSPAEAATARLVQLLGCIDAEVDHSAPPLVPYVSAIQTTLVHGLETATSSRQQKLCLQGLRCLVLRPPSPLLDDASLEV
LLQGWTSTVLSNPFPDVRDEATSTLQAIALKSPGLAQIVLTRCVPSFLQVLEQPAVLFFASWCGDMDDGLGQCSVWAVDRGHGARHPRGPHAALARPRHL
SAPPAAVPDQLDAPVCDRDDGGRGRHCPRQQGLGRVHGLRHPRHSHFIVGALPPRRRARRARPDAVDRRPSDRAGDGQHERACHAVRVPPGANNAAHVDL
AAGLDVGHFALAWPRPATATIERSTLLVRRGVDGARTRRAAAPLAAAVYARRAEQRRRREHSQGACGALQRPPRGHAAVRCVPKVDHVARCPPRRHGVAG
GRHS*
>MMS19_polPal Polysphondylium pallidum amoeba EFA86574 994 aa x HEAT 28%
MSKANIDSYININNNDQTKQTSLNILLLEINANKLSIHQLVEYLGDYLQNTDSILRARGTLLLSEVLCRLPDLKLNEAQVEFLAAFYHDRLQDYACASEV
VKGVYGLCVNHKVPYPHNQKMIRAIFQEVHPSTLVQTHRKMVLQLIEHLLEHNLTEIQELKGDFMAGFLQFIDGEKDPRNIIYTFRLIPRVILYIPEYKN
FADSLFEILSCYFPISFNPKPGDPNSITKDDLVSSLLNCFGASTYFAEHCIPFLIDKICSNVVDTKIESLKTLLFCCSKYGPVALRPHLDDIWGTLRTQI
LTQKSATVIDESKKTMFYLTRVLAADQETLQSFLSMVDKECLHHIKTSQDSKLAVSCASILFQTVSASVKSSRIVLSHILPTIIDFFKELSLHLSDDPIH
KANEQLSIIGLFNDLLKANNISFQYNNENIDKEINPLEEYKDKLYDLFIGLLSNSSALVRTLAVDCLANLYVTRHIKTSVPITFVLDQEKRQSIIKDLGV
WLLIQIFRNKSLEALMSITKLEQVEQMNLFAIPTLLQMINANQSKNVSESKHYLEAFSQLCTHQPLLQSVIPQIKTLLEHSIKKKYINNDEFENSLLVLQ
SLENTFSNSIDEQTMTICYREILLPLVKELFEQVFSLDVNSQEQKDQVLGIMKPAISMIHSNNKKEAIELFINIYLNGDLSALQINKEFKPFSSDATEQA
KLLIPIFTSVISQSKFELSTNKLLKEMLMSRALDSNVEESISNACAICYGSIINKQTDQTDLPLDHLEQLISSSSTNKTQALNLLIWIEKGLVTNGNPQS
IKVGELLAQLITSENTEISQKAAKSFYILLSDHDTFDHKSGAIVKRQKNETVSSQFLVAITNLLRNVPKEVLLGELQEVLPIVLHSLHSNQRDLLNSSLQ
TLMMLVDEASTSISSHLDSLIPTLIKISVNGESLTFRQSSLEILTRISRAIPYPKIYPFRNQVINGIVPALDDKKRLVRREATKCRNSWFILQ*
>MMS19_entHis Entamoeba histolytica amoeba XP_651925 868 aa x HEAT 25%
MSTPAQQLNEFIESPKVIKEGYEIIDQLMKNNYNVNSLVTDLGDTLPSEDERIRFRATSLLTYCLIKYPIKEESKDVFVDYLASRLVDAVCLEPILTALL
QLVTKKPSDEIINEIAMAYSCMRTQLYTKEVRILVYQFYKVFINYYQATEVIDTCVQLIELERDPECLKEVFDLIKLVSQKNEIDADSAPLLFDCASAYF
PILYPPKGDEALRIDLTNKILDAFVSAPIYAQFALPFLLDKLDADLSSIKLEALKAIYFCIQRFELKYVYAYFTHIWESIEQNISTVGVVEVNEFAFAIA
SYFCSLDDFHSKNLMESIKMFCLRMMSETDEIIINFVNGLLEELTKKSEKFFKVFVPVFIQCFHDQLQDADDRPKEQFERELFIVRLIYQRIIEGMPLLD
CVKGQVAWDLHRLATPLHPCFVSLLDIDVSLALLNLLGEQRMVPFQNAIELSENKCHDAIPILQRLYEKEEDVMISLLPANKIITNLELVSGIALHSPKL
FEQLLKLIPTLQSNEYVPVFQSILSDALPFNCLDVYVNHCIPVFIVITNGVLSPLFNTLMNRLSRLHSILSSKKISELTEGVLSQLKEHSRLLLILPSLL
QFYQPENLIVYLNEVQEVDKDTIAIYSLLISKLTNIIPHVLEQNKEYFNGYKTIQELDSHESNKQATPIFIEELCRMNNKAIECLKEMIVFDSINKKNEL
HWKEELFNLVYERFIESHQVTTVEESHIMILLFSLLPTEKLLTYESTVLKIFNIICVPTSHLNEIDSVVVLLFNILPTVSQYPMSLIESELDSIITKLFN
VLYINGTTIKYRCDIIDLLTRIRVVYGIDAIRPYQKNVIKKLLVPLDDNKRLVRRSAAICRNIWETTA*
>MMS19_naeGru Naegleria gruberi early eukaryote: heterolobosea XP_002678884 1070 aa x HEAT 27%
MQTSSNSNGEQELISLIDSLCNPTLPNTNKESLKSKLIEFVVNGTLTINEGIKLLGEYLNHATDDRIRGAAYAVLDLILENIPNNVGSETDETKQTTQLK
LVASLLRFIGDRFYDFDCLATLLPCLFSLFKKWSSYISSEQAINVVLQFFENVNIQSIGSTSGVAHATKTRSLCFEWFSLLADRFPSIVRTIDFLNGFIQ
SLEGERDPGNLLYCFNLIPKVIAIFDDSELSSKILSAVSDDLFDITSCYFPITYTPPANDTRGITREDLSRSLKLCFGCNKFFAPTLFPFLLEKLSSDLV
DTKLETLDYLCYCIEKFGEVNSREYLTEIWSYIKAESVKTNSMDVMKKCYESITKIARIVIIPNDPSNKPFDIPNIEAILRTALLELKSKEPKFAAQYAR
MIYACAVPTFEISMMVFNRVMPELVATLSESDTKDKLYGSLLMITQLLQAVAEQKGENQLPEVVFNLISQVQTVFLSIYEEEFSKNDKEMILVMVETISR
IAIFRIPSTLLRDIYVSRILLKSYGEENKSFSLLHATSLEEYKERVIKDIAWIYKYAPDIVSEDILVPLFGALYGCENKSEHINRILSNISAVGKVCPSM
TPSITHRLFERIESIPISESHYEHERVKVFETITSLDVSLIPAHDKVSYIQRIVKMSVTDSSSQMVDDSDTMDCSDSECAHVHHNQGNFSFLTLLLGRSL
ENELQQVVLDSVLQYANSVPSTGLKNFISVLSAIVIACRPTVGMGNLITMTDSLLQMALKGEQPSQVTKCIAQLVGSVLNKLPLDSTEFQQLITICNATV
FDAFSQMLTVYNGDSESAERYIEMVSWILKGLVMRGAYVPHADRYSSLLCGSLVFEYNSSKVNKKVAEGFLIAIGEDETSIHKENHAIIQVLYKQRFFAT
NVRKLMDSINTVTQPHIIGSILLALSNLIHNVPTKVILSEVKNIFPIVLKFLEMRQILIEQDNNSEDLLYAAIKTTLTLLSDAKEEMSVHLSSIVPILLD
TCKFKKSQAVRILSVEALLELTNGYKYYEIYPLKKDIIKGLEACLDDKKRKVRKAAVKCRNSYFVLSNNQ*
>MMS19_phyInf Phytophthora infestans early eukaryote: stramenopiles 1114 aa x HEAT 33%
MVSYEQLGSLPQKGSQNPVVNQKLEAIAMFSLDAPLAPAIDAFVDPENDDAAQKTGLNTVVMQVHRHVSMEALIQALGAYLTNGDDKVRSRATLLLAEVL
TRLPELQLTPSAVQLLMTFFADRLADFPSASACLRALLALETHHAAQVQSPRTTVALIPKLGKTLHIPQLGQAMRKMCFDLMQLALMQSTVVELLLDSVP
ASKDAQDASVDDAEQSEDLGRQLAQTFLSAMEGEKDPRNLLLCMQVARTLLSKLEPVFSRSDTLLQQYFDVVSCYFPIIFTPPPNDPYGITSEGLILSLR
HAFAASDLLAPLVLPFLLKKLASTVVEAKLDAIQTLVFCGERYSVNALLLQMHAVATALYDEVLDGEKQEVIAEARQAISRFSGVVARAKAQDTPGAAYA
WSKFVVDMTARAAGELRENAADSMVSVSAGQVLAALGRESSMSFAHVLKIAVPLLVEQLNNESSGSDSVPSKCEAALARILLLIDTIDREIDQSGQGQPM
RPHAAALIDALVNFLSSDHDNQTKPGSSPTARCVAVEALCHLLTFPPSPIVAPAQVKALINLFTRMLLLDPVAEVRTACLQSLKEISTVSTASEGSTNSG
EHPVTGGYAAFVVEISLARLMAAVSEGSDQEDDDDEEGTGVAAVLTASNRNFDSFFEEALLAITELCRESSIFQATIFLLIDLCVEKGDGKQSAIGFCEA
EGDATRQRHVDCILDAVAKIVEINAGDRTSMEFCVKASSSASIIFRLLTAVETLAARATASSGYKSGLVDEVKLSACVRIFRAVMQNVSSATQQQLVDAV
VPAFLRTNTSEPASLQFVPLFAAVINSAARDVALPDSSLVINRLLELAQSGATAVSESPPRQLQLVYTDAALSAAKSLASIVNKMSDGAEFDALIDLLLS
RKLAVVISNSAESFTVRVAALQIYAWIAKALVIRGHKVHAPVCLRFLCSFLTPDGDVNMEQEGDDQHAAALRMEVAKTFKLLVSEYLDVLNRKCGAFITF
LYRQRLFDLVFPVLLEYIRARIDEESSVAALVAFAQVIAHSPKAIYLPHLAQIFPLMVQALNTDDRELGSAAIQTFKPLLLESVESAKPFLKDVFPGLLK
QAQFGYVVSCSDS*
>MMS19_albLai Albugo laibachii early eukaryote: stramenopiles 1077 aa x HEAT 27%
MFQLDAPLSPAIKKFIDSGASNDEETGQKTSLNAVVMHTHRIGSIETLIQELEPYLTDDCNDFARARATLLIAEVLTRLPDLPLSGNRIQVLNNFFCARL
DDPPSIPASFQALLALQKHHSTEIPDSENMELVIRISDTLHVPQLNQPMRKRYYELVYLVIQQERMQKALSRSQQAQVFIRSFLNAMTGEKDPRNLKHCF
QIAQTMMQKLEMVFQEAELSEQYFRVISCYFPITFTPPPNDPYGVTTEELIRSLRNVLTASDVLIHQMVPFLLEKLSGSMSEEAKVDALDTLGHCVETFS
LKNLLLHIRSIGQVFYHEILNGERARVIETASNVLSRVSSVIGRAKVQGSSGSGFAWNAVVVTITNQAVEKLHENSVDSMSSASAGKVLASMSRESLVVS
THVLNTSMPLLIEQVKHSFEASSSQCEAALDRLMLFVDTIDEEVEQISTIHPIHSHASPILEALVKFLEEDTPTSTPNAKRLSIRIISHLVIYPSTPVVR
PSDVERIVRLFTRGFLSDASKHVRSEFLSSLKALSGAIKTPSTLQSVHCKREKTLQLYGTLLKEHCIAQLLALVQDGKSPEAETFQKSSCRTRKDFEQDT
LAAITELSHDPVIFKEAVVHLLQSCFIDQDGLLIFRSFEVEHTLQFFQAVATIIELNASNASNMEFCASIDDQNGIAFKLLDAFVSMAMSNGQSKEQKFL
PPNAIAFSTRILRTIMQNICFDTQQKLLDRAISRFHPILQTEESTPSQHLYQIVSAFSTVINSANRSLAFPKAYCVIDSLMAVSRSITTESHGYTNEIVL
LISQSIGSILNKVRDKHFEAKVESLLTGLSQSIHNDQEQAQWHTSIEVYIWITKGLLLCGHPKYSSQSVAFLTQLLIHHSDKGVRGQVAEGVRVILTEFP
NVLNRKCGASCNMLFRQRLFELVGPNLLAFISKHSEETTEALTGFCYIVAFSPKAAFISLISTIMPLVLRGLSSDHVELGAAAIKAYKIVSDTSIEHVKP
FLKDVFHGLLQQAQHSANALDRKDALECIGMLTTLPYELIHSYKDRVLRQLLFCLDDRKRFVRYTAVRVRNKWSVL*</pre>


[[Category:Comparative Genomics]]
[[Category:Comparative Genomics]]

Revision as of 14:11, 15 June 2012

Introduction

The surprisingly numerous nuclear proteins containing 4Fe-4S clusters are made from their respective apoproteins in the cytoplasm during the final stages of an assembly process that begins within mitochondria and ends with an embedded cluster in polymerases, helicases, primases, telomerases, and photolyases with no explained need for a cofactor otherwise associated with oxidation and reduction.

These 4Fe-4S clusters do not spontaneously associate with their target protein because they do not occur in free solution, being quite unstable to unwanted oxidation. Instead, nascent clusters are attached to a series of mediating proteins, carrier scaffolds and conformational chaperones throughout a complex process of maturation. That process and the gene products involved -- which are conserved from yeast to human -- have been recently reviewed in depth and new results (1,2) have clarified the roles of the four main protein components that collaborate on the final stage of cytoplasmic assembly.

Not all extra-mitochondrial 4Fe-4S cluster proteins are assembled in this pathway, but the molecular basis for specificity has not yet been determined. Indeed, a surprising number of proteins -- some studied for decades -- have only been recognized as iron sulfur proteins in 2011-12.

Indeed, the list of proteins is still incomplete because many unrelated homology classes have Fe-S clusters, meaning no single diagnostic pattern can be used to scan the entire proteome. Often four conserved cysteines coordinate the cubane complex but their spacing within the primary sequence is not uniform and difficult to distinguish from cysteine patterns that bind intrinsic zinc. Further confusing matters, seemingly artifactual zinc can replace bona fide 4Fe-4S clusters in proteins purified for crystallography in the presence of oxygen (1,2,3).

The early and middle stages of intra-mitochondrial iron sulfur cluster assembly are carried out by gene products of bacterial origin, relics of alphaproteobacterial endosymbiosis transferred long ago to the nuclear genome. However, not all components of final cytoplasmic assembly have such a clear origin whereas most targeted apoproteins (such as primase large subunit PRIM2) are clearly those of the archaeal parent. Thus the final stage of assembly presents two worlds in collision: bacterial proteins assembly iron sulfur clusters in unfamiliar archaeal proteins.

Bioinformatics, while a poor substitute for experimentation, is fast and easy, so it is best to exhaust the possibilities there first. Nothing is proven but sometimes it can suggest interesting directions.

MMS19: a large all-scaffold protein

MMS19 is a large protein involved in cytoplasmic iron sulfur assembly first studied with bioinformatic tools 12 years ago (1,2). Revisiting that with modern comparative genomics methods, MMS19 emerges as a modular scaffolding protein over its entire length, conserved in its features -- though not particularly in amino acid sequence -- from the earliest diverging eukaryotes to human.

The C-terminus of MMS19 was initially classified as HEAT repeats. Today we know these are not found as individual units but instead work together to form a long twisted spiral of consecutive modules called an ARM domain. An individual HEAT unit consists of a small 3-helix bundle, a generic super-secondary structure analogous to a beta-alpha-beta Rossmann fold unit, meaning most occurrences of HEAT in the eukaryotic proteome are not truly homologous despite structural similarity but instead represent convergent evolution analogous to Rossmann-like fold units forming many unrelated beta propellers or TIM barrels.

Since these domains are catalytically inert and lack conserved cysteins or other conserved motifs, MMS19 can contribute as organizing principle to the cytoplasmic iron assembly complex (and other nuclear complexes) but not to the actual business of forming 4Fe-4S on target apoproteins.

The size of MMS19 -- over a thousand residues -- makes it a difficult target for structure determination. As of June 2012, no deposited structure at PDB provides a template upon which the MMS19 can be threaded. In the interim, MMS19 might be structurally modeled using the known beta-catenin structure -- despite its lack of authentic homology, it too is comprised almost entirely of HEAT units.

The number of HEAT repeats in an ARM domain is subject to expansion and contraction over evolutionary time. The individual units often align poorly with each other and generally lack conserved residue signatures despite initial reports, yet this level of variation does not necessarily affect the overall fold. However, this lack of diagnostic features makes it difficult to reliably identify remote homologs of HEAT repeats because the primary sequences can be diverged beyond recognition and homological alignments go out of register when the number of repeats differs.

An accurate count of the number of individual HEAT domains in MMS19 from a given species is also difficult because some domains are more accurately represented in the HMMer profile than others, here giving different counts between mouse and human at UniProt despite 90% sequence identity over their entire length. Indeed the five species with manually reviewed UniProt entries are all in conflict, with the nominal number of HEAT units range from 7 in human to 18 in slime mold, despite similar lengths and overall alignment implying the same actual domain structure.

MMS19 is a single-copy gene without paralogs in all eukaryotes, implying simple orthology (no retained duplications or losses). Below, 20 full length fasta sequences from GenBank were chosen for uniform distribution over the eukaryotic phylogenetic tree. Superfamily proved to be the most consistent, sensitive and selective online tool for ARM domain detection. The figure at bottom establishes that MMS19 consists entirely of HEAT units and spacers, which in effect form a single ARM.

It emerges upon alignment with MultAlin that conservation is mediocre overall except for one previously notedspecial region of exceptional conservation containing two blocks of invariant residues from human to yeast to amoeba. This region must have already been established in the last common ancestor of all eukaryotes and play a very special role in MMS19 even today to account its conservation over trillions of years of cumulative branch length. However that role remains a complete mystery. Sequence conservation in MMS19 is otherwise not exceptional: typically 27-34% identity relative to human, of which some portion is accidental.

ultra-conserved region in MMS19:

human: 182 DGEKDPRNLLVAFRIVHDLISRDYSLGPFVEELFEVTSCYFPIDFTPPPNDPHGIQREDL 241
yeast: 184 NGEKDPRNLLLSFALNKSITSSLQNVENFKEDLFDVLFCYFPITFKPPKHDPYKISNQDL 24 

Regardless of Blast query -- full-length MMS19, this ultra-conserved region, or reconstructed ancestral sequence -- no counterpart to MMS19 occurs among 2,500 complete bacteria and archaea genomes, even though unambiguous orthologs to human MMS19 are readily found in the earliest diverging eukaryotes. MMS19 may thus represent a eukaryotic innovation needed to organize more complex cytoplasmic iron assembly, or be too simplified and diverged (or just lost) in prokaryotes. As method of last resort, prokaryotic operons containing other cytoplasmic iron sulfur assembly proteins could be scanned for adjacent HEAT-like domains or comparable scaffolding proteins.

There are no matches to MMS19 at PDB using Blastp. Since the fold is widespread and generic, structural matches in DALI do not imply homology. On the other hand, this allows the crystallographic structure of a non-homologous ARM protein (beta-catennin pdb: 1LUJ) to serve as provisional structural template. Bound E-cadherin, ICAT, XTCF3 complexes have been also been determined which may suggest a binding mode for cytoplasmic iron sulfur and helicase-type proteins on HEAT repeats of MMS19.

MMS19 could determine selectivity among the overall set of iron sulfur apoproteins if only those interacting with DNA (or comparable nucleotide) have binding propensity for HEAT units. This specificity would then vary by organism, as would the effects of knock-in replacement. Only those apoproteins that align along the linear scaffolding structure in close enough proximity to CIA effector proteins receive an iron sulfur complex. Not every protein arising in this context need directly bind a HEAT domain -- they could bind another protein with that capacity.

This scaffolding scenario requires multiple non-homologous proteins to have HEAT binding sites, which seemingly requires convergent evolution on a significant scale since a shared mobile binding domain can be ruled out. If MMM19 is truly a eukaryotic innovation, then cytoplasmic iron assembly complex initially functioned without the scaffolding until MMS19 and these binding sites evolved.

That seems implausible, so some other common ground must account for HEAT binding. One option, based on the super-helical configuration and major groove of the overall ARM domain, supposes that that MMM19 spoofs a DNA helix or nucleotide base in shape and charge (along the lines of W536 in CRY1B photolyase). This would explain most of the specificity -- each archaeal apoproteins of DNA metabolism needing an iron sulfur cluster already has a DNA binding site, so already an appropriate MMM19 HEAT binding site.

In early endosymbiosis, retained bacterial cluster assembly machinery collides with nuclear-encoded archaeal iron-sulfur protein motifs previously maturated by a different system, a conflict that had to be seamlessly resolved without ever a gap in continued functionality.

MMS19.jpg

CIAO1

(coming shortly)

FAM98B

(coming shortly)

IOP1

(coming shortly)

XPD

(coming shortly)

Curated reference sequences

It serves no current purpose to collect all possible full length MMS19 sequences from GenBank, so only a sample of 20 uniformly distributed over the eukaryotic phylogenetic tree is provided here. MMS19 presents no real homology complications, being present as a single-copy gene. Genes in early diverging eukaryotes are assumed single-exon, ie taken as the largest open reading frame enveloping the match to the ultra-conserved region. MMS19 is studied experimentally only in yeast and human.

The apparent absence in Giardia and various obligate parasites could be attributable to a reduced genome, extreme sequence divergence relative to available probes, or incomplete assembly -- it is inconceivable that these species lack core iron sulfur proteins of DNA metabolism. Indeed, the conserved cysteine pattern of primase large subunit are readily located in these species. It remains conceivable however that the very earliest diverging eukaryotes retain components of the archaeal iron sulfur cluster formation system.

The yeast gene, sometimes called MET18 in that literature, is unsurprisingly single-exon (only 283 of 6000 yeast proteins have them) and not located in a yeast-type operon. While some immediate neighbors are involved in DNA processes, none are homologous to iron sulfur cluster assembly components or have recognized 4Fe-4S cofactors themselves.

Gene   Position      Description

MET18  chrIX:113806  DNA repair, TFIIH regulator, nucleotide excision repair, RNA polymerase II, telomere maintenance
RRT14  chrIX:117024  rDNA transcription, localizes to nucleolus, involved in ribosome biogenesis
STH1   chrIX:117992  ATPase component in chromatin remodeling, expression of early meiotic genes, helicase-related protein homologous to Snf2p
KGD1   chrIX:122689  mitochondrial alpha-ketoglutarate dehydrogenase
ASG1   chrIX:102782  zinc cluster transcriptional regulator stress response
CSM2   chrIX:99860   homologous recombination repair, accurate chromosome segregation during meiosis
SIM1   chrIX:128151  may participate in DNA replication

The human gene has 31 coding exons. These do not correspond to natural structural breaks in the tertiary structure (eg HEAT units) and the ultra-conserved regions is spread across parts of 3 exons. Thus despite its modular structure, MMS19 had already completed its internal expansion of domain units prior to the main era of exon formation and could not today expand further by exon duplication because these would present issues of compatible [phasing] as well as not corresponding cleanly to structural units.

Exon structure of human MMS19: columns show exon number, amino acid size, intron phasing (donor bp overhang), primary sequence, and ultra-conserved region.

1   37  1  MAAAAAVEAAAPMGALWGLVHDFVVGQQEGPADQVAA
2   17  2  DVKSGNYTVLQVVEALG
3   33  1  SSLENPEPRTRARAIQLLSQVLLHCHTLLLEKE
4   29  0  VVHLILFYENRLKDHHLVIPSVLQGLKAL
5   25  0  SLCVALPPGLAVSVLKAIFQEVHVQ
6   23  1  SLPQVDRHTVYNIITNFMRTREE
7   43  1  ELKSLGADFTFGFIQVMDGEKDPRNLLVAFRIVHDLISRDYSL
8   21  0  GPFVEELFEVTSCYFPIDFTP
9   29  0  PPNDPHGIQREDLILSLRAVLASTPRFAE
10  25  0  FLLPLLIEKVDSEVLSAKLDSLQTL
11  26  0  NACCAVYGQKELKDFLPSLWASIRRE
12  46  1  VFQTASERVEAEGLAALHSLTACLSRSVLRADAEDLLDSFLSNILQ
13  52  0  DCRHHLCEPDMKLVWPSAKLLQAAAGASARACDSVTSNVLPLLLEQFHKHSQ
14  26  1  SSQRRTILEMLLGFLKLQQKWSYEDK
15  42  1  DQRPLNGFKDQLCSLVFMALTDPSTQLQLVGIRTLTVLGAQP
16  28  2  DLLSYEDLELAVGHLYRLSFLKEDSQSC
17  33  1  RVAALEASGTLAALYPVAFSSHLVPKLAEELRV
18  50  1  GESNLTNGDEPTQCSRHLCCLQALSAVSTHPSIVKETLPLLLQHLWQVNR
19  52  1  GNMVAQSSDVIAVCQSLRQMAEKCQQDPESCWYFHQTAIPCLLALAVQASMP
20  34  2  EKEPSVLRKVLLEDEVLAAMVSVIGTATTHLSPE
21  34  0  LAAQSVTHIVPLFLDGNVSFLPENSFPSRFQPFQ
22  23  0  DGSSGQRRLIALLMAFVCSLPRN
23  42  1  VEIPQLNQLMRELLELSCCHSCPFSSTAAAKCFAGLLNKHPA
24  34  0  GQQLDEFLQLAVDKVEAGLGSGPCRSQAFTLLLW
25  19  0  VTKALVLRYHPLSSCLTAR
26  62  1  LMGLLSDPELGPAAADGFSLLMSDCTDVLTRAGHAEVRIMFRQRFFTDNVPALVQGFHAAPQ
27  28  0  DVKPNYLKGLSHVLNRLPKPVLLPELPT
28  55  0  LLSLLLEALSCPDCVVQLSTLSCLQPLLLEAPQVMSLHVDTLVTKFLNLSSSPSM
29  20  0  AVRIAALQCMHALTRLPTPV
30  34  2  LLPYKPQVIRALAKPLDDKKRLVRKEAVSARGEW
31   9  0  FLLGSPGS*
Alignment of ultra-conserved region:

MMS19_homSap   FTFGFIQVMDGEKDPRNLLVAFRIVHDLI-------SRDYSLGPFVEELFEVTSCYFPIDFTPPPNDPHG-IQREDLILSLR
MMS19_musMus   FTFGFIQVMDGEKDPRNLLLAFRIVHDLI-------SKDYSLGPFVEELFEVTSCYFPIDFTPPPNDPYG-IQREDLILSLR
MMS19_cioInt   FLYQYIQVIDGEQDPRNLLTIFQLTKNLI-------ESSFPLFDLVEELFDVSSCYFPIDFNPAAAGKKSTITNLDLVSSLR
MMS19_braFlo   FVWGFIQAMDGEKDPRNLIIAFSIAR-IV-------AQAFPIGTFTEELFEVISCYFPIDFTPPADDPHG-VTREDLVLGLR
MMS19_strPur   FVLGLLHAMDGEKDPRNLILLFNILP-TV-------INNFKIDMFIEETFEVVACYYPVDFHPPPNDPYG-ISREKLALSLK
MMS19_sacKow   FVFGYIQCMDGEKDPRNLTMIFRCVP-II-------IHNFPIDVFIEELFEVVSCYFPIDFTPPPNDPYK-VTQEELVLGLR
MMS19_dapPul   FVFGLIQLADQERDPRNLLILFSIFPVVA--------RYFRFEPFTEEFFEVFSCYFPIDFTPPANDPYA-VTKEQLCDGLR
MMS19_droMel   FVYGLINSIDGERDPRNLDIIFSFMPEFL--------STYPLLHLAEEMFEIFACYFPIDFNPSKQDPAA-ITRDELSKKLT
MMS19_nemVec   LVFGFLQAMDGEKDPRNLVVAFKLAR-II-------IKNFPIGLFAEDLFEVTSCYFPIDFTP-------------------
MMS19_triAdh   FVFGYIQVMDGEKDPRNLLLALKIAKFIV--------QNFNIDLFLDDFFEIISCYFPIDFTPPPNLP----SNENVTK---
MMS19_sacCer   FIETFLHVANGEKDPRNLLLSFALNKSIT-------SSLQNVENFKEDLFDVLFCYFPITFKPPKHDPYK-ISNQDLKTALR
MMS19_schPom   FFSGICSTFAGEKDPRNLMLVFSMLK-KI-------LSTFPIDGFEQQFFDITYCYFPITFRAPPDATNLAITSDDLKIALR
MMS19_araTha   LVYAMCEAIDGEKDPQCLMIVFHLVELLAPLFP---SPSGPLASDASDLFEVIGCYFPLHFTHTKDDEAN-IRREDLSRGLL
MMS19_dicDis   FMVGYLQFIDNEKDPRNLIFSFKLLPKVIYNIP---EHKHFLES----LFEIISCYFPISFNPKGNDPNS-ITKDDLSNSLL
MMS19_pytUlt   FAQAFLNAMEGEKDPRNLLLCLQIARELLAKLE---VVFDRHDAVLQQYFDVVSCYFPITFTPPPNDPYG-ITSEELILSLR
MMS19_sapPar   LMDGFLRAMSGEKDPRNLLFCLRFAAELLTTYA---NVVDAD--VAKGFFDATSCYFPITFRPPPNDPYG-ITSEDLVLALR
MMS19_polPal   FMAGFLQFIDGEKDPRNIIYTFRLIPRVILYIP---EYKNFADS----LFEILSCYFPISFNPKPGDPNS-ITKDDLVSSLL
MMS19_entHis   -IDTCVQLIELERDPECLKEVFDLIKLVS-------QKNEIDADSAPLLFDCASAYFPILYPPKGDEA----LRIDLTNKIL
MMS19_naeGru   FLNGFIQSLEGERDPGNLLYCFNLIPKVIAIFDDSELSSKILSAVSDDLFDITSCYFPITYTPPANDTRG-ITREDLSRSLK
MMS19_phyInf   LAQTFLSAMEGEKDPRNLLLCMQVARTLLSKLE---PVFSRSDTLLQQYFDVVSCYFPIIFTPPPNDPYG-ITSEGLILSLR
MMS19_albLai   FIRSFLNAMTGEKDPRNLKHCFQIAQTMMQKLE---MVFQEAE-LSEQYFRVISCYFPITFTPPPNDPYG-VTTEELIRSLR
Consensus      f  g  q  dgEkDPrnL   f                          lF#v sCYFPI ftpp ndp   it edl   lr
Summary of MMS19 reference sequences:

MMS19_homSap Homo sapiens mammal Q96T76 1030 aa 7 HEAT 100%
MMS19_musMus Mus musculus mammal Q9D071 1031 aa 9 HEAT 89%
MMS19_cioInt Ciona intestinalis urochordate XP_002128657 1026 aa x HEAT 32%
MMS19_braFlo Branchiostoma floridae cephalochordate XP_002588594 1027 aa x HEAT 42%
MMS19_strPur Strongylocentrotus purpuratus echinoderm XP_001194909 975 aa x HEAT 36%
MMS19_sacKow Saccoglossus kowalevskii hemichordate XP_002735310 1007 aa x HEAT 40%
MMS19_dapPul Daphnia pulex crustacean EFX86854 961 aa x HEAT 38%
MMS19_droMel Drosophila melanogaster insect NP_649519 959 aa x HEAT 30%
MMS19_nemVec Nematostella vectensis cnidarian XM_001629116 897 aa x HEAT 32%
MMS19_triAdh Trichoplax adhaerens single-celled metazoan XP_002114595 959 aa x HEAT 39%
MMS19_sacCer Saccharomyces cerevisiae budding yeast P40469 MET18 1032 aa 13 HEAT 29%
MMS19_schPom Schizosaccharomyces pombe fission yeast Q9UTR1 1018 aa 14 HEAT 25%
MMS19_araTha Arabidopsis thaliana plant NM_124186 1134 aa x HEAT 28% armadillo/beta-catenin-like
MMS19_dicDis Dictyostelium discoideum slime mold Q54J88 1115 aa 18 HEAT 31%
MMS19_pytUlt Pythium ultimum stramenopiles ADOS01001616 957 aa 31%
MMS19_sapPar Saprolegnia parasitica stramenopiles ADCG01000470 804 aa 31%
MMS19_polPal Polysphondylium pallidum amoeba EFA86574 994 aa x HEAT 28%
MMS19_entHis Entamoeba histolytica amoeba XP_651925 868 aa x HEAT 25%
MMS19_naeGru Naegleria gruberi early eukaryote: heterolobosea XP_002678884 1070 aa x HEAT 27%
MMS19_phyInf Phytophthora infestans early eukaryote: stramenopiles 1114 aa x HEAT 33%
MMS19_albLai Albugo laibachii early eukaryote: stramenopiles 1077 aa x HEAT 27%
>MMS19_homSap Homo sapiens mammal Q96T76 1030 aa 7 HEAT 100%
MAAAAAVEAAAPMGALWGLVHDFVVGQQEGPADQVAADVKSGNYTVLQVVEALGSSLENPEPRTRARAIQLLSQVLLHCHTLLLEKEVVHLILFYENRLK
DHHLVIPSVLQGLKALSLCVALPPGLAVSVLKAIFQEVHVQSLPQVDRHTVYNIITNFMRTREEELKSLGADFTFGFIQVMDGEKDPRNLLVAFRIVHDL
ISRDYSLGPFVEELFEVTSCYFPIDFTPPPNDPHGIQREDLILSLRAVLASTPRFAEFLLPLLIEKVDSEVLSAKLDSLQTLNACCAVYGQKELKDFLPS
LWASIRREVFQTASERVEAEGLAALHSLTACLSRSVLRADAEDLLDSFLSNILQDCRHHLCEPDMKLVWPSAKLLQAAAGASARACDSVTSNVLPLLLEQ
FHKHSQSSQRRTILEMLLGFLKLQQKWSYEDKDQRPLNGFKDQLCSLVFMALTDPSTQLQLVGIRTLTVLGAQPDLLSYEDLELAVGHLYRLSFLKEDSQ
SCRVAALEASGTLAALYPVAFSSHLVPKLAEELRVGESNLTNGDEPTQCSRHLCCLQALSAVSTHPSIVKETLPLLLQHLWQVNRGNMVAQSSDVIAVCQ
SLRQMAEKCQQDPESCWYFHQTAIPCLLALAVQASMPEKEPSVLRKVLLEDEVLAAMVSVIGTATTHLSPELAAQSVTHIVPLFLDGNVSFLPENSFPSR
FQPFQDGSSGQRRLIALLMAFVCSLPRNVEIPQLNQLMRELLELSCCHSCPFSSTAAAKCFAGLLNKHPAGQQLDEFLQLAVDKVEAGLGSGPCRSQAFT
LLLWVTKALVLRYHPLSSCLTARLMGLLSDPELGPAAADGFSLLMSDCTDVLTRAGHAEVRIMFRQRFFTDNVPALVQGFHAAPQDVKPNYLKGLSHVLN
RLPKPVLLPELPTLLSLLLEALSCPDCVVQLSTLSCLQPLLLEAPQVMSLHVDTLVTKFLNLSSSPSMAVRIAALQCMHALTRLPTPVLLPYKPQVIRAL
AKPLDDKKRLVRKEAVSARGEWFLLGSPGS*

>MMS19_musMus Mus musculus mammal Q9D071 1031 aa 9 HEAT 89%
MAAATGLEEAVAPMGALCGLVQDFVMGQQEGPADQVAADVKSGGYTVLQVVEALGSSLENAEPRTRARGAQLLSQVLLQCHSLLSEKEVVHLILFYENRL
KDHHLVVPSVLQGLRALSMSVALPPGLAVSVLKAIFQEVHVQSLLQVDRHTVFSIITNFMRSREEELKGLGADFTFGFIQVMDGEKDPRNLLLAFRIVHD
LISKDYSLGPFVEELFEVTSCYFPIDFTPPPNDPYGIQREDLILSLRAVLASTPRFAEFLLPLLIEKVDSEILSAKLDSLQTLNACCAVYGQKELKDFLP
SLWASIRREVFQTASERVEAEGLAALHSLTACLSCSVLRADAEDLLGSFLSNILQDCRHHLCEPDMKLVWPSAKLLQAAAGASARACEHLTSNVLPLLLE
QFHKHSQSNQRRTILEMILGFLKLQQKWSYEDRDERPLSSFKDQLCSLVFMALTDPSTQLQLVGIRTLTVLGAQPGLLSAEDLELAVGHLYRLTFLEEDS
QSCRVAALEASGTLATLYPGAFSRHLLPKLAEELHKGESDVARADGPTKCSRHFRCLQALSAVSTHPSIVKETLPLLLQHLCQANKGNMVTESSEVVAVC
QSLQQVAEKCQQDPESYWYFHKTAVPCLFALAVQASMPEKESSVLRKVLLEDEVLAALASVIGTATTHLSPELAAQSVTCIVPLFLDGNTSFLPENSFPD
QFQPFQDGSSGQRRLVALLTAFVCSLPRNVEIPQLNRLMRELLKQSCGHSCPFSSTAATKCFAGLLNKQPPGQQLEEFLQLAVGTVEAGLASESSRDQAF
TLLLWVTKALVLRYHPLSACLTTRLMGLLSDPELGCAAADGFSLLMSDCTDVLTRAGHADVRIMFRQRFFTDNVPALVQGFHAAPQDVKPNYLKGLSHVL
NRLPKPVLLPELPTLLSLLLEALSCPDSVVQLSTLSCLQPLLLEAPQIMSLHVDTLVTKFLNLSSSYSMAVRIAALQCMHALTRLPTSVLLPYKSQVIRA
LAKPLDDKKRLVRKEAVSARGEWFLLGSPGS*

>MMS19_cioInt Ciona intestinalis urochordate XP_002128657 1026 aa x HEAT 32%
MEKVKFEMEEMIQLWLRDKNDDHKILKCAQQIENREQTIGDLVTALGPHLTNKDTKIRIDACTLLSNVIHKLPKDCLNQGELESLVQFLCSRLEDHYTLQ
PVALSLLLQLSSADNLTGENACSIITSVFKEVHIQTCMQHDRLKIFQILGTLLDIHTKDVITMGRDFLYQYIQVIDGEQDPRNLLTIFQLTKNLIESSFP
LFDLVEELFDVSSCYFPIDFNPAAAGKKSTITNLDLVSSLRGVLASTKQFAQYCIPLMLEKLESDVESAKIDSLETLTACLGCYGKQELEKYLSSLWSDV
KREINQSSSEQIEKCCLTFLTSLLSNLSSWPVDQKSEKATDLKSFLDDVLEDCVPRLQAQSDDRSKWMAGHVVLACAKSSKKACSQIVTTVLPILLQNAQ
SKSASTTLAGQSVQQSALDNLVKLTAVCGQFNFENHPVLKKKEEFFTILNELALKSEIEEQLKCIAVAGFASLLKLEILSNVELTEIASLLLKMIKLKPE
SHLRGEVLSVAGYLSSQHPDVAKSHLIPCVMRRMEEGDDSCFDVLASVCTHFDVLKLVLGFIMERIVNTQVDETSEPLLHACLESLQKMTSSSWVGNTEI
EYMALNLVLPLLKRCIEVTLELSVPEQCCANCHIFEDVSKECASLPILKSAAIVIRNVCQKLKPGKSTDLVIQLIASLYNNSKLSSLDIKSDVHFTPFHP
KASPLQTRTLCFLPATICALHPNIEIPELAELETKLLNTCLHCTDQPSYVFAAKALSGLVNKYKKPSIPILEKLKSHFDTDPNWSLKSEEEKMMILTLLI
WICKALVLSNHPDSLIFIKNLLYWMGDDSVGEVAAAGFDIILRESNEVLSPSSHSTIRLMHKQRFFLLIIPEIVSSFKTSENKTQQTNILTALSHLIGHL
PKQVLMQHFTELLPLLTQALHTDNTQLLKSVLSTLFCFIQDTTEAMTAHLENLMKHFLRLSKFKQDIDVRVKAVQCIGVVTLLPPIVILPFKNDIVRHLV
SVLDDRKRDVRTEASKARSEWFLVGT*

>MMS19_braFlo Branchiostoma floridae cephalochordate XP_002588594 1027 aa x HEAT 42%
MAALSGNVQENVLEFVQGQQDSALQSVAKAVFDGETSLLQLVESLGSSLTSTEVTTRARATQLLAEVLHRSPSNRLTEKEAEVLSAFFCDRLLDHHSVQP
HVLHGLLALSAAPQLPQGEEVKIVQVIFKEVYVQSLVQTDRRAIYNILANFLDTRLEALQALGADFVWGFIQAMDGEKDPRNLIIAFSIARIVAQAFPIG
TFTEELFEVISCYFPIDFTPPADDPHGVTREDLVLGLRQVLAATSKFAQFCVPMLLEKLSSDVTSAKLDSLHTLAACAEVYGADSMKSFLDLLLSAISKE
VYSSIHQDVENAALAGLTAVVATLSHAVTETRSVFSLHHFLDSLLKGCKHHLCEPELKLMWPSAKLLQAAARASDPACVHVLDTAVPLLVEQFQVHPYPQ
HRHTILEVTIAFIHVAHASTSGTDAPNPVVPHSDNFLTLFYSVLEDADAGLRSSGVGGMAAMIGITDVVKGKHLDLCANHLGRLVLHDADPTVQRRSTEA
LAAMATAHPDVVREEVLSKLLQVLENNNPNAMDTNQSEQVCAKHVTNQYVLNTLAAVSTHPTIVRCTIPKLLSHLQALIESCPDQATQEAIATLDCVYKV
VEKTVINDANVEYFVDTIVLNLMSMALSAAVNTSDENLLHDTSLLEIVAKVLRAVARSLPNSTGKGIVNSTVQAFLQGNLAAISLNTSASFEPLDVSSPW
QQTQTVQLLSAIVCSVARNVDIPSISELAQKLLTLSCASDHEPTSLAAAKSLSGVVNKWDQGEQLQTFLQETRDCLEQILSKTEDEKARCRAVAVLVWLT
KALVIRGHPSGSQFTKTLMALFEDEAIGRRAAEGFYVILSDSPDVLSKESHANIRLMYKQRFFMENLPALVDGFNQADDGRKQSFLCAVSHLLTFIPRQV
LLGALPPLVPLLVQSLLGEDPSLQVSTLEMFSSLVQEAPQVISKNIDALIPQLLELSKNGPTMKVRMAALKSVGSMTSLPHAVVYPYRNRVVRELAVAVG
DKKRMVRKEAVAARGEWFLLGSPGGK*

>MMS19_strPur Strongylocentrotus purpuratus echinoderm XP_001194909 975 aa x HEAT 36%
MGTYLMSTETRIRAKGVELMSEVLTSLPRAFLNQQQIQVLIEFLCARLLDHHSITQHTLKGLLAMSSQSSFPPSSSVQVMTAIFKEVQVQTMLQVDRRTV
YNIVVNLLRISLTELQGMGSEFVLGLLHAMDGEKDPRNLILLFNILPTVINNFKIDMFIEETFEVVACYYPVDFHPPPNDPYGISREKLALSLKTCLSST
PKFAQFCLPMVMEKLSSDLQTARLDSYQLLQACAPVYSQGDMMSYIEAIWSYCRKELMVGASVELDQEAVKTLGAVVKAVSTGIQSTSGGGGGDGDLNSF
LRNILTECRQHLCKPDHRMIHPCSKLLVAVATASYPACIAILKYSVPILLDQFHIFDQTRERMVLLDIIQRLLHSGHDHIKEADDWRAIYAHHLDTVVTT
VLSTLAKDQGVPDLRMAGLGTLGELVQVPVLMDQSRLELVGQELTRILLEEANEDVRCECIETVSCFASRHTEFVKSTILTTLWTTVQKGESGYQRIVVD
MATIVTDTSDDLSRSLTSELMVEIAKTELNSEQHLVYLATLNTLTAHLSSAPSNLESLLSSVVHPLMKMVVSATLQSSVEAGNNPHCCGEVLVAMAEVFR
TVIPKLDSSMGSKLCQCAVDVFLHGNLTSLELTNPNTSVPFSPLDPRAPVHQTQLVTVLQPIVCSLRRDIHIPSSKQLMSSLLHIAAHSRAWLASTWAAK
GLAGGVNKHPAGSDLDEVLVEAESLLGQAMSSGQEGSVKQQALMAWVLLTKALVMRSHPKATAFLTTLLRLLEDAELGAQVPQTLGMLLEDMRDVLSEGL
HADVKIMYKQRVFLQALPFVMALFNKDDLRTKAITALCHLLPSIPRPVLLAELPPIIPRLVQSLRVTDPRPTLPILDILESLLEETLPSLVDQADTLLPT
LLELSAYQASMKVRIASLKCVGAITSFPHHLVYPHQETVVRSLAPRLDDKKRLVRQEAGKARTKWILLQQDTKG*

>MMS19_sacKow Saccoglossus kowalevskii hemichordate XP_002735310 1007 aa x HEAT 40%
MATSMCIEIVENYVRGEDESAIHAAEIKILELVENLGTYLTSTEKNIRCRGTRLLSEVLNRLPKNFLSSDEVRALVIFYCDRLSDHYSVTPHTLLGMLAL
STYDNLPKGCEVQLVQAIYKEVHVQSMIQVDRRSVYAILSNLLDTRIKDLQSLGRDFVFGYIQCMDGEKDPRNLTMIFRCVPIIIHNFPIDVFIEELFEV
VSCYFPIDFTPPPNDPYKVTQEELVLGLRKCLAATPKFAEHCLPLLMEKLSSDVQRAKIDSFLTLAECCEVYGEDDLMEFLPAMWSTIRREVFQAFSHEV
EKSALTCLCSIVKTLSNAVSNANKAAGGLDEFLDLVLKDCSKHLRDPGLRLMLPTSKLLQSAASASDPACYKIISAVVPILLEQFHKCKQVNERVSLLHA
ALDFIKVCKSFTFGDDTPSPVIPFKDSLASLFLSLLSDHSSQLRCIGITGLVGLMSLNAIMNINEKKLAAMHFTNIVLTDQDNKVCSEAVTALAFMSMEF
SLLVKEEVLPQLIKELDSRATGTRHRFIVNTLAGISMHSDIVLTTIPVMLQHLGTLSEDNTAESLETAVNTIQSIDIVVNSNISDEQCLDFFHSKLLPQL
LRITVDQALQVNNYILCKEDVVSSIATVCRNIAKVLDDRVASNLVSNTISLFLDGNLENIGLKQSSQHFRPLEISSPWQHTQLVSLLTSIICSMKTFELS
SQCLELMEKLLKLSLSSEHHLTCVSAAKCYAGLVNKHKQGTDLDSSLETVVESTCRMLQDEISDQNIYNRQKALTLWLWVTKALVLRAHFKSTQFTTKLI
SLFEDHQLSQMAADGFYIILSESQDVLNKDMHCDIKLMYRQRLFMQTLPRILAGFEKANEDKKQYYLSALSHLLQFIPKQVLLSELPPLMPMLVQSLYCQ
DVGLYVSTSDTLSMLIQDAPTVISLYVDTLLPQLLTLSTYQQSMKVRIAALKCIGLFVTLPTHVIYPRQKEVVRRLASVLDDRKRLVRQQAVTARGLWFL
LEAPKK*

>MMS19_dapPul Daphnia pulex crustacean EFX86854 961 aa x HEAT 38%
MAISTAIQKLRDSFNSEESANESIRCISQSIASKELTILKLVEDLQPDLLNQQNTHRCKAVSTIGTILEQLGPELKGLNEKEVELVTEFFCSKLKDHHSI
LPAALQGLHALSTAPKLSPGLARLISQSIFQDVHCQSQLQHDRRAIYKTLKNLLAFHLKELQDLGQDFVFGLIQLADQERDPRNLLILFSIFPVVARYFR
FEPFTEEFFEVFSCYFPIDFTPPANDPYAVTKEQLCDGLRQCLAGSPHFAEYCLPLLQEKLESDLVSAKVEALKTLELCCQTYQAGQLEKWVDSFWTGIR
REVLINVNTDDLEHASLDALAALSRAFTTDGEFNSPAFTKLLKNVLTECQGHLCEPERRLMTPSSYILLAICSGSAPACALIVSQVIPLLMDQYRIRPQS
NPRQFILNSLNKMVHAGLYGFTEENVAQSGLASLIPKLLELYLEVLKEDDAVLRNLSLQGLSHLIGTCLNHQDLEKVNGTLLDLLQKSTATDSVIAEIGH
FFCKSAEKNENLFLEQVLVKLLDIAVSGSIPTDGCARTIRPGITTGSTQSFDSFRTKGRTRNIPAIAPLGIENIGRRRSSGVTPSHLCLIEKSGFQFIFR
VILLDQNVGRNVFVTFSALYRKATEFINEQTEQYVSQHLARSPWTLSIMEATLGSLDATPSGHSLERLVNTLEPLTVCHPKADVRLSACRLMAALVNKLP
EGHELEAILDSLRRKWQDPSTDRCNSVCLFVWITKALLMRSYSKLNQYIQELVDSLNDPTHGYQVAEGFKTILCDTEECLNFNCHANIRLMYRQRFFQEV
VPRLLKLYRESESCNKAACFAAIANQLAFIPEGVLIAHITTLIPLLIQCLSTDQPAQLIISTINAFMGLMSDNVSAIEEYISSLVPRLLTLAKDGITMDV
RRLALQCLSELRKAQSIVLLPLRSEVILRLVPCLSDKKRLVRREAALARQKWIMLGQPGCN*

>MMS19_droMel Drosophila melanogaster insect NP_649519 959 aa x HEAT 30%
MTTPTRATLEKALKSDQKLVNSATQIAKDLTAKAYDISALAEALGFALSSPDMEERVAGTNLLSAVLIALPQDLLQERQLEFLSTFYMDRLRDHHNVMPA
IIDGIDALVHMKALPRAQIPQILQSFFEHTTCQSQTRSDRTKLFHIFQYLTENFQDELQAMAGDFVYGLINSIDGERDPRNLDIIFSFMPEFLSTYPLLH
LAEEMFEIFACYFPIDFNPSKQDPAAITRDELSKKLTNCLVANNEFAEGTVVLAIEKLESELLVAKLDSIELLHQAAVKFPPSVLEPHFDQIWQALKTET
FPGNDNEEILKASLKALSALLERAAHIPDISHSYQSSILGVILPHLSDVNQRLFHPATGIALVCVSGDAPYAADKILNSFLLKLQAADASSEQRIKIYYI
VSQVYKLSALRGSLQKLDTTIRESVQDDVIASLRLIEQEEFDAKKEDLELQKAALSVLNESAPLLNEKQRALIYKALVQLVSHPSIDIDFTTLTVSLGAL
QPVEVQSNFIDVCVRNFEIFSTFVKRKIYTNLLPLMPQIAFTQRILDLVMTQTFNDTTAEPVRLLALEALNKLLLLADQRFIVDVQQESNLLHKLIELGQ
KTEGLSMQSLEQIAGALSRITQQLPLSEQSAIVSEYLPGLNLSQSADLYITKGLLGYLHKDITLDDHFERLLTDLTQLSLNSDNEQLRVIAHHLLCSMVN
KMESNPANLRKVKKITEQLKVAIKKGDVRAVEILAWVGKGLVVAGFDEAADVVGDLSDLLKHPSLSTAAALGFDIIAAEYPELDLPVVKFLYKQKLFHTI
MGKMGSKLANYCVHHLKAFVYVLKATPQAVIKLNIEQLGPLLFKSLEEHNEAQSLCIALGICEKFVAQQDTYFQGHLAHLIPSCLELSKYKAQHTMQVRI
AALQLLYDVTKYPTFVLLPHKVDVTLALAAALDDPKRLVRNTAVKARNAWYLVGAPSPN*

>MMS19_nemVec Nematostella vectensis cnidarian XM_001629116 897 aa x HEAT 32%
MAALGQEEYPSLATLLQDVYQRRKNLLQVVELLGPSLTSTDTDKRCSAVQLLSSLLQKLVNYKLTDREDLKPVGSDLVFGFLQAMDGEKDPRNLVVAFKL
ARIIIKNFPIGLFAEDLFEVTSCYFPIDFTPFCLPLLMEKLSSDVINAKIDSLLTLVFQTVSSELEDAAFKALSSIIKNLESSSPGQEPFLSRIFINFYT
ISCYVTQCHPDVVEFKTPFLDCVIKECCANIEGADLRKVKPSGQLLQAAFVTDTTYNEITSTAVPLILSKYNDEATQGLVKKLLLDVLLGLLTASKPYYK
RKGSVLASHTSALVDVLFSALVSDSPSLCRAAIAGLVSMVTLPGLLLEQKVGMFVEHLTSFVLNTKDLTVRQESNAALAFLAMEFPELIKTKLVSVLAEQ
LQKEDGSAMDEENISHLQSDKSHPQYDQMLNTLSAVCTEEGVVRHVVPIILDHGEYLVTGKDLERGVLHGKISETLKCLNSIVKGTLQSSTVEPNYYTEV
VIYRIIDLCTQSALQESPDCPMATPEALALVCSIVRQVISHLAVNEAEDVLHIIVSNFIEGKTPLSARAEQKFAPLEPSSPWQQSQLVTVLMAAVCSARR
EVRIPRQKELVPRLQVLASGCNHRKTTVAASKCLGGIINKMAQGDDLTADLHSLKGQLQNHMDGNEEQRWRAVITWLWLTRALVTRSHPMAQEFVQKVLH
LLDDVSVGRVAADGFYVIVSDCDDVMNQAMHADIKMMYKQRFFMETLPLLLKGFHDTRPECKYLYLCALSHLLQWIPKQVLLTEIPTLMPMLIQALSRDE
PSLLLSTLQTLYSLVFDAPEVISRQVTSLIPNFLELAKCKASMKVRMEAIKCLGAMTTLEHHVVYVYKARVIKELACTLDDPKRLVRAEAVKCRNEW*

>MMS19_triAdh Trichoplax adhaerens single-celled metazoan XP_002114595 959 aa x HEAT 39%
MEKDSSAKSLQQLMDEFILGNSSAINEIIKGIYDGHIKLSTIVELLGPYLTSVEHEKRLQGMKLLSEVLQMLSMYKMQATEVQLLVAFYSDRLQDHFSIL
PETLRGILALVQHQIISEEDAVTIVKGIFKEVQNQALLQADRNKVYAILAGLLDKHYEGIKIMDADFVFGYIQVMDGEKDPRNLLLALKIAKFIVQNFNI
DLFLDDFFEIISCYFPIDFTPPPNLPSNENVTKEDLIIVLRESLTSTRKFAGISCAKIYTATDFQEYLQPIWTAIRQEVFLSMDDQVQELSLEALKHVVV
TISSNSLQQPDQDPLNDFINMIVTETQQYLQDPELKLANPCGNVLNAVASASDRSCYSILTPIIPRLVNLYSTDKTVIFRCKVLDILIKLLNAAANCQLS
EQFIAPMDWHEIVKLLQLAMDTSEEDIRLRVTASFSILIQIKDALPADEIERISNDILKRALEDPSSIVRHGSISTLATIASVLPDVIITTVIPYIRTSV
TNLQLLLQCLANVKNRIENCLYLYHYLFDDILWLCVYNSLEESINSFEFKTIKIIASIGQLIYLNLDESSQKKFIDNLLELFMNGQVSVLKPMTVIDELP
LKQFYPLNVASSQRQVQLIEILCKILGAIKFRDGILSPNDMITNLLDISCKSVHQPSATSAAQLLSSIINKMEEGDQLENYIKSITNTICNVLYSKNVET
EMKNAVNTWIWMFAILCRYSCSLFHYSNFDIKTSFDASFQLMKALIMRSHPYSNEALIQVLKFFKLPNVGHVASAGFKIIIGDEENILCESTNAIVKFMY
KNRFFMMASEKLMENYRIASKGIKHHYLTALSHLLNGVPKQMLLNHLQMLMPLLVESVSCDEESLRLSSLQTLRPLITEAPDIISNYVASMLPELLKLCN
FPSSMKIRISALQSVNDLASLPIHLVVPYKSKVINELGNTVNDKKRLVFTVINPKKQQ*

>MMS19_sacCer Saccharomyces cerevisiae budding yeast P40469 MET18 1032 aa 13 HEAT 29%
MTPDELNSAVVTFMANLNIDDSKANETASTVTDSIVHRSIKLLEVVVALKDYFLSENEVERKKALTCLTTILAKTPKDHLSKNECSVIFQFYQSKLDDQA
LAKEVLEGFAALAPMKYVSINEIAQLLRLLLDNYQQGQHLASTRLWPFKILRKIFDRFFVNGSSTEQVKRINDLFIETFLHVANGEKDPRNLLLSFALNK
SITSSLQNVENFKEDLFDVLFCYFPITFKPPKHDPYKISNQDLKTALRSAITATPLFAEDAYSNLLDKLTASSPVVKNDTLLTLLECVRKFGGSSILENW
TLLWNALKFEIMQNSEGNENTLLNPYNKDQQSDDVGQYTNYDACLKIINLMALQLYNFDKVSFEKFFTHVLDELKPNFKYEKDLKQTCQILSAIGSGNVE
IFNKVISSTFPLFLINTSEVAKLKLLIMNFSFFVDSYIDLFGRTSKESLGTPVPNNKMAEYKDEIIMILSMALTRSSKAEVTIRTLSVIQFTKMIKMKGF
LTPEEVSLIIQYFTEEILTDNNKNIYYACLEGLKTISEIYEDLVFEISLKKLLDLLPDCFEEKIRVNDEENIHIETILKIILDFTTSRHILVKESITFLA
TKLNRVAKISKSREYCFLLISTIYSLFNNNNQNENVLNEEDALALKNAIEPKLFEIITQESAIVSDNYNLTLLSNVLFFTNLKIPQAAHQEELDRYNELF
ISEGKIRILDTPNVLAISYAKILSALNKNCQFPQKFTVLFGTVQLLKKHAPRMTETEKLGYLELLLVLSNKFVSEKDVIGLFDWKDLSVINLEVMVWLTK
GLIMQNSLESSEIAKKFIDLLSNEEIGSLVSKLFEVFVMDISSLKKFKGISWNNNVKILYKQKFFGDIFQTLVSNYKNTVDMTIKCNYLTALSLVLKHTP
SQSVGPFINDLFPLLLQALDMPDPEVRVSALETLKDTTDKHHTLITEHVSTIVPLLLSLSLPHKYNSVSVRLIALQLLEMITTVVPLNYCLSYQDDVLSA
LIPVLSDKKRIIRKQCVDTRQVYYELGQIPFE*

>MMS19_schPom Schizosaccharomyces pombe fission yeast Q9UTR1 1018 aa 14 HEAT 25%
MSSNLVALYLFSIDRSQDEANDVVDRIVEEIVTDRMGIVDLVTSIGEYLTDNNISVRAKAVLLLSQTLGELPKDRLPAKHVSVLLQFYLSRLDDEVTMKE
NALGIGALLNMQNFPAQKIVDVCKALFSSTDMPKYAQATRLNILKVFETIIDNYLFFISSQTRDAFFSGICSTFAGEKDPRNLMLVFSMLKKILSTFPID
GFEQQFFDITYCYFPITFRAPPDATNLAITSDDLKIALRETLVANDAFSKLLLPALFERLKASTVRIKIDALNIYIEACKTWRVGAYLWSAKDFWESIKQ
EILNSTDAELQNLALGALNTLASKFYKEEGFSSSFTEFVDMILIQLSQRLLEDVNVKSCGSCAAVFASLASISVETFNYCSCNFLPSVLDLPMVNEPLEK
QKGMLVFLEYVYKCLVLLYGKWRSKNQADIDNPLLVYKDKQLSFVSGSLMGTAKDETEIRMLALKVIFLMASIKNFLTESELTMVLQFLDDIAFDFSDPI
KKKATECLKDLGLLKPDFLLLTSFPFAFSKLTDDVTAKSSSEETFKQYLSVLVSISEERSLFKALVIRLVEMLKDQFKSKEMSVDLVESIVQSLSVAFKE
RNDRNEQEIPFFFEELLKQLFTLCFANCESMNVRCLIYVSQTINEIVRVNHFEFQEKFVGQLWKLYMENSNSDLIETEGCEKAAERFTLAASLSDQKFLN
LVVLLQGGLNGLSKKLHFIEKLNIELLNLLINVVFVTESPGVKISALRLISSLINKCEKDEDISSFISSKGVTSLWDKVYTGTPKESEAALDVLAWVDKA
LVSRKHSEGIPLAFKLLDTLNLQNVGDSSVKALSIIIKDDPALSKENSYVEKLLYKQRFYASVSPKILEHISTATGGEKSLYLMLLSNVIGNVPKEIVIP
DMPSILPLLLQCLSLSDISVKLSTLNVIHTSVKELTSLLTEYLDTLIPSLLAIPKDMNNPTVVRLLALKCLGSLPEFTPTTNLQLFRDKVIRGLIPCLDD
PKRVVRTEASRTRHKWYI*

>MMS19_araTha Arabidopsis thaliana plant NM_124186 1134 aa x HEAT 28% armadillo/beta-catenin-like
MMVEPNQLVQHLETFVDTNRSSSQQDDSLKAIASSLENDSLSITQLVREMEMYLTTTDNLVRARGILLLAEILDCLKAKPLNDTIVHTLVGFFSEKLADW
RAMCGALVGCLALLKRKDVAGVVTDIDVQAMAKSMIQNVQVQALALHERKLAFELLECLLQQHSEAILTMGDLLVYAMCEAIDGEKDPQCLMIVFHLVEL
LAPLFPSPSGPLASDASDLFEVIGCYFPLHFTHTKDDEANIRREDLSRGLLLAISSTPFFEPYAIPLLLEKLSSSLPVAKVDSLKCLKDCALKYGVDRMK
KHYGALWSALKDTFYSSTGTHLSFAIESLTSPGFEMNEIHRDAVSLLQRLVKQDISFLGFVVDDTRINTVFDTIYRYPQYKEMPDPSKLEVLVISQILSV
SAKASVQSCNIIFEAIFFRLMNTLGIVEKTSTGDVVQNGNSTVSTRLYHGGLHLCIELLAASKDLILGFEECSPTSGCANSGCSMVKSFSVPLIQVFTSA
VCRSNDDSVVDVYLGVKGLLTMGMFRGGSSPVSRTEFENILVTLTSIITAKSGKTVVWELALKALVCIGSFIDRYHESDKAMSYMSIVVDNLVSLACSSH
CGLPYQMILEATSEVCSTGPKYVEKMVQGLEEAFCSSLSDFYVNGNFESIDNCSQLLKCLTNKLLPRVAEIDGLEQLLVHFAISMWKQIEFCGVFSCDFN
GREFVEAAMTTMRQVVGIALVDSQNSIIQKAYSVVSSCTLPAMESIPLTFVALEGLQRDLSSRDELILSLFASVIIAASPSASIPDAKSLIHLLLVTLLK
GYIPAAQALGSMVNKLGSGSGGTNTSRDCSLEEACAIIFHADFASGKKISSNGSAKIIVGSETTMSKICLGYCGSLDLQTRAITGLAWIGKGLLMRGNER
VNEIALVLVECLKSNNCSGHALHPSAMKHAADAFSIIMSDSEVCLNRKFHAVIRPLYKQRCFSTIVPILESLIMNSQTSLSRTMLHVALAHVISNVPVTV
ILDNTKKLQPLILEGLSVLSLDSVEKETLFSLLLVLSGTLTDTKGQQSASDNAHIIIECLIKLTSYPHLMVVRETSIQCLVALLELPHRRIYPFRREVLQ
AIEKSLDDPKRKVREEAIRCRQAWASITSGSNIF*

>MMS19_dicDis Dictyostelium discoideum slime mold Q54J88 1115 aa 18 HEAT 31%
MTSNITELNKWIEGYVNPQSEESVKTNAINMVLLYMKSNKIDLQDVVQGLGDYLKSNDSILRARGTLLLSEVLCRLPDLPLNQDQVHFLAMFYCDRLQDY
ACSSEVVKGITGLITNHTPDYPDNQKLLRNIFSEVHPTSLTQAHRKMVLQVIDIMFNKCLSEIQELKNDFMVGYLQFIDNEKDPRNLIFSFKLLPKVIYN
IPEHKHFLESLFEIISCYFPISFNPKGNDPNSITKDDLSNSLLNCFSCTPLLAEHSIPFLIDKICSNLIETKIEALQTLVYCCDRYGGFAVQPFLEEIWS
TLRTLILTHKNTTVIEESKKTIFYLTRSFTKERKVLESFLSIMIKECLHHIKSSQDSKIAIYCASILYQSVSASLLSSKIILIHIFPNLFNFLSELQKQD
TVQKVNEQNSVIALFNDLLKANSIAFEMYSNENKEPNPLEPFVDQLFKLFSDLLLLNSSSSIRSNSIECLSNLYISKKVHTTEQDDDDSEQITNEFLLDL
EKRQFIIKSLVSLLNSSDNTLRHKSLDSLFTIASNEDPSVLNLYVIPTLLQMINHSSCNINTTNNKINNNNNNNNIVIKNNKCQDEHCNEDHSNKNENNN
NSNENSNGNSTSGSDDDLKHYLEAFTKLCTHQPLLESVIPQIQVLLQHNIKETYQSNEDFEKSILILQSISFILEKSTNIKSMTICSKSILFPLIKGLYK
QELISSSNDNNNNNNNNSNRFNQILTPTLKMIHSIFENISIESQKPLLEKLIKLFLNGDTLVINYQLPTTTTTIIKPFEKSSPYKYLIPIFTTIISQSKL
DLSENNELKQSLYQMSLDVNVDDSIAISCSKAYSSIINKQQQQQQQDQINFNFFNDNLLKVINDTTTPLPLKIRHLDLFTWCTKALLTNGNSINIKLGSC
LADIISNENVELSYHASKSFGILLSETDVLNEKSGSIIKILFEQKFFTLMFPILLESFKVSKNKELQTISSHYLIAISNLLKHVPKEILLAELNEILPIV
MQSLKSSDNNDQVQLLDSSLQTLTMLINETPSSFISYLDSLIPSLIKISTKSTKYNLKRSALEILTLLSKSIPFVNLFPYKTQVVTDIIPCLDDKKRIVR
REAQKCRNSWYILQK*

>MMS19_pytUlt Pythium ultimum stramenopiles ADOS01001616 957 aa 4 ARM units 31%
MFSLDAPLAPAIDAFVNPENDDNVHKTSLNTVVMQVHRKVSMEALIQALGLHLTSTDDKVRARAMQLLAEVLSRLPELPLTPNAVQLLVDFFADRLADYP
SASACLQALLALESNHAKKIASPTVTIILIQKMVKVLHVPQLGQAMRKQCFELMQLALGQKVVVDVLVTAPESSSIDHGLLFAQAFLNAMEGEKDPRNLL
LCLQIARELLAKLEVVFDRHDAVLQQYFDVVSCYFPITFTPPPNDPYGITSEELILSLRKAFAASDLLAKHVLPFLLEKLSSTVVEAKLDSLQTLVFCCE
AYSINVALLHMLSIANALYHEVVKGEKKEVIEASLRAISRFSSVIGLAKTKAAGGAAYAWNKFVVELTTRAMSDLTGHATDSLVSVSAGQVLAALGKDSV
LGFTHVLETSVPLLIQQFNESSTSTESKCEASLARLLLIVNTIDREVDQSASAQPMRPHALVLIDALVAFLSNNEALSTPTAKCSAIEALSHLVTYPPSP
IVEIAQVKALVELFINFLLFDASPEVRRECLQSLRAISTIKQKATVKNYASLVMEIALTQLMDAVQLSAQNTKVAAVLASSGRDHPEFFNDVLDSITQLS
QEASLFQATIVRLVDFCVVENQDSNKITFVANSSANGTQAHVDGILNAVAKIVELNADDKASMEFCVTSGGDNSIVFRLLKAVTTTAADAAAQNALLDDA
KLASCARIFRTPMQNVSTETQQLLANAAISAFLTTQSTGASASHPAYLQLVPLFSAVINSANRNLNLPETSRVINTLLELAQSSTAVYHTTASTQQIEQI
SSEAALSAAKSLASIVNKMSDGEEFDALIVLLLDQKLSQIIANEQKDVSVRVAALQIYVWIAKALVIRGHREHAPACLFFLCKFLTPETSDARSQIAMHV
AKSFKLLVTEFPDVLNRKCGAFITVRQHKKKYAGILGNADLTFYFVCVVPVPSANV*

>MMS19_sapPar Saprolegnia parasitica stramenopiles ADCG01000470 804 aa 1 large ARM repeat 31%
MFSLDAPLQPAIDGFVDPENGEQQHTTHLNNVVMSVHRKTPIEQVIQGLGAHLTHVQDKRRARATLLLAEVLTRLPDLRLSSDTAHLLLTFFLERLKDGP
SMAACLKALVALISLHAALLPANDAWTVCATCHAWCERAVVETLLNLPTPIASLSQSMRKQSFELLQLIVRRGALGDHEGRVLMDGFLRAMSGEKDPRNL
LFCLRFAAELLTTYANVVDADVAKGFFDATSCYFPITFRPPPNDPYGITSEDLVLALRSVFVGHDSLAKHVLPMVLDKLSRTTVVEMTKDILETLAFCCA
KYPLNRLLLHFTPVAAAVYHHVLHGDNTAVIAVAIDALKTITRAVSPPSKLPGMQALAWNKCIVYLVNQAVEDLAHQAPDSMVSTGAGHVLCAIASVGVA
GFSHVLSSALALLLEQCAAQAGSPAEAATARLVQLLGCIDAEVDHSAPPLVPYVSAIQTTLVHGLETATSSRQQKLCLQGLRCLVLRPPSPLLDDASLEV
LLQGWTSTVLSNPFPDVRDEATSTLQAIALKSPGLAQIVLTRCVPSFLQVLEQPAVLFFASWCGDMDDGLGQCSVWAVDRGHGARHPRGPHAALARPRHL
SAPPAAVPDQLDAPVCDRDDGGRGRHCPRQQGLGRVHGLRHPRHSHFIVGALPPRRRARRARPDAVDRRPSDRAGDGQHERACHAVRVPPGANNAAHVDL
AAGLDVGHFALAWPRPATATIERSTLLVRRGVDGARTRRAAAPLAAAVYARRAEQRRRREHSQGACGALQRPPRGHAAVRCVPKVDHVARCPPRRHGVAG
GRHS*

>MMS19_polPal Polysphondylium pallidum amoeba EFA86574 994 aa x HEAT 28%
MSKANIDSYININNNDQTKQTSLNILLLEINANKLSIHQLVEYLGDYLQNTDSILRARGTLLLSEVLCRLPDLKLNEAQVEFLAAFYHDRLQDYACASEV
VKGVYGLCVNHKVPYPHNQKMIRAIFQEVHPSTLVQTHRKMVLQLIEHLLEHNLTEIQELKGDFMAGFLQFIDGEKDPRNIIYTFRLIPRVILYIPEYKN
FADSLFEILSCYFPISFNPKPGDPNSITKDDLVSSLLNCFGASTYFAEHCIPFLIDKICSNVVDTKIESLKTLLFCCSKYGPVALRPHLDDIWGTLRTQI
LTQKSATVIDESKKTMFYLTRVLAADQETLQSFLSMVDKECLHHIKTSQDSKLAVSCASILFQTVSASVKSSRIVLSHILPTIIDFFKELSLHLSDDPIH
KANEQLSIIGLFNDLLKANNISFQYNNENIDKEINPLEEYKDKLYDLFIGLLSNSSALVRTLAVDCLANLYVTRHIKTSVPITFVLDQEKRQSIIKDLGV
WLLIQIFRNKSLEALMSITKLEQVEQMNLFAIPTLLQMINANQSKNVSESKHYLEAFSQLCTHQPLLQSVIPQIKTLLEHSIKKKYINNDEFENSLLVLQ
SLENTFSNSIDEQTMTICYREILLPLVKELFEQVFSLDVNSQEQKDQVLGIMKPAISMIHSNNKKEAIELFINIYLNGDLSALQINKEFKPFSSDATEQA
KLLIPIFTSVISQSKFELSTNKLLKEMLMSRALDSNVEESISNACAICYGSIINKQTDQTDLPLDHLEQLISSSSTNKTQALNLLIWIEKGLVTNGNPQS
IKVGELLAQLITSENTEISQKAAKSFYILLSDHDTFDHKSGAIVKRQKNETVSSQFLVAITNLLRNVPKEVLLGELQEVLPIVLHSLHSNQRDLLNSSLQ
TLMMLVDEASTSISSHLDSLIPTLIKISVNGESLTFRQSSLEILTRISRAIPYPKIYPFRNQVINGIVPALDDKKRLVRREATKCRNSWFILQ*

>MMS19_entHis Entamoeba histolytica amoeba XP_651925 868 aa x HEAT 25%
MSTPAQQLNEFIESPKVIKEGYEIIDQLMKNNYNVNSLVTDLGDTLPSEDERIRFRATSLLTYCLIKYPIKEESKDVFVDYLASRLVDAVCLEPILTALL
QLVTKKPSDEIINEIAMAYSCMRTQLYTKEVRILVYQFYKVFINYYQATEVIDTCVQLIELERDPECLKEVFDLIKLVSQKNEIDADSAPLLFDCASAYF
PILYPPKGDEALRIDLTNKILDAFVSAPIYAQFALPFLLDKLDADLSSIKLEALKAIYFCIQRFELKYVYAYFTHIWESIEQNISTVGVVEVNEFAFAIA
SYFCSLDDFHSKNLMESIKMFCLRMMSETDEIIINFVNGLLEELTKKSEKFFKVFVPVFIQCFHDQLQDADDRPKEQFERELFIVRLIYQRIIEGMPLLD
CVKGQVAWDLHRLATPLHPCFVSLLDIDVSLALLNLLGEQRMVPFQNAIELSENKCHDAIPILQRLYEKEEDVMISLLPANKIITNLELVSGIALHSPKL
FEQLLKLIPTLQSNEYVPVFQSILSDALPFNCLDVYVNHCIPVFIVITNGVLSPLFNTLMNRLSRLHSILSSKKISELTEGVLSQLKEHSRLLLILPSLL
QFYQPENLIVYLNEVQEVDKDTIAIYSLLISKLTNIIPHVLEQNKEYFNGYKTIQELDSHESNKQATPIFIEELCRMNNKAIECLKEMIVFDSINKKNEL
HWKEELFNLVYERFIESHQVTTVEESHIMILLFSLLPTEKLLTYESTVLKIFNIICVPTSHLNEIDSVVVLLFNILPTVSQYPMSLIESELDSIITKLFN
VLYINGTTIKYRCDIIDLLTRIRVVYGIDAIRPYQKNVIKKLLVPLDDNKRLVRRSAAICRNIWETTA*

>MMS19_naeGru Naegleria gruberi early eukaryote: heterolobosea XP_002678884 1070 aa x HEAT 27%
MQTSSNSNGEQELISLIDSLCNPTLPNTNKESLKSKLIEFVVNGTLTINEGIKLLGEYLNHATDDRIRGAAYAVLDLILENIPNNVGSETDETKQTTQLK
LVASLLRFIGDRFYDFDCLATLLPCLFSLFKKWSSYISSEQAINVVLQFFENVNIQSIGSTSGVAHATKTRSLCFEWFSLLADRFPSIVRTIDFLNGFIQ
SLEGERDPGNLLYCFNLIPKVIAIFDDSELSSKILSAVSDDLFDITSCYFPITYTPPANDTRGITREDLSRSLKLCFGCNKFFAPTLFPFLLEKLSSDLV
DTKLETLDYLCYCIEKFGEVNSREYLTEIWSYIKAESVKTNSMDVMKKCYESITKIARIVIIPNDPSNKPFDIPNIEAILRTALLELKSKEPKFAAQYAR
MIYACAVPTFEISMMVFNRVMPELVATLSESDTKDKLYGSLLMITQLLQAVAEQKGENQLPEVVFNLISQVQTVFLSIYEEEFSKNDKEMILVMVETISR
IAIFRIPSTLLRDIYVSRILLKSYGEENKSFSLLHATSLEEYKERVIKDIAWIYKYAPDIVSEDILVPLFGALYGCENKSEHINRILSNISAVGKVCPSM
TPSITHRLFERIESIPISESHYEHERVKVFETITSLDVSLIPAHDKVSYIQRIVKMSVTDSSSQMVDDSDTMDCSDSECAHVHHNQGNFSFLTLLLGRSL
ENELQQVVLDSVLQYANSVPSTGLKNFISVLSAIVIACRPTVGMGNLITMTDSLLQMALKGEQPSQVTKCIAQLVGSVLNKLPLDSTEFQQLITICNATV
FDAFSQMLTVYNGDSESAERYIEMVSWILKGLVMRGAYVPHADRYSSLLCGSLVFEYNSSKVNKKVAEGFLIAIGEDETSIHKENHAIIQVLYKQRFFAT
NVRKLMDSINTVTQPHIIGSILLALSNLIHNVPTKVILSEVKNIFPIVLKFLEMRQILIEQDNNSEDLLYAAIKTTLTLLSDAKEEMSVHLSSIVPILLD
TCKFKKSQAVRILSVEALLELTNGYKYYEIYPLKKDIIKGLEACLDDKKRKVRKAAVKCRNSYFVLSNNQ*

>MMS19_phyInf Phytophthora infestans early eukaryote: stramenopiles 1114 aa x HEAT 33%
MVSYEQLGSLPQKGSQNPVVNQKLEAIAMFSLDAPLAPAIDAFVDPENDDAAQKTGLNTVVMQVHRHVSMEALIQALGAYLTNGDDKVRSRATLLLAEVL
TRLPELQLTPSAVQLLMTFFADRLADFPSASACLRALLALETHHAAQVQSPRTTVALIPKLGKTLHIPQLGQAMRKMCFDLMQLALMQSTVVELLLDSVP
ASKDAQDASVDDAEQSEDLGRQLAQTFLSAMEGEKDPRNLLLCMQVARTLLSKLEPVFSRSDTLLQQYFDVVSCYFPIIFTPPPNDPYGITSEGLILSLR
HAFAASDLLAPLVLPFLLKKLASTVVEAKLDAIQTLVFCGERYSVNALLLQMHAVATALYDEVLDGEKQEVIAEARQAISRFSGVVARAKAQDTPGAAYA
WSKFVVDMTARAAGELRENAADSMVSVSAGQVLAALGRESSMSFAHVLKIAVPLLVEQLNNESSGSDSVPSKCEAALARILLLIDTIDREIDQSGQGQPM
RPHAAALIDALVNFLSSDHDNQTKPGSSPTARCVAVEALCHLLTFPPSPIVAPAQVKALINLFTRMLLLDPVAEVRTACLQSLKEISTVSTASEGSTNSG
EHPVTGGYAAFVVEISLARLMAAVSEGSDQEDDDDEEGTGVAAVLTASNRNFDSFFEEALLAITELCRESSIFQATIFLLIDLCVEKGDGKQSAIGFCEA
EGDATRQRHVDCILDAVAKIVEINAGDRTSMEFCVKASSSASIIFRLLTAVETLAARATASSGYKSGLVDEVKLSACVRIFRAVMQNVSSATQQQLVDAV
VPAFLRTNTSEPASLQFVPLFAAVINSAARDVALPDSSLVINRLLELAQSGATAVSESPPRQLQLVYTDAALSAAKSLASIVNKMSDGAEFDALIDLLLS
RKLAVVISNSAESFTVRVAALQIYAWIAKALVIRGHKVHAPVCLRFLCSFLTPDGDVNMEQEGDDQHAAALRMEVAKTFKLLVSEYLDVLNRKCGAFITF
LYRQRLFDLVFPVLLEYIRARIDEESSVAALVAFAQVIAHSPKAIYLPHLAQIFPLMVQALNTDDRELGSAAIQTFKPLLLESVESAKPFLKDVFPGLLK
QAQFGYVVSCSDS*

>MMS19_albLai Albugo laibachii early eukaryote: stramenopiles 1077 aa x HEAT 27%
MFQLDAPLSPAIKKFIDSGASNDEETGQKTSLNAVVMHTHRIGSIETLIQELEPYLTDDCNDFARARATLLIAEVLTRLPDLPLSGNRIQVLNNFFCARL
DDPPSIPASFQALLALQKHHSTEIPDSENMELVIRISDTLHVPQLNQPMRKRYYELVYLVIQQERMQKALSRSQQAQVFIRSFLNAMTGEKDPRNLKHCF
QIAQTMMQKLEMVFQEAELSEQYFRVISCYFPITFTPPPNDPYGVTTEELIRSLRNVLTASDVLIHQMVPFLLEKLSGSMSEEAKVDALDTLGHCVETFS
LKNLLLHIRSIGQVFYHEILNGERARVIETASNVLSRVSSVIGRAKVQGSSGSGFAWNAVVVTITNQAVEKLHENSVDSMSSASAGKVLASMSRESLVVS
THVLNTSMPLLIEQVKHSFEASSSQCEAALDRLMLFVDTIDEEVEQISTIHPIHSHASPILEALVKFLEEDTPTSTPNAKRLSIRIISHLVIYPSTPVVR
PSDVERIVRLFTRGFLSDASKHVRSEFLSSLKALSGAIKTPSTLQSVHCKREKTLQLYGTLLKEHCIAQLLALVQDGKSPEAETFQKSSCRTRKDFEQDT
LAAITELSHDPVIFKEAVVHLLQSCFIDQDGLLIFRSFEVEHTLQFFQAVATIIELNASNASNMEFCASIDDQNGIAFKLLDAFVSMAMSNGQSKEQKFL
PPNAIAFSTRILRTIMQNICFDTQQKLLDRAISRFHPILQTEESTPSQHLYQIVSAFSTVINSANRSLAFPKAYCVIDSLMAVSRSITTESHGYTNEIVL
LISQSIGSILNKVRDKHFEAKVESLLTGLSQSIHNDQEQAQWHTSIEVYIWITKGLLLCGHPKYSSQSVAFLTQLLIHHSDKGVRGQVAEGVRVILTEFP
NVLNRKCGASCNMLFRQRLFELVGPNLLAFISKHSEETTEALTGFCYIVAFSPKAAFISLISTIMPLVLRGLSSDHVELGAAAIKAYKIVSDTSIEHVKP
FLKDVFHGLLQQAQHSANALDRKDALECIGMLTTLPYELIHSYKDRVLRQLLFCLDDRKRFVRYTAVRVRNKWSVL*