Opsin evolution: RBP3 (IRBP): Difference between revisions
Tomemerald (talk | contribs) (New page: == RPB3 (IRBP) (96 marsupials): introduction == Interphotoreceptor retinol-binding protein, poorly named by IGNC as RBP3 despite its complete lack of paralogs, is a 4 exon 1247 residue gl...) |
Tomemerald (talk | contribs) mNo edit summary |
||
Line 113: | Line 113: | ||
equCab EALQDYYTLVDRVPALLHHLASMDFSSVVSEDDLVAKLNAGLQAVSEDPRLLVWVVRSK | equCab EALQDYYTLVDRVPALLHHLASMDFSSVVSEDDLVAKLNAGLQAVSEDPRLLVWVVRSK | ||
== RBP3 | == Reference sequences == | ||
These are organized in three ways, as intronated genes showing position and phase of intron breaks, as parsed into functional modules (signal peptide and spacer residues are dropped), and as modules organized by type 1-4. All three sets are in phylogenetic order with respect to the canonical deuterostome tree. | |||
=== RBP3 from human to amphioxus === | |||
<pre> | <pre> | ||
>RPB3_homSap human | >RPB3_homSap human | ||
Line 358: | Line 362: | ||
2 GVTPDIIVPSEKALTVALRKIQGSEDTKMAASSGNIEPPRWTVYLVFICTSIAILTYPTFM* 0 | 2 GVTPDIIVPSEKALTVALRKIQGSEDTKMAASSGNIEPPRWTVYLVFICTSIAILTYPTFM* 0 | ||
</pre> | </pre> | ||
== RBP3 proteins parsed into constituent modules == | |||
=== RBP3 proteins parsed into constituent modules === | |||
<pre> | <pre> | ||
> | >M1_homSap | ||
GPTHPALTSLSEEELLAWLQRGLRHEVLEGNVGYLRVDSVPGQEVLSMMGEFLVAHVWGNLMGTSALVLDLRHCTGGQVSGIPYIISYLHPGNTILHVDTIYNRPSNTTTEIWTLPQVLG | GPTHPALTSLSEEELLAWLQRGLRHEVLEGNVGYLRVDSVPGQEVLSMMGEFLVAHVWGNLMGTSALVLDLRHCTGGQVSGIPYIISYLHPGNTILHVDTIYNRPSNTTTEIWTLPQVLG | ||
ERYGADKDVVVLTSSQTRGVAEDIAHILKQMRRAIVVGERTGGGALDLRKLRIGESDFFFTVPVSRSLGPLGGGSQTWEGSGVLPCVGTPAEQALEKALAIL | ERYGADKDVVVLTSSQTRGVAEDIAHILKQMRRAIVVGERTGGGALDLRKLRIGESDFFFTVPVSRSLGPLGGGSQTWEGSGVLPCVGTPAEQALEKALAIL | ||
Line 374: | Line 379: | ||
DITVPMSEALSIAQDIV | DITVPMSEALSIAQDIV | ||
>M4_homSap | >M4_homSap | ||
ALRAKVPTVLQTAGKLVADNYASAELGAKMATKLSGLQSRYSRVTSEVALAEILGADLQMLSGDPHLKAAHIPENAKDRIPGIVPMQ | ALRAKVPTVLQTAGKLVADNYASAELGAKMATKLSGLQSRYSRVTSEVALAEILGADLQMLSGDPHLKAAHIPENAKDRIPGIVPMQ | ||
IPSPEVFEELIKFSFHTNVLEDNIGYLRFDMFGDGELLTQVSRLLVEHIWKKIMHTDAMIIDMR | |||
FNIGGPTSSIPILCSYFFDEGPPVLLDKIYSRPDDSVSELWTHAQVV | |||
GERYGSKKSMVILTSSVTAGTAEEFTYIMKRLGRALVIGEVTSGGCQPPQTYHVDDTNLYLTIPTARSVGASDGSSWEGVGVTPHVVVPAEEALARAKEML | |||
> | >M1_bosTau | ||
LFQPSLVLEMAQVLLDNYCFPENLMGMQGAIEQAIKSQEILSISDPQTLAHVLTAGVQSSLNDPRLVISYEPSTLEAPP | LFQPSLVLEMAQVLLDNYCFPENLMGMQGAIEQAIKSQEILSISDPQTLAHVLTAGVQSSLNDPRLVISYEPSTLEAPP | ||
RAPAVTNLTLEEIIAGLQDGLRHEILEGNVGYLRVDDIPGQEVMSKLRSFLVANVWRKLVNTSALVLDLRHCTGGHVSGIPYVISYLHPGSTVSHVDTVY | RAPAVTNLTLEEIIAGLQDGLRHEILEGNVGYLRVDDIPGQEVMSKLRSFLVANVWRKLVNTSALVLDLRHCTGGHVSGIPYVISYLHPGSTVSHVDTVY | ||
Line 417: | Line 422: | ||
>M4_monDom | >M4_monDom | ||
LRAKVPTILQTAGKLVADNYASLEVGSRVASKLAKLQTQYRQVTSEGELADMLGADLQTLSGDRHLKTAHI | LRAKVPTILQTAGKLVADNYASLEVGSRVASKLAKLQTQYRQVTSEGELADMLGADLQTLSGDRHLKTAHI | ||
PEDAKDRIPGIVPMQLPSPEAFEDLIKFSFHTNVFEGNIGYLRFDMFGDCELLTQVSDLLVEHVWKKVVHTDGMIIDMR | PEDAKDRIPGIVPMQLPSPEAFEDLIKFSFHTNVFEGNIGYLRFDMFGDCELLTQVSDLLVEHVWKKVVHTDGMIIDMR | ||
FNIGGPTSSISALCSYFFDEGQEVLLDQIYNRPNDSISEIWTQSQVA | |||
GERYGSKKSVIILTSSMTAGAAEEFVYVMQRLGRALVIGEVTSGGCQPPQTYHVDDTDLYITIPTARSVGSGDKPSWEGVGVAPHVEVPADQALSKAKEM | |||
>M1_ornAna genome rife with frameshifts, dels, misassembly | >M1_ornAna genome rife with frameshifts, dels, misassembly | ||
Line 426: | Line 431: | ||
DRPSNATRQLWTLPRVLGARYAADKDVVVLTSRLTAGVAEDVAYILQQMRRAIVVGERTAGGPLVFRKLRVGLSDFFITVPVACSLGPLGGGGRSWEGSG | DRPSNATRQLWTLPRVLGARYAADKDVVVLTSRLTAGVAEDVAYILQQMRRAIVVGERTAGGPLVFRKLRVGLSDFFITVPVACSLGPLGGGGRSWEGSG | ||
VLPCVAVPADRALDEALDIL | VLPCVAVPADRALDEALDIL | ||
> | >M2_ornAna | ||
LRGAVPGAVAHLADLLRDYYALVDRVPALLRHLAALDLSSVLSEEDLTSRLNAGLQAASEDPRLLVRRLEPEEAERGPPRKEEEQKEEE | LRGAVPGAVAHLADLLRDYYALVDRVPALLRHLAALDLSSVLSEEDLTSRLNAGLQAASEDPRLLVRRLEPEEAERGPPRKEEEQKEEE | ||
EEDQPSPGASILPGDGSSREAPLFRVSVLPGNVGYLCFDEFPEASALERLGPLLGRRVWEPLEATDHLMVDLRNNPGGPSSAVPLLLSYFQDPAAGPIRLFTTYNRPADVTREYASRAGA | EEDQPSPGASILPGDGSSREAPLFRVSVLPGNVGYLCFDEFPEASALERLGPLLGRRVWEPLEATDHLMVDLRNNPGGPSSAVPLLLSYFQDPAAGPIRLFTTYNRPADVTREYASRAGA | ||
LEKPYGARRGVYLLTSHRTATAAEEFAYLMQALGRATLVGEITAGRLLHSRTFPLLRPPWEGLVLTVPFLTLFDPHGEGWLGGGVVPDAIVLAEEALEKAGEVL | LEKPYGARRGVYLLTSHRTATAAEEFAYLMQALGRATLVGEITAGRLLHSRTFPLLRPPWEGLVLTVPFLTLFDPHGEGWLGGGVVPDAIVLAEEALEKAGEVL | ||
> | >M3_ornAna frag | ||
FHQTLEALVETTGHLLEAHYCFPAGARRAGAQPWPVAGVEPDVMAQAAEALAVAQGIAA | FHQTLEALVETTGHLLEAHYCFPAGARRAGAQPWPVAGVEPDVMAQAAEALAVAQGIAA | ||
>M4_ornAna | >M4_ornAna | ||
LRSKVPTVLRTAAKLVADNYAFRETGAGVAAQMGGLQARCGRVTSEGALAEVLGAHLRALSGDPHLQMVYIPEDAKDRIPGVVPMQ | LRSKVPTVLRTAAKLVADNYAFRETGAGVAAQMGGLQARCGRVTSEGALAEVLGAHLRALSGDPHLQMVYIPEDAKDRIPGVVPMQ | ||
IPSAETFEDLIKFSFHTSVMEGNIGYLRFDMFGDCELLTQVSELMVEHVWKKIVHTDGLIIDMR | |||
NIGGPTSSISALCSYFFDEDHPVLLDKIYNRPNDSISEIWTHSHIA | |||
GERYGSRKSVVILTSNMTAGAAEEFVSIMKRLGRALVVGEVTGGGCHPPQTYHVDDTHLYITIPTSRSVGSEDGSSWEGVGVTPHLVVPADVALSRAKDL | |||
>M1_taeGut Taeniopygia guttata | >M1_taeGut Taeniopygia guttata | ||
Line 443: | Line 448: | ||
RPSNTTTEIWTLPKVLGERYSKDKDVIVLISHHTTGVAEDVAYILKHMNRAITVGEKTAGGSLDIQKLRIGPSNFYMMVPVSRSVSPLSGGGQSWEVSGV | RPSNTTTEIWTLPKVLGERYSKDKDVIVLISHHTTGVAEDVAYILKHMNRAITVGEKTAGGSLDIQKLRIGPSNFYMMVPVSRSVSPLSGGGQSWEVSGV | ||
MPCVATEAEQALQKSLDIL | MPCVATEAEQALQKSLDIL | ||
> | >M2_taeGut | ||
VRRAVPGTISHLKNILKDYYSLVERVPALLRRLTTSDFSSVQSSEDLATKLNTELQALSDDPRLMVRVMMPGEAADSPAE | VRRAVPGTISHLKNILKDYYSLVERVPALLRRLTTSDFSSVQSSEDLATKLNTELQALSDDPRLMVRVMMPGEAADSPAE | ||
KPVGMAADLPDNEQLLHALVDTVFKVSVLPGNVGYMRFDEFADASVLVKLGPYLVHKVWEPLQNTENLIMDLRYNLGGPSSSAVPVLLSYFQDPAAGPVH | KPVGMAADLPDNEQLLHALVDTVFKVSVLPGNVGYMRFDEFADASVLVKLGPYLVHKVWEPLQNTENLIMDLRYNLGGPSSSAVPVLLSYFQDPAAGPVH | ||
Line 455: | Line 460: | ||
>M4_taeGut | >M4_taeGut | ||
ANLRAQVPQILQTVGKLVADNYAFVNTGTVIASNLTKNIHKDNYKRINTEEDLAGKVTAILQALSDDKHLKLLY | ANLRAQVPQILQTVGKLVADNYAFVNTGTVIASNLTKNIHKDNYKRINTEEDLAGKVTAILQALSDDKHLKLLY | ||
IPEHAKDSIPGIMPK | IPEHAKDSIPGIMPK | ||
QIPPPEVFEDLIKFSFHTNVFENNIGYLRFDMFGDSELLTQLSDLMIEHVWKKIFHTDALIIDLR | |||
YNIGGSTTPIAILCSYFFDEGHPVLLDRVYDRPSDSVKEIWTQPQLK | |||
GERYGSQKGLVILTSAVTAGAAEEFVYIMKRLSRALIIGEQTSGGCHSPQTYQVDETNFYVVIPTSRSVTSADSTSWEGKGVSPHIETPAETALIKAKEM | |||
> | >M1_galGal | ||
IFQPTLVLDMAKVLLDNYCYPENLVGMQEAIEQAIKSGEILDISDPKMLANVLTAGVQGALNDPRLVISYEPSLHAAPKQ | IFQPTLVLDMAKVLLDNYCYPENLVGMQEAIEQAIKSGEILDISDPKMLANVLTAGVQGALNDPRLVISYEPSLHAAPKQ | ||
EAETYPTREQLLSLIEHVVIYDKLEGNVGYLRIDYIIGQEVVEKVGAFLVDKVWKTLINTSALVIDLRYSTGGQISGIPFIISYLHEADKMLHVETVYNR | EAETYPTREQLLSLIEHVVIYDKLEGNVGYLRIDYIIGQEVVEKVGAFLVDKVWKTLINTSALVIDLRYSTGGQISGIPFIISYLHEADKMLHVETVYNR | ||
PSNTTTEIWTLPKVLGERYSKDKDVIVLISHHTTGVAEDVAYILKHMNRAITLGEKTAGGSLDIQKLRIGPSNFYMMVPVSRSVSPLSGGGQSWEVSGVM | PSNTTTEIWTLPKVLGERYSKDKDVIVLISHHTTGVAEDVAYILKHMNRAITLGEKTAGGSLDIQKLRIGPSNFYMMVPVSRSVSPLSGGGQSWEVSGVM | ||
PCVASEAEQALKKSLDIL | PCVASEAEQALKKSLDIL | ||
> | >M2_galGal | ||
AVRRAVPGTLSRLTDILKDYYSLVERVPVLLRHLTTSDFSSVQSAEDLATKLNTEMQTLSEDPRLLVRTMMPGEAAAPPAEM | AVRRAVPGTLSRLTDILKDYYSLVERVPVLLRHLTTSDFSSVQSAEDLATKLNTEMQTLSEDPRLLVRTMMPGEAAAPPAEM | ||
PIAMAANLPDNEQLLHALVDTVFKVSVLPGNVGYMRFDEFADASVLVKLGPYIVKKVWEPLQNTENLIMDLRYNPGGPSSSAVPMLISYFQDPTAGPVHL | PIAMAANLPDNEQLLHALVDTVFKVSVLPGNVGYMRFDEFADASVLVKLGPYIVKKVWEPLQNTENLIMDLRYNPGGPSSSAVPMLISYFQDPTAGPVHL | ||
FTTYDRRTNHTQEHNSQAELLAQPYGAQRGIYVLTSRHTATAAEEFAYLMQSLGRATLIGEITAGSLSHTCTFPLVQPEQGITRGLTITVPVITFIDNHG | FTTYDRRTNHTQEHNSQAELLAQPYGAQRGIYVLTSRHTATAAEEFAYLMQSLGRATLIGEITAGSLSHTCTFPLVQPEQGITRGLTITVPVITFIDNHG | ||
ESWMGGGVVPDAIVLAEDALEKAEEVL | ESWMGGGVVPDAIVLAEDALEKAEEVL | ||
> | >M3_galGal | ||
LLESTGQLLEAHYAIPEVAEKASVMLSTKRVQGGYRSAVDFETLASQLTSDLQEASGDHRLHVFH | LLESTGQLLEAHYAIPEVAEKASVMLSTKRVQGGYRSAVDFETLASQLTSDLQEASGDHRLHVFH | ||
SHVEPTPEEQLPNMIPSPEELSYIIEALFKIEVLPGNLGYLRFDMMAEAETVKAIGPQLVQMVWNKLVDTDAMIIDMRYNTGGYSTAVPILCSYFFEPEP | SHVEPTPEEQLPNMIPSPEELSYIIEALFKIEVLPGNLGYLRFDMMAEAETVKAIGPQLVQMVWNKLVDTDAMIIDMRYNTGGYSTAVPILCSYFFEPEP | ||
RQHLYTVFDRSTSRSTEVWTLPKVTGKRYGSLKDIYILTSHMSGSAAEAFTRSMKDLHRATVIGEPTVGGSLSVGIYRVGNSSLYRSIPSQVVLSPVTGK | RQHLYTVFDRSTSRSTEVWTLPKVTGKRYGSLKDIYILTSHMSGSAAEAFTRSMKDLHRATVIGEPTVGGSLSVGIYRVGNSSLYRSIPSQVVLSPVTGK | ||
VWSVSGAEPHITIQASEALAAAKHI | VWSVSGAEPHITIQASEALAAAKHI | ||
> | >M4_galGal | ||
ASLRTQVPQIVQTVGKLVAENYAFVDIGTDIASNLTKSVNKENYKRINSEKELARKLTAILQALSDDEHLKILYI | ASLRTQVPQIVQTVGKLVAENYAFVDIGTDIASNLTKSVNKENYKRINSEKELARKLTAILQALSDDEHLKILYI | ||
PEHAKDSIPGILPK | PEHAKDSIPGILPK | ||
QIPSPEVFEDLIKFSFHTNVFENNIGYLRFDMFGDCELLTQVSDLLVEHVWKKIVHTDALIIDMR | |||
YNIGGYTNSIPILCSYFFDEGHQVLLDKVYDRPSDSVKEIWTQPQLR | |||
GERYGSQKGLIILTSAVTAGAAEEFVFIMKRLGRALIIGEQTSGGSHSPQTYQVDDTNFYIIIPTARSVISAESASWEGKGVPPHMETPAVTALIKAKEVL | |||
>M1_anoCar | >M1_anoCar | ||
Line 499: | Line 504: | ||
>M4_anoCar | >M4_anoCar | ||
LFRTKLPSVLNTIGKLVADNYAFADIGATVAAKFADYAKKGTYRKINSEIELSGKLAADLKALSGDRHLMISHIP | LFRTKLPSVLNTIGKLVADNYAFADIGATVAAKFADYAKKGTYRKINSEIELSGKLAADLKALSGDRHLMISHIP | ||
ERSKGRILGLVPMQ | ERSKGRILGLVPMQ | ||
QIPPPEILEDLIKFSLHTNVFENNIGYLRFDMFGDCELMSQVSELLVQHVWNKIVNTDALIIDMR | |||
YNVGGPACSVPLLCSYFFDEGHPILLDKVYNRPNDTTSNIWTVSKLA | |||
GKRYGLNKGLIILTSSVTSGAAEEFAHIMKRLGRAFIIGQKTSGGCHPPQTFHVDGTNLYITTPVSRSVFSVNDSWEGVGVSPHLDVSTDVALIKAKEML | |||
>M1_xenLae Xenopus laevis | >M1_xenLae Xenopus laevis | ||
Line 520: | Line 525: | ||
EAMNIAHRII | EAMNIAHRII | ||
>M4_xenLae | >M4_xenLae | ||
KLRTKIPTVIQTAAKLVADNYAFADTGANVASKFIALVDKIDYKMIKSEVELAEKINDDLQSLSKDFHLKAVYIPENSKDRIPGVVPM | KLRTKIPTVIQTAAKLVADNYAFADTGANVASKFIALVDKIDYKMIKSEVELAEKINDDLQSLSKDFHLKAVYIPENSKDRIPGVVPM | ||
QIPSPELFEELIKFSFHTDVFEKNIGYIRFDMFADSDLLNQVSDLLVEHVWKKVVDQDALIIDMR | |||
FNIGGPTSSIPIFCSYFFDEGTPVLLDKIYSRTSNAMTDIWTLPDLV | |||
GKTFGSKKPLIILTSSLTEGAAEEFVYIMKRLGRAYVVGEVTSGGCHPPQTYHVDDTHLYLTIPTSRSASAEPGESWEGKGVLPDLEISSETALLKAKEIL | |||
>M1_xenTro Xenopus tropicalis 89% xenLae | >M1_xenTro Xenopus tropicalis 89% xenLae | ||
Line 541: | Line 546: | ||
EAMNVAHHII | EAMNVAHHII | ||
>M4_xenTro | >M4_xenTro | ||
KLRTKIPSVIQTAGKLVADNYAFADTGADVASKLIALVDKINYKMIKSEVELAEKLNYDLQSLSKDVHLKAVYIPENSKDRIPGVVPMQ | KLRTKIPSVIQTAGKLVADNYAFADTGADVASKLIALVDKINYKMIKSEVELAEKLNYDLQSLSKDVHLKAVYIPENSKDRIPGVVPMQ | ||
IPSPEMFEDLIKFSFHTDVFEKNLGYIRFDMFADSDLLNQVSDLLVEHVWKKVVNQDALIIDMr | |||
FNIGGPTSSIPTFCSYFFDEGTPVLLDKIYSRTTNAITDVWTLPHLV | |||
GNAFGSKKPVIILTSSLTEGAAEEFVYIMKRLGRAYVIGEVTSGGCHPPQTYHVDDTHLYLTIPTSRSASAKPGESWEGKGVLPDLEITSETALMKAKEIL | |||
>M1_tetNig | >M1_tetNig | ||
AFPPSLIADMAKIVLDNYCSPEKLAGMKEAIKAAGTNTEVLNIPDGESLARVLSAGVQGTVSDPRLMVSFQPNYVPAG | AFPPSLIADMAKIVLDNYCSPEKLAGMKEAIKAAGTNTEVLNIPDGESLARVLSAGVQGTVSDPRLMVSFQPNYVPAG | ||
PHKMPPLPPEHLVAVLQTSVKLDILEGNTGYLRIDHILGEEVADKVGPALIDLIWNKILPTSALIFDLRYTSSGDISGIPYIVSYFTQAEPVVHIDSVYD | PHKMPPLPPEHLVAVLQTSVKLDILEGNTGYLRIDHILGEEVADKVGPALIDLIWNKILPTSALIFDLRYTSSGDISGIPYIVSYFTQAEPVVHIDSVYD | ||
RPSNTTTKLLSLPNLLGQRYGVSKPLIVLTSKNTKGIAEDVAYCLKNLKRATIVGEKTAGGSLKLDTFKVGDTDFYITVPTAKSINPITGSSWEIRGVTP | RPSNTTTKLLSLPNLLGQRYGVSKPLIVLTSKNTKGIAEDVAYCLKNLKRATIVGEKTAGGSLKLDTFKVGDTDFYITVPTAKSINPITGSSWEIRGVTP | ||
HVEVNAEDALATAIKIV | HVEVNAEDALATAIKIV | ||
> | >M4_tetNig | ||
LRAQIPAIIEGTAALVANNYAFEATGADVAKELRELQANGQYSSVVSKESLEAALSADLQRLSGDKSLKTTPNTPVLPPM | LRAQIPAIIEGTAALVANNYAFEATGADVAKELRELQANGQYSSVVSKESLEAALSADLQRLSGDKSLKTTPNTPVLPPM | ||
DYTPEMYIELIKVSFHTDVFENNIGYLRFDMFGDFEEVKAIAQIIVEHVWNKVVNTDALILDLr | |||
NNVGGPTTAIAGFCSYFFDADKQNRVGQAVRQASGTTTELLTLSELT | |||
GVRYGSKKSLIILTSGATAGAAEEFVYIMKKLGRAMIVGETTAGASHPPQTFRVGETDVFLLIPTVHSDTGAGPAWEGAGIAPHIPASAEAALGTAR | |||
>M1_takRub two domains: | >M1_takRub two domains: 3-324,326-61plus upstream dup | ||
AFPPSLITDMAKIVLDNYCSPEKLAGMKEAIEAAGTNTEVLNIPDGESLARVLSAGVQGTVSDSRLMVSYQPDYVPAV | AFPPSLITDMAKIVLDNYCSPEKLAGMKEAIEAAGTNTEVLNIPDGESLARVLSAGVQGTVSDSRLMVSYQPDYVPAV | ||
PPKMPPLPPEHLVAVLQTSIKLDLLEGNTGYLRIDHIIGEDVAEKVGPSLIDLIWNKILPTSALIFDLRYTSSGEISGIPYIVSYFTQAEPVVHIDSVYD | PPKMPPLPPEHLVAVLQTSIKLDLLEGNTGYLRIDHIIGEDVAEKVGPSLIDLIWNKILPTSALIFDLRYTSSGEISGIPYIVSYFTQAEPVVHIDSVYD | ||
Line 563: | Line 568: | ||
DVEVNAEDALATAIKIV | DVEVNAEDALATAIKIV | ||
>M4_takRub | >M4_takRub | ||
LRAQIPAIIEGAATLIAKNYAFEATGADVATKLRELLAKGQYNSVVSSESLEVALSADLQRLSGDKSLKATQNAPVLPPM | LRAQIPAIIEGAATLIAKNYAFEATGADVATKLRELLAKGQYNSVVSSESLEVALSADLQRLSGDKSLKATQNAPVLPPM | ||
DYSPEMYIELIKVSFHTDVFENNIGYLRFDMFGDFEEVKAIAQIIVEHVWNKVVNTDALILDLR | |||
NNVGGPTTAIAGFCSYFFDADKLIVLDKLHDRPSGTTTELLTLPELT | |||
GVRYGSKKSLIILTSGATAGAAEEFVYIMKKLGRAMIVGETTAGASHPPQVFSVGEIGIFLSIPTVHSDTAAGPAWEGTGITPHIPVSAEAALGTAK | |||
>M1_gasAcu two domains: | >M1_gasAcu two domains:7-317,323-61no upstream dup | ||
FAPNVIIDMAKIVIDNYCSPEKLAGMKEAIEAAGSNTEVLSIPDAETLANVLSAGVQTTVSDPRLMISYEPNYVPVV | FAPNVIIDMAKIVIDNYCSPEKLAGMKEAIEAAGSNTEVLSIPDAETLANVLSAGVQTTVSDPRLMISYEPNYVPVV | ||
PPKMPPLPPDQVIAVLQTSIKLDILEGNIGYLRIDHILGEDVAEKVGPLLLDLVWNKILPTSALIFDLRYTSSGDISGIPYIVSYFTEAGTPIHIDSIYD | PPKMPPLPPDQVIAVLQTSIKLDILEGNIGYLRIDHILGEDVAEKVGPLLLDLVWNKILPTSALIFDLRYTSSGDISGIPYIVSYFTEAGTPIHIDSIYD | ||
Line 574: | Line 579: | ||
NVEVNAEDALATAIKIV | NVEVNAEDALATAIKIV | ||
DALATAIKIV | DALATAIKIV | ||
> | >M4_gasAcu | ||
TLLNRVPAIIEGSATLIADNYAFEDIGAAVAEKLKGLLANGEYSKVVSKDSLEMKLSADLRTLSGDKSLKTTSNVPALPPM | TLLNRVPAIIEGSATLIADNYAFEDIGAAVAEKLKGLLANGEYSKVVSKDSLEMKLSADLRTLSGDKSLKTTSNVPALPPM | ||
NYSPEMYIELIKVSFHTDVFEDNIGYLRFDMFGDFEEVKAIAQIIVEHVWNKVVNTDAMIVDLR | |||
NNIGGPTTAIAGFCSYFFDSDKQIVLDRLYDRPSGTTTELRTLPELT | |||
GTRYGSKKSLVMLTSRATAGAAEEFVYIMKKLGRAMIVGETTAGTSHPPKTFRVGETDIFLSIPTVHSDTAAGPAWEGAGVAPHIPVPADAALETAKGIFKKHFAGQK | |||
>M1_oryLat two domains: | >M1_oryLat two domains:8-314,320-605 no upstream dup | ||
SFPPSLITDLAKIVMDNYCSPEKLSGMKEDIATAGANTDVLNIPDGEALAKVLTDGVQTTVSDPRLRVSYEPNYVPVV | SFPPSLITDLAKIVMDNYCSPEKLSGMKEDIATAGANTDVLNIPDGEALAKVLTDGVQTTVSDPRLRVSYEPNYVPVV | ||
PPQLPPEQLIAVLQTSIKLDILEGNIGYLRIDSIIGEEVAEKVGPLLLELVWSKILPTSALIFDLRYTSSGDITGIPYIISYLTDAKSEIHIDTIYDRPL | PPQLPPEQLIAVLQTSIKLDILEGNIGYLRIDSIIGEEVAEKVGPLLLELVWSKILPTSALIFDLRYTSSGDITGIPYIISYLTDAKSEIHIDTIYDRPL | ||
Line 586: | Line 591: | ||
VNAEEALATALKII | VNAEEALATALKII | ||
>M4_oryLat | >M4_oryLat | ||
LRLQVPAIIEESATLVANNYAFESTAADVAEKLKGHLANGDYNMVVSKESLEAKLSADLQSLSGDKSLTVSSNTGAPPPM | LRLQVPAIIEESATLVANNYAFESTAADVAEKLKGHLANGDYNMVVSKESLEAKLSADLQSLSGDKSLTVSSNTGAPPPM | ||
EYTPEMYIELIKISFHTDVFENNIGYLRFDMFGDFEEVKAIAQVIVEHVWNKVLHTDAMIIDLR | |||
NNVGGPTTAIAGFCSYFFDGDKQILLDKLYDRSTGTTTDLLTLGELT | |||
GERYGSKKSLIILASRATAGAAEEFVYIMKRLGRAMIVGETTAGASHPPKVFQVGESDIFLSIPTVHSDTSAGPGWEGAGVAPHIPVAAGAALETAK | |||
>M1_danRer upstream frag as well two domains: | >M1_danRer upstream frag as well two domains:2-322,324-609 | ||
FSPTLIADMAKIFMDNYCSPEKLTGMEEAIDAASSNTEILSISDPTMLANVLTDGVKKTISDSRVKVTYEPDLILAAPP | FSPTLIADMAKIFMDNYCSPEKLTGMEEAIDAASSNTEILSISDPTMLANVLTDGVKKTISDSRVKVTYEPDLILAAPP | ||
AMPDIPLEHLAAMIKGTVKVEILEGNIGYLKIQHIIGEEMAQKVGPLLLEYIWDKILPTSAMILDFRSTVTGELSGIPYIVSYFTDPEPLIHIDSVYDRT | AMPDIPLEHLAAMIKGTVKVEILEGNIGYLKIQHIIGEEMAQKVGPLLLEYIWDKILPTSAMILDFRSTVTGELSGIPYIVSYFTDPEPLIHIDSVYDRT | ||
Line 597: | Line 602: | ||
DVAAEDALDAAIAII | DVAAEDALDAAIAII | ||
>M4_danRer | >M4_danRer | ||
KLRAEIPALAQAAATLIADNYAFPSIGEHVAEKLEAVVAGGEYNLISTKEDLEERLSEDLLKLSEDKCLKTTSNIPALPPM | KLRAEIPALAQAAATLIADNYAFPSIGEHVAEKLEAVVAGGEYNLISTKEDLEERLSEDLLKLSEDKCLKTTSNIPALPPM | ||
NPTPEMFIALIKSSFQTDVFENNIGYLRFDMFGDFEHVATIAQIIVEHVWNKVVDTDALIIDLr | |||
NNIGGHASSIAGFCSYFFDADKQIVLDHIYDRPSNTTRDLQTLEQLT | |||
GRRYGSKKSVVILTSGVTAGAAEEFVFIMKRLGRAMIIGETTHGGCQPPETFAVGESDIFLSIPISHSTAQGPSWEGAGIAPHIPVPAGAALDTAK | |||
>M1x_takRub single upstream exon 42% frameshift no transcripts three domains: | >M1x_takRub single upstream exon 42% frameshift no transcripts three domains:3-323,325-615,618-907 | ||
TLVLEMAKLLLENYCIPENLVGMQEAIQRAIKSREILQISDRKTLATVLTVGVQGALNDPRLSVSYEPSFSPLPLQALSSLPVEQQLRLLRN | TLVLEMAKLLLENYCIPENLVGMQEAIQRAIKSREILQISDRKTLATVLTVGVQGALNDPRLSVSYEPSFSPLPLQALSSLPVEQQLRLLRN | ||
SIKLDILDSDVGYLRIDRIIDEETLLKFGPLLRENVWDKAAQTSSLILDLRFSTAGGWSGIPSIVSYFTEPHSLVHIDTVYDRPSNTTTELWTMSSVRGK | SIKLDILDSDVGYLRIDRIIDEETLLKFGPLLRENVWDKAAQTSSLILDLRFSTAGGWSGIPSIVSYFTEPHSLVHIDTVYDRPSNTTTELWTMSSVRGK | ||
Line 609: | Line 614: | ||
AVRSRIPKVLQIVLDIIGRFYAFADRVQALLQQLESADLFSVVSEEDLAARLNHDLQTASEDPRLIIRHKRDNIPRAEEEPELHAANDHDGELVEGFTVQV | AVRSRIPKVLQIVLDIIGRFYAFADRVQALLQQLESADLFSVVSEEDLAARLNHDLQTASEDPRLIIRHKRDNIPRAEEEPELHAANDHDGELVEGFTVQV | ||
LPHNTGYLRLDRFVRCSEGDKLEEIVAEKVWGPLKDTQNLIIDLRHNTGGSSTSVALLLSYLRDPLPKRHFFTIYDSVQNTTTEYGSRPHIPGPSYGSER | LPHNTGYLRLDRFVRCSEGDKLEEIVAEKVWGPLKDTQNLIIDLRHNTGGSSTSVALLLSYLRDPLPKRHFFTIYDSVQNTTTEYGSRPHIPGPSYGSER | ||
GVYVLTSHYTAGAAEEFAYLIQSLHFGTVVGEITSGTLMHSKTFQVEGTDIFITVPFINFLDNNGEYWLGGGVVPDAIVLAEEALE | |||
>M3x_takRub | >M3x_takRub | ||
FHQGLRSLIGRTGELLEKHYAIQEVAQKVGEV | FHQGLRSLIGRTGELLEKHYAIQEVAQKVGEV | ||
Line 616: | Line 621: | ||
WSISGVEPDVFAQARDALPVAQRII | WSISGVEPDVFAQARDALPVAQRII | ||
>M1x_danRer | >M1x_danRer | ||
FQSALVLDMAKILLDNYCFPENLIGMQEAIQQAINSGEILHISDRKTLASVLTAGVQGALNDPRLTVSYEPNYTLITPPA | FQSALVLDMAKILLDNYCFPENLIGMQEAIQQAINSGEILHISDRKTLASVLTAGVQGALNDPRLTVSYEPNYTLITPPA | ||
LHSLPTEQLIRLIRSTVKLEVMDNNIGYLRIDRIIGQETVVKLGRLLHNNIWKKVAHTSAMIFDLRFSTAGELSGLPYIVSYFSDSDPLLHIDTIYERPT | LHSLPTEQLIRLIRSTVKLEVMDNNIGYLRIDRIIGQETVVKLGRLLHNNIWKKVAHTSAMIFDLRFSTAGELSGLPYIVSYFSDSDPLLHIDTIYERPT | ||
Line 629: | Line 634: | ||
GQKYSSQKDVYILTSHITGSAAEAFTRTMKDLKRATVIGEPTIGGALSSGTYQIGNSILYASIPNQAVLNAVTGKPWSISGVEPHIVAQASDALIVAQKII | GQKYSSQKDVYILTSHITGSAAEAFTRTMKDLKRATVIGEPTIGGALSSGTYQIGNSILYASIPNQAVLNAVTGKPWSISGVEPHIVAQASDALIVAQKII | ||
>M2_calMil frag | >M2_calMil frag domains 6-243,334-531 | ||
VTRESSPTSDKLPEDPTFLQALVDTVFKVSVLPDNTGYFRFDEFPEISVMSKLVQYIIEKVWLPVKDTDRLIVDLRHNVGGHSSVVPLLLSYFYDPEP | VTRESSPTSDKLPEDPTFLQALVDTVFKVSVLPDNTGYFRFDEFPEISVMSKLVQYIIEKVWLPVKDTDRLIVDLRHNVGGHSSVVPLLLSYFYDPEP | ||
PVGLFTVYNRLTNTTSHTTLPGVGQHVYGSRKDIYVLTSHRTATAAEELAYLLQSLNRATIVGEITSGSLLHSRSFQIPSTHLVITIPFINFMDNHGECW | PVGLFTVYNRLTNTTSHTTLPGVGQHVYGSRKDIYVLTSHRTATAAEELAYLLQSLNRATIVGEITSGSLLHSRSFQIPSTHLVITIPFINFMDNHGECW | ||
Line 637: | Line 642: | ||
VQANEAMTVALGIIN | VQANEAMTVALGIIN | ||
>M4_calMil frag | >M4_calMil frag | ||
LRAKIPSIFQAAGKLVADNYAFAQTGAGVAETIADLIEGTGYGMINTEGKLAEVLSDTLQQLSGDKHLKAVHIPGDSKHQTPGIAMIQ | LRAKIPSIFQAAGKLVADNYAFAQTGAGVAETIADLIEGTGYGMINTEGKLAEVLSDTLQQLSGDKHLKAVHIPGDSKHQTPGIAMIQ | ||
QMPPPEILEDLVKFSYQTKVLENNVGYLRFDMFGDNEMITQVSELMAKHVWNVIASTSSLIVDLR | |||
YNIGGPTSSIPILCSYFFDDDKTVLLDTVYSRPTDTISEMKAIPQVAGNGSTESSVHSYI | |||
>M1_petMar | >M1_petMar exon3/4 fused, exon4 run-on, fixed genomic frameshift; four domains: 34-312,327-615,625-914,916-1217 | ||
KFDTAVVLHLAKVLLDNYCIPENLVGMDEAIQRAVDNGELLGVSDPESAASALTEGIQAALNDPRIAV | KFDTAVVLHLAKVLLDNYCIPENLVGMDEAIQRAVDNGELLGVSDPESAASALTEGIQAALNDPRIAV | ||
SYVAPPHTFEELLATIPQKTSFAVLDGNVGYLRADEIISEATIKKLGPVIVQRIWNRLVDTDTFVLDLRYNSHGDITGLPYLVSCFCEPRPVVHLDTVYY | SYVAPPHTFEELLATIPQKTSFAVLDGNVGYLRADEIISEATIKKLGPVIVQRIWNRLVDTDTFVLDLRYNSHGDITGLPYLVSCFCEPRPVVHLDTVYY | ||
RPTNESKEIWSLPDLQGARFAKHKDVFVLVSANTEGVAENVAYVLKHLHRATVIGEQTAGGSLEVERFRLGDSRFFVTVPTARSEPADRSWGVFPCVSAP | RPTNESKEIWSLPDLQGARFAKHKDVFVLVSANTEGVAENVAYVLKHLHRATVIGEQTAGGSLEVERFRLGDSRFFVTVPTARSEPADRSWGVFPCVSAP | ||
SERALDKALEIL | SERALDKALEIL | ||
> | >M2_petMar | ||
ELLLSSYTFVERASAIADHLSWSEYGSVVSVEDLTSKLTQDLQSVAEDPRLVVSNREPEWVGAADPPGPPAPLP | ELLLSSYTFVERASAIADHLSWSEYGSVVSVEDLTSKLTQDLQSVAEDPRLVVSNREPEWVGAADPPGPPAPLP | ||
DDEQMLEAIVDSAFKVEVLEGNIGYLRFDEFGDASAVMKLRKQLVSKVWERIHPTDDVIIDLRYNLGGSSTAIPIVLSYFQDVAPVHFYTVYDRLRNVTA | DDEQMLEAIVDSAFKVEVLEGNIGYLRFDEFGDASAVMKLRKQLVSKVWERIHPTDDVIIDLRYNLGGSSTAIPIVLSYFQDVAPVHFYTVYDRLRNVTA | ||
EFHTVSNLTSQLYGSKKGVYLLTSQHTATAAEEFTYLMQSLNRATIVGEITSGRLAHSLAFRLSDTGLYMTVPIVNFIDNNDEYWLGGGVVPDAIVLAENALDAAKEII | EFHTVSNLTSQLYGSKKGVYLLTSQHTATAAEEFTYLMQSLNRATIVGEITSGRLAHSLAFRLSDTGLYMTVPIVNFIDNNDEYWLGGGVVPDAIVLAENALDAAKEII | ||
> | >M3_petMar | ||
FHAKMASL | FHAKMASL | ||
LELAGALVEGYYAMLSDGENATAEILLKYREGWYRSVVDYEALASQLTSDLHEIWGDHRLHAFYSDLQIERMDEDKTPSVPSPEELSVLIDTVFKVDILANNVGYLRFDMMTDAEVLKHV | LELAGALVEGYYAMLSDGENATAEILLKYREGWYRSVVDYEALASQLTSDLHEIWGDHRLHAFYSDLQIERMDEDKTPSVPSPEELSVLIDTVFKVDILANNVGYLRFDMMTDAEVLKHV | ||
GPQLVEKVWNKISSTRSLVIDVRYNMGGYSTSIPILCSYFFDASPPRHLYTVFDRPSRSSTQVFTVPRVLGQRYGASKDVYILTSHMTGSAGEILTRVMSDLKRATVIGEPTAGGSLSTG | GPQLVEKVWNKISSTRSLVIDVRYNMGGYSTSIPILCSYFFDASPPRHLYTVFDRPSRSSTQVFTVPRVLGQRYGASKDVYILTSHMTGSAGEILTRVMSDLKRATVIGEPTAGGSLSTG | ||
TYRIGDSRLYVFIPNQAGVSPSGGRTWSVAGVEPHVQTKASEALQSALRMV | TYRIGDSRLYVFIPNQAGVSPSGGRTWSVAGVEPHVQTKASEALQSALRMV | ||
> | >M4_petMar | ||
ALRADAPSILRTVGKLVADGYSRAEAALGVPSKLAALLEAGEYGALRSEEELAFKLTVHLQLITGDRHLKAVCVPEHATDRMPGIVPMQ | ALRADAPSILRTVGKLVADGYSRAEAALGVPSKLAALLEAGEYGALRSEEELAFKLTVHLQLITGDRHLKAVCVPEHATDRMPGIVPMQ | ||
MPPTESFEDLIKFSFITDVLEGNIGYLRFDLFSDLEALEHVAHLLVEHVWKKICDTEILIIDLR | |||
YNMGGYSTSIPILCSYFFDASPPRHLYTVFDRPSRSSTQVFTVPRVLGQRYGASKDVYILTSHMTGSAGEILTRVMSDLKRATVIGEPTAGGSLSTGTYRIGDSRLYVFIPNQAGVSPSGGRTWSVAGVEPHVQTKASEALQSA | |||
>M3/4_braFlo Branchiostoma floridae Region: 9 | >M3/4_braFlo Branchiostoma floridae Region: 9 exons | ||
VVMGIGDVMADHYLDQDLRALNDQSLLQRWNRTLVHRFQ | |||
SWSQDDMSDSLRMEEGLTSELRNITGDETIK | |||
VWDFGVYENTTQEPVPREFYNFSTFVDNFK | |||
KNREKHINVTMLEGNVGYVSIRSMSHIVDIILPDPEMTEFFLSKMAALNESK | |||
AIILDLRYNLGGDREGVVHWASFFFNATPSVPLSDVYYRDGVNQYWTLLE | |||
VPGGIRFPDMPLYLLTSNRTSREAEEFAYAMQVVNRTTIIGETT | |||
AGEEFTGMWFPIDQTDVHLLTRTNVVRNPITQDSWSGK | |||
GVTPDIIVPSEKALTVALRK | |||
</pre> | </pre> | ||
== RBP3 proteins parsed by module class == | === RBP3 proteins parsed by module class === | ||
<font color="blue"> | <font color="blue"> | ||
>M1_homSap | >M1_homSap |
Revision as of 21:54, 3 March 2009
RPB3 (IRBP) (96 marsupials): introduction
Interphotoreceptor retinol-binding protein, poorly named by IGNC as RBP3 despite its complete lack of paralogs, is a 4 exon 1247 residue glycoprotein that shuttles retinoids between the photoreceptor cells and the retinal pigment epithelium. The protein's size results from four ancient internal tandem duplictions that became established prior to the intronation era (that is, the gene structure does not reflect the repeat structure; the repeats happened first, introns were inserted randomly later within the fourth repeat) though lamprey lacks the final intron. The repeats cluster best across species to the same-numbered repeat consistent with thus.
Thus the origin of RBP3 seems to have preceded the origin of the RPE by an immense time span, suggesting an earlier non-visual function possibly related to beta-carotene metabolization. However the gene can really not be traced back earlier than amphioxus (where the match is strong but the genomic situation is quite confused). There is no significant match in cnidarians or protostomia. Xray crystallography establishes a curious fold relationship to crotonase/tail specific proteases in plants, suggesting recruitment of a pre-existing protein. It does not appear all four domains separately bind retinol.
The first three homology domains and part of the fourth are all encoded by the first large exon of 1090 amino acids. This exon has been much used in marsupial phylogeny (along with the first intron of transthyretin). Indeed the 96 marsupial species in 51 genera having determined IRBP sequences at GenBank include a Dec 2008 partial sequence for Thylacinus cynocephalus, as well as for Sarcophilus harrisii.
The closest matches to the thylacine IRBP are shown in the difference alignment of the first 60 residues below. These species all lie with the Dasyuromorphia. The indicated E-->K may be one of several phyloSNPs breaking this group into blue and green subclades.
The numbat Myrmecobius fits implausibly (its amino terminal sequence EF028750 needs verification) -- its affinities seem to lie with the Didelphimorphia. Thylacinus is not basal within Dasyuromorphia relative to Myrmecobius using IRBP. However this may be a case of mis-comparison of genes.
* * * STSKAPQHDSKFTNATQEELLALFQQIIKYQVLEGNVGYLRVDYIPGREMIEEVGEFLVN EU091365 0 Thylacinus cynocephalus .........P..A..................I............................ AY532676 3 Myoictis wallacei ........NP..A............................................... AY532687 3 Neophascogale lorentzii ........NP..A........T...................................... AY532686 4 Phascolosorex dorsalis .........P..V............................................... AY532670 2 Parantechinus apicalis ....V....P..A..................I.....................L...... AY532675 5 Myoictis melas .........P..A...................................D........... AY532679 3 Dasyurus hallucatus ...E.....P..A............K........D.............D........... AY532685 6 Sarcophilus harrisii ...E.......RA..........L............................Q..K.... EF028748 6 Sminthopsis crassicaudata .......R.P.LA.........SL.......................Q....Q....... EF028749 8 Planigale ingrami ..A......P.LA.V.....................................K....... EF028736 6 Antechinus stuartii ..A......P.L..V.....................................K....... EF028743 5 Micromurexia habbema ..A......P.LA.V.....................................K....... EF028744 6 Murexchinus melanurus ..A......P.L..V....V................................K....... EF028746 6 Paramurexia rothschildi ..A......P.LA.V.....................................K....... EF028747 6 Phascogale calura ..A......P.LA.V.....................................K....... EF028745 6 Phascomurexia naso .SA......P.LA.V.....................................K....... AY532667 7 Murexia longicaudata ......K..PNLA........T.L..R....................Q.VV.K....... EF028750 12 Myrmecobius fasciatus ..PET...VP..A.V........L..M....................Q.VV.K....... AY233765 13 Caluromys philander ..PET...VP.LA.V.......QL..M....................Q.VV.K....... AF257675 15 Caluromysiops irrupta ..PET...VP.LA.V......T.L..M....................Q.VV.K....... AF257688 15 Glironia venusta .IPET...VP..A.V.R....T.L..M....................Q.VV.K....... AF257683 16 Didelphis albiventris .IPE....VP.LA.I......T.L..M....................Q.VV.K....... AF257686 15 Gracilinanus microtarsus .IPET...VP..A.V......T.L..M....................Q.VV.K....... AF257676 15 Marmosops noctivagus .IPET...VP.LA.V........L..M....................Q.VV.K....... AY233788 15 Philander opossum .IPET...VP.LA.I......T.L..M....................Q.VV.K....... AF257689 16 Thylamys pallidior
Using Sarcophilus as probe in a different region, 721-900, we find this peculiar outcome: what appears to be a second very odd gene, XY difference, pseudogene, weird balanced polymorphism, nonhomologous recombination, sequence submission error, frameshifts, or systemic experimental error (eg Dasyurus maculatus AY532680 is identical to AY243439 outside the 15 amino acid block). However the genomic reads from individual Sarcophilus used in this project show no sign of this gene despite excellent coverage of the second type of gene.
Macropus and Monodelphis genomes only contain the second type of gene. All Didelphimorphia and Diprotodontia are of this type, as are platypus and all placentals. With the Sarcophilus genome, this can be resolved as it should have both and be the such first genome. Perhaps the alignment above is a mixture of type 1 and type 2 genes (resp. alleles). The Myrmecobius anomaly makes it more likely two distinct genes are present.
A definite pecularity seen in blast searches is the occurence earlier in the sequence of a very homologous segment for this very block, likely the homologous part of another of the internal tandem repeats. It is seen in both types of genes. Possibly internal non-homologus recombination or gene conversion has inserted first repeat sequence again in this distal block in place of what was relatively diverged sequence. Internal gene conversion would make IRBP extremely difficult to use in alignment-based phylogeny. As rare genomic event, it unites the species that have it but species that don't have it would have to be re-examined to exclude the possiblity that only the type 2 gene happened to be sequenced.
It emerges from direct tblastn that the Sacrophilus individual sequenced was female. That is, ATRX is well represented but not ATRY (though the situation is somewhat confused due to additional paralogs). Marsupial XY are quite different from placentals:
"Many or most genes on the mammal Y chromosome evolved a testis-specific function after diverging from an X-borne copy with a general function in both sexes. In marsupial but not eutherian mammals, a testis-specific orthologue (ATRY) of the widely expressed X-borne ATRX gene lies on the Y chromosome. Since mutations in human ATRX cause sex reversal, it is possible that one function of ATRY in marsupials is testicular differentiation. We report here the isolation and sequencing of the tammar wallaby (Macropus eugenii) ATRY cDNA, and comparison of its sequence with that of tammar ATRX. The evolution of a testis-specific function for the ATRY protein distinct from the general role of ATRX in both sexes has been accompanied by sequence changes in many protein domains that would alter protein binding partners. A large open reading frame encodes a 1771 amino acid ATRY protein that has diverged extensively from ATRX. The conservation and loss of particular motifs identify those required for testicular function (ATRY) and function in other tissues (ATRX)."
AY532685 MEILQKYYTLVDRVPALLHHLTAIDYSSSLVLDLQHSRGGEVSGTVSEDPRLLVRVLRSE Sarcophilus harrisii AY532684 ....E................................S....................P. Dasyurus geoffroii AY532681 ....E................................S....................P. Dasyurus albopunctatus AY532683 ....E................................S....................P. Dasyurus viverrinus AY532682 ....E........................P.......SE...................P. Dasyurus spartacus AY532680 ....E..............R.................SR...................P. Dasyurus maculatus AY532678 ..V..................................S....................P. Dasycercus cristicauda AY532669 ..V..................................S....................P. Dasykaluta rosamondae AY532676 ..V..................S...............S....................P. Myoictis wallacei AY532675 ..V..................S...............S....................P. Myoictis melas AY532687 ..V........N.L.......................S....................P. Neophascogale lorentzii AY532671 ..V..................................S....................P. Parantechinus bilarni AY532670 ..V.................................TS.........RG.........P. Parantechinus apicalis AY532686 ..V..................................S........P...........p. Phascolosorex dorsalis AY532674 ..V.......................................................P. Pseudantechinus ningbing AY532672 ..V..................................S....................P. Pseudantechinus woolleyae AY532673 ..V........N..R......................S...................SP. Pseudantechinus roryi 454 read MEILQKYYTLVDRVPALLHHLTAIDYSSVLTEEDLAAKLNAMLQAVSEDP Sarcophilus harrisii EF028739 ............................V.TEEDLAAKLNAMLQA.............P. Antechinus minimus AY243439 ....E..............R........V.TEEDLAAKLNAMLQA.............P. Dasyurus maculatus EF028750 ....K................KT.....I.TEEDLAAKLNAILQA.............P. Myrmecobius fasciatus EF028737 ..V.........................V.TEEDLAAKINAMLQA.............P. Antechinus flavipes EF028748 ..V.........................V.TEEDLAAKLNA.LQA.............P. Sminthopsis crassicaudata AY243438 ..V.........................V.TEEDLAAKLNA.LQA.............P. Planigale sp. EF028749 ..V.........................V.TEEDLAAKLNA.LQA.............P. Planigale ingrami AY532679 ..V.........................V.TEEDLAAKLNAMLQA............... Dasyurus hallucatus AF025382 ..V.........................V.TEEDLAAKLNAMLQA.............P. Phascogale tapoatafa EF028741 ..V.........................V.TEEDLAAKLNAMLQA.............P. Antechinus godmani AY532666 ..V.........................V.TEEDLAAKLNAMLQA.............P. Antechinus swainsonii EF028736 ..V.........................V.TEEDLAAKLNAMLQA.............P. Antechinus stuartii EF028742 ..V.........................V.TEEDLAAKLNAMLQA.............P. Antechinus agilis EF028738 ..V.........................V.TEEDLAAKLNAMLQA.............P. Antechinus bellus EF028740 ..V.........................V.TEEDLAAKLNAMLQA.............P. Antechinus leo EF028747 ..V.........................V.TEEDLAAKLNAMLQA.............P. Phascogale calura EF028744 ..V.........................V.TEEDLAAKLNAMLQA.............P. Murexchinus melanurus EF028743 ..V.........................V.TEEDLAAKLNAMLQA.............P. Micromurexia habbema EU086688 ..V.........................V.TEEDLAAKLNAMLQA.............P. Pseudantechinus macdonnellensis EU086689 ..V.........................V.TEEDLAAKLNAMLQA.............P. Pseudantechinus roryi EU086686 ..V.........................V.TEEDLAAKLNAMLQA............SP. Pseudantechinus macdonnellensis EU086687 ..V.........................V.TEEDLAAKLNAMLQA..........G..P. Pseudantechinus mimulus AY532667 ..V.........................V.TEEDLAAKLNAMLQA.............P. Murexia longicaudata EF028746 ..V.........................V.TEEDLAAKLNAMLQA.............P. Paramurexia rothschildi AY532677 ..V.........................V.TEEDLAAKLNAMLQA.............P. Dasyuroides byrnei EF028745 ..V..........I..............V.TEEDLAAKLNAMLQA.............P. Phascomurexia naso Macropus eugenii assembly sacHar MEILQKYYTLVDRVPALLHHLTAIDYSSSLVLDLQHSRGGEVSGTVSEDPRLLVRVLRSE ME+LQ YYTLVDRVPALLHHLTAIDYSS L + ++ VSEDPRLLVRVLR E macEug MEVLQNYYTLVDRVPALLHHLTAIDYSSVLTEEDLAAKLNAGLQAVSEDPRLLVRVLRPE Monodelphis domestica assembly TSSLVLDLQHSSGGEISG sacHar MEILQKYYTLVDRVPALLHHLTAIDYSSSLVLDLQHSRGGEVSGTVSEDPRLLVRVLRSE ME+LQ YYTLVDRVPALLHHLTAIDYSS L + ++ VSEDPRLLVRVLR E monDom MEVLQNYYTLVDRVPALLHHLTAIDYSSVLTEEDLAAKLNAGLQAVSEDPRLLVRVLRPE Ornithorhynchus anatinus assembly sacHar EILQKYYTLVDRVPALLHHLTAIDYSSSLVLDLQHSRGGEVSGTVSEDPRLLVRVLRSE ++L+ YY LVDRVPALL HL A+D SS L + SR SEDPRLLVR L E ornAna DLLRDYYALVDRVPALLRHLAALDLSSVLSEEDLTSRLNAGLQAASEDPRLLVRRLEPE Equus caballus assembly sacHar EILQKYYTLVDRVPALLHHLTAIDYSSSLVLDLQHSRGGEVSGTVSEDPRLLVRVLRSE E LQ YYTLVDRVPALLHHL ++D+SS + D ++ VSEDPRLLV V+RS+ equCab EALQDYYTLVDRVPALLHHLASMDFSSVVSEDDLVAKLNAGLQAVSEDPRLLVWVVRSK
Reference sequences
These are organized in three ways, as intronated genes showing position and phase of intron breaks, as parsed into functional modules (signal peptide and spacer residues are dropped), and as modules organized by type 1-4. All three sets are in phylogenetic order with respect to the canonical deuterostome tree.
RBP3 from human to amphioxus
>RPB3_homSap human 0 MMREWVLLMSVLLCGLAGPTHLFQPSLVLDMAKVLLDNYCFPENLLGMQEAIQQAIKSHEILSISDPQTLASVLTAGVQSSLNDPRLVISYEPSTPEPPPQV PALTSLSEEELLAWLQRGLRHEVLEGNVGYLRVDSVPGQEVLSMMGEFLVAHVWGNLMGTSALVLDLRHCTGGQVSGIPYIISYLHPGNTILHVDTIYNRPSNTTTEIWTLPQVLG ERYGADKDVVVLTSSQTRGVAEDIAHILKQMRRAIVVGERTGGGALDLRKLRIGESDFFFTVPVSRSLGPLGGGSQTWEGSGVLPCVGTPAEQALEKALAILTLRSALPGVVHCLQ EVLKDYYTLVDRVPTLLQHLASMDFSTVVSEEDLVTKLNAGLQAASEDPRLLVRAIGPTETPSWPAPDAAAEDSPGVAPELPEDEAIRQALVDSVFQVSVLPGNVGYLRFDSFADA SVLGVLAPYVLRQVWEPLQDTEHLIMDLRHNPGGPSSAVPLLLSYFQGPEAGPVHLFTTYDRRTNITQEHFSHMELPGPRYSTQRGVYLLTSHRTATAAEEFAFLMQSLGWATLVG EITAGNLLHTRTVPLLDTPEGSLALTVPVLTFIDNHGEAWLGGGVVPDAIVLAEEALDKAQEVLEFHQSLGALVEGTGHLLEAHYARPEVVGQTSALLRAKLAQGAYRTAVDLESL ASQLTADLQEVSGDHRLLVFHSPGELVVEEAPPPPPAVPSPEELTYLIEALFKTEVLPGQLGYLRFDAMAELETVKAVGPQLVRLVWQQLVDTAALVIDLRYNPGSYSTAIPLLCS YFFEAEPRQHLYSVFDRATSKVTEVWTLPQVAGQRYGSHKDLYILMSHTSGSAAEAFAHTMQDLQRATVIGEPTAGGALSVGIYQVGSSPLYASMPTQMAMSATTGKAWDLAGVEP DITVPMSEALSIAQDIVALRAKVPTVLQTAGKLVADNYASAELGAKMATKLSGLQSRYSRVTSEVALAEILGADLQMLSGDPHLKAAHIPENAKDRIPGIVPMQ 0 0 IPSPEVFEELIKFSFHTNVLEDNIGYLRFDMFGDGELLTQVSRLLVEHIWKKIMHTDAMIIDMR 2 1 FNIGGPTSSIPILCSYFFDEGPPVLLDKIYSRPDDSVSELWTHAQVV 1 2 GERYGSKKSMVILTSSVTAGTAEEFTYIMKRLGRALVIGEVTSGGCQPPQTYHVDDTNLYLTIPTARSVGASDGSSWEGVGVTPHVVVPAEEALARAKEMLQHNQLRVKRSPGLQDHL* 0 >RBP3_bosTau cow run-on terminal exon 0 MVRKWALLLPMLLCGLTGPAHLFQPSLVLEMAQVLLDNYCFPENLMGMQGAIEQAIKSQEILSISDPQTLAHVLTAGVQSSLNDPRLVISYEPSTLEAPP RAPAVTNLTLEEIIAGLQDGLRHEILEGNVGYLRVDDIPGQEVMSKLRSFLVANVWRKLVNTSALVLDLRHCTGGHVSGIPYVISYLHPGSTVSHVDTVY DRPSNTTTEIWTLPEALGEKYSADKDVVVLTSSRTGGVAEDIAYILKQMRRAIVVGERTVGGALNLQKLRVGQSDFFLTVPVSRSLGPLGEGSQTWEGSG VLPCVGTPAEQALEKALAVLMLRRALPGVIQRLQEALREYYTLVDRVPALLSHLAAMDLSSVVSEDDLVTKLNAGLQAVSEDPRLQVQVVRPKEASSGPE EEAEEPPEAVPEVPEDEAVRRALVDSVFQVSVLPGNVGYLRFDSFADASVLEVLGPYILHQVWEPLQDTEHLIMDLRQNPGGPSSAVPLLLSYFQSPDAS PVRLFSTYDRRTNITREHFSQTELLGRPYGTQRGVYLLTSHRTATAAEELAFLMQSLGWATLVGEITAGSLLHTHTVSLLETPEGGLALTVPVLTFIDNH GECWLGGGVVPDAIVLAEEALDRAQEVLEFHRSLGELVEGTGRLLEAHYARPEVVGQMGALLRAKLAQGAYRTAVDLESLASQLTADLQEMSGDHRLLVF HSPGEMVAEEAPPPPPVVPSPEELSYLIEALFKTEVLPGQLGYLRFDAMAELETVKAVGPQLVQLVWQKLVDTAALVVDLRYNPGSYSTAVPLLCSYFFE AEPRRHLYSVFDRATSRVTEVWTLPHVTGQRYGSHKDLYVLVSHTSGSAAEAFAHTMQDLQRATIIGEPTAGGALSVGIYQVGSSALYASMPTQMAMSAS TGEAWDLAGVEPDITVPMSVALSTARDIVTLRAKVPTVLQTAGKLVADNYASPELGVKMAAELSGLQSRYARVTSEAALAELLQADLQVLSGDPHLKTAH IPEDAKDRIPGIVPMQ 0 0 IPSPEVFEDLIKFSFHTNVLEGNVGYLRFDMFGDCELLTQVSELLVEHVWKKIVHTDALIVDMR 2 1 FNIGGPTSSISALCSYFFDEGPPILLDKIYNRPNDSVSELWTLSQLE 1 2 GERYGSKKSMVILTSTLTAGAAEEFTYIMKRLGRALVIGEVTSGGCQPPQTYHVDDTDLYLTIPTARSVGAADGSSWEGVGVVPDVAVPAEAALTRAQEMLQHTPLRARRSPRLHGRRKGHHRQSQGRAGSLGRNQGVgRPEVLTEAPSGQKRGLLQCG* 0 >RBP3_monDom opossum 0 MTSQCLLLFSALLFSLAHAEQIFQPSLVRDMAKILLDNYCFPENLMGMQEVIEQAIKSGEILDISDPQMLASVLTAGVQGALNDPRLVISFEPSIPETPQ HVPKLANVTQEELLILLQQMIKYQVLEGNVGYLRVDYIPGQEVVEKVGEFLVNNIWKKLMGTSSLVLDLQHSSGGEISGIPFVISYLHQGDILLHVDTVY DRPSNTTTEIWTLPQVLGERYGGEKDMVVLTSHRTVGVAEDIAYILKKLRRAIVVGEQTLGGALDLRKLRIGQSDFFITVPVSRSLSPLGGGSQTWEGSG VLPCVGIPAEQALGKALAILTLRRARPGAIQRLMEVLQNYYTLVDRVPALLHHLTAIDYSSVLTEEDLAAKLNAGLQAVSEDPRLLVRVLRPEEATMGEA EEEDATPAANSLPEDESQRQALVDSVFQVSVLPGNVGYLRFDEFADSSVLGTLAPYVIRQVWEPLQDTNHLIMDLRYNPGGPSSAVPLLLSYFQDPAAGP IRLFTTYDRQTNQTQEHLSRAELLGKPYGAQRGVYLLTSHHTATAAEEFAFLMQSLGRATLVGEITAGSLMHTRTFPLLQPPNGNLVLTVPILTFIDNNG ECWLGGGVVPDAIVLAEEALDKAKEVLEFHQRLGALVEGTGHLLEAHYALPEVVGQASALLKAKLEHGTYRTAVDFESLASQLTSDLQEVSGDHRLHVFH SPGEPVSEELTPPQKGVPSPEELTYLIEALFKTEVLPGQLGYLRFDMMAEAETVRAIAPQLVELVWEKLVHTEALVVDLRYNPGGYSTAVPLLCSYFFEA EPRRHLYTIFDRAASQLTEVWTLPQVAGERYGSQKDLYILISHTSGSAAEAFVHTMKDQHRATVIGEPTGGGALSVGIYQVENSPLYASMPTQVAISPVT GKAWDMAGVEPDVSVLSSEALMTTQGIVALRAKVPTILQTAGKLVADNYASLEVGSRVASKLAKLQTQYRQVTSEGELADMLGADLQTLSGDRHLKTAHI PEDAKDRIPGIVPMQ 0 0 LPSPEAFEDLIKFSFHTNVFEGNIGYLRFDMFGDCELLTQVSDLLVEHVWKKVVHTDGMIIDMR 2 1 FNIGGPTSSISALCSYFFDEGQEVLLDQIYNRPNDSISEIWTQSQVA 1 2 GERYGSKKSVIILTSSMTAGAAEEFVYVMQRLGRALVIGEVTSGGCQPPQTYHVDDTDLYITIPTARSVGSGDKPSWEGVGVAPHVEVPADQALSKAKEMFNHHLQRAK* 0 >RBP3_ornAna platypus genome rife with frameshifts, dels, misassembly frag 0 MGVCLPLLLVAQFSLTGHVEPVSQPSMVLDVAKILLDNYCYPENLMGMQEAIEEAIQRGEILDIADPKRLASVLTAGVQGSLNDPRLVISYEPAPVAVSQ QPPEPASLPAEQPLERLRPAVGSEVLEGNVGYLRVDRLPGREEIERVGAVLGRDIWEKLLGTSALVLDLRHSTGGHVSGIPFFISYFYPEGPALHVDTVY DRPSNATRQLWTLPRVLGARYAADKDVVVLTSRLTAGVAEDVAYILQQMRRAIVVGERTAGGPLVFRKLRVGLSDFFITVPVACSLGPLGGGGRSWEGSG VLPCVAVPADRALDEALDILALRGAVPGAVAHLADLLRDYYALVDRVPALLRHLAALDLSSVLSEEDLTSRLNAGLQAASEDPRLLVRRLEPEEAERGPP RKEEEQKEEEEEDQPSPGASILPGDGSSREAPLFRVSVLPGNVGYLCFDEFPEASALERLGPLLGRRVWEPLEATDHLMVDLRNNPGGPSSAVPLLLSYF QDPAAGPIRLFTTYNRPADVTREYASRAGALEKPYGARRGVYLLTSHRTATAAEEFAYLMQALGRATLVGEITAGRLLHSRTFPLLRPPWEGLVLTVPFL TLFDPHGEGWLGGGVVPDAIVLAEEALEKAGEVLAFHQTLEALVETTGHLLEAHYCFPAGARRAGAQPWPVAGVEPDVMAQAAEALAVAQGIAALRSKVP TVLRTAAKLVADNYAFRETGAGVAAQMGGLQARCGRVTSEGALAEVLGAHLRALSGDPHLQMVYIPEDAKDRIPGVVPMQ 0 0 IPSAETFEDLIKFSFHTSVMEGNIGYLRFDMFGDCELLTQVSELMVEHVWKKIVHTDGLIIDMR 2 1 NIGGPTSSISALCSYFFDEDHPVLLDKIYNRPNDSISEIWTHSHIA 1 2 GERYGSRKSVVILTSNMTAGAAEEFVSIMKRLGRALVVGEVTGGGCHPPQTYHVDDTHLYITIPTSRSVGSEDGSSWEGVGVTPHLVVPADVALSRAKDLFRAHLEHRD* 0 >RBP3_taeGut Taeniopygia guttata 0 MIRTHFLLLSALIMCSIPAEEIFQPTLVLDMAKVLLDNYCYPENLVGMQEAIEQAIKSGEILDISDPKMLANVLTAGVQGALNDPRLVISYEPLPHSGPK QEAEGSPTREQLLSLIEHVIMYDKLEGNVGYLRIDYIIGEEVVQKVGAFLVDKVWKTLIETSALVIDLRHSTGGQISGLPFIISYLHEQDKILHVETVYN RPSNTTTEIWTLPKVLGERYSKDKDVIVLISHHTTGVAEDVAYILKHMNRAITVGEKTAGGSLDIQKLRIGPSNFYMMVPVSRSVSPLSGGGQSWEVSGV MPCVATEAEQALQKSLDILAVRRAVPGTISHLKNILKDYYSLVERVPALLRRLTTSDFSSVQSSEDLATKLNTELQALSDDPRLMVRVMMPGEAADSPAE KPVGMAADLPDNEQLLHALVDTVFKVSVLPGNVGYMRFDEFADASVLVKLGPYLVHKVWEPLQNTENLIMDLRYNLGGPSSSAVPVLLSYFQDPAAGPVH LFTTYDRRTNHTQEHNSQAELLGQSYGAKRGVYLLTSHHTATAAEEFAYLMQSLGRATLIGEITAGSLSHTRTFPLLQPGPGITRGLTITVPVITFIDNH GESWMGGGVVPDAIVLAEDALEKAEEVLAFHKNMGVLLEGTGQLLEDHYAIPEVAAKASAMLSTKRAQGGYRSAIDSETLASQLTSDLQEASGDHRLHVF HSHVEPTPEEQLPNVIPSPEELSYIIEALFKIEVLPGNLGYLRFDMMAEAETVKAIGPQLLQMVWNKLVDTDAMIIDMRYNTGGYSTAIPILCSYFFDPE PRKHLYTVFDRSTSRSTEVWTLPQLAGKRYGSLKDIYILTSHMSGSAAEAFTRSMKDLHRATVVGEPTVGGSLSVGIYRVGNSSLYASIPSQVVLSPVTG KVWSVSGVEPHITIQASEAMAAAQHIANLRAQVPQILQTVGKLVADNYAFVNTGTVIASNLTKNIHKDNYKRINTEEDLAGKVTAILQALSDDKHLKLLY IPEHAKDSIPGIMPK 0 0 QIPPPEVFEDLIKFSFHTNVFENNIGYLRFDMFGDSELLTQLSDLMIEHVWKKIFHTDALIIDLR 2 1 YNIGGSTTPIAILCSYFFDEGHPVLLDRVYDRPSDSVKEIWTQPQLK 1 2 GERYGSQKGLVILTSAVTAGAAEEFVYIMKRLSRALIIGEQTSGGCHSPQTYQVDETNFYVVIPTSRSVTSADSTSWEGKGVSPHIETPAETALIKAKEMLNAHLHSSR* 0 >RBP3_galGal Gallus gallus 1236 aa N-terminal 21 aa signal peptide 5 glyc (3 unique) two W per repeat 0 MRTYFFLFSVLIVCSISAEEIFQPTLVLDMAKVLLDNYCYPENLVGMQEAIEQAIKSGEILDISDPKMLANVLTAGVQGALNDPRLVISYEPSLHAAPKQ EAETYPTREQLLSLIEHVVIYDKLEGNVGYLRIDYIIGQEVVEKVGAFLVDKVWKTLINTSALVIDLRYSTGGQISGIPFIISYLHEADKMLHVETVYNR PSNTTTEIWTLPKVLGERYSKDKDVIVLISHHTTGVAEDVAYILKHMNRAITLGEKTAGGSLDIQKLRIGPSNFYMMVPVSRSVSPLSGGGQSWEVSGVM PCVASEAEQALKKSLDILAVRRAVPGTLSRLTDILKDYYSLVERVPVLLRHLTTSDFSSVQSAEDLATKLNTEMQTLSEDPRLLVRTMMPGEAAAPPAEM PIAMAANLPDNEQLLHALVDTVFKVSVLPGNVGYMRFDEFADASVLVKLGPYIVKKVWEPLQNTENLIMDLRYNPGGPSSSAVPMLISYFQDPTAGPVHL FTTYDRRTNHTQEHNSQAELLAQPYGAQRGIYVLTSRHTATAAEEFAYLMQSLGRATLIGEITAGSLSHTCTFPLVQPEQGITRGLTITVPVITFIDNHG ESWMGGGVVPDAIVLAEDALEKAEEVLTFHRKMGILLESTGQLLEAHYAIPEVAEKASVMLSTKRVQGGYRSAVDFETLASQLTSDLQEASGDHRLHVFH SHVEPTPEEQLPNMIPSPEELSYIIEALFKIEVLPGNLGYLRFDMMAEAETVKAIGPQLVQMVWNKLVDTDAMIIDMRYNTGGYSTAVPILCSYFFEPEP RQHLYTVFDRSTSRSTEVWTLPKVTGKRYGSLKDIYILTSHMSGSAAEAFTRSMKDLHRATVIGEPTVGGSLSVGIYRVGNSSLYRSIPSQVVLSPVTGK VWSVSGAEPHITIQASEALAAAKHIASLRTQVPQIVQTVGKLVAENYAFVDIGTDIASNLTKSVNKENYKRINSEKELARKLTAILQALSDDEHLKILYI PEHAKDSIPGILPK 0 0 QIPSPEVFEDLIKFSFHTNVFENNIGYLRFDMFGDCELLTQVSDLLVEHVWKKIVHTDALIIDMR 2 1 YNIGGYTNSIPILCSYFFDEGHQVLLDKVYDRPSDSVKEIWTQPQLR 1 2 GERYGSQKGLIILTSAVTAGAAEEFVFIMKRLGRALIIGEQTSGGSHSPQTYQVDDTNFYIIIPTARSVISAESASWEGKGVPPHMETPAVTALIKAKEVLSAHLHSSR* 0 >RBP3_anoCar lizard 0 MLRKCLWLSIVLVCCSSYADSVLQSTLVLDMAKLLLDNYCLPENLVGMREAIEQAIKNGEVLDISDPKLLATVLTAGVQGALNDPRLVISYEPTAPAAPK QRMETSLTPEQLLSLIQHTVKYEVLDDNVGYLRIDYIMGQDIVQKIGSFLVEKVWKTLLGTSALILDLRYTTGGDVSGIPFIISYLYNGDKVLHVDTVYN RPSNTTVEILTLPKVLGVRYSKDKDVILLISKYTTGVAENVAYILKHMHRTIIVGEKSAGGSLDTQKMQIGNSQFYMTVPLSCSVSPLSGSGQSWEISGV TPCVVISAEQALDKALAILSLRKAIPNSMSYLVDIIKNNYSMLEQVPVLLQHLSTFDYSSVLSVKDLASKLNAELQTISEDPRLFLRVPASDEAVTSQTD EKVAMASDLPNNEQLMKALVMTVFKVSVLPGNVGYMRFDEFGDATVLVKLGPYLLQHVWEPLQATDYLIIDLRYNIGGPSSSAVPVLLSYFQDPSAGPVH FFTTYNRLTNQTQAYSSSAEMVGKPYGARRGVYLLTSHNTATAAEEFAYLMQTLGRATLVGEITAGSLSHTHTFCILELGGGCGLLINVPVITLIDNHGE YWLGGGVVPDSIVLADEALEKAREVLEFHKGMGSLIERVGQLLEAHYAIPEMARRVSSMLNSKLAQGGYRTAVDFETLASQLTNDLQETSGDHQLHVFHS HVEPSLEEQSPFKTLTPEELNFIIEALFKVDVLPGNVGYLRFDMMAEFESVKTIEPQILHMVWEKLVETSAMIVDMRYNTGSYSTAVPMFCSYFFDAEPQ QHLYTIIDRSTSQSTEVWTSSQVSGKRYGSTKDLYILISHASGSAAEAFTRSLKDLHRATVIGEPTVGGSLSASIYNIGSTPLYASIPSQIVLSPVSGKV WSLSGIQPHVTTQSNEALASAQNIILFRTKLPSVLNTIGKLVADNYAFADIGATVAAKFADYAKKGTYRKINSEIELSGKLAADLKALSGDRHLMISHIP ERSKGRILGLVPMQ 0 0 QIPPPEILEDLIKFSLHTNVFENNIGYLRFDMFGDCELMSQVSELLVQHVWNKIVNTDALIIDMR 2 1 YNVGGPACSVPLLCSYFFDEGHPILLDKVYNRPNDTTSNIWTVSKLA 1 2 GKRYGLNKGLIILTSSVTSGAAEEFAHIMKRLGRAFIIGQKTSGGCHPPQTFHVDGTNLYITTPVSRSVFSVNDSWEGVGVSPHLDVSTDVALIKAKEMLKAHLH* 0 >RBP3_xenLae Xenopus laevis 0 MPPLFQALTTALFFCGIASNPLFQPSLVMDMAKVLLDNYCFPENLVGMQETIEQAVKGGEILHISDPDTLANVFTSGVQGYLNDPRLVVSYEPNYSGPQT EQSLELTPEQLKFLINHSVKYDILPGNIGYLRIDFIIGQDVVQKVGPHLVNNIWKKLMPTSALILDLRYSTQGEVSGIPFVVSYLCDSEIHIDSIYNRPS NTTTDLWTLPELMGERYGKVKDVVVLTSKYTKGVAEDASYILKHMNRAIVVGEKTAGGSLDTQKIKIGQSDFYITVPVSRSLSPLTGQSWEVSGVSPCVV VNAKDALDKAQAILAVRSSVTHVLHQLCDILANNYAFSERIPTLLQHLPNLDYSTVISEEDIAAKLNYELQSLTEDPRLVLKSKTDTLVMPGDSIQAENI PEDEAMLQALVNTVFKVSILPGNIGYLRFDQFADVSVIAKLAPFIVNTVWEPITITENLIIDLRYNVGGSSTAVPLLLSYFLDPETKIHLFTLHNRQQNS TDEVYSHPKVLGKPYGSKKGVYVLTSHQTATAAEEFAYLMQSLSRATIIGEITSGNLMHSKVFPFDGTQLSVTVPIINFIDSNGDYWLGGGVVPDAIVLA DEALDKAKEIIAFHPSIFPLVKGTGHLLEVHYAIPEVAYKVSSVLQNKWSEGGYRSVVDLESLASLLTSEMQENSGDHRLHVFYSDTEPEILEDQPPKIP SPEELNYIIDALFKIEVLPGNVGYLRFDMMADTEIIKAIGPQLVSLVWNKLVETNSLIIDMRYNTGGYSTAIPIFCSYFFDPEPLQHLYTVYDRSTSTGK DIWTLPEVFGERYGSTKDIYILTSHMTGSAAEVFTRSLKDLNRATLIGEPTSGVSLSVGMYKVGDSNLYVTIPNQVVISSVTGKVWSVSGVEPHVIIQAN EAMNIAHRIIKLRTKIPTVIQTAAKLVADNYAFADTGANVASKFIALVDKIDYKMIKSEVELAEKINDDLQSLSKDFHLKAVYIPENSKDRIPGVVPM 0 0 QIPSPELFEELIKFSFHTDVFEKNIGYIRFDMFADSDLLNQVSDLLVEHVWKKVVDQDALIIDMR 2 1 FNIGGPTSSIPIFCSYFFDEGTPVLLDKIYSRTSNAMTDIWTLPDLV 1 2 GKTFGSKKPLIILTSSLTEGAAEEFVYIMKRLGRAYVVGEVTSGGCHPPQTYHVDDTHLYLTIPTSRSASAEPGESWEGKGVLPDLEISSETALLKAKEILESQLEGRR* 0 >RBP3_xenTro Xenopus tropicalis 89% xenLae 0 MSPLFKALTTVLFFCIVASNPVFQPSLVMDMAKVLLDNYCFPENLVGMQETIEQAMKSGEILHISDPETLANVFTSGVQGFLNDPRLVVSYEPNYSGPRK EQSPEPTLEQLKFLLDHSVTYDLLPGNIGYLRIDFIIGQDVVQKVGPLLVNNIWKKLMPSSALILDLRYSTQGKVSGIPFVVSYLTDPQIHIDSIYNRPS NTTTDLWTLSELMGERYGKDKDVVVLTSKYTEGIAEGAAYILKHMSRAIVVGEKTAGGSLDIQKIKIGQSEFYITVPVSRSISPLTGQSWEVAGVFPCVV VNANNALNKAQGILAVRSSITHILLQLSEILVNNYAFSERIPTLLQHLPNLDYSSVISEEDITAKLNYELQSLTEDPRLVLKSKTDSLVMPEDSTQVENL PDDEATLQALVNTVFKVSILPGNIGYLRFDEFADVSVLAKLGPYIVNTVWDPITVTENLIIDLRYNIGGSSTSIPLLLSYFQEPENRIHLFTIYNRQQNS TNEVYSLPKVLGKPYGSKKGVYVLTSHETATAAEEFAYLMQSLSRATIIGEITSGNLMHSKAFPLDGTRLSVTVPIMNFIDNNGDYWLGGGVVPDAIVLA DEALDKAKEIIAFHPSVFALVEGTGHLLEVHYAIPEVAYKVSSVLQNKWSEGGYRSVVDLESLASQLTSEMQENSGDHRLHVFYSDTEPEILEDQPPKIP SAEELNYIIDALFKIEVLQGNVGYLRFDMMADTEIIKAIGPQLVSLVWNKLVETNSLIIDMRYNTGGYSTAIPIFCSYFFDPEPLQHLYTVYDRSTSSGT DIWTLPEVVGERYGSTKDIYILTSHMTGSAAEVFTRSMKELNRATIIGEPTSGVSLSVGMYKVGESNLYVSIPNQVVISSVTGKVWSVSGVEPHVIAQAS EAMNVAHHIIKLRTKIPSVIQTAGKLVADNYAFADTGADVASKLIALVDKINYKMIKSEVELAEKLNYDLQSLSKDVHLKAVYIPENSKDRIPGVVPMQ 0 0 IPSPEMFEDLIKFSFHTDVFEKNLGYIRFDMFADSDLLNQVSDLLVEHVWKKVVNQDALIIDMr 2 1 FNIGGPTSSIPTFCSYFFDEGTPVLLDKIYSRTTNAITDVWTLPHLV 1 2 GNAFGSKKPVIILTSSLTEGAAEEFVYIMKRLGRAYVIGEVTSGGCHPPQTYHVDDTHLYLTIPTSRSASAKPGESWEGKGVLPDLEITSETALMKAKEILVSQLEGR* 0 >RBP3_tetNig frameshifts in genome two domains: 23-324,326-612 no upstream dup 0 MAKALFTVASLLLLANGFFVGAAFPPSLIADMAKIVLDNYCSPEKLAGMKEAIKAAGTNTEVLNIPDGESLARVLSAGVQGTVSDPRLMVSFQPNYVPAG PHKMPPLPPEHLVAVLQTSVKLDILEGNTGYLRIDHILGEEVADKVGPALIDLIWNKILPTSALIFDLRYTSSGDISGIPYIVSYFTQAEPVVHIDSVYD RPSNTTTKLLSLPNLLGQRYGVSKPLIVLTSKNTKGIAEDVAYCLKNLKRATIVGEKTAGGSLKLDTFKVGDTDFYITVPTAKSINPITGSSWEIRGVTP HVEVNAEDALATAIKIVNLRAQIPAIIEGTAALVANNYAFEATGADVAKELRELQANGQYSSVVSKESLEAALSADLQRLSGDKSLKTTPNTPVLPPM 0 0 DYTPEMYIELIKVSFHTDVFENNIGYLRFDMFGDFEEVKAIAQIIVEHVWNKVVNTDALILDLr 2 1 NNVGGPTTAIAGFCSYFFDADKQNRVGQAVRQASGTTTELLTLSELT 1 2 GVRYGSKKSLIILTSGATAGAAEEFVYIMKKLGRAMIVGETTAGASHPPQTFRVGETDVFLLIPTVHSDTGAGPAWEGAGIAPHIPASAEAALGTARAILNKHFAGQK* 0 >RBP3_takRub fugu two domains: 23-324,326-612 plus upstream dup 0 MAKALFLVASLLLLANDVLVRAAFPPSLITDMAKIVLDNYCSPEKLAGMKEAIEAAGTNTEVLNIPDGESLARVLSAGVQGTVSDSRLMVSYQPDYVPAV PPKMPPLPPEHLVAVLQTSIKLDLLEGNTGYLRIDHIIGEDVAEKVGPSLIDLIWNKILPTSALIFDLRYTSSGEISGIPYIVSYFTQAEPVVHIDSVYD RPSNTTTKLFSLSNLLGERYGITKPLIILTSKNTKGIAEDVAYCLKNLKRATIVGERTAGGSVKLDNFKVGSTDFYITVPTAKSINPVTGSSWEITGVKP DVEVNAEDALATAIKIVSLRAQIPAIIEGAATLIAKNYAFEATGADVATKLRELLAKGQYNSVVSSESLEVALSADLQRLSGDKSLKATQNAPVLPPM 0 0 DYSPEMYIELIKVSFHTDVFENNIGYLRFDMFGDFEEVKAIAQIIVEHVWNKVVNTDALILDLR 2 1 NNVGGPTTAIAGFCSYFFDADKLIVLDKLHDRPSGTTTELLTLPELT 1 2 GVRYGSKKSLIILTSGATAGAAEEFVYIMKKLGRAMIVGETTAGASHPPQVFSVGEIGIFLSIPTVHSDTAAGPAWEGTGITPHIPVSAEAALGTAKGILNKHFGGQK* 0 >RBP3_gasAcu sticklebck two domains: 27-317,323-612 no upstream dup 0 MAKLIFLVAPLLVLGNIAFIHAGFAPNVIIDMAKIVIDNYCSPEKLAGMKEAIEAAGSNTEVLSIPDAETLANVLSAGVQTTVSDPRLMISYEPNYVPVV PPKMPPLPPDQVIAVLQTSIKLDILEGNIGYLRIDHILGEDVAEKVGPLLLDLVWNKILPTSALIFDLRYTSSGDISGIPYIVSYFTEAGTPIHIDSIYD RPSNTTTKLFSMSTLLGERYSTSKPLIILTSKNTKGIAEDVAYCLQNLKRATIVGEKTAGGSVKVDKIQVRDTGFYVTVPTAKSVNPITGSTWEVTGVTP NVEVNAEDALATAIKIVTLLNRVPAIIEGSATLIADNYAFEDIGAAVAEKLKGLLANGEYSKVVSKDSLEMKLSADLRTLSGDKSLKTTSNVPALPPM 0 0 NYSPEMYIELIKVSFHTDVFEDNIGYLRFDMFGDFEEVKAIAQIIVEHVWNKVVNTDAMIVDLR 2 1 NNIGGPTTAIAGFCSYFFDSDKQIVLDRLYDRPSGTTTELRTLPELT 1 2 GTRYGSKKSLVMLTSRATAGAAEEFVYIMKKLGRAMIVGETTAGTSHPPKTFRVGETDIFLSIPTVHSDTAAGPAWEGAGVAPHIPVPADAALETAKGIFKKHFAGQK* 0 >RBP3_oryLat medaka two domains: 28-314,320-605 no upstream dup 0 MAKTLFLVASLLVLGNVVFLHASFPPSLITDLAKIVMDNYCSPEKLSGMKEDIATAGANTDVLNIPDGEALAKVLTDGVQTTVSDPRLRVSYEPNYVPVV PPQLPPEQLIAVLQTSIKLDILEGNIGYLRIDSIIGEEVAEKVGPLLLELVWSKILPTSALIFDLRYTSSGDITGIPYIISYLTDAKSEIHIDTIYDRPL NTTTKLLSMQSTLGQTYGGTKPLLVLTSKNTKDIAEDVAYCLKNLKRATIVGEKTAGGSAKIKKFRVGDTDFYVTLPTAKSINPITGSSWEVTGVKPNVE VNAEEALATALKIINLRLQVPAIIEESATLVANNYAFESTAADVAEKLKGHLANGDYNMVVSKESLEAKLSADLQSLSGDKSLTVSSNTGAPPPM 0 0 EYTPEMYIELIKISFHTDVFENNIGYLRFDMFGDFEEVKAIAQVIVEHVWNKVLHTDAMIIDLR 2 1 NNVGGPTTAIAGFCSYFFDGDKQILLDKLYDRSTGTTTDLLTLGELT 1 2 GERYGSKKSLIILASRATAGAAEEFVYIMKRLGRAMIVGETTAGASHPPKVFQVGESDIFLSIPTVHSDTSAGPGWEGAGVAPHIPVAAGAALETAKAILNKHIGGQQHAAS* 0 >RBP3_danRer zebrafish upstream frag as well two domains: 22-322,324-609 0 MAQALVLLVSLLFFSNVAHCNFSPTLIADMAKIFMDNYCSPEKLTGMEEAIDAASSNTEILSISDPTMLANVLTDGVKKTISDSRVKVTYEPDLILAAPP AMPDIPLEHLAAMIKGTVKVEILEGNIGYLKIQHIIGEEMAQKVGPLLLEYIWDKILPTSAMILDFRSTVTGELSGIPYIVSYFTDPEPLIHIDSVYDRT ADLTIELWSMPTLLGKRYGTSKPLIILTSKDTLGIAEDVAYCLKNLKRATIVGENTAGGTVKMSKMKVGDTDFYVTVPVAKSINPITGKSWEINGVAPDV DVAAEDALDAAIAIIKLRAEIPALAQAAATLIADNYAFPSIGEHVAEKLEAVVAGGEYNLISTKEDLEERLSEDLLKLSEDKCLKTTSNIPALPPM 0 0 NPTPEMFIALIKSSFQTDVFENNIGYLRFDMFGDFEHVATIAQIIVEHVWNKVVDTDALIIDLr 2 1 NNIGGHASSIAGFCSYFFDADKQIVLDHIYDRPSNTTRDLQTLEQLT 1 2 GRRYGSKKSVVILTSGVTAGAAEEFVFIMKRLGRAMIIGETTHGGCQPPETFAVGESDIFLSIPISHSTAQGPSWEGAGIAPHIPVPAGAALDTAKGMLNKHFSGQK* 0 >RBP3x_takRub fugu single upstream exon 42% frameshift no transcripts three domains: 23-323,325-615,618-907 MAPRTPVLLLVLLFCALPVRSFYQHTLVLEMAKLLLENYCIPENLVGMQEAIQRAIKSREILQISDRKTLATVLTVGVQGALNDPRLSVSYEPSFSPLPLQALSSLPVEQQLRLLRN SIKLDILDSDVGYLRIDRIIDEETLLKFGPLLRENVWDKAAQTSSLILDLRFSTAGGWSGIPSIVSYFTEPHSLVHIDTVYDRPSNTTTELWTMSSVRGK TFGGKKDMIVLIGRRTAGAAEAVAYTLKHLNRAIVVGERSAGGSLKVRKFRIAESDFYITMPVARSVSPITGKSWEVSGISPTVNVAAREALAKAQTFLA VRSRIPKVLQIVLDIIGRFYAFADRVQALLQQLESADLFSVVSEEDLAARLNHDLQTASEDPRLIIRHKRDNIPRAEEEPELHAANDHDGELVEGFTVQV LPHNTGYLRLDRFVRCSEGDKLEEIVAEKVWGPLKDTQNLIIDLRHNTGGSSTSVALLLSYLRDPLPKRHFFTIYDSVQNTTTEYGSRPHIPGPSYGSER GVYVLTSHYTAGAAEEFAYLIQSLHFGTVVGEITSGTLMHSKTFQVEGTDIFITVPFINFLDNNGEYWLGGGVVPDAIVLAEEALEHVNRTATFHQGLRSLIGRTGELLEKHYAIQEVAQKVGEV LLSKWAEGLYRSVVDLESLASQLTADLQEASGDHRLHVFRCDVELESLHGVPKIAAVEEAGFVIDALFKSELLPRNVGYLRFDTMADIEAAKGAAPRLVKSVWNKLVDTDSLIIDMRYNA GGSSTAVPLWCSYFVDGEPLQHLYTVYDRTTKTRVEVMTLPEVSGQRYDPGKDVYILTSHMTGSAAEAFVRAMRDLNRVTIVGEPTAGGSLSSATYQIGESVLYASIPNQVVTSAATGKL WSISGVEPDVFAQARDALPVAQRIISARLLKREKGR* 0 >RBP3x_danRer zebrafish single upstream exon 55%/41% transcript DN857398 3 domains: 21-321,324-609,612-901 expressed: inner nuclear layer and ganglion cell layer MAGVFVFILVTYRVLLVNASFQSALVLDMAKILLDNYCFPENLIGMQEAIQQAINSGEILHISDRKTLASVLTAGVQGALNDPRLTVSYEPNYTLITPPA LHSLPTEQLIRLIRSTVKLEVMDNNIGYLRIDRIIGQETVVKLGRLLHNNIWKKVAHTSAMIFDLRFSTAGELSGLPYIVSYFSDSDPLLHIDTIYERPT NITRELWTLPTLLGERFGKRKDLIVLISKRTIGAAEGVAYILKHLKRAVIIGERSAGGSVRVDKLKIGDSGFYITVPVARSVNPVTGQSWEVSGVAPSVT VNPKESIAKAKSLISVRKTIPKAVRRVSDIIKRYYSFKDKIPALLNQLAKADYFTVVSEEDLAGKLNHEMQSVFEDPRLLIKATQVLTDDASSEDRSSSD DLTDPLFKLEMISGNNGYLRFDRFPTPEVLLRLEDHIKKKIWQPVQETENLVIDLRFNTGGSTEALPILLSYMFDTSSSTYLFSIYDSIKNTTFDFHTLN NISGPSYGSTKGVYVLTSYYTAEAGEEFAYLMQSLHRGTVIGEITSGMLLHSKTFQIEQTSLAITVPIINFIDVNGECWLGGGVVPDAIVLAEEALERAH EIIAFHKNIQGLVQEAGDLLEKHYSVPEVAAKVSRLLQSKLTEGLYRSVVDYESLASQLTSDLQETSGDQRLHIFYCETEPETLHDTPKIPSPEEAGFIV EALFKVDVMSGNIGYLRFDMMEDIKVLQAINPEFLKVVWNKLVNTDMLIIDVRYNTGGYSTAIPLLCTYFFDAQPLTHIYTLFDRSTATVTKVTTLPDVL GQKYSSQKDVYILTSHITGSAAEAFTRTMKDLKRATVIGEPTIGGALSSGTYQIGNSILYASIPNQAVLNAVTGKPWSISGVEPHIVAQASDALIVAQKI IATKQQKKNSGK* 0 >RBP3x_salSal Salmo salar transcript frag DY725143 EETAAKLGPLLRENIWTKVTHASSLIFDLRYSTAGELSGVPFIISYFSDPEPLIHIDTVFDRPSNTTKELWTMSSIMGERYGKRKDLIVLTSKRTMGAAEAIAYTLKHLNRAIIVGERSA GGSVKVQKIRIGDSGFYITVPVARSVNPITGQSWEVSGVSPSVNINAKEAVANAKNLLAVRSAIPNAVQSVSDIIRQYYSFTDRVPALLQHLESTDFFSVISEEDLANKFNNELQSVSEDPRLMIKL >RBP3_calMil elephantfish frag 2 domains 6-243,334-531 PPVTRESSPTSDKLPEDPTFLQALVDTVFKVSVLPDNTGYFRFDEFPEISVMSKLVQYIIEKVWLPVKDTDRLIVDLRHNVGGHSSVVPLLLSYFYDPEP PVGLFTVYNRLTNTTSHTTLPGVGQHVYGSRKDIYVLTSHRTATAAEELAYLLQSLNRATIVGEITSGSLLHSRSFQIPSTHLVITIPFINFMDNHGECW LGGGVVPDSIVLAEDTLERTKEIIGFHAQVAELVESTGKLLAVHYAIPEVAAEVSAVLSAKLTQGLYRSVVDWESLASRLTVDLQETSVWSVSGAEPHVI VQANEAMTVALGIINLRAKIPSIFQAAGKLVADNYAFAQTGAGVAETIADLIEGTGYGMINTEGKLAEVLSDTLQQLSGDKHLKAVHIPGDSKHQTPGIAMIQ 0 0 QMPPPEILEDLVKFSYQTKVLENNVGYLRFDMFGDNEMITQVSELMAKHVWNVIASTSSLIVDLR 2 1 YNIGGPTSSIPILCSYFFDDDKTVLLDTVYSRPTDTISEMKAIPQVAGNGSTESSVHSYI 1 2 * 0 >RBP3_petMar lamprey exon3/4 fused, exon4 run-on, fixed genomic frameshift; four domains: 34-312,327-615,625-914,916-1217 0 MAGSREQRTAFSTRLLLLLLLPLATCPSQAPYKFDTAVVLHLAKVLLDNYCIPENLVGMDEAIQRAVDNGELLGVSDPESAASALTEGIQAALNDPRIAV SYVAPPHTFEELLATIPQKTSFAVLDGNVGYLRADEIISEATIKKLGPVIVQRIWNRLVDTDTFVLDLRYNSHGDITGLPYLVSCFCEPRPVVHLDTVYY RPTNESKEIWSLPDLQGARFAKHKDVFVLVSANTEGVAENVAYVLKHLHRATVIGEQTAGGSLEVERFRLGDSRFFVTVPTARSEPADRSWGVFPCVSAP SERALDKALEILNARGVARKAVEAAGELLLSSYTFVERASAIADHLSWSEYGSVVSVEDLTSKLTQDLQSVAEDPRLVVSNREPEWVGAADPPGPPAPLP DDEQMLEAIVDSAFKVEVLEGNIGYLRFDEFGDASAVMKLRKQLVSKVWERIHPTDDVIIDLRYNLGGSSTAIPIVLSYFQDVAPVHFYTVYDRLRNVTA EFHTVSNLTSQLYGSKKGVYLLTSQHTATAAEEFTYLMQSLNRATIVGEITSGRLAHSLAFRLSDTGLYMTVPIVNFIDNNDEYWLGGGVVPDAIVLAENALDAAKEIIEFHAKMASL LELAGALVEGYYAMLSDGENATAEILLKYREGWYRSVVDYEALASQLTSDLHEIWGDHRLHAFYSDLQIERMDEDKTPSVPSPEELSVLIDTVFKVDILANNVGYLRFDMMTDAEVLKHV GPQLVEKVWNKISSTRSLVIDVRYNMGGYSTSIPILCSYFFDASPPRHLYTVFDRPSRSSTQVFTVPRVLGQRYGASKDVYILTSHMTGSAGEILTRVMSDLKRATVIGEPTAGGSLSTG TYRIGDSRLYVFIPNQAGVSPSGGRTWSVAGVEPHVQTKASEALQSALRMVALRADAPSILRTVGKLVADGYSRAEAALGVPSKLAALLEAGEYGALRSEEELAFKLTVHLQLITGDRHL KAVCVPEHATDRMPGIVPMQ 0 0 MPPTESFEDLIKFSFITDVLEGNIGYLRFDLFSDLEALEHVAHLLVEHVWKKICDTEILIIDLR 2 1 YNMGGYSTSIPILCSYFFDASPPRHLYTVFDRPSRSSTQVFTVPRVL 1^2 GQRYGASKDVYILTSHMTGSAGEILTRVMSDLKRATVIGEPTAGGSLSTGTYRIGDSRLYVFIPNQAGVSPSGGRTWSVAGVEPHVQTKASEALQSALRMVALRADAPSILRTVGKLVADGYSRAEAALGVPSKLAALLEAGEYGALRSEEELAFKLTVHLQLITGDRHLKAVCVPEHATDRMPGIVPMQVNVVRTRI* 0 >RBP3_braFlo Branchiostoma floridae Region: 9 exons 1 domain: 83-381 ClpP/crotonase e-38 419-630; misfused to PAPS sulfotransferase 0 MTRPSKVDIVFPIKPFTIPTAHEQVKGEGPVDINKNALCKSADEGHTHP 1 2 VSIAMAPTAYIVFVALVPTVLSVDWLDVVMGIGDVMADHYLDQDLRALNDQSLLQRWNRTLVHRFQ 0 0 SWSQDDMSDSLRMEEGLTSELRNITGDETIK 0 0 VWDFGVYENTTQEPVPREFYNFSTFVDNFK 2 1 KNREKHINVTMLEGNVGYVSIRSMSHIVDIILPDPEMTEFFLSKMAALNESK 0 0 AIILDLRYNLGGDREGVVHWASFFFNATPSVPLSDVYYRDGVNQYWTLLE 0 0 VPGGIRFPDMPLYLLTSNRTSREAEEFAYAMQVVNRTTIIGETT 1 2 AGEEFTGMWFPIDQTDVHLLTRTNVVRNPITQDSWSGK 1 2 GVTPDIIVPSEKALTVALRKIQGSEDTKMAASSGNIEPPRWTVYLVFICTSIAILTYPTFM* 0
RBP3 proteins parsed into constituent modules
>M1_homSap GPTHPALTSLSEEELLAWLQRGLRHEVLEGNVGYLRVDSVPGQEVLSMMGEFLVAHVWGNLMGTSALVLDLRHCTGGQVSGIPYIISYLHPGNTILHVDTIYNRPSNTTTEIWTLPQVLG ERYGADKDVVVLTSSQTRGVAEDIAHILKQMRRAIVVGERTGGGALDLRKLRIGESDFFFTVPVSRSLGPLGGGSQTWEGSGVLPCVGTPAEQALEKALAIL >M2_homSap TLRSALPGVVHCLQ EVLKDYYTLVDRVPTLLQHLASMDFSTVVSEEDLVTKLNAGLQAASEDPRLLVRAIGPTETPSWPAPDAAAEDSPGVAPELPEDEAIRQALVDSVFQVSVLPGNVGYLRFDSFADA SVLGVLAPYVLRQVWEPLQDTEHLIMDLRHNPGGPSSAVPLLLSYFQGPEAGPVHLFTTYDRRTNITQEHFSHMELPGPRYSTQRGVYLLTSHRTATAAEEFAFLMQSLGWATLVG EITAGNLLHTRTVPLLDTPEGSLALTVPVLTFIDNHGEAWLGGGVVPDAIVLAEEALDKAQEVL >M3_homSap EFHQSLGALVEGTGHLLEAHYARPEVVGQTSALLRAKLAQGAYRTAVDLESL ASQLTADLQEVSGDHRLLVFHSPGELVVEEAPPPPPAVPSPEELTYLIEALFKTEVLPGQLGYLRFDAMAELETVKAVGPQLVRLVWQQLVDTAALVIDLRYNPGSYSTAIPLLCS YFFEAEPRQHLYSVFDRATSKVTEVWTLPQVAGQRYGSHKDLYILMSHTSGSAAEAFAHTMQDLQRATVIGEPTAGGALSVGIYQVGSSPLYASMPTQMAMSATTGKAWDLAGVEP DITVPMSEALSIAQDIV >M4_homSap ALRAKVPTVLQTAGKLVADNYASAELGAKMATKLSGLQSRYSRVTSEVALAEILGADLQMLSGDPHLKAAHIPENAKDRIPGIVPMQ IPSPEVFEELIKFSFHTNVLEDNIGYLRFDMFGDGELLTQVSRLLVEHIWKKIMHTDAMIIDMR FNIGGPTSSIPILCSYFFDEGPPVLLDKIYSRPDDSVSELWTHAQVV GERYGSKKSMVILTSSVTAGTAEEFTYIMKRLGRALVIGEVTSGGCQPPQTYHVDDTNLYLTIPTARSVGASDGSSWEGVGVTPHVVVPAEEALARAKEML >M1_bosTau LFQPSLVLEMAQVLLDNYCFPENLMGMQGAIEQAIKSQEILSISDPQTLAHVLTAGVQSSLNDPRLVISYEPSTLEAPP RAPAVTNLTLEEIIAGLQDGLRHEILEGNVGYLRVDDIPGQEVMSKLRSFLVANVWRKLVNTSALVLDLRHCTGGHVSGIPYVISYLHPGSTVSHVDTVY DRPSNTTTEIWTLPEALGEKYSADKDVVVLTSSRTGGVAEDIAYILKQMRRAIVVGERTVGGALNLQKLRVGQSDFFLTVPVSRSLGPLGEGSQTWEGSG VLPCVGTPAEQALEKALAVL >M2_bosTau LRRALPGVIQRLQEALREYYTLVDRVPALLSHLAAMDLSSVVSEDDLVTKLNAGLQAVSEDPRLQVQVVRPKEASSGPE EEAEEPPEAVPEVPEDEAVRRALVDSVFQVSVLPGNVGYLRFDSFADASVLEVLGPYILHQVWEPLQDTEHLIMDLRQNPGGPSSAVPLLLSYFQSPDAS PVRLFSTYDRRTNITREHFSQTELLGRPYGTQRGVYLLTSHRTATAAEELAFLMQSLGWATLVGEITAGSLLHTHTVSLLETPEGGLALTVPVLTFIDNH GECWLGGGVVPDAIVLAEEALDRAQEVL >M3_bosTau EFHRSLGELVEGTGRLLEAHYARPEVVGQMGALLRAKLAQGAYRTAVDLESLASQLTADLQEMSGDHRLLVF HSPGEMVAEEAPPPPPVVPSPEELSYLIEALFKTEVLPGQLGYLRFDAMAELETVKAVGPQLVQLVWQKLVDTAALVVDLRYNPGSYSTAVPLLCSYFFE AEPRRHLYSVFDRATSRVTEVWTLPHVTGQRYGSHKDLYVLVSHTSGSAAEAFAHTMQDLQRATIIGEPTAGGALSVGIYQVGSSALYASMPTQMAMSAS TGEAWDLAGVEPDITVPMSVALSTARDI >M4_bosTau LRAKVPTVLQTAGKLVADNYASPELGVKMAAELSGLQSRYARVTSEAALAELLQADLQVLSGDPHLKTAH IPEDAKDRIPGIVPMQIPSPEVFEDLIKFSFHTNVLEGNVGYLRFDMFGDCELLTQVSELLVEHVWKKIVHTDALIVDMRFNIGGPTSSISALCSYFFDE GPPILLDKIYNRPNNSVSELWTLSQLEGERYGSKKSMVILTSTLTAGAAEEFTYIMKRLGRALVIGEVTSGGCQPPQTYHVDDTDLYLTIPTARSVGAAD GSSWEGVGVVPDVAVPAEAALTRAQEML >M1_monDom IFQPSLVRDMAKILLDNYCFPENLMGMQEVIEQAIKSGEILDISDPQMLASVLTAGVQGALNDPRLVISFEPSIPETPQ HVPKLANVTQEELLILLQQMIKYQVLEGNVGYLRVDYIPGQEVVEKVGEFLVNNIWKKLMGTSSLVLDLQHSSGGEISGIPFVISYLHQGDILLHVDTVY DRPSNTTTEIWTLPQVLGERYGGEKDMVVLTSHRTVGVAEDIAYILKKLRRAIVVGEQTLGGALDLRKLRIGQSDFFITVPVSRSLSPLGGGSQTWEGSG VLPCVGIPAEQALGKALAIL >M2_monDom LRRARPGAIQRLMEVLQNYYTLVDRVPALLHHLTAIDYSSVLTEEDLAAKLNAGLQAVSEDPRLLVRVLRPEEATMGEA EEEDATPAANSLPEDESQRQALVDSVFQVSVLPGNVGYLRFDEFADSSVLGTLAPYVIRQVWEPLQDTNHLIMDLRYNPGGPSSAVPLLLSYFQDPAAGP IRLFTTYDRQTNQTQEHLSRAELLGKPYGAQRGVYLLTSHHTATAAEEFAFLMQSLGRATLVGEITAGSLMHTRTFPLLQPPNGNLVLTVPILTFIDNNG ECWLGGGVVPDAIVLAEEALDKAKEVL >M3_monDom EFHQRLGALVEGTGHLLEAHYALPEVVGQASALLKAKLEHGTYRTAVDFESLASQLTSDLQEVSGDHRLHVFH SPGEPVSEELTPPQKGVPSPEELTYLIEALFKTEVLPGQLGYLRFDMMAEAETVRAIAPQLVELVWEKLVHTEALVVDLRYNPGGYSTAVPLLCSYFFEA EPRRHLYTIFDRAASQLTEVWTLPQVAGERYGSQKDLYILISHTSGSAAEAFVHTMKDQHRATVIGEPTGGGALSVGIYQVENSPLYASMPTQVAISPVT GKAWDMAGVEPDVSVLSSEALMTTQGI >M4_monDom LRAKVPTILQTAGKLVADNYASLEVGSRVASKLAKLQTQYRQVTSEGELADMLGADLQTLSGDRHLKTAHI PEDAKDRIPGIVPMQLPSPEAFEDLIKFSFHTNVFEGNIGYLRFDMFGDCELLTQVSDLLVEHVWKKVVHTDGMIIDMR FNIGGPTSSISALCSYFFDEGQEVLLDQIYNRPNDSISEIWTQSQVA GERYGSKKSVIILTSSMTAGAAEEFVYVMQRLGRALVIGEVTSGGCQPPQTYHVDDTDLYITIPTARSVGSGDKPSWEGVGVAPHVEVPADQALSKAKEM >M1_ornAna genome rife with frameshifts, dels, misassembly SQPSMVLDVAKILLDNYCYPENLMGMQEAIEEAIQRGEILDIADPKRLASVLTAGVQGSLNDPRLVISYEPAPVAVSQ QPPEPASLPAEQPLERLRPAVGSEVLEGNVGYLRVDRLPGREEIERVGAVLGRDIWEKLLGTSALVLDLRHSTGGHVSGIPFFISYFYPEGPALHVDTVY DRPSNATRQLWTLPRVLGARYAADKDVVVLTSRLTAGVAEDVAYILQQMRRAIVVGERTAGGPLVFRKLRVGLSDFFITVPVACSLGPLGGGGRSWEGSG VLPCVAVPADRALDEALDIL >M2_ornAna LRGAVPGAVAHLADLLRDYYALVDRVPALLRHLAALDLSSVLSEEDLTSRLNAGLQAASEDPRLLVRRLEPEEAERGPPRKEEEQKEEE EEDQPSPGASILPGDGSSREAPLFRVSVLPGNVGYLCFDEFPEASALERLGPLLGRRVWEPLEATDHLMVDLRNNPGGPSSAVPLLLSYFQDPAAGPIRLFTTYNRPADVTREYASRAGA LEKPYGARRGVYLLTSHRTATAAEEFAYLMQALGRATLVGEITAGRLLHSRTFPLLRPPWEGLVLTVPFLTLFDPHGEGWLGGGVVPDAIVLAEEALEKAGEVL >M3_ornAna frag FHQTLEALVETTGHLLEAHYCFPAGARRAGAQPWPVAGVEPDVMAQAAEALAVAQGIAA >M4_ornAna LRSKVPTVLRTAAKLVADNYAFRETGAGVAAQMGGLQARCGRVTSEGALAEVLGAHLRALSGDPHLQMVYIPEDAKDRIPGVVPMQ IPSAETFEDLIKFSFHTSVMEGNIGYLRFDMFGDCELLTQVSELMVEHVWKKIVHTDGLIIDMR NIGGPTSSISALCSYFFDEDHPVLLDKIYNRPNDSISEIWTHSHIA GERYGSRKSVVILTSNMTAGAAEEFVSIMKRLGRALVVGEVTGGGCHPPQTYHVDDTHLYITIPTSRSVGSEDGSSWEGVGVTPHLVVPADVALSRAKDL >M1_taeGut Taeniopygia guttata IFQPTLVLDMAKVLLDNYCYPENLVGMQEAIEQAIKSGEILDISDPKMLANVLTAGVQGALNDPRLVISYEPLPHSGPK QEAEGSPTREQLLSLIEHVIMYDKLEGNVGYLRIDYIIGEEVVQKVGAFLVDKVWKTLIETSALVIDLRHSTGGQISGLPFIISYLHEQDKILHVETVYN RPSNTTTEIWTLPKVLGERYSKDKDVIVLISHHTTGVAEDVAYILKHMNRAITVGEKTAGGSLDIQKLRIGPSNFYMMVPVSRSVSPLSGGGQSWEVSGV MPCVATEAEQALQKSLDIL >M2_taeGut VRRAVPGTISHLKNILKDYYSLVERVPALLRRLTTSDFSSVQSSEDLATKLNTELQALSDDPRLMVRVMMPGEAADSPAE KPVGMAADLPDNEQLLHALVDTVFKVSVLPGNVGYMRFDEFADASVLVKLGPYLVHKVWEPLQNTENLIMDLRYNLGGPSSSAVPVLLSYFQDPAAGPVH LFTTYDRRTNHTQEHNSQAELLGQSYGAKRGVYLLTSHHTATAAEEFAYLMQSLGRATLIGEITAGSLSHTRTFPLLQPGPGITRGLTITVPVITFIDNH GESWMGGGVVPDAIVLAEDALEKAEEVLA >M3_taeGut FHKNMGVLLEGTGQLLEDHYAIPEVAAKASAMLSTKRAQGGYRSAIDSETLASQLTSDLQEASGDHRLHVF HSHVEPTPEEQLPNVIPSPEELSYIIEALFKIEVLPGNLGYLRFDMMAEAETVKAIGPQLLQMVWNKLVDTDAMIIDMRYNTGGYSTAIPILCSYFFDPE PRKHLYTVFDRSTSRSTEVWTLPQLAGKRYGSLKDIYILTSHMSGSAAEAFTRSMKDLHRATVVGEPTVGGSLSVGIYRVGNSSLYASIPSQVVLSPVTG KVWSVSGVEPHITIQASEAMAAAQHI >M4_taeGut ANLRAQVPQILQTVGKLVADNYAFVNTGTVIASNLTKNIHKDNYKRINTEEDLAGKVTAILQALSDDKHLKLLY IPEHAKDSIPGIMPK QIPPPEVFEDLIKFSFHTNVFENNIGYLRFDMFGDSELLTQLSDLMIEHVWKKIFHTDALIIDLR YNIGGSTTPIAILCSYFFDEGHPVLLDRVYDRPSDSVKEIWTQPQLK GERYGSQKGLVILTSAVTAGAAEEFVYIMKRLSRALIIGEQTSGGCHSPQTYQVDETNFYVVIPTSRSVTSADSTSWEGKGVSPHIETPAETALIKAKEM >M1_galGal IFQPTLVLDMAKVLLDNYCYPENLVGMQEAIEQAIKSGEILDISDPKMLANVLTAGVQGALNDPRLVISYEPSLHAAPKQ EAETYPTREQLLSLIEHVVIYDKLEGNVGYLRIDYIIGQEVVEKVGAFLVDKVWKTLINTSALVIDLRYSTGGQISGIPFIISYLHEADKMLHVETVYNR PSNTTTEIWTLPKVLGERYSKDKDVIVLISHHTTGVAEDVAYILKHMNRAITLGEKTAGGSLDIQKLRIGPSNFYMMVPVSRSVSPLSGGGQSWEVSGVM PCVASEAEQALKKSLDIL >M2_galGal AVRRAVPGTLSRLTDILKDYYSLVERVPVLLRHLTTSDFSSVQSAEDLATKLNTEMQTLSEDPRLLVRTMMPGEAAAPPAEM PIAMAANLPDNEQLLHALVDTVFKVSVLPGNVGYMRFDEFADASVLVKLGPYIVKKVWEPLQNTENLIMDLRYNPGGPSSSAVPMLISYFQDPTAGPVHL FTTYDRRTNHTQEHNSQAELLAQPYGAQRGIYVLTSRHTATAAEEFAYLMQSLGRATLIGEITAGSLSHTCTFPLVQPEQGITRGLTITVPVITFIDNHG ESWMGGGVVPDAIVLAEDALEKAEEVL >M3_galGal LLESTGQLLEAHYAIPEVAEKASVMLSTKRVQGGYRSAVDFETLASQLTSDLQEASGDHRLHVFH SHVEPTPEEQLPNMIPSPEELSYIIEALFKIEVLPGNLGYLRFDMMAEAETVKAIGPQLVQMVWNKLVDTDAMIIDMRYNTGGYSTAVPILCSYFFEPEP RQHLYTVFDRSTSRSTEVWTLPKVTGKRYGSLKDIYILTSHMSGSAAEAFTRSMKDLHRATVIGEPTVGGSLSVGIYRVGNSSLYRSIPSQVVLSPVTGK VWSVSGAEPHITIQASEALAAAKHI >M4_galGal ASLRTQVPQIVQTVGKLVAENYAFVDIGTDIASNLTKSVNKENYKRINSEKELARKLTAILQALSDDEHLKILYI PEHAKDSIPGILPK QIPSPEVFEDLIKFSFHTNVFENNIGYLRFDMFGDCELLTQVSDLLVEHVWKKIVHTDALIIDMR YNIGGYTNSIPILCSYFFDEGHQVLLDKVYDRPSDSVKEIWTQPQLR GERYGSQKGLIILTSAVTAGAAEEFVFIMKRLGRALIIGEQTSGGSHSPQTYQVDDTNFYIIIPTARSVISAESASWEGKGVPPHMETPAVTALIKAKEVL >M1_anoCar VLQSTLVLDMAKLLLDNYCLPENLVGMREAIEQAIKNGEVLDISDPKLLATVLTAGVQGALNDPRLVISYEPTAPAAPK QRMETSLTPEQLLSLIQHTVKYEVLDDNVGYLRIDYIMGQDIVQKIGSFLVEKVWKTLLGTSALILDLRYTTGGDVSGIPFIISYLYNGDKVLHVDTVYN RPSNTTVEILTLPKVLGVRYSKDKDVILLISKYTTGVAENVAYILKHMHRTIIVGEKSAGGSLDTQKMQIGNSQFYMTVPLSCSVSPLSGSGQSWEISGV TPCVVISAEQALDKALAIL >M2_anoCar SLRKAIPNSMSYLVDIIKNNYSMLEQVPVLLQHLSTFDYSSVLSVKDLASKLNAELQTISEDPRLFLRVPASDEAVTSQTD EKVAMASDLPNNEQLMKALVMTVFKVSVLPGNVGYMRFDEFGDATVLVKLGPYLLQHVWEPLQATDYLIIDLRYNIGGPSSSAVPVLLSYFQDPSAGPVH FFTTYNRLTNQTQAYSSSAEMVGKPYGARRGVYLLTSHNTATAAEEFAYLMQTLGRATLVGEITAGSLSHTHTFCILELGGGCGLLINVPVITLIDNHGE YWLGGGVVPDSIVLADEALEKAREVLE >M3_anoCar EAHYAIPEMARRVSSMLNSKLAQGGYRTAVDFETLASQLTNDLQETSGDHQLHVFHS HVEPSLEEQSPFKTLTPEELNFIIEALFKVDVLPGNVGYLRFDMMAEFESVKTIEPQILHMVWEKLVETSAMIVDMRYNTGSYSTAVPMFCSYFFDAEPQ QHLYTIIDRSTSQSTEVWTSSQVSGKRYGSTKDLYILISHASGSAAEAFTRSLKDLHRATVIGEPTVGGSLSASIYNIGSTPLYASIPSQIVLSPVSGKV WSLSGIQPHVTTQSNEALASAQNII >M4_anoCar LFRTKLPSVLNTIGKLVADNYAFADIGATVAAKFADYAKKGTYRKINSEIELSGKLAADLKALSGDRHLMISHIP ERSKGRILGLVPMQ QIPPPEILEDLIKFSLHTNVFENNIGYLRFDMFGDCELMSQVSELLVQHVWNKIVNTDALIIDMR YNVGGPACSVPLLCSYFFDEGHPILLDKVYNRPNDTTSNIWTVSKLA GKRYGLNKGLIILTSSVTSGAAEEFAHIMKRLGRAFIIGQKTSGGCHPPQTFHVDGTNLYITTPVSRSVFSVNDSWEGVGVSPHLDVSTDVALIKAKEML >M1_xenLae Xenopus laevis LFQPSLVMDMAKVLLDNYCFPENLVGMQETIEQAVKGGEILHISDPDTLANVFTSGVQGYLNDPRLVVSYEPNYSGPQT EQSLELTPEQLKFLINHSVKYDILPGNIGYLRIDFIIGQDVVQKVGPHLVNNIWKKLMPTSALILDLRYSTQGEVSGIPFVVSYLCDSEIHIDSIYNRPS NTTTDLWTLPELMGERYGKVKDVVVLTSKYTKGVAEDASYILKHMNRAIVVGEKTAGGSLDTQKIKIGQSDFYITVPVSRSLSPLTGQSWEVSGVSPCVV VNAKDALDKAQAIL >M2_xenLae AVRSSVTHVLHQLCDILANNYAFSERIPTLLQHLPNLDYSTVISEEDIAAKLNYELQSLTEDPRLVLKSKTDTLVMPGDSIQAENI PEDEAMLQALVNTVFKVSILPGNIGYLRFDQFADVSVIAKLAPFIVNTVWEPITITENLIIDLRYNVGGSSTAVPLLLSYFLDPETKIHLFTLHNRQQNS TDEVYSHPKVLGKPYGSKKGVYVLTSHQTATAAEEFAYLMQSLSRATIIGEITSGNLMHSKVFPFDGTQLSVTVPIINFIDSNGDYWLGGGVVPDAIVLA DEALDKAKEII >M3_xenLae FHPSIFPLVKGTGHLLEVHYAIPEVAYKVSSVLQNKWSEGGYRSVVDLESLASLLTSEMQENSGDHRLHVFYSDTEPEILEDQPPKIP SPEELNYIIDALFKIEVLPGNVGYLRFDMMADTEIIKAIGPQLVSLVWNKLVETNSLIIDMRYNTGGYSTAIPIFCSYFFDPEPLQHLYTVYDRSTSTGK DIWTLPEVFGERYGSTKDIYILTSHMTGSAAEVFTRSLKDLNRATLIGEPTSGVSLSVGMYKVGDSNLYVTIPNQVVISSVTGKVWSVSGVEPHVIIQAN EAMNIAHRII >M4_xenLae KLRTKIPTVIQTAAKLVADNYAFADTGANVASKFIALVDKIDYKMIKSEVELAEKINDDLQSLSKDFHLKAVYIPENSKDRIPGVVPM QIPSPELFEELIKFSFHTDVFEKNIGYIRFDMFADSDLLNQVSDLLVEHVWKKVVDQDALIIDMR FNIGGPTSSIPIFCSYFFDEGTPVLLDKIYSRTSNAMTDIWTLPDLV GKTFGSKKPLIILTSSLTEGAAEEFVYIMKRLGRAYVVGEVTSGGCHPPQTYHVDDTHLYLTIPTSRSASAEPGESWEGKGVLPDLEISSETALLKAKEIL >M1_xenTro Xenopus tropicalis 89% xenLae VFQPSLVMDMAKVLLDNYCFPENLVGMQETIEQAMKSGEILHISDPETLANVFTSGVQGFLNDPRLVVSYEPNYSGPRK EQSPEPTLEQLKFLLDHSVTYDLLPGNIGYLRIDFIIGQDVVQKVGPLLVNNIWKKLMPSSALILDLRYSTQGKVSGIPFVVSYLTDPQIHIDSIYNRPS NTTTDLWTLSELMGERYGKDKDVVVLTSKYTEGIAEGAAYILKHMSRAIVVGEKTAGGSLDIQKIKIGQSEFYITVPVSRSISPLTGQSWEVAGVFPCVV VNANNALNKAQGIL >M2_xenTro AVRSSITHILLQLSEILVNNYAFSERIPTLLQHLPNLDYSSVISEEDITAKLNYELQSLTEDPRLVLKSKTDSLVMPEDSTQVENL PDDEATLQALVNTVFKVSILPGNIGYLRFDEFADVSVLAKLGPYIVNTVWDPITVTENLIIDLRYNIGGSSTSIPLLLSYFQEPENRIHLFTIYNRQQNS TNEVYSLPKVLGKPYGSKKGVYVLTSHETATAAEEFAYLMQSLSRATIIGEITSGNLMHSKAFPLDGTRLSVTVPIMNFIDNNGDYWLGGGVVPDAIVLA DEALDKAKEII >M3_xenTro FHPSVFALVEGTGHLLEVHYAIPEVAYKVSSVLQNKWSEGGYRSVVDLESLASQLTSEMQENSGDHRLHVFYSDTEPEILEDQPPKIP SAEELNYIIDALFKIEVLQGNVGYLRFDMMADTEIIKAIGPQLVSLVWNKLVETNSLIIDMRYNTGGYSTAIPIFCSYFFDPEPLQHLYTVYDRSTSSGT DIWTLPEVVGERYGSTKDIYILTSHMTGSAAEVFTRSMKELNRATIIGEPTSGVSLSVGMYKVGESNLYVSIPNQVVISSVTGKVWSVSGVEPHVIAQAS EAMNVAHHII >M4_xenTro KLRTKIPSVIQTAGKLVADNYAFADTGADVASKLIALVDKINYKMIKSEVELAEKLNYDLQSLSKDVHLKAVYIPENSKDRIPGVVPMQ IPSPEMFEDLIKFSFHTDVFEKNLGYIRFDMFADSDLLNQVSDLLVEHVWKKVVNQDALIIDMr FNIGGPTSSIPTFCSYFFDEGTPVLLDKIYSRTTNAITDVWTLPHLV GNAFGSKKPVIILTSSLTEGAAEEFVYIMKRLGRAYVIGEVTSGGCHPPQTYHVDDTHLYLTIPTSRSASAKPGESWEGKGVLPDLEITSETALMKAKEIL >M1_tetNig AFPPSLIADMAKIVLDNYCSPEKLAGMKEAIKAAGTNTEVLNIPDGESLARVLSAGVQGTVSDPRLMVSFQPNYVPAG PHKMPPLPPEHLVAVLQTSVKLDILEGNTGYLRIDHILGEEVADKVGPALIDLIWNKILPTSALIFDLRYTSSGDISGIPYIVSYFTQAEPVVHIDSVYD RPSNTTTKLLSLPNLLGQRYGVSKPLIVLTSKNTKGIAEDVAYCLKNLKRATIVGEKTAGGSLKLDTFKVGDTDFYITVPTAKSINPITGSSWEIRGVTP HVEVNAEDALATAIKIV >M4_tetNig LRAQIPAIIEGTAALVANNYAFEATGADVAKELRELQANGQYSSVVSKESLEAALSADLQRLSGDKSLKTTPNTPVLPPM DYTPEMYIELIKVSFHTDVFENNIGYLRFDMFGDFEEVKAIAQIIVEHVWNKVVNTDALILDLr NNVGGPTTAIAGFCSYFFDADKQNRVGQAVRQASGTTTELLTLSELT GVRYGSKKSLIILTSGATAGAAEEFVYIMKKLGRAMIVGETTAGASHPPQTFRVGETDVFLLIPTVHSDTGAGPAWEGAGIAPHIPASAEAALGTAR >M1_takRub two domains: 3-324,326-61plus upstream dup AFPPSLITDMAKIVLDNYCSPEKLAGMKEAIEAAGTNTEVLNIPDGESLARVLSAGVQGTVSDSRLMVSYQPDYVPAV PPKMPPLPPEHLVAVLQTSIKLDLLEGNTGYLRIDHIIGEDVAEKVGPSLIDLIWNKILPTSALIFDLRYTSSGEISGIPYIVSYFTQAEPVVHIDSVYD RPSNTTTKLFSLSNLLGERYGITKPLIILTSKNTKGIAEDVAYCLKNLKRATIVGERTAGGSVKLDNFKVGSTDFYITVPTAKSINPVTGSSWEITGVKP DVEVNAEDALATAIKIV >M4_takRub LRAQIPAIIEGAATLIAKNYAFEATGADVATKLRELLAKGQYNSVVSSESLEVALSADLQRLSGDKSLKATQNAPVLPPM DYSPEMYIELIKVSFHTDVFENNIGYLRFDMFGDFEEVKAIAQIIVEHVWNKVVNTDALILDLR NNVGGPTTAIAGFCSYFFDADKLIVLDKLHDRPSGTTTELLTLPELT GVRYGSKKSLIILTSGATAGAAEEFVYIMKKLGRAMIVGETTAGASHPPQVFSVGEIGIFLSIPTVHSDTAAGPAWEGTGITPHIPVSAEAALGTAK >M1_gasAcu two domains:7-317,323-61no upstream dup FAPNVIIDMAKIVIDNYCSPEKLAGMKEAIEAAGSNTEVLSIPDAETLANVLSAGVQTTVSDPRLMISYEPNYVPVV PPKMPPLPPDQVIAVLQTSIKLDILEGNIGYLRIDHILGEDVAEKVGPLLLDLVWNKILPTSALIFDLRYTSSGDISGIPYIVSYFTEAGTPIHIDSIYD RPSNTTTKLFSMSTLLGERYSTSKPLIILTSKNTKGIAEDVAYCLQNLKRATIVGEKTAGGSVKVDKIQVRDTGFYVTVPTAKSVNPITGSTWEVTGVTP NVEVNAEDALATAIKIV DALATAIKIV >M4_gasAcu TLLNRVPAIIEGSATLIADNYAFEDIGAAVAEKLKGLLANGEYSKVVSKDSLEMKLSADLRTLSGDKSLKTTSNVPALPPM NYSPEMYIELIKVSFHTDVFEDNIGYLRFDMFGDFEEVKAIAQIIVEHVWNKVVNTDAMIVDLR NNIGGPTTAIAGFCSYFFDSDKQIVLDRLYDRPSGTTTELRTLPELT GTRYGSKKSLVMLTSRATAGAAEEFVYIMKKLGRAMIVGETTAGTSHPPKTFRVGETDIFLSIPTVHSDTAAGPAWEGAGVAPHIPVPADAALETAKGIFKKHFAGQK >M1_oryLat two domains:8-314,320-605 no upstream dup SFPPSLITDLAKIVMDNYCSPEKLSGMKEDIATAGANTDVLNIPDGEALAKVLTDGVQTTVSDPRLRVSYEPNYVPVV PPQLPPEQLIAVLQTSIKLDILEGNIGYLRIDSIIGEEVAEKVGPLLLELVWSKILPTSALIFDLRYTSSGDITGIPYIISYLTDAKSEIHIDTIYDRPL NTTTKLLSMQSTLGQTYGGTKPLLVLTSKNTKDIAEDVAYCLKNLKRATIVGEKTAGGSAKIKKFRVGDTDFYVTLPTAKSINPITGSSWEVTGVKPNVE VNAEEALATALKII >M4_oryLat LRLQVPAIIEESATLVANNYAFESTAADVAEKLKGHLANGDYNMVVSKESLEAKLSADLQSLSGDKSLTVSSNTGAPPPM EYTPEMYIELIKISFHTDVFENNIGYLRFDMFGDFEEVKAIAQVIVEHVWNKVLHTDAMIIDLR NNVGGPTTAIAGFCSYFFDGDKQILLDKLYDRSTGTTTDLLTLGELT GERYGSKKSLIILASRATAGAAEEFVYIMKRLGRAMIVGETTAGASHPPKVFQVGESDIFLSIPTVHSDTSAGPGWEGAGVAPHIPVAAGAALETAK >M1_danRer upstream frag as well two domains:2-322,324-609 FSPTLIADMAKIFMDNYCSPEKLTGMEEAIDAASSNTEILSISDPTMLANVLTDGVKKTISDSRVKVTYEPDLILAAPP AMPDIPLEHLAAMIKGTVKVEILEGNIGYLKIQHIIGEEMAQKVGPLLLEYIWDKILPTSAMILDFRSTVTGELSGIPYIVSYFTDPEPLIHIDSVYDRT ADLTIELWSMPTLLGKRYGTSKPLIILTSKDTLGIAEDVAYCLKNLKRATIVGENTAGGTVKMSKMKVGDTDFYVTVPVAKSINPITGKSWEINGVAPDV DVAAEDALDAAIAII >M4_danRer KLRAEIPALAQAAATLIADNYAFPSIGEHVAEKLEAVVAGGEYNLISTKEDLEERLSEDLLKLSEDKCLKTTSNIPALPPM NPTPEMFIALIKSSFQTDVFENNIGYLRFDMFGDFEHVATIAQIIVEHVWNKVVDTDALIIDLr NNIGGHASSIAGFCSYFFDADKQIVLDHIYDRPSNTTRDLQTLEQLT GRRYGSKKSVVILTSGVTAGAAEEFVFIMKRLGRAMIIGETTHGGCQPPETFAVGESDIFLSIPISHSTAQGPSWEGAGIAPHIPVPAGAALDTAK >M1x_takRub single upstream exon 42% frameshift no transcripts three domains:3-323,325-615,618-907 TLVLEMAKLLLENYCIPENLVGMQEAIQRAIKSREILQISDRKTLATVLTVGVQGALNDPRLSVSYEPSFSPLPLQALSSLPVEQQLRLLRN SIKLDILDSDVGYLRIDRIIDEETLLKFGPLLRENVWDKAAQTSSLILDLRFSTAGGWSGIPSIVSYFTEPHSLVHIDTVYDRPSNTTTELWTMSSVRGK TFGGKKDMIVLIGRRTAGAAEAVAYTLKHLNRAIVVGERSAGGSLKVRKFRIAESDFYITMPVARSVSPITGKSWEVSGISPTVNVAAREALAKAQTFL >M2x_takRub AVRSRIPKVLQIVLDIIGRFYAFADRVQALLQQLESADLFSVVSEEDLAARLNHDLQTASEDPRLIIRHKRDNIPRAEEEPELHAANDHDGELVEGFTVQV LPHNTGYLRLDRFVRCSEGDKLEEIVAEKVWGPLKDTQNLIIDLRHNTGGSSTSVALLLSYLRDPLPKRHFFTIYDSVQNTTTEYGSRPHIPGPSYGSER GVYVLTSHYTAGAAEEFAYLIQSLHFGTVVGEITSGTLMHSKTFQVEGTDIFITVPFINFLDNNGEYWLGGGVVPDAIVLAEEALE >M3x_takRub FHQGLRSLIGRTGELLEKHYAIQEVAQKVGEV LLSKWAEGLYRSVVDLESLASQLTADLQEASGDHRLHVFRCDVELESLHGVPKIAAVEEAGFVIDALFKSELLPRNVGYLRFDTMADIEAAKGAAPRLVKSVWNKLVDTDSLIIDMRYNA GGSSTAVPLWCSYFVDGEPLQHLYTVYDRTTKTRVEVMTLPEVSGQRYDPGKDVYILTSHMTGSAAEAFVRAMRDLNRVTIVGEPTAGGSLSSATYQIGESVLYASIPNQVVTSAATGKL WSISGVEPDVFAQARDALPVAQRII >M1x_danRer FQSALVLDMAKILLDNYCFPENLIGMQEAIQQAINSGEILHISDRKTLASVLTAGVQGALNDPRLTVSYEPNYTLITPPA LHSLPTEQLIRLIRSTVKLEVMDNNIGYLRIDRIIGQETVVKLGRLLHNNIWKKVAHTSAMIFDLRFSTAGELSGLPYIVSYFSDSDPLLHIDTIYERPT NITRELWTLPTLLGERFGKRKDLIVLISKRTIGAAEGVAYILKHLKRAVIIGERSAGGSVRVDKLKIGDSGFYITVPVARSVNPVTGQSWEVSGVAPSVTVNPKESIAKAKSLI >M2x_danRer SVRKTIPKAVRRVSDIIKRYYSFKDKIPALLNQLAKADYFTVVSEEDLAGKLNHEMQSVFEDPRLLIKATQVLTDDASSEDRSSSD DLTDPLFKLEMISGNNGYLRFDRFPTPEVLLRLEDHIKKKIWQPVQETENLVIDLRFNTGGSTEALPILLSYMFDTSSSTYLFSIYDSIKNTTFDFHTLN NISGPSYGSTKGVYVLTSYYTAEAGEEFAYLMQSLHRGTVIGEITSGMLLHSKTFQIEQTSLAITVPIINFIDVNGECWLGGGVVPDAIVLAEEALERAHEII >M3x_danRer FHKNIQGLVQEAGDLLEKHYSVPEVAAKVSRLLQSKLTEGLYRSVVDYESLASQLTSDLQETSGDQRLHIFYCETEPETLHDTPKIPSPEEAGFIV EALFKVDVMSGNIGYLRFDMMEDIKVLQAINPEFLKVVWNKLVNTDMLIIDVRYNTGGYSTAIPLLCTYFFDAQPLTHIYTLFDRSTATVTKVTTLPDVL GQKYSSQKDVYILTSHITGSAAEAFTRTMKDLKRATVIGEPTIGGALSSGTYQIGNSILYASIPNQAVLNAVTGKPWSISGVEPHIVAQASDALIVAQKII >M2_calMil frag domains 6-243,334-531 VTRESSPTSDKLPEDPTFLQALVDTVFKVSVLPDNTGYFRFDEFPEISVMSKLVQYIIEKVWLPVKDTDRLIVDLRHNVGGHSSVVPLLLSYFYDPEP PVGLFTVYNRLTNTTSHTTLPGVGQHVYGSRKDIYVLTSHRTATAAEELAYLLQSLNRATIVGEITSGSLLHSRSFQIPSTHLVITIPFINFMDNHGECW LGGGVVPDSIVLAEDTLERTKEII >M3_calMil frag GFHAQVAELVESTGKLLAVHYAIPEVAAEVSAVLSAKLTQGLYRSVVDWESLASRLTVDLQETSVWSVSGAEPHVI VQANEAMTVALGIIN >M4_calMil frag LRAKIPSIFQAAGKLVADNYAFAQTGAGVAETIADLIEGTGYGMINTEGKLAEVLSDTLQQLSGDKHLKAVHIPGDSKHQTPGIAMIQ QMPPPEILEDLVKFSYQTKVLENNVGYLRFDMFGDNEMITQVSELMAKHVWNVIASTSSLIVDLR YNIGGPTSSIPILCSYFFDDDKTVLLDTVYSRPTDTISEMKAIPQVAGNGSTESSVHSYI >M1_petMar exon3/4 fused, exon4 run-on, fixed genomic frameshift; four domains: 34-312,327-615,625-914,916-1217 KFDTAVVLHLAKVLLDNYCIPENLVGMDEAIQRAVDNGELLGVSDPESAASALTEGIQAALNDPRIAV SYVAPPHTFEELLATIPQKTSFAVLDGNVGYLRADEIISEATIKKLGPVIVQRIWNRLVDTDTFVLDLRYNSHGDITGLPYLVSCFCEPRPVVHLDTVYY RPTNESKEIWSLPDLQGARFAKHKDVFVLVSANTEGVAENVAYVLKHLHRATVIGEQTAGGSLEVERFRLGDSRFFVTVPTARSEPADRSWGVFPCVSAP SERALDKALEIL >M2_petMar ELLLSSYTFVERASAIADHLSWSEYGSVVSVEDLTSKLTQDLQSVAEDPRLVVSNREPEWVGAADPPGPPAPLP DDEQMLEAIVDSAFKVEVLEGNIGYLRFDEFGDASAVMKLRKQLVSKVWERIHPTDDVIIDLRYNLGGSSTAIPIVLSYFQDVAPVHFYTVYDRLRNVTA EFHTVSNLTSQLYGSKKGVYLLTSQHTATAAEEFTYLMQSLNRATIVGEITSGRLAHSLAFRLSDTGLYMTVPIVNFIDNNDEYWLGGGVVPDAIVLAENALDAAKEII >M3_petMar FHAKMASL LELAGALVEGYYAMLSDGENATAEILLKYREGWYRSVVDYEALASQLTSDLHEIWGDHRLHAFYSDLQIERMDEDKTPSVPSPEELSVLIDTVFKVDILANNVGYLRFDMMTDAEVLKHV GPQLVEKVWNKISSTRSLVIDVRYNMGGYSTSIPILCSYFFDASPPRHLYTVFDRPSRSSTQVFTVPRVLGQRYGASKDVYILTSHMTGSAGEILTRVMSDLKRATVIGEPTAGGSLSTG TYRIGDSRLYVFIPNQAGVSPSGGRTWSVAGVEPHVQTKASEALQSALRMV >M4_petMar ALRADAPSILRTVGKLVADGYSRAEAALGVPSKLAALLEAGEYGALRSEEELAFKLTVHLQLITGDRHLKAVCVPEHATDRMPGIVPMQ MPPTESFEDLIKFSFITDVLEGNIGYLRFDLFSDLEALEHVAHLLVEHVWKKICDTEILIIDLR YNMGGYSTSIPILCSYFFDASPPRHLYTVFDRPSRSSTQVFTVPRVLGQRYGASKDVYILTSHMTGSAGEILTRVMSDLKRATVIGEPTAGGSLSTGTYRIGDSRLYVFIPNQAGVSPSGGRTWSVAGVEPHVQTKASEALQSA >M3/4_braFlo Branchiostoma floridae Region: 9 exons VVMGIGDVMADHYLDQDLRALNDQSLLQRWNRTLVHRFQ SWSQDDMSDSLRMEEGLTSELRNITGDETIK VWDFGVYENTTQEPVPREFYNFSTFVDNFK KNREKHINVTMLEGNVGYVSIRSMSHIVDIILPDPEMTEFFLSKMAALNESK AIILDLRYNLGGDREGVVHWASFFFNATPSVPLSDVYYRDGVNQYWTLLE VPGGIRFPDMPLYLLTSNRTSREAEEFAYAMQVVNRTTIIGETT AGEEFTGMWFPIDQTDVHLLTRTNVVRNPITQDSWSGK GVTPDIIVPSEKALTVALRK
RBP3 proteins parsed by module class
>M1_homSap GPTHPALTSLSEEELLAWLQRGLRHEVLEGNVGYLRVDSVPGQEVLSMMGEFLVAHVWGNLMGTSALVLDLRHCTGGQVSGIPYIISYLHPGNTILHVDTIYNRPSNTTTEIWTLPQVLGERYGADK DVVVLTSSQTRGVAEDIAHILKQMRRAIVVGERTGGGALDLRKLRIGESDFFFTVPVSRSLGPLGGGSQTWEGSGVLPCVGTPAEQALEKALAIL >M1_bosTau LFQPSLVLEMAQVLLDNYCFPENLMGMQGAIEQAIKSQEILSISDPQTLAHVLTAGVQSSLNDPRLVISYEPSTLEAPPRAPAVTNLTLEEIIAGLQDGLRHEILEGNVGYLRVDDIPGQEVMSKLR SFLVANVWRKLVNTSALVLDLRHCTGGHVSGIPYVISYLHPGSTVSHVDTVYDRPSNTTTEIWTLPEALGEKYSADKDVVVLTSSRTGGVAEDIAYILKQMRRAIVVGERTVGGALNLQKLRVGQSDF >M1_monDom IFQPSLVRDMAKILLDNYCFPENLMGMQEVIEQAIKSGEILDISDPQMLASVLTAGVQGALNDPRLVISFEPSIPETPQHVPKLANVTQEELLILLQQMIKYQVLEGNVGYLRVDYIPGQEVVEKVG EFLVNNIWKKLMGTSSLVLDLQHSSGGEISGIPFVISYLHQGDILLHVDTVYDRPSNTTTEIWTLPQVLGERYGGEKDMVVLTSHRTVGVAEDIAYILKKLRRAIVVGEQTLGGALDLRKLRIGQSDF >M1_ornAna genome rife with frameshifts, dels, misassembly SQPSMVLDVAKILLDNYCYPENLMGMQEAIEEAIQRGEILDIADPKRLASVLTAGVQGSLNDPRLVISYEPAPVAVSQQPPEPASLPAEQPLERLRPAVGSEVLEGNVGYLRVDRLPGREEIERVGA VLGRDIWEKLLGTSALVLDLRHSTGGHVSGIPFFISYFYPEGPALHVDTVYDRPSNATRQLWTLPRVLGARYAADKDVVVLTSRLTAGVAEDVAYILQQMRRAIVVGERTAGGPLVFRKLRVGLSDFF >M1_galGal IFQPTLVLDMAKVLLDNYCYPENLVGMQEAIEQAIKSGEILDISDPKMLANVLTAGVQGALNDPRLVISYEPSLHAAPKQEAETYPTREQLLSLIEHVVIYDKLEGNVGYLRIDYIIGQEVVEKVGA FLVDKVWKTLINTSALVIDLRYSTGGQISGIPFIISYLHEADKMLHVETVYNRPSNTTTEIWTLPKVLGERYSKDKDVIVLISHHTTGVAEDVAYILKHMNRAITLGEKTAGGSLDIQKLRIGPSNFY >M1_taeGut Taeniopygia guttata IFQPTLVLDMAKVLLDNYCYPENLVGMQEAIEQAIKSGEILDISDPKMLANVLTAGVQGALNDPRLVISYEPLPHSGPKQEAEGSPTREQLLSLIEHVIMYDKLEGNVGYLRIDYIIGEEVVQKVGA FLVDKVWKTLIETSALVIDLRHSTGGQISGLPFIISYLHEQDKILHVETVYNRPSNTTTEIWTLPKVLGERYSKDKDVIVLISHHTTGVAEDVAYILKHMNRAITVGEKTAGGSLDIQKLRIGPSNFY >M1_anoCar VLQSTLVLDMAKLLLDNYCLPENLVGMREAIEQAIKNGEVLDISDPKLLATVLTAGVQGALNDPRLVISYEPTAPAAPKQRMETSLTPEQLLSLIQHTVKYEVLDDNVGYLRIDYIMGQDIVQKIGS FLVEKVWKTLLGTSALILDLRYTTGGDVSGIPFIISYLYNGDKVLHVDTVYNRPSNTTVEILTLPKVLGVRYSKDKDVILLISKYTTGVAENVAYILKHMHRTIIVGEKSAGGSLDTQKMQIGNSQFY >M1_xenTro Xenopus tropicalis 89% xenLae VFQPSLVMDMAKVLLDNYCFPENLVGMQETIEQAMKSGEILHISDPETLANVFTSGVQGFLNDPRLVVSYEPNYSGPRKEQSPEPTLEQLKFLLDHSVTYDLLPGNIGYLRIDFIIGQDVVQKVGPL LVNNIWKKLMPSSALILDLRYSTQGKVSGIPFVVSYLTDPQIHIDSIYNRPSNTTTDLWTLSELMGERYGKDKDVVVLTSKYTEGIAEGAAYILKHMSRAIVVGEKTAGGSLDIQKIKIGQSEFYITV >M1_xenLae Xenopus laevis LFQPSLVMDMAKVLLDNYCFPENLVGMQETIEQAVKGGEILHISDPDTLANVFTSGVQGYLNDPRLVVSYEPNYSGPQTEQSLELTPEQLKFLINHSVKYDILPGNIGYLRIDFIIGQDVVQKVGPH LVNNIWKKLMPTSALILDLRYSTQGEVSGIPFVVSYLCDSEIHIDSIYNRPSNTTTDLWTLPELMGERYGKVKDVVVLTSKYTKGVAEDASYILKHMNRAIVVGEKTAGGSLDTQKIKIGQSDFYITV >M1_danRer upstream frag as well two domains: 22-322,324-609 FSPTLIADMAKIFMDNYCSPEKLTGMEEAIDAASSNTEILSISDPTMLANVLTDGVKKTISDSRVKVTYEPDLILAAPPAMPDIPLEHLAAMIKGTVKVEILEGNIGYLKIQHIIGEEMAQKVGPLL LEYIWDKILPTSAMILDFRSTVTGELSGIPYIVSYFTDPEPLIHIDSVYDRTADLTIELWSMPTLLGKRYGTSKPLIILTSKDTLGIAEDVAYCLKNLKRATIVGENTAGGTVKMSKMKVGDTDFYVT >M1_takRub two domains: 23-324,326-612 plus upstream dup AFPPSLITDMAKIVLDNYCSPEKLAGMKEAIEAAGTNTEVLNIPDGESLARVLSAGVQGTVSDSRLMVSYQPDYVPAVPPKMPPLPPEHLVAVLQTSIKLDLLEGNTGYLRIDHIIGEDVAEKVGPS LIDLIWNKILPTSALIFDLRYTSSGEISGIPYIVSYFTQAEPVVHIDSVYDRPSNTTTKLFSLSNLLGERYGITKPLIILTSKNTKGIAEDVAYCLKNLKRATIVGERTAGGSVKLDNFKVGSTDFYI >M1_gasAcu two domains: 27-317,323-612 no upstream dup FAPNVIIDMAKIVIDNYCSPEKLAGMKEAIEAAGSNTEVLSIPDAETLANVLSAGVQTTVSDPRLMISYEPNYVPVVPPKMPPLPPDQVIAVLQTSIKLDILEGNIGYLRIDHILGEDVAEKVGPLL LDLVWNKILPTSALIFDLRYTSSGDISGIPYIVSYFTEAGTPIHIDSIYDRPSNTTTKLFSMSTLLGERYSTSKPLIILTSKNTKGIAEDVAYCLQNLKRATIVGEKTAGGSVKVDKIQVRDTGFYVT >M1_tetNig AFPPSLIADMAKIVLDNYCSPEKLAGMKEAIKAAGTNTEVLNIPDGESLARVLSAGVQGTVSDPRLMVSFQPNYVPAGPHKMPPLPPEHLVAVLQTSVKLDILEGNTGYLRIDHILGEEVADKVGPA LIDLIWNKILPTSALIFDLRYTSSGDISGIPYIVSYFTQAEPVVHIDSVYDRPSNTTTKLLSLPNLLGQRYGVSKPLIVLTSKNTKGIAEDVAYCLKNLKRATIVGEKTAGGSLKLDTFKVGDTDFYI >M1_oryLat two domains: 28-314,320-605 no upstream dup SFPPSLITDLAKIVMDNYCSPEKLSGMKEDIATAGANTDVLNIPDGEALAKVLTDGVQTTVSDPRLRVSYEPNYVPVVPPQLPPEQLIAVLQTSIKLDILEGNIGYLRIDSIIGEEVAEKVGPLLLE LVWSKILPTSALIFDLRYTSSGDITGIPYIISYLTDAKSEIHIDTIYDRPLNTTTKLLSMQSTLGQTYGGTKPLLVLTSKNTKDIAEDVAYCLKNLKRATIVGEKTAGGSAKIKKFRVGDTDFYVTLP >M1_petMar exon3/4 fused, exon4 run-on, fixed genomic frameshift; four domains: 34-312,327-615,625-914,916-1217 KFDTAVVLHLAKVLLDNYCIPENLVGMDEAIQRAVDNGELLGVSDPESAASALTEGIQAALNDPRIAVSYVAPPHTFEELLATIPQKTSFAVLDGNVGYLRADEIISEATIKKLGPVIVQRIWNRLV DTDTFVLDLRYNSHGDITGLPYLVSCFCEPRPVVHLDTVYYRPTNESKEIWSLPDLQGARFAKHKDVFVLVSANTEGVAENVAYVLKHLHRATVIGEQTAGGSLEVERFRLGDSRFFVTVPTARSEPA >M1x_danRer FQSALVLDMAKILLDNYCFPENLIGMQEAIQQAINSGEILHISDRKTLASVLTAGVQGALNDPRLTVSYEPNYTLITPPALHSLPTEQLIRLIRSTVKLEVMDNNIGYLRIDRIIGQETVVKLGRLL HNNIWKKVAHTSAMIFDLRFSTAGELSGLPYIVSYFSDSDPLLHIDTIYERPTNITRELWTLPTLLGERFGKRKDLIVLISKRTIGAAEGVAYILKHLKRAVIIGERSAGGSVRVDKLKIGDSGFYIT >M1x_takRub single upstream exon 42% frameshift no transcripts three domains: 23-323,325-615,618-907 TLVLEMAKLLLENYCIPENLVGMQEAIQRAIKSREILQISDRKTLATVLTVGVQGALNDPRLSVSYEPSFSPLPLQALSSLPVEQQLRLLRNSIKLDILDSDVGYLRIDRIIDEETLLKFGPLLREN VWDKAAQTSSLILDLRFSTAGGWSGIPSIVSYFTEPHSLVHIDTVYDRPSNTTTELWTMSSVRGKTFGGKKDMIVLIGRRTAGAAEAVAYTLKHLNRAIVVGERSAGGSLKVRKFRIAESDFYITMPV >M2_homSap TLRSALPGVVHCLQEVLKDYYTLVDRVPTLLQHLASMDFSTVVSEEDLVTKLNAGLQAASEDPRLLVRAIGPTETPSWPAPDAAAEDSPGVAPELPEDEAIRQALVDSVFQVSVLPGNVGYLRFDSF ADASVLGVLAPYVLRQVWEPLQDTEHLIMDLRHNPGGPSSAVPLLLSYFQGPEAGPVHLFTTYDRRTNITQEHFSHMELPGPRYSTQRGVYLLTSHRTATAAEEFAFLMQSLGWATLVGEITAGNLLH >M2_bosTau LRRALPGVIQRLQEALREYYTLVDRVPALLSHLAAMDLSSVVSEDDLVTKLNAGLQAVSEDPRLQVQVVRPKEASSGPEEEAEEPPEAVPEVPEDEAVRRALVDSVFQVSVLPGNVGYLRFDSFADA SVLEVLGPYILHQVWEPLQDTEHLIMDLRQNPGGPSSAVPLLLSYFQSPDASPVRLFSTYDRRTNITREHFSQTELLGRPYGTQRGVYLLTSHRTATAAEELAFLMQSLGWATLVGEITAGSLLHTHT >M2_monDom LRRARPGAIQRLMEVLQNYYTLVDRVPALLHHLTAIDYSSVLTEEDLAAKLNAGLQAVSEDPRLLVRVLRPEEATMGEAEEEDATPAANSLPEDESQRQALVDSVFQVSVLPGNVGYLRFDEFADSS VLGTLAPYVIRQVWEPLQDTNHLIMDLRYNPGGPSSAVPLLLSYFQDPAAGPIRLFTTYDRQTNQTQEHLSRAELLGKPYGAQRGVYLLTSHHTATAAEEFAFLMQSLGRATLVGEITAGSLMHTRTF >M2_ornAna LRGAVPGAVAHLADLLRDYYALVDRVPALLRHLAALDLSSVLSEEDLTSRLNAGLQAASEDPRLLVRRLEPEEAERGPPRKEEEQKEEEEEDQPSPGASILPGDGSSREAPLFRVSVLPGNVGYLCF DEFPEASALERLGPLLGRRVWEPLEATDHLMVDLRNNPGGPSSAVPLLLSYFQDPAAGPIRLFTTYNRPADVTREYASRAGALEKPYGARRGVYLLTSHRTATAAEEFAYLMQALGRATLVGEITAGR >M2_galGal AVRRAVPGTLSRLTDILKDYYSLVERVPVLLRHLTTSDFSSVQSAEDLATKLNTEMQTLSEDPRLLVRTMMPGEAAAPPAEMPIAMAANLPDNEQLLHALVDTVFKVSVLPGNVGYMRFDEFADASV LVKLGPYIVKKVWEPLQNTENLIMDLRYNPGGPSSSAVPMLISYFQDPTAGPVHLFTTYDRRTNHTQEHNSQAELLAQPYGAQRGIYVLTSRHTATAAEEFAYLMQSLGRATLIGEITAGSLSHTCTF >M2_taeGut VRRAVPGTISHLKNILKDYYSLVERVPALLRRLTTSDFSSVQSSEDLATKLNTELQALSDDPRLMVRVMMPGEAADSPAEKPVGMAADLPDNEQLLHALVDTVFKVSVLPGNVGYMRFDEFADASVL VKLGPYLVHKVWEPLQNTENLIMDLRYNLGGPSSSAVPVLLSYFQDPAAGPVHLFTTYDRRTNHTQEHNSQAELLGQSYGAKRGVYLLTSHHTATAAEEFAYLMQSLGRATLIGEITAGSLSHTRTFP >M2_anoCar SLRKAIPNSMSYLVDIIKNNYSMLEQVPVLLQHLSTFDYSSVLSVKDLASKLNAELQTISEDPRLFLRVPASDEAVTSQTDEKVAMASDLPNNEQLMKALVMTVFKVSVLPGNVGYMRFDEFGDATV LVKLGPYLLQHVWEPLQATDYLIIDLRYNIGGPSSSAVPVLLSYFQDPSAGPVHFFTTYNRLTNQTQAYSSSAEMVGKPYGARRGVYLLTSHNTATAAEEFAYLMQTLGRATLVGEITAGSLSHTHTF >M2_xenTro AVRSSITHILLQLSEILVNNYAFSERIPTLLQHLPNLDYSSVISEEDITAKLNYELQSLTEDPRLVLKSKTDSLVMPEDSTQVENLPDDEATLQALVNTVFKVSILPGNIGYLRFDEFADVSVLAKL GPYIVNTVWDPITVTENLIIDLRYNIGGSSTSIPLLLSYFQEPENRIHLFTIYNRQQNSTNEVYSLPKVLGKPYGSKKGVYVLTSHETATAAEEFAYLMQSLSRATIIGEITSGNLMHSKAFPLDGTR >M2_xenLae AVRSSVTHVLHQLCDILANNYAFSERIPTLLQHLPNLDYSTVISEEDIAAKLNYELQSLTEDPRLVLKSKTDTLVMPGDSIQAENIPEDEAMLQALVNTVFKVSILPGNIGYLRFDQFADVSVIAKL APFIVNTVWEPITITENLIIDLRYNVGGSSTAVPLLLSYFLDPETKIHLFTLHNRQQNSTDEVYSHPKVLGKPYGSKKGVYVLTSHQTATAAEEFAYLMQSLSRATIIGEITSGNLMHSKVFPFDGTQ >M2_calMil frag 2 domains 6-243,334-531 VTRESSPTSDKLPEDPTFLQALVDTVFKVSVLPDNTGYFRFDEFPEISVMSKLVQYIIEKVWLPVKDTDRLIVDLRHNVGGHSSVVPLLLSYFYDPEPPVGLFTVYNRLTNTTSHTTLPGVGQHVYG SRKDIYVLTSHRTATAAEELAYLLQSLNRATIVGEITSGSLLHSRSFQIPSTHLVITIPFINFMDNHGECWLGGGVVPDSIVLAEDTLERTKEII >M2_petMar ELLLSSYTFVERASAIADHLSWSEYGSVVSVEDLTSKLTQDLQSVAEDPRLVVSNREPEWVGAADPPGPPAPLPDDEQMLEAIVDSAFKVEVLEGNIGYLRFDEFGDASAVMKLRKQLVSKVWERIH PTDDVIIDLRYNLGGSSTAIPIVLSYFQDVAPVHFYTVYDRLRNVTAEFHTVSNLTSQLYGSKKGVYLLTSQHTATAAEEFTYLMQSLNRATIVGEITSGRLAHSLAFRLSDTGLYMTVPIVNFIDNN >M2x_danRer SVRKTIPKAVRRVSDIIKRYYSFKDKIPALLNQLAKADYFTVVSEEDLAGKLNHEMQSVFEDPRLLIKATQVLTDDASSEDRSSSDDLTDPLFKLEMISGNNGYLRFDRFPTPEVLLRLEDHIKKKI WQPVQETENLVIDLRFNTGGSTEALPILLSYMFDTSSSTYLFSIYDSIKNTTFDFHTLNNISGPSYGSTKGVYVLTSYYTAEAGEEFAYLMQSLHRGTVIGEITSGMLLHSKTFQIEQTSLAITVPII >M2x_takRub AVRSRIPKVLQIVLDIIGRFYAFADRVQALLQQLESADLFSVVSEEDLAARLNHDLQTASEDPRLIIRHKRDNIPRAEEEPELHAANDHDGELVEGFTVQVLPHNTGYLRLDRFVRCSEGDKLEEIV AEKVWGPLKDTQNLIIDLRHNTGGSSTSVALLLSYLRDPLPKRHFFTIYDSVQNTTTEYGSRPHIPGPSYGSERGVYVLTSHYTAGAAEEFAYLIQSLHFGTVVGEITSGTLMHSKTFQVEGTDIFIT >M3_homSap EFHQSLGALVEGTGHLLEAHYARPEVVGQTSALLRAKLAQGAYRTAVDLESLASQLTADLQEVSGDHRLLVFHSPGELVVEEAPPPPPAVPSPEELTYLIEALFKTEVLPGQLGYLRFDAMAELETV KAVGPQLVRLVWQQLVDTAALVIDLRYNPGSYSTAIPLLCSYFFEAEPRQHLYSVFDRATSKVTEVWTLPQVAGQRYGSHKDLYILMSHTSGSAAEAFAHTMQDLQRATVIGEPTAGGALSVGIYQVG >M3_bosTau EFHRSLGELVEGTGRLLEAHYARPEVVGQMGALLRAKLAQGAYRTAVDLESLASQLTADLQEMSGDHRLLVFHSPGEMVAEEAPPPPPVVPSPEELSYLIEALFKTEVLPGQLGYLRFDAMAELETV KAVGPQLVQLVWQKLVDTAALVVDLRYNPGSYSTAVPLLCSYFFEAEPRRHLYSVFDRATSRVTEVWTLPHVTGQRYGSHKDLYVLVSHTSGSAAEAFAHTMQDLQRATIIGEPTAGGALSVGIYQVG >M3_monDom EFHQRLGALVEGTGHLLEAHYALPEVVGQASALLKAKLEHGTYRTAVDFESLASQLTSDLQEVSGDHRLHVFHSPGEPVSEELTPPQKGVPSPEELTYLIEALFKTEVLPGQLGYLRFDMMAEAETV RAIAPQLVELVWEKLVHTEALVVDLRYNPGGYSTAVPLLCSYFFEAEPRRHLYTIFDRAASQLTEVWTLPQVAGERYGSQKDLYILISHTSGSAAEAFVHTMKDQHRATVIGEPTGGGALSVGIYQVE >M3_ornAna frag FHQTLEALVETTGHLLEAHYCFPAGARRAGAQPWPVAGVEPDVMAQAAEALAVAQGIAA >M3_galGal LLESTGQLLEAHYAIPEVAEKASVMLSTKRVQGGYRSAVDFETLASQLTSDLQEASGDHRLHVFHSHVEPTPEEQLPNMIPSPEELSYIIEALFKIEVLPGNLGYLRFDMMAEAETVKAIGPQLVQM VWNKLVDTDAMIIDMRYNTGGYSTAVPILCSYFFEPEPRQHLYTVFDRSTSRSTEVWTLPKVTGKRYGSLKDIYILTSHMSGSAAEAFTRSMKDLHRATVIGEPTVGGSLSVGIYRVGNSSLYRSIPS >M3_taeGut FHKNMGVLLEGTGQLLEDHYAIPEVAAKASAMLSTKRAQGGYRSAIDSETLASQLTSDLQEASGDHRLHVFHSHVEPTPEEQLPNVIPSPEELSYIIEALFKIEVLPGNLGYLRFDMMAEAETVKAI GPQLLQMVWNKLVDTDAMIIDMRYNTGGYSTAIPILCSYFFDPEPRKHLYTVFDRSTSRSTEVWTLPQLAGKRYGSLKDIYILTSHMSGSAAEAFTRSMKDLHRATVVGEPTVGGSLSVGIYRVGNSS >M3_anoCar EAHYAIPEMARRVSSMLNSKLAQGGYRTAVDFETLASQLTNDLQETSGDHQLHVFHSHVEPSLEEQSPFKTLTPEELNFIIEALFKVDVLPGNVGYLRFDMMAEFESVKTIEPQILHMVWEKLVETS AMIVDMRYNTGSYSTAVPMFCSYFFDAEPQQHLYTIIDRSTSQSTEVWTSSQVSGKRYGSTKDLYILISHASGSAAEAFTRSLKDLHRATVIGEPTVGGSLSASIYNIGSTPLYASIPSQIVLSPVSG >M3_xenTro FHPSVFALVEGTGHLLEVHYAIPEVAYKVSSVLQNKWSEGGYRSVVDLESLASQLTSEMQENSGDHRLHVFYSDTEPEILEDQPPKIPSAEELNYIIDALFKIEVLQGNVGYLRFDMMADTEIIKAI GPQLVSLVWNKLVETNSLIIDMRYNTGGYSTAIPIFCSYFFDPEPLQHLYTVYDRSTSSGTDIWTLPEVVGERYGSTKDIYILTSHMTGSAAEVFTRSMKELNRATIIGEPTSGVSLSVGMYKVGESN >M3_xenLae FHPSIFPLVKGTGHLLEVHYAIPEVAYKVSSVLQNKWSEGGYRSVVDLESLASLLTSEMQENSGDHRLHVFYSDTEPEILEDQPPKIPSPEELNYIIDALFKIEVLPGNVGYLRFDMMADTEIIKAI GPQLVSLVWNKLVETNSLIIDMRYNTGGYSTAIPIFCSYFFDPEPLQHLYTVYDRSTSTGKDIWTLPEVFGERYGSTKDIYILTSHMTGSAAEVFTRSLKDLNRATLIGEPTSGVSLSVGMYKVGDSN >M3_calMil frag GFHAQVAELVESTGKLLAVHYAIPEVAAEVSAVLSAKLTQGLYRSVVDWESLASRLTVDLQETSVWSVSGAEPHVIVQANEAMTVALGIIN >M3_petMar FHAKMASLLELAGALVEGYYAMLSDGENATAEILLKYREGWYRSVVDYEALASQLTSDLHEIWGDHRLHAFYSDLQIERMDEDKTPSVPSPEELSVLIDTVFKVDILANNVGYLRFDMMTDAEVLKH VGPQLVEKVWNKISSTRSLVIDVRYNMGGYSTSIPILCSYFFDASPPRHLYTVFDRPSRSSTQVFTVPRVLGQRYGASKDVYILTSHMTGSAGEILTRVMSDLKRATVIGEPTAGGSLSTGTYRIGDS >M3/4_braFlo Branchiostoma floridae Region: 9 exons MTRPSKVDIVFPIKPFTIPTAHEQVKGEGPVDINKNALCKSADEGHTHPVSIAMAPTAYIVFVALVPTVLSVDWLDVVMGIGDVMADHYLDQDLRALNDQSLLQRWNRTLVHRFQSWSQDDMSDSLR MEEGLTSELRNITGDETIKVWDFGVYENTTQEPVPREFYNFSTFVDNFKKNREKHINVTMLEGNVGYVSIRSMSHIVDIILPDPEMTEFFLSKMAALNESKAIILDLRYNLGGDREGVVHWASFFFNA >M3x_takRub FHQGLRSLIGRTGELLEKHYAIQEVAQKVGEVLLSKWAEGLYRSVVDLESLASQLTADLQEASGDHRLHVFRCDVELESLHGVPKIAAVEEAGFVIDALFKSELLPRNVGYLRFDTMADIEAAKGAA PRLVKSVWNKLVDTDSLIIDMRYNAGGSSTAVPLWCSYFVDGEPLQHLYTVYDRTTKTRVEVMTLPEVSGQRYDPGKDVYILTSHMTGSAAEAFVRAMRDLNRVTIVGEPTAGGSLSSATYQIGESVL >M3x_danRer FHKNIQGLVQEAGDLLEKHYSVPEVAAKVSRLLQSKLTEGLYRSVVDYESLASQLTSDLQETSGDQRLHIFYCETEPETLHDTPKIPSPEEAGFIVEALFKVDVMSGNIGYLRFDMMEDIKVLQAIN PEFLKVVWNKLVNTDMLIIDVRYNTGGYSTAIPLLCTYFFDAQPLTHIYTLFDRSTATVTKVTTLPDVLGQKYSSQKDVYILTSHITGSAAEAFTRTMKDLKRATVIGEPTIGGALSSGTYQIGNSIL >M4_homSap ALRAKVPTVLQTAGKLVADNYASAELGAKMATKLSGLQSRYSRVTSEVALAEILGADLQMLSGDPHLKAAHIPENAKDRIPGIVPMQIPSPEVFEELIKFSFHTNVLEDNIGYLRFDMFGDGELLTQ VSRLLVEHIWKKIMHTDAMIIDMRFNIGGPTSSIPILCSYFFDEGPPVLLDKIYSRPDDSVSELWTHAQVVGERYGSKKSMVILTSSVTAGTAEEFTYIMKRLGRALVIGEVTSGGCQPPQTYHVDDT >M4_bosTau LRAKVPTVLQTAGKLVADNYASPELGVKMAAELSGLQSRYARVTSEAALAELLQADLQVLSGDPHLKTAHIPEDAKDRIPGIVPMQIPSPEVFEDLIKFSFHTNVLEGNVGYLRFDMFGDCELLTQV SELLVEHVWKKIVHTDALIVDMRFNIGGPTSSISALCSYFFDEGPPILLDKIYNRPNNSVSELWTLSQLEGERYGSKKSMVILTSTLTAGAAEEFTYIMKRLGRALVIGEVTSGGCQPPQTYHVDDTD >M4_monDom LRAKVPTILQTAGKLVADNYASLEVGSRVASKLAKLQTQYRQVTSEGELADMLGADLQTLSGDRHLKTAHIPEDAKDRIPGIVPMQLPSPEAFEDLIKFSFHTNVFEGNIGYLRFDMFGDCELLTQV SDLLVEHVWKKVVHTDGMIIDMRFNIGGPTSSISALCSYFFDEGQEVLLDQIYNRPNDSISEIWTQSQVAGERYGSKKSVIILTSSMTAGAAEEFVYVMQRLGRALVIGEVTSGGCQPPQTYHVDDTD >M4_ornAna LRSKVPTVLRTAAKLVADNYAFRETGAGVAAQMGGLQARCGRVTSEGALAEVLGAHLRALSGDPHLQMVYIPEDAKDRIPGVVPMQIPSAETFEDLIKFSFHTSVMEGNIGYLRFDMFGDCELLTQV SELMVEHVWKKIVHTDGLIIDMRNIGGPTSSISALCSYFFDEDHPVLLDKIYNRPNDSISEIWTHSHIAGERYGSRKSVVILTSNMTAGAAEEFVSIMKRLGRALVVGEVTGGGCHPPQTYHVDDTHL >M4_galGal ASLRTQVPQIVQTVGKLVAENYAFVDIGTDIASNLTKSVNKENYKRINSEKELARKLTAILQALSDDEHLKILYIPEHAKDSIPGILPKQIPSPEVFEDLIKFSFHTNVFENNIGYLRFDMFGDCEL LTQVSDLLVEHVWKKIVHTDALIIDMRYNIGGYTNSIPILCSYFFDEGHQVLLDKVYDRPSDSVKEIWTQPQLRGERYGSQKGLIILTSAVTAGAAEEFVFIMKRLGRALIIGEQTSGGSHSPQTYQV >M4_taeGut ANLRAQVPQILQTVGKLVADNYAFVNTGTVIASNLTKNIHKDNYKRINTEEDLAGKVTAILQALSDDKHLKLLYIPEHAKDSIPGIMPKQIPPPEVFEDLIKFSFHTNVFENNIGYLRFDMFGDSEL LTQLSDLMIEHVWKKIFHTDALIIDLRYNIGGSTTPIAILCSYFFDEGHPVLLDRVYDRPSDSVKEIWTQPQLKGERYGSQKGLVILTSAVTAGAAEEFVYIMKRLSRALIIGEQTSGGCHSPQTYQV >M4_anoCar LFRTKLPSVLNTIGKLVADNYAFADIGATVAAKFADYAKKGTYRKINSEIELSGKLAADLKALSGDRHLMISHIPERSKGRILGLVPMQQIPPPEILEDLIKFSLHTNVFENNIGYLRFDMFGDCEL MSQVSELLVQHVWNKIVNTDALIIDMRYNVGGPACSVPLLCSYFFDEGHPILLDKVYNRPNDTTSNIWTVSKLAGKRYGLNKGLIILTSSVTSGAAEEFAHIMKRLGRAFIIGQKTSGGCHPPQTFHV >M4_xenTro KLRTKIPSVIQTAGKLVADNYAFADTGADVASKLIALVDKINYKMIKSEVELAEKLNYDLQSLSKDVHLKAVYIPENSKDRIPGVVPMQIPSPEMFEDLIKFSFHTDVFEKNLGYIRFDMFADSDLL NQVSDLLVEHVWKKVVNQDALIIDMrFNIGGPTSSIPTFCSYFFDEGTPVLLDKIYSRTTNAITDVWTLPHLVGNAFGSKKPVIILTSSLTEGAAEEFVYIMKRLGRAYVIGEVTSGGCHPPQTYHVD >M4_xenLae KLRTKIPTVIQTAAKLVADNYAFADTGANVASKFIALVDKIDYKMIKSEVELAEKINDDLQSLSKDFHLKAVYIPENSKDRIPGVVPMQIPSPELFEELIKFSFHTDVFEKNIGYIRFDMFADSDLL NQVSDLLVEHVWKKVVDQDALIIDMRFNIGGPTSSIPIFCSYFFDEGTPVLLDKIYSRTSNAMTDIWTLPDLVGKTFGSKKPLIILTSSLTEGAAEEFVYIMKRLGRAYVVGEVTSGGCHPPQTYHVD >M4_danRer KLRAEIPALAQAAATLIADNYAFPSIGEHVAEKLEAVVAGGEYNLISTKEDLEERLSEDLLKLSEDKCLKTTSNIPALPPMNPTPEMFIALIKSSFQTDVFENNIGYLRFDMFGDFEHVATIAQIIV EHVWNKVVDTDALIIDLrNNIGGHASSIAGFCSYFFDADKQIVLDHIYDRPSNTTRDLQTLEQLTGRRYGSKKSVVILTSGVTAGAAEEFVFIMKRLGRAMIIGETTHGGCQPPETFAVGESDIFLSI >M4_takRub LRAQIPAIIEGAATLIAKNYAFEATGADVATKLRELLAKGQYNSVVSSESLEVALSADLQRLSGDKSLKATQNAPVLPPMDYSPEMYIELIKVSFHTDVFENNIGYLRFDMFGDFEEVKAIAQIIVE HVWNKVVNTDALILDLRNNVGGPTTAIAGFCSYFFDADKLIVLDKLHDRPSGTTTELLTLPELTGVRYGSKKSLIILTSGATAGAAEEFVYIMKKLGRAMIVGETTAGASHPPQVFSVGEIGIFLSIP >M4_tetNig LRAQIPAIIEGTAALVANNYAFEATGADVAKELRELQANGQYSSVVSKESLEAALSADLQRLSGDKSLKTTPNTPVLPPMDYTPEMYIELIKVSFHTDVFENNIGYLRFDMFGDFEEVKAIAQIIVE HVWNKVVNTDALILDLrNNVGGPTTAIAGFCSYFFDADKQNRVGQAVRQASGTTTELLTLSELTGVRYGSKKSLIILTSGATAGAAEEFVYIMKKLGRAMIVGETTAGASHPPQTFRVGETDVFLLIP >M4_gasAcu TLLNRVPAIIEGSATLIADNYAFEDIGAAVAEKLKGLLANGEYSKVVSKDSLEMKLSADLRTLSGDKSLKTTSNVPALPPMNYSPEMYIELIKVSFHTDVFEDNIGYLRFDMFGDFEEVKAIAQIIV EHVWNKVVNTDAMIVDLRNNIGGPTTAIAGFCSYFFDSDKQIVLDRLYDRPSGTTTELRTLPELTGTRYGSKKSLVMLTSRATAGAAEEFVYIMKKLGRAMIVGETTAGTSHPPKTFRVGETDIFLSI >M4_oryLat LRLQVPAIIEESATLVANNYAFESTAADVAEKLKGHLANGDYNMVVSKESLEAKLSADLQSLSGDKSLTVSSNTGAPPPMEYTPEMYIELIKISFHTDVFENNIGYLRFDMFGDFEEVKAIAQVIVE HVWNKVLHTDAMIIDLRNNVGGPTTAIAGFCSYFFDGDKQILLDKLYDRSTGTTTDLLTLGELTGERYGSKKSLIILASRATAGAAEEFVYIMKRLGRAMIVGETTAGASHPPKVFQVGESDIFLSIP >M4_calMil frag LRAKIPSIFQAAGKLVADNYAFAQTGAGVAETIADLIEGTGYGMINTEGKLAEVLSDTLQQLSGDKHLKAVHIPGDSKHQTPGIAMIQQMPPPEILEDLVKFSYQTKVLENNVGYLRFDMFGDNEMI TQVSELMAKHVWNVIASTSSLIVDLRYNIGGPTSSIPILCSYFFDDDKTVLLDTVYSRPTDTISEMKAIPQVAGNGSTESSVHSYI >M4_petMar ALRADAPSILRTVGKLVADGYSRAEAALGVPSKLAALLEAGEYGALRSEEELAFKLTVHLQLITGDRHLKAVCVPEHATDRMPGIVPMQMPPTESFEDLIKFSFITDVLEGNIGYLRFDLFSDLEAL EHVAHLLVEHVWKKICDTEILIIDLRYNMGGYSTSIPILCSYFFDASPPRHLYTVFDRPSRSSTQVFTVPRVLGQRYGASKDVYILTSHMTGSAGEILTRVMSDLKRATVIGEPTAGGSLSTGTYRI