Coding indels: PRNP
The prion gene PRNP exhibits a 6bp indel in its amino-terminal signal peptide that contributed historically to establishing the clade Euarchontoglires. From consideration of outgroups, the indel is a deletion (reducing signal pepide length from 31 to 29) rather than an insertion. It occurs in all species of rodents, rabbits, treeshrews, flying lemurs and primates sequenced to date but not in any other species of mammal.
Remarkably, this indel distribution has held up even as the number of genera sequenced has come to exceed 100. The billions of years of branch length represented by this data suggest that the deletion was a very rare event not subject to independent reoccurence (in effect homoplasy-free). Note it does not occur in a compositionally simple region (strings of leucines are common interiorly). As a typical mammalian gene as of November 08 can only be recovered from about 40 species, meaning similar rare genetic events cannot be as stringently evaluted as in PRNP.
Consequently this data set strongly conflicts with the never-ending computer proposals placing mouse basal relative to dog and human, ie (mouse,(dog,human)), which would require both a global revision of the well-established super-ordinal mammalian tree and in PRNP highly non-parsimonious multiple events both bizarrely located basally at the two unrelated divergence stems (very dense phylogenetic sampling has the effect of squeezing the window on homoplasy).
Signal region indels are not especially rare among orthologs to the 4500-odd human genes with signal peptides of which 595 are experimentally validated, despite steric requirements of the binding pocket of the signal processing complex SRP. In actuality the distribution of signal peptide length is fairly broad. These indels can be rapidly screened in batches of 25 by Blat alignment relative to the 44 available vertebrate genomes.
However few of these indels have any phylogenetic depth. It does not appear that the PRNP indel in euarchontoglires has any significant effect on cell targeting by the signal peptide (or subsequent membrane topology). It is not that indels in signal peptides are so rare but rather narrowly windowed basal events in large clades.
Below is data from 96 species:
MA--NLGCWMLVLFVATWSDLGLCKKRPKPG Homo sapiens MA--NLGCWMLVLFVATWSDLGLCKKRPKPG Pan troglodytes MA--NLGCWMLVLFVATWSDLGLCKKRPKPG Gorilla gorilla MA--NLGCWMLVLFVATWSDLGLCKKRPKPG Pongo pygmaeus MA--NLGCWMLVLFVATWSDLGLCKKRPKPG Nomascus leucogenys MA--NLGCWMLVLFVATWSDLGLCKKRPKPG Hylobates lar MA--NLGCWMLVLFVATWSDLGLCKKRPKPG Symphalangus syndactylus MA--NLGCWMLVLFVATWSDLGLCKKRPKPG Macaca arctoides MA--NLGCWMLVLFVATWSDLGLCKKRPKPG Macaca fascicularis MA--NLGCWMLVLFVATWSDLGLCKKRPKPG Macaca fuscata MA--NLGCWMLVLFVATWSDLGLCKKRPKPG Macaca mulatta MA--NLGCWMLVLFVATWSDLGLCKKRPKPG Macaca nemestrina MA--NLGCWMLVLFVATWSDLGLCKKRPKPG Papio hamadryas MA--NLGCWMLFLFVATWSDLGLCKKRPKPG Callithrix jacchus MA--NLGCWMLVLFVATWSDLGLCKKRPKPG Cebus apella MA--NLGCWMLVVFVATWSDLGLCKKRPKPG Cercopithecus aethiops MA--NLGCWMLVVFVATWSDLGLCKKRPKPG Cercopithecus dianae MA--NLGCWMLVLFVATWSDLGLCKKRPKPG Colobus guereza MA--NLGCWMLVLFVATWSDLGLCKKRPKPG Presbytis francoisi MA--NLGCWMLVLFVATWSDLGLCKKRPKPG Saimiri sciureus MA--KLGYWLLVLFVATWSDVGLCKKRPKPG Tarsius syrichta MA--NLGCWMLVVFVATWSDVGLCKKRPKPG Microcebus murinus MA--RLGCWMLVLFVATWSDIGLCKKRPKPG Otolemur garnettii ME--NLGCWMLILFVATWSDIGLCKKRPKPG Cynocephalus variegatus MA--QLGCWLMVLFVATWSDVGLCKKRPKPG Tupaia belangeri MA--NLGYWLLALFVTMWTDVGLCKKRPKPG Mus musculus MA--NLGYWLLALFVTTCTDVGLCKKRPKPG Rattus norvegicus MA--NLGYWLLALFVTTCTDVGLCKKRPKPG Rattus rattus MA--NAGCWLLVLFVATWSDTGLCKKRPKPG Cavia porcellus MA--NLGYWLLALFVTTWTDVGLCKKRPKPG Apodemus sylvaticus MA--NLGCWLLVLFVATWSDLGLCKKRTKPG Dipodomys ordii MA--NLSYWLLAFFVTTWTDVGLCKKRPKPG Clethrionomys glareolus MA--NLSYWLLALFVATWTDVGLCKKRPKPG Cricetulus griseus MA--NLSYWLLALFVATWTDVGLCKKRPKPG Cricetulus migratorius MA--NLGYWLLALFVTMWTDVGLCKKRPKPG Meriones unguiculatus MA--NLSYWLLALFVAMWTDVGLCKKRPKPG Mesocricetus auratus MA--NLGYWLLALFVATWTDVGLCKKRPKPG Sigmodon fulviventer MA--NLGYWLLALFVATWTDVGLCKKRPKPG Sigmodon hispiedis MV--NPGCWLLVLFVATLSDVGLCKKRPKPG Spermophilus tridecemlineatus MV--NPGYWLLVLFVATLSDVGLCKKRPKPG Sciurus vulgaris MA--HLGYWMLLLFVATWSDVGLCKKRPKPG Oryctolagus cuniculus MA--HLSYWLLVLFVAAWSDVGLCKKRPKPG Ochotona princeps MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG Bos taurus MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG Bison bison MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG Rangifer tarandus MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG Alces alces MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG Capreolus capreolus MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG Kobus megaceros MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG Connochaetes taurinus MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG Ammotragus lervia MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG Hippotragus niger MVKSHMGSWILVLFVVTWSDVGLCKKRPKPG Camelus dromedarius MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG Capris hircus MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG Cervus elaphus MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG Cervus elaphus nelsoni MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG Dama dama MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG Odocoileus hemionus MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG Odocoileus virginianus MVKSHIGSWILVLFVAMWSDVALCKKRPKPG Oryx leucoryx MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG Ovibos moschatus MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG Ovis aries MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG Ovis canadensis MVKSHIGSWILVLFVAMWSDVALCKKRPKPG Tragelaphus strepsiceros MVKSHIGGWILVLFVAAWSDIGLCKKRPKPG Sus scrofa MVKSHMGSWILVLFVVTWSDMGLCKKRPKPG Vicugna vicugna MVKSHVGGWILVLFVATWSDVGLCKKRPKPG Equus caballus MVRSHVGGWILVLFVATWSDVGLCKKRPKPG Diceros bicornis MVKSLVGGWILLLFVATWSDVGLCKKRPKPG Myotis lucifugus MVKNYIGGWILVLFVATWSDVGLCKKRPKPG Pteropus vampyrus MVKSHIANWILVLFVATWSDMGFCKKRPKPG Tursiops truncatus MVKSHIGGWILLLFVATWSDVGLCKKRPKPG Canis lupus familiaris MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG Felis catus MVKSHIGSWLLVLFVATWSDIGFCKKRPKPG Mustela putorius MVKSHIGSWLLVLFVATWSDIGFCKKRPKPG Mustela vison MVKSHIGSWILVLFVAMWSDVGLCKKRPKPG Ailuropoda melanoleuca MVKNHVGCWLLVLFVATWSEVGLCKKRPKPG Erinaceus europaeus MVTGHLGCWLLVLFMATWSDVGLCKKRPKPG Sorex araneus MVKSHLGCWIMVLFVATWSEVGLCKKRPKPG Cyclopes didactylus MVRSRVGCWLLLLFVATWSELGLCKKRPKPG Dasypus novemcinctus MVKGTVSCWLLVLVVAACSDMGLCKKRPKPG Echinops telfairi MVKSSLGCWILVLFVATWSDMGLCKKRPKPG Elephas maximus MVKSSLGCWILVLFVATWSDMGLCKKRPKPG Loxodonta africana MVKSSLGCWMLVLFVATWSDVGLCKKRPKPG Procavia capensis MMKSGLGCWILVLFVATWSDVGLCKKRPKPG Orycteropus afer MVKSGLGCWILVLFVATWSDVGVCKKRPKPG Trichechus manatus MAKIQLGYWILALFIVTWSELGLCKKPKTRPG Macropus eugenii MGKIHLGYWFLALFIMTWSDLTLCKKPKPRPG Monodelphis domestica MGKIQLGYWILVLFIVTWSDLGLCKKPKPRPG Trichosurus vulpecular MARLLTTCCLLALLLAACTDVALSKKGKGKPS Gallus gallus MAKLPGTSCLLLLLLLLGADLASCKKGKGKPG Taeniopygia guttata MGKHQMTCWLAIFLLLIQANVSLAKK-KPKPS Anolis carolinensis MRRFLVTCWIAVFLILLQTDVSLSKKGKNKPG Gekko gekkko MGRYRLTCWIVVLLVVMWSDVSFSKKGKGKGG Trachemys scripta MGRHLISCWIIVLFVAMWSDVSLAKKGKGKTG Pelodiscus sinensis MPQSLWTCLVLISLICTLTVSSKKSGGGKSKTG Xenopus laevis MLRSLWTSLVLISLVCALTVSSKKSGSGKSKTG Xenopus topicalis