CDH23 SNPs: Difference between revisions
Tomemerald (talk | contribs) |
Tomemerald (talk | contribs) |
||
Line 527: | Line 527: | ||
Note L1122V will test out as benign at PolyPhen and SIFT because it is a common conservative substitution with comparative genomics support in fish that is not de-weighted for phylogenetic remoteness. Comparative genomics of these tools is limited to sequences at SwissProt and do not incorporate phylogenetic relations; consequently they miss the stable L-->V transition at the level of the tetrapod ancestor and all its descendent clades. Much is known about cadherin domains and their stability but that literature is not used. For example with 27 cadherin domains within the same molecule, the mutational record at homologous residues is likely informative. Consequently such initial screening tools do not utilize a significant part of the available information and are not authoritative here. | Note L1122V will test out as benign at PolyPhen and SIFT because it is a common conservative substitution with comparative genomics support in fish that is not de-weighted for phylogenetic remoteness. Comparative genomics of these tools is limited to sequences at SwissProt and do not incorporate phylogenetic relations; consequently they miss the stable L-->V transition at the level of the tetrapod ancestor and all its descendent clades. Much is known about cadherin domains and their stability but that literature is not used. For example with 27 cadherin domains within the same molecule, the mutational record at homologous residues is likely informative. Consequently such initial screening tools do not utilize a significant part of the available information and are not authoritative here. | ||
The evidence discussed above suggests that a moderately adverse auditory outcome for the V1122V homozygote without syndromic retinitis pigmentosa, similar but milder than non-syndromic L0480Q which also occurs 6 residues back from a DxD motif. The change is unlikely to improve CDH23 performance in hair cell tip links because leucine in this position has been stable for billions of years of branch length despite presumed opportunities for change (indeed residues around it have experienced change and fixation in the population). If valine were neutral, it would be part of the reduced alphabet at this position by now. | The evidence discussed above suggests that a moderately adverse auditory outcome for the V1122V homozygote without syndromic retinitis pigmentosa, similar but milder than non-syndromic L0480Q which also occurs 6 residues back from a DxD motif. Offsetting this, weakly aligning linker regions 6,10 and 15 have valine (as seen in the difference alignment above). The change is unlikely to improve CDH23 performance in hair cell tip links because leucine in this position has been stable for billions of years of branch length despite presumed opportunities for change (indeed residues around it have experienced change and fixation in the population). If valine were neutral, it would be part of the reduced alphabet at this position by now. | ||
However CDH23 is expressed many other cellular sites including but not limited to the retina. Perhaps gain in functionality elsewhere could offset a slight loss of optimality in tip links. However compensation by an unseen allele in another gene seems unlikely given the position of 1122 in a linker domain. The bottom line is that only clinical observation of older homozygotes and mode of familial inheritance of effects can resolve the impact of this allele. | However CDH23 is expressed many other cellular sites including but not limited to the retina. Perhaps gain in functionality elsewhere could offset a slight loss of optimality in tip links. However compensation by an unseen allele in another gene seems unlikely given the position of 1122 in a linker domain. The bottom line is that only clinical observation of older homozygotes and mode of familial inheritance of effects can resolve the impact of this allele. | ||
[[Category:Comparative Genomics]] | [[Category:Comparative Genomics]] |
Revision as of 15:11, 26 August 2009
CDH23 SNPs
CDH23 (cadherin 23) on 10q22.1 is one of the better understood genes of the Usher disease complex. These genes generally encode structural proteins utilized in both hearing and visual systems -- and so at the mutational level by effects on both. Stop codons within CDH23 cause both deafness and blindness (USH1D) whereas missense alleles can affect hearing only (DFNB12). Both conditions are autosomal recessive. However one bad copy of CDH23 in conjunction with one bad allele of PCDH15, protocadherin 15 on 10q21.1 (17 million bp over, not tandem) can perhaps give rise to digenic disease USH1H. That could have a simple physical explanation in defective heteroligomeric binding of the two terminal domains where the respective cSNPs occur.
Many Usher genes function both transiently during development of cochlea and retina and permenantly in adult structures. These functions may localize to multiple sites within each organ, for example ribbon synapses and stereocilia. CDH23, like many of these proteins, has different binding partner issues in cytoplasmic (USH1C harmonin, MYO7A myosin, USH1G sans) versus extracellular and transmembrane domains. Other unrelated cell types elsewhere in the body may use these gene products (or particular splice variants) though mutant alleles manifest most sensitively in hearing and vision (where mouse serves erratically as human disease model). The role of CDH23 in hair tip links has recently been disentangled from its transient but critical role in hair cell development.
However some coding variants of CDH23 are simply near-normal (or even adaptive) polymorphic variants not giving rise to problems during the carrier's lifespan, though subtle subclinical effects on age related (or noise-induced) hearing loss or night vision acuity might still occur. In the past, such variations would be occasionally be detected within geneologies of affected indiviuals but not track with their disease; today, coding SNPs are far more likely to emerge -- and in far greater numbers -- simply in the course of genomic screening. That trend will only accerate with the advent of rapid screening platforms such as Nimblegen that can affordably screen the entire human proteome.
Note these myriad new cSNPs needing interpretation will come with accurate population frequencies further stratified by ethnic group distribution. That can be viewed as 'close-up' comparative genomics that complements the longer view of reduced alphabet afforded currently by CDH23 orthologs in 50-odd vertebrate genome phylogenetic tree. These considerations, along with accurate 3D models of both the cadherin module affected and protein binding partner, greatly help in interpreting disease implications of particular observed SNPs (for example E737V), yet uncertainty will remain in many instances.
Here a newly observed cSNP in a Kalahari Bushmen, heterozygous L1122V in exon 26, lies fall just before the boundary of the 11th of 27 cadherin ectodomains of the 3354 residue, 67 exon protein. This would appear unremarkable except for the observation that valine is the ancestral mammalian value here and it is conserved over a vast phylogenetic time scale.
It does not suffice to consider the structual impact of L1122V in isolation (say by modeling the two adjacent cadherin domains and the intervening region). Even though CDH23 is highly extended (meaning other cadherin domains are irrelevent to L1122V), it forms a parallel dimer in hair tip links implying a side-by-side uncharacterized interaction with a second copy of the relevent L1122V-containing domains. Leucine, valine and isoleucine are hydrophobic residues often important to tight packing of globular fold interiors (where they are often not interchangeable) but also can occur similarly occluded in dimer surfaces patches.
Calcium play an important role in connecting consecutive cadherin domains into a stable non-extensible structure. (If the structure extended in response to an auditory vibration, the tip link would not then open the mechanotransduction channel.) Three Ca+2 ions are bound by each domain, so 81 for CDH23 and 33 for PCDH15 if all sites are utilized. About a third of the 49 known disease alleles directly alter a calcium chelating residue.
The current view of the placement of this residue in auditory hair cell stereocilia tip links involves the relationship between PCDH15 and CDH25 as shown below. Twisted homodimers of each form before they meet at their special first cadherin domain to form the two domain-swapped dimers because both proteins are anchored by transmembrane regions and cytoplasmic interactions. If PCDH15 continues the twist of CDH25 (say clockwise viewed N to C, then the net result is neither homodimer can unravel. However if a substitution weakens the initial twisting, the connection might not stably form.
It's currently unclear how two vaguely homologous proteins could abut at their ends (which are actually the amino-termini minus signal peptides) as a non-covalent tetramer that is stable for 7-8 decades -- there is little evidence for repair in mammals. The much-studed domain swap model in other cadherins appears not applicable here (key tryptophan and initial strands lacking) and a lack of cysteines in the first 3 cadherin domains rules out a disulfide. Because the proteins are oriented C<--N|N-->C, PCDH15 cannot continue a Ca+2 binding motif begun in CDH23 with joint binding providing a seamless connection.
Comparative genomics
Orthologs of CDH23 are available from 42 vertebrates in the exon containing L1122V. The following exon is quite short, therefore difficult to obtain by blast and transcripts are uncommon so deep in a gene. An unusual GC-AG phase 0 intron separates these exons and may help identify ancient orthologs. Residue 1122 lies in a linker region between two adjacent cadherin domains but to model this region it will prove necessary to model consecutive globular domains as well.
Observe that while leucine is sometimes found at this position in other species, that occurrence is concentrated strictly in early diverging vertebrates. In all 33 species of tetrapods (where sound is conducted primarily through air), the value here is exclusively valine. Note in particular the four other species of great apes have valine with no indication of heterozygosity.
From this perspective, L1122V may reflect retention of the ancestral value in one allele, rather than result from de novo back mutation from a L1122L homozygote. In other words, L1122 could be viewed as a mutation apparently fixed for better or worse in all other human populatons -- at a position conserved over billions of years of branch length in phylogenetically related species.
Dozens of disease alleles are known, for example D124G, P240L, E247K, R301Q, A366T, N452S, L480Q, R582Q, H755Y. These often directly disrupt a Ca+2 binding motif (thus in EC1 at 124: DVNDNAPTF --> GVNDNAPTF) so demonstration of residue phylogenetic conservation would support that criterion in the analysis of L1122V.
Posy et al observe cadherin Ca2+ binding domains like DxNDN are located in a linker region so cannot be clearly associated with one of the folded beta-sheet domains, observing further that the domain definition at SMART v.34 omits the N-terminal region critical to EC domains (A*, A and part of B strands) whereas Pfam35 drops the first two. These tools also err in defining calx-beta domains important to vlgr1 binding to usherin.
To see if the domain swap dimer model is applicable to link EC1 of PCDH15 to EC1 of CDH25 -- it seems not because the conserved tryptophan is not present -- it is preferable to associate the Ca2+ binding inter-domain residues with the domain from which they originate but not include residues more naturally part of the following domain. Note the first 3 EC regions of human CDH23 lack any cysteine so a disulfide linkage can be ruled out.
<1......................exon 26.................0> <0..exon 27.......1> <2...............exon 28........................0> <-----------EC10-------------><---interdomain----- -><----------------- -------------------------EC11--------------------> ....................Ca+2........................^. ....Ca+2............ .................................................. CDH23_homSap DNGPVGKRHTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSILQ LKATDADEGEFGRVWYRILH GNHGNNFRIHVSNGLLMRGPRPLDRERNSSHVLIVEAYNHDLGPMRSSVR CDH23_panTro DNGPVGKRHTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSIVQ LKATDADEGEFGRVWYRILH CDH23_gorGor dNGPVGKRHTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSIVQ LKATDADEGEFGRVWYRILR CDH23_ponAbe DNGPVGKRHTGTATVFVTVLDVNDNRPIFLQSSYEASIPEDIPEGHSIVQ LKATDADEGEFGRVWYRILH CDH23_nomLeu DNGPVGKRHTGTATVFITVLDVNDNRPIFLQSSYEASIPEDIPEGHSIVQ LKATDADEGEFGRVWYRILH CDH23_rheMac DNGPVGKRHTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSIVQ LKATDADEGEFGRVWYRILH CDH23_calJac DNGPVGKRHTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSIVQ LKATDADEGEFGRVWYRILQ CDH23_tarSyr DNGPVGKRRTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSIVQ LKATDADEGEFGRVWYRILH CDH23_micMur DNGPVGKRRTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSIVQ LKATDADEGEFGRVWYRILR CDH23_musMus DNGPVGKRRTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSIVQ LKATDADEGEFGRVWYRILH CDH23_ratNor DNGPVGKRRTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSIVQ LKATDADEGEFGRVWYRILH CDH23_cavPor DNGPVGKRRTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSIVQ LKATDADEGEFGRVWYRILH CDH23_speTri DNGPVGKRRTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSIVQ CDH23_oryCun DNGPVGKRRTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSIVQ LRATDADEGEFGRVWYRILH CDH23_ochPri DNGPVGKRHTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIVEGHSIVQ LRATDADEGEFGRVWYRILH CDH23_bosTau DNGPVGKRRTGTATVFVTVLDVNDNRPIFLQSSYEASVSEDIPEGHSIVQ LKATDADEGEFGRVWYRIVH CDH23_canFam DNGPVGKRRTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSIVQ LKAADADEGEFGRVWYRILH CDH23_felCat DNGPVGKRRTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSIVQ LKATDADEGEFGRVWYRILH CDH23_pteVam DNGPVGKRHTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSIVQ LMATDADEGEFGRVWYRILH CDH23_turTru DNGPVGKRRTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSIVQ LTATDADEGEFGRVWYRILH CDH23_susScr DNGPVGKRRTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSIVQ CDH23_equCab DNGPVGKRRTGTATVFITVLDVNDNRPIFLQSSYEASVPEDIPEGHSIVQ LKATDADEGEFGRVWYRILQ CDH23_eriEur dNGPVGKRRTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSIVQ LKATDADEGEFGRVWYRILH CDH23_loxAfr DNGPVGKRRTGTTTVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSIVQ LKATDADEGEFGLVWYRILH CDH23_proCap DNGPVGKRRTGTATVFVTVLDVNDNRPIFLQSSYEASIPEDIPEGHSIVQ LKATDADEGEFGRVWYRILR CDH23_echTel dNGPVGKRRTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSIVQ LKATDADEGEFGRVWYRILH CDH23_choHof dNGPVGKRRTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGRSIVQ CDH23_monDom DNGPVGKRRTGTATIYVTVLDVNDNRPIFLQSSYEASVPEDIPEGSSIVQ LMATDADEGDNGRVWYRILH CDH23_macEug DNGPVGKRRTGTATVYVTVLDVNDNRPIFLHSSYEASISEDIPEGSSIVQ LMATDADEGDNGRVWYRILH CDH23_ornAna DNGPSGKRRTGTATVYVTVLDVNDNRPIFLQSSYEASVPEDIPEASSIVQ LKATDADEGEYGRVWYRIIS CDH23_galGal DNGPTGNRRTGTATVYVTVLDVNDNRPIFLQSSYEASVPEDIPAASSIVQ VKATDADEGVNGRVWYRIVK CDH23_taeGut DNGPSGNRRTGTATVYVTVLDVNDNRPIFLQSSYEVSVPEDIPAASSIVQ VKATDADEGINGRVWYRIVK CDH23_anoCar DNGPTGKRRTGTATVHVTVLDVNDNRPYFLQSSYEATVPEDIPDYSSIVQ VKATDADEGINGRVWYRIVK CDH23_xenTro DNGPAGNRKTGTATVSVTVLDINDNKPIFLKSSYEASVPENVPFSSSIVQ LEATDADEGDNGLVWYRILS CDH23_oryLat DNGPAGSRRTGTATVFVEVLDVNDNRPIFLQNSYETSVLETVPQGTSILQ VQATDADQGENGRVLYRILS CDH23_takRub DNGPAGSRRTGTATVFVEVQDVNDNRPIFLQNSYETGILESVPQGTSILQ VQATDADQGENGRVLYRILT CDH23_danRer DNGPAGGRRTGTATVYVEVLDVNDNRPIFLQNSYETSVLENIPRGTSILQ VQATDADQGENGKVLYRILS CDH23_gasAcu DNGPAGSRRTGTATVFVEVQDVNDNRPIFLQNSYETSILESVPQRTSILK VQATDADQGENGKVLYRILT CDH23_tetNig DNGPAGSRRTGTATVFVEVQDVNDNRPIFLQNSYETSVLESVPQGTSILQ VQATDADQGENGSVLYRILT CDH23_ictPun DNGPAGDRKTGTATVYVEVLDVNDNRPIFLQNSYETTVLENVPRGSSVLQ CDH23_calMil DNGPAGSRRTGTATVYIRVLDVNDNRPIFLQNTYEASVPENITMSTSILQ VSATDADTGQNGRLTYQILQ CDH23_petMar DHGPAGSRRTGTTTLDVLVLDVNDNRPLFLEGSYZVSVPDNVTRGAIFLQ ................................................^. CDH23_homSap DNGPVGKRHTGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSILQ CDH23_panTro ................................................V. CDH23_gorGor ................................................V. CDH23_rheMac ................................................V. CDH23_calJac ................................................V. CDH23_pteVam ................................................V. CDH23_ponAbe .....................................I..........V. CDH23_nomLeu ................I....................I..........V. CDH23_tarSyr ........R.......................................V. CDH23_micMur ........R.......................................V. CDH23_musMus ........R.......................................V. CDH23_ratNor ........R.......................................V. CDH23_cavPor ........R.......................................V. CDH23_speTri ........R.......................................V. CDH23_oryCun ........R.......................................V. CDH23_canFam ........R.......................................V. CDH23_felCat ........R.......................................V. CDH23_turTru ........R.......................................V. CDH23_susScr ........R.......................................V. CDH23_echTel ........R.......................................V. CDH23_eriEur ........R.......................................V. CDH23_proCap ........R............................I..........V. CDH23_equCab ........R.......I...............................V. CDH23_loxAfr ........R...T...................................V. CDH23_choHof ........R....................................R..V. CDH23_bosTau ........R.............................S.........V. CDH23_ochPri ..........................................V.....V. CDH23_monDom ........R.....IY.............................S..V. CDH23_macEug ........R......Y..............H......IS......S..V. CDH23_ornAna ....S...R......Y............................AS..V. CDH23_galGal ....T.N.R......Y...........................AAS..V. CDH23_taeGut ....S.N.R......Y...................V.......AAS..V. CDH23_anoCar ....T...R......H...........Y........T......DYS..V. CDH23_xenTro ....A.N.K......S.....I...K....K.........NV.FSS..V. CDH23_oryLat ....A.S.R........E.............N...T..L.TV.Q.T.... CDH23_takRub ....A.S.R........E.Q...........N...TGIL.SV.Q.T.... CDH23_tetNig ....A.S.R........E.Q...........N...T..L.SV.Q.T.... CDH23_gasAcu ....A.S.R........E.Q...........N...T.IL.SV.QRT...K CDH23_danRer ....A.G.R......Y.E.............N...T..L.N..R.T.... CDH23_ictPun ....A.D.K......Y.E.............N...TT.L.NV.R.S.V.. CDH23_calMil ....A.S.R......YIR.............NT.......N.TMST.... CDH23_petMar .H..A.S.R...T.LD.L.........L..EG..ZV...DNVTR.AIF.. Consensus .n..a...r...a.v..t.l.......i..qs..#as!p#.!p.g.siv. ................................................^.
Comparative anatomy
The remarkable auditory hair cell linker provided by CDH23 and PCDH15 is not a vertebrate innovation. Instead it must date back to the pre-bilateran ancestor because contemporary cnidarians such as Nematostella have a very similar overall structure that incorporates a strong homolog to CDH23. This cannot be plausibly attributed to convergent evolution given the extent of structural agreement of kinocilium, sterocilia, lateral and tip links. Note in mouse that the link is polarized with PCDH15 attached to the shorter sterocilium and CDH23 to the longer.
The Nematostella protein most resembling CDH23 has 6,074 residues, three transmembrane helices and 44 contiguous cadherin ectodomains with 4x-periodicity. Thus the correspondence at the protein level is imperfect. However antibodies show it is distributed on stereocilia of anemone hair bundles and required for tentacle sensitivity to vibration (prey detection). It provides both lateral and tip linkages. It can be predicted to form a coiled parallel homodimer like mammalian CDH23.
Nematostella also has long but weak matches to PCDH15, namely XM_001638202 and EU289217. However upon back-blast to human, these do not quite pull up PCDH15 as best match but rather its closest paralog FAT4 (unsurprising given its 34 cadherin, 6 EGF and 2 lamG domains), perhaps because of lineage-specific expansion or because the blast score is inflated. Possibly CDH23 had not yet undergone duplication and divergence to protocadherins and it alone may play a double homodimer linker role in anemone stereocilia.
Nematocyst discharge is sensitive to calcium levels and streptomycin (like vertebrate mechanotransduction channels) but is insensitive to the MET channel blocker amiloride. The channel itself has not been identified in anemone either.
Prey capture can result in signficant trauma to anemone tenacle hair bundles but this can be repaired using a protein again with similarities to a vertebrate stereocilia repair protein ARL5B which acts on the extracellular face of the plasma membrane along stereocilia in the vicinity of tip links. Human and cnidarian protein XM_001629283 are 77% identical:
homSap ARL5B MGLIFAKLWSLFCNQEHKVIIVGLDNAGKTTILYQFLMNEVVHTSPTIGSNVEEIVVKNTFLMWDIGGQESLRSSWNTYYSNTEFIILV MGL+FAK +S F N+EHKVIIVGLDNAGKTTILYQFLMNEVVHTSPTIGSNVEEIV KNHF+MWDIGGQESLRS+WNTYY+NTEF+ILV nemVec repair MGLLFAKFFSWFSNEEHKVIIVGLDNAGKTTILYQFLMNEVVHTSPTIGSNVEEIVWKNIHFIMWDIGGQESLRSAWNTYYTNTEFLIL homSap ARL5B HVDSIDRERLAITKEELYRMLAHEDLRKAAVLIFANKQDMKGCMTAAEISKYLTLSSIKDHPWHIQSCCALTGEGLCQGLEWMTSRI +DS DRERLAI+K ELY+MLA+E+L++AA+ LI ANKQD+KG M+ AEIS+L L+ IKDH WHIQ+CCALTGEGL QGLEW+T+++ nemVec repair VIDSTDRERLAISKAELYQMLANEELKQAALLILANKQDIKGSMSVAEISEQLNLACIKDHGWHIQACCALTGEGLYQGLEWITTQL
Note 'stereocilia' is an anatomical misnomer. These are instead actin-based membrane protrusions. It is the kinocilium that is a true cilium in both anemone and developing vertebrate hair cells. If parallels to ciliary photoreceptors are sought, these should be with the kinocilium rather than stereocilia. Since no known counterpart to the PCDH15-CDH23 linker occurs in vision, the commonality (Usher syndrome) may reside primarily in ribbon synapses of auditory and photoreceptive neurons.
Pseudogene and paralog issues
No potential exists here for mis-determining orthologous exons, even in remote species such as lamprey with poor assemblies. Exceedingly long genes such as this are not well-represented as retrogenes (which begin 3' and truncate early). Position 1122 is too remote from the 3' terminus. Relevent pseudogenes are not observed by Blat of human, macaque, and dog. Indeed, this exon gives a unique match at this level of sensitivity, even though cadherins are very widespread in the proteome.
The top matches to CDH23 within the human genome are shown as provided by GeneSorter at UCSC. Observe the established binding partner PCDH15 is by no means the best match; the e-value is high but only because of many weak alignments of tandem cadherin domains.
PCDH15 has 11 cadherin domains and a transmembrane region C-terminally. It's problematic to call it a paralog of CDH23 even though all cadherin domains must ultimately coalesce. PCDH15 tandem domains, which cannot be put in 1:1 correspondence with the 27 cadherin domains of CDH23, might have had quite a different history of sources in its domain histories. The evolution of these gene families might be better worked out using interdomain spacer regions and intron position and phasing. Particular regions can be very conserved in themselves while not display much conservation between spacers.
Here the spacer region of CDH23 containing L1122 has best match within the human proteome to PCDHB14 (protocadherin beta 14), far down on the list of overall best matches. The interdomain region is shown in blue below. Note the best matches internally to other spacers are quite weak and neither L nor V is conserved in them.
PCDH15 S-TLTLAIKVLDIDDNSPVFTNSTYTVLVEENLPAGTTILQIEAKDVDLG---ANVSYRIRS + | |+ + |||++|| |+| |+| | |++| | +|||++| | | | | ||| CDH23 TGTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSILQILKATDADEGEFGRVWYRILH +GT V + VLD+NDN P F QS YE VPED P G I + A D D G +G++ Y H PCDHB14 SGTTLVLIKVLDINDNAPEFPQSLYEVQVPEDRPLGSWIATIISAKDLDAGNYGKISYTFFH IFLQSSYEASVPEDIPEGHSILQLK 1122 TFQNLPFVAEVLEGIPAGVSIYQVV 928 CDH23 internal spacer IFSQPLYNISLYENVTVGTSVLTVL 497 CDH23 internal spacer TFFPAVYNVSVSEDVPREFRVVWLN 1034 CDH23 internal spacer TFHNQPYSVRIPENTPVGTPIFIVN 169 CDH23 internal spacer TWKDAPYYINLVEMTPPDSDVTTVV 809 CDH23 internal spacer CDH23 0 chr10:72826710-73245710 cadherin-like 23 FAT4 0 chr4:126457017-126633537 FAT tumor suppressor homolog 4 DCHS1 0 chr11:6599134-6633650 dachsous 1 FAT3 0 chr11:91724910-92269283 FAT tumor suppressor homolog 3 FAT1 0 chr4:187745931-187881981 FAT tumor suppressor 1 FAT2 0 chr5:150863846-150928698 FAT tumor suppressor 2 DCHS2 0 chr4:155375138-155531899 dachsous 2 isoform 1 CELSR2 1e-115 chr1:109594164-109619901 cadherin EGF LAG seven-pass G-type receptor 2 CELSR1 5e-113 chr22:45135395-45311731 cadherin EGF LAG seven-pass G-type receptor 1 CELSR3 5e-109 chr3:48637835-48684985 cadherin EGF LAG seven-pass G-type receptor 3 PCDH24 3e-87 chr5:175908971-175955375 protocadherin LKC ... PCDHB14 5e-53 chr5 140,584,653 protocadherin beta 14
Structural significance to normal function
Blastp at PDB of the region around residue L1122 establishes that the best fit to an already determined structure is the 39% identity match to mouse cadherin CDH8, 2A62. Within the 25 residue interdomain region, the percent identity is somewhat higher at nearly 50%. While not ideal, this should still allow accurate modeling of the adjacent cadherin domains and the critical spacer region, although the structural effects of the L1122V may be fairly subtle.
CDH23 1 GTATVFVTVLDVNDNRPIFLQSSYEASVPEDIPEGHSILQLKATDADEGEFGRVWYRILHGNHGNNFRI GT T+ VT+ DVNDN P F QS Y SVPED+ G +I ++KA D D GE + Y I+ G+ F I CDH8 174 GTTTLTVTLTDVNDNPPKFAQSLYHFSVPEDVVLGTAIGRVKANDQDIGENAQSSYDIIDGDGTALFEI
Since CDH23 is known to form a helical homodimer in tip links -- and such binding patches can involve hydrophobic residues that otherwise would be buried -- the quaternary structure here is the main unknown. Crystallographic adjacency in the unit cell does not always reflect oligomeric solution structure. Consequently, it may not be possible to fully evaluate L1122V despite the favorable match at PDB.
There is no reason to think L1122V would directly affect the calcium binding motifs (LDRE, D.ND, D.D) of either adjacent cadherin domains in the manner of the E737V salsa mouse mutation in exon 22 or D124G, R1060W, E1595K and D2202N, none of which have syndromic effects on the retina but demonstrably weaken calcium dependent binding to PCDH15 even though they do not lie in the amino-terminal cadherin binding domains.
An effect of L1122V on the MET (mechanotransduction) channel is also implausibly indirect because this is at the lower (PCDH15) end of the link tip, though that is disputed for larger sound displacement effects.
Similarly an up-link intracellular effect of extracellular L1122V on CDH23 binding to harmonin would be a stretch. That binding is now thought mediated by an autonomously folding region proximal to the harmonin PDZ motif with a short internal peptide of CDH23 that extends from 3180-3211, KPDDDRYLRAAIQEYDNIAKLGQIIREGPIK, over 2,000 residues away.
The diagram below summarizes what is currently known about homotypic and heterotypic binding of proteins within the Usher network. Some of these remain unclear, like those of whirlin USH2D which also localizes at stereocilia tips and has an N-terminal domain like that of harmonin but yet does not bind CDH23. These interactions must be understood before SNPs such as V1122L can be modelled in their quaternary context and assessed with any confidence.
Comparison of CDH15 domains to PCDH15
Given that CDH23 and PCDH15 binding constitutes the critical part of hair links joining a lower stereocilia to the next higher, it's worth investigating the evolutionary relationship betweeen the 27 and 11 cadherin ectodomains (ie the extent to which these proteins are paralogs and so naturally dimerize). It emerges that the cadherin domains have little residual sequence identity to each other within or across proteins, even though each individually has very considerable phylogenetic conservation.
Further, the cadherin domain of CDH23 best corresponding to a cadherin domain of PCDH15 (as linearly ordered in its primary sequence) are out of order with respect to the primary sequence of CDH23: 25, 16, 15, 13, 12, 16, 14, 2, 9, 18, 12. This suggests if a deep relationship ever existed between these two proteins, different rates of cadherin domain evolution has obscured it.
However when the 27 spacer domains separating cadherin units are concatenated, SMART detects an internal repeat within CDH23 of the first 11 to the last 11 spacers. Furthermore, these repeats give weak but full length alignments to the 11 concatenated spacer units of PCDH15. (Use of spacers avoids confounding issues of cadherin domain cross-matching.) This suggests the latter is ancestral length and that CDH23 experienced an internal duplication that lengthened its ectodomain. Spacers is surely a misnomer as the degree of phylogenetic conservation suggests each is under considerable selection along with its associated cadherin domains.
When the cytoplasmic domains are aligned (using uncorrected genomic alignments from the 46 species UCSC data set), it emerges that CDH23 (below, top graphic) is far better conserved to far better phylogenetic depth than PCDH15 (bottom). This conservation does not extend into early deuterostomes, protostomes or cnidarians but rather emerged abruptly in lamprey (ie synchronoously with ciliary opsins and the retina).
Presumably both are specifically anchored to the internal Usher protein network at one or more domains, with the terminal PDZ binding motif ITEL* and STSL* respectively) and internal motif IM KPDDDRYLRAAIQEYDNIAKLGQIIREGPIK serving to anchor CDH23 but not PCDH15 bidentately to harmonin N-domain. However these domains can explain only a small fraction of observed CDH23 cytoplasmic conservation. Note runs of compositionally simple sequence cannot form ordered secondary or tertiary structure.
Intriguingly, whirlin (USH2D) has a similar domain structure to harmonin, in particular the N-domains align. Although this does not bind the cytoplasmic internal motif of CDH23, it could plausibly bind a comparable region in PCDH15, in effect quasi-paralog binding to quasi-paralog.
Overview of cytoplasmic tail conservation in vertebrate CDH23 and PCDH15: red: 80% conserved residues, blue 50% conserved, black variable:
Difference alignment of human CDH23 cadherin domains relative to L1122V in CDH23.11 Consensus ....y.........p.g...* .v.a.d.d.g.n....y..............f......................dre......l...a.d................v.i.v.#.#.n...f CDH23.11 LQSSYEASVPED-IPEGHSIL QLKATDADEGEFGRVWYRILH----GNHGNNF-RIHVS----NGLLMRG-PRPLDRERNSSHVLIVEAYNHD--LGPMRSSV---RVIVYVEDINDEAPVF CDH23.09 QNLPFV.E.L.G-..A.V..Y .VV.I.L...LN.L.S..MPV----.MPRMD.-L.NS.----S.VV-VT-TTE.....IAEYQ.R.V.SDAG--TPTKS.TS---TLTIH.L.V...T.T. CDH23.20 SPATLTVHLL.N-C.P.F.V. .VT...E.S.LN.ELV...EA----.AQDR-.-L..LV----T.VIRV.-NATI...EQE.YR.T.V.TDRG--TV.LSGTA---I.TILID....SR.E. CDH23.15 SPFG.NV..N.N-VGG.TAVV .VR...R.I.INSVLS.Y.TE----..KDMA.-.MDRI----S.EIATR-.A.P....Q.FYH.VATVEDEG--TPTLSATT---H.Y.TIV.E..N..M. CDH23.18 .NLPMNITIS.N-S.VSSFVA HVL.S...S.CNA.LTFN.TA----..RERA.-F.NAT----T.IVTV--N.......IPEYK.TISVKDNP--EN.RIARRDYDLLLIFLS.E..NH.L. CDH23.14 FT.DSAV.I...-C.V.QRVA TV..W.P.A.SN.Q.VFSLAS----..IAGA.EIVTTN----DSIGEVFVA......ELDHYI.Q.V.SDRG--TP.RKKDH---ILQ.TIL....NP..I CDH23.17 RDYEGPFE.T.G-Q.-.PRVW TFL.H.R.S.PN.Q.E.S.MD----.DPLGE.-V.SPV----E.V.RVRKDVE....TIAFYN.TIC.RDRG--MP.LS.TM---L.GIR.L....ND..L CDH23.05 S.PL.NI.LY.N-VTV.T.V. TVL...N.A.T..E.S.FF------SDDPDR.-SLDKD----T..I.L--IAR..Y.LIQRFT.TII.RDGGG-EETTGR------.RIN.L.V..NV.T. CDH23.23 D.P..QEA.F..-V.V.TI.. TVT.....S.N.ALIE.SL------.DGESK.-A.NPT----T.DIYV--LSS....KKDHYI.TAL.KDNPG-DVASNRRENSVQ.VIQ.L.V..CR.Q. CDH23.07 SKPA.FV..V.N-.MA.ATV. F.N...L.RSREYGQESI.YS----LEGSTQ.-..NAR----S.EITT--TSL....TK.EYI...R.VDGG---VGHNQKTGIAT.NITLL....NH.TW CDH23.10 FPAV.NV..S..-V.REFRVV W.NC..N.V.LNAELS.F.TG----..VDGK.-SVGYR----DAVVRT--VVG....TTAAYM..L..IDNG---PVGKRHTGTAT.F.T.L.V..NR.I. CDH23.02 HNQP.SVRI..N-T.V.TP.F IVN...P.L.AG.S.L.SFQP----PS-----QFFAID----SARGIVTVI.E..Y.TTQAYQ.T.N.TDQ.K-TR.LSTLA---NLAIIIT.VQ.MD.I. CDH23.25 PPNGTILHIR.E-..LRSNVY EVY...K...LN.A.R.SF.K----TAGNRDWEFFIID----PISGLIQTAQR....SQAVYS..LV.SDLGQ-PV.YETMQ---PLQ.AL...D.NE.L. CDH23.12 Q.QYSRLGLR.T-AGI.T.VI VVQ...R.S.DG.L.N....S----.AE.K----FEID----ESTGLIITVNY..Y.TKT.YMMN.S.TDQAP-PFNQGFCS---VY.TLLNELDEAVQF CDH23.03 INLP.STNIY.H-S.P.TTVR IIT.I.Q.K.RPRGIG.T.VS----..TNSI.ALDYI.G----V.TLN.LLDRENPLYSHGFI.T.KGTELND-DRTPSDATVTTTFNIL.I....N..E. CDH23.08 KDAP.YINLV.M-T.PDSDVT TVV.V.P.L..N.TLV.S.QP------PNKFYSLNSTTG---KIRTTHAMLDRENPDPHEAELMRKIVVSVTDCGR.PLKATSSAT.F.NLL.L..ND.T. CDH23.06 QKDA.VGALR.N-E.SVTQLV R.R...E.SPPNNQIT.S.VS----AS-AFGSYFDISLYEGYGVISVSRPL-DYEQIS.GLIY.T.M.MDAG---N.PLN.T--VP.TIE.F.E..NP.T. CDH23.24 SKPQFST..Y.N-E.A.T.VI TMM...Q...PN.ELT.SLEG----PG-VEA.HVDMD.----GLVTTQRPLQSYEKFS-----.T.V.TDGG---E.PLWGT--TMLL.E.I.V..NR... CDH23.16 Q.PH..VLLD.GPDTLNT.LI TIQ.L.L...PN.T.T.A.VA----..IV.T.RIDRHM----GVITAAKEL-DYE-ISHGRYT...T.TDQCPI.SHRLT.T--TT.L.N.N....NV.T. CDH23.22 GITY.MERIL.GAT.-.TTLI AVA.V.P.K.LN.L.T.TL.D----LVPPGYVQLEDS.A---GKVIANRTV-DYEEVH--WLNFT.R.SDNG---S.P.AAE--IP.YLEIV....NN.I. CDH23.27 TKAE.T.G.AT.-AKV.SELI .VL.L...I.NNSL.F.S..AIHYFRALA.DSEDVGQVFTMGSMDGILRTFDLFMAYSPGYF.VDIV.RDLAGHNDTAIIGIYIL.DDQR.KIVIN.I.DR CDH23.01 FTNHFFDTYLLISEDTPVGSS VTQLLAQ.MDND-PLVFG---VSGEEASR-F.AVEPDTGVVWLRQ-------.....TK.EFTVEFSVSD.QG--------.ITRK.NIQ.G.V..N..T. CDH23.21 .NPIQTV..L.SAE.GTVIAN IT-.I.H.LNPK-LEYHIVGIVAKDDTDR-LVPNQEDAFAVNINTGSVMVKS.MN..LVATYEVTLSVIDNASD.PERSV..PNAKLT.N.L.V..NT.Q. CDH23.26 SPQYQLLT...HSPRGTLVGN VTG.V.....PN-AIV.Y--FIAAGNEEK-..HLQPDGCLLVLRD.D.EREAIFSFIVKA.SNRSWTPPRGPSPTLDLVADLTLQE.R.VL.....QP.R. CDH23.04 NS.E.SVAIT.LAQVGFALP. FIQVV.K..NLG-LNSMFEVYLVGNNS.HFIISPTS.QGKADIRI-----RVAIPLDYETVDRYDFDLFANESV----PDH.GYAK.KITLINE..NR.I. CDH23.13 SNA....AIL.NLALGTEIVR -VQ.YSI.-NLN-QIT..FNAYTSTQAKA-L.KIDAITGVITVQG-------LV...--KGDFYTLTVVAD.GG----PKVDSTVK.YIT.L.E..NS.R. CDH23.19 TK.T.Q.E.M.NSPAGTPLTV -.NGPILALDAD-QDI.AVVTYQLL.AQSGL.DINSSTGVVTVRS-----GVII...AF.PPI.ELLLLAE.IG-----LLNSTAHLLITIL.D..NR.T. Consensus ....y.........p.g.... .v.a.d.d.g.n....y..............f......................dre......l...a.d................v.i.v.#.#.n...f Difference alignment of human PCDH15 cadherin domains PCDH15.CA.01 GTAGGPDPTIELSLKDN---VDYWVLMDPVKQMLFLNSTGRVLDRDPPMNIHSIV-----VQVQCINKKVGTIIYHEVRIVVRDRNDNSP PCDH15.CA.02 NGATDIDD..NGQ..YVIQY.PDDPTSNDTFEIPLMLTGNIVLRKR.NYEDKTRYFV.I----QANDRAQ.LNERRTTTTTLTVD.L.GD.LG. PCDH15.CA.07 VKATDPDA.INGQVHY..G------NFNN.FRITS--NGSIY.AVK.N.EVRDYYELV.----VATDGAVH---PRHSTLTLA.K.L.ID.... PCDH15.CA.05 LTAVDADE.SNGE.TYEILV-----GAQGDFIIN.T-TG.ITIAPGVEMIVGRTYALT.----QAADNAPPA-ERRNSICT.Y.E.LPP.NQ.. PCDH15.CA.06 LQATDREGDS.TYAIEN----G.PQRVFNLSET-TGILTL.KA...ESTDRYIL--------IITASDGRPDGTSTAT.N...T.V...A. PCDH15.CA.10 VGVIS.AAINQS.VY.IVS----GNEEDTFGINNI-TGVIYVNGP..YETRTSYVLR.QADSLEV.LANLRVPSKSNTAK.Y.EIQ.E.NHP. PCDH15.CA.08 IEAKDVDLGANVSYRIRS----PEVKHFFALHPF-TGEL.LL.S..YEAFPDQEASI---TFLVEAFDIYGTMPPGIAT.TVI.K.M..YP. PCDH15.CA.03 IQAIDQDRNIQPPSDR.G.LY.ILVGTPEDYPRFFHMHPR--TAEL.LLEPVN..FHQKFDLVI-------KAEQDNGHPLPAFAGLH.EIL.E.NQ.. PCDH15.CA.04 IVALDKDIEDTKDPELHLFL------NDYTS.FTVTQTGITRYLTLLQPV..EEQQTYTFSI-------TAFDGVQESEPVI--.N.Q.M.A...T. PCDH15.CA.09 VYAEDAD-PP.L.ASRVRYRVD.VQFPYPASIFEVEED--SGRVI.RVN.NEE.TTIFKLV.-------.AFDDGEPVMSSSAT.K.L.LHPGEIPR PCDH15.CA.11 VKATDKD-.GNY--SVMAYR.IIPPIKEGKEGFVVETY--TG.IK.AMLFHNMRRSYFKFQ.-------IATDDYGK.LSGKAD.LVS.VNQL.MQV Consensus .........d......!.Y.i...........ff..... TG.i.l...l#r#....%.l.! ....aa..........at.....l..##...
>CDH23_homSap domains: signal, 27 spacers, 27 cadhedrin domains (last weak), unknown extracellular, single pass transmembrane, unknown cytoplasmic: MGRHVATSCHVAWLLVLISGCWG QVNRLPFFTNHFFDTYLLISEDTPVGSSVTQ LLAQDMDNDPLVFGVSGEEASRFFAVEPDTGVVWLRQPLDRETKSEFTVEFSVSDHQGVITRKVNIQVGDVNDNAPTF HNQPYSVRIPENTPVGTPIFI VNATDPDLGAGGSVLYSFQPPSQFFAIDSARGIVTVIRELDYETTQAYQLTVNATDQDKTRPLSTLANLAIIITDVQDMDPIF INLPYSTNIYEHSPPGTTVRI ITAIDQDKGRPRGIGYTIVSGNTNSIFALDYISGVLTLNGLLDRENPLYSHGFILTVKGTELNDDRTPSDATVTTTFNILVIDINDNAPEF NSSEYSVAITELAQVGFALPLF IQVVDKDENLGLNSMFEVYLVGNNSHHFIISPTSVQGKADIRIRVAIPLDYETVDRYDFDLFANESVPDHVGYAKVKITLINENDNRPIF SQPLYNISLYENVTVGTSVLT VLATDNDAGTFGEVSYFFSDDPDRFSLDKDTGLIMLIARLDYELIQRFTLTIIARDGGGEETTGRVRINVLDVNDNVPTF QKDAYVGALRENEPSVTQLVR LRATDEDSPPNNQITYSIVSASAFGSYFDISLYEGYGVISVSRPLDYEQISNGLIYLTVMAMDAGNPPLNSTVPVTIEVFDENDNPPTF SKPAYFVSVVENIMAGATVLF LNATDLDRSREYGQESIIYSLEGSTQFRINARSGEITTTSLLDRETKSEYILIVRAVDGGVGHNQKTGIATVNITLLDINDNHPTW KDAPYYINLVEMTPPDSDVTT VVAVDPDLGENGTLVYSIQPPNKFYSLNSTTGKIRTTHAMLDRENPDPHEAELMRKIVVSVTDCGRPPLKATSSATVFVNLLDLNDNDPTF QNLPFVAEVLEGIPAGVSIYQ VVAIDLDEGLNGLVSYRMPVGMPRMDFLINSSSGVVVTTTELDRERIAEYQLRVVASDAGTPTKSSTSTLTIHVLDVNDETPTF FPAVYNVSVSEDVPREFRVVW LNCTDNDVGLNAELSYFITGGNVDGKFSVGYRDAVVRTVVGLDRETTAAYMLILEAIDNGPVGKRHTGTATVFVTVLDVNDNRPIF LQSSYEASVPEDIPEGHSILQ LKATDADEGEFGRVWYRILHGNHGNNFRIHVSNGLLMRGPRPLDRERNSSHVLIVEAYNHDLGPMRSSVRVIVYVEDINDEAPVF TQQQYSRLGLRETAGIGTSVIV VQATDRDSGDGGLVNYRILSGAEGKFEIDESTGLIITVNYLDYETKTSYMMNVSATDQAPPFNQGFCSVYITLLNELDEAVQF SNASYEAAILENLALGTEIVR VQAYSIDNLNQITYRFNAYTSTQAKALFKIDAITGVITVQGLVDREKGDFYTLTVVADDGGPKVDSTVKVYITVLDENDNSPRF DFTSDSAVSIPEDCPVGQRVAT VKAWDPDAGSNGQVVFSLASGNIAGAFEIVTTNDSIGEVFVARPLDREELDHYILQVVASDRGTPPRKKDHILQVTILDINDNPPVI ESPFGYNVSVNENVGGGTAVVQ VRATDRDIGINSVLSYYITEGNKDMAFRMDRISGEIATRPAPPDRERQSFYHLVATVEDEGTPTLSATTHVYVTIVDENDNAPMF QQPHYEVLLDEGPDTLNTSLIT IQALDLDEGPNGTVTYAIVAGNIVNTFRIDRHMGVITAAKELDYEISHGRYTLIVTATDQCPILSHRLTSTTTVLVNVNDINDNVPTF PRDYEGPFEVTEGQPGPRVWT FLAHDRDSGPNGQVEYSIMDGDPLGEFVISPVEGVLRVRKDVELDRETIAFYNLTICARDRGMPPLSSTMLVGIRVLDINDNDPVLL NLPMNITISENSPVSSFVAH VLASDADSGCNARLTFNITAGNRERAFFINATTGIVTVNRPLDRERIPEYKLTISVKDNPENPRIARRDYDLLLIFLSDENDNHPLF TKSTYQAEVMENSPAGTPLTVLNGP ILALDADQDIYAVVTYQLLGAQSGLFDINSSTGVVTVRSGVIIDREAFSPPILELLLLAEDIGLLNSTAHLLITILDDNDNRPTF SPATLTVHLLENCPPGFSVLQ VTATDEDSGLNGELVYRIEAGAQDRFLIHLVTGVIRVGNATIDREEQESYRLTVVATDRGTVPLSGTAIVTILIDDINDSRPEF LNPIQTVSVLESAEPGTVIAN ITAIDHDLNPKLEYHIVGIVAKDDTDRLVPNQEDAFAVNINTGSVMVKSPMNRELVATYEVTLSVIDNASDLPERSVSVPNAKLTVNVLDVNDNTPQF KPFGITYYMERILEGATPGTTLIA VAAVDPDKGLNGLVTYTLLDLVPPGYVQLEDSSAGKVIANRTVDYEEVHWLNFTVRASDNGSPPRAAEIPVYLEIVDINDNNPIF DQPSYQEAVFEDVPVGTIILT VTATDADSGNFALIEYSLGDGESKFAINPTTGDIYVLSSLDREKKDHYILTALAKDNPGDVASNRRENSVQVVIQVLDVNDCRPQF SKPQFSTSVYENEPAGTSVIT MMATDQDEGPNGELTYSLEGPGVEAFHVDMDSGLVTTQRPLQSYEKFSLTVVATDGGEPPLWGTTMLLVEVIDVNDNRPVF VRPPNGTILHIREEIPLRSNVYE VYATDKDEGLNGAVRYSFLKTAGNRDWEFFIIDPISGLIQTAQRLDRESQAVYSLILVASDLGQPVPYETMQPLQVALEDIDDNEPLF VRPPKGSPQYQLLTVPEHSPRGTLVGNV TGAVDADEGPNAIVYYFIAAGNEEKNFHLQPDGCLLVLRDLDREREAIFSFIVKASSNRSWTPPRGPSPTLDLVADLTLQEVRVVLEDINDQPPRF TKAEYTAGVATDAKVGSELIQ VLALDADIGNNSLVFYSILAIHYFRALANDSEDVGQVFTMGSMDGILRTFDLFMAYSPGYFVVDIVARDLAGHNDTAIIGIYILRDDQRV KIVINEIPDRVRGFEEEFIHLLSNITGAIVNTDNVQFHVDKKGRVNFAQTELLIHVVNRDTNRILDVDRVIQMIDENKEQLRNLFRNYNVLDVQPAISVRLPDDMSALQM AIIVLAILLFLAAMLFVLMNWYY RTVHKRKLKAIVAGSAGNRGFIDIMDMPNTNKYSFDGANPVWLDPFCRNLELAAQAEHEDDLPENLS RTVHKRKLKAIVAGSAGNRGFIDIMDMPNTNKYSFDGANPVWLDPFCRNLELAAQAEHEDDLPENLSEIADLWNSPTRTHGTFGREPAAVKPDDDRYLRAAIQEYDNIAKLGQIIREGPIKGSLLKVVLEDYLRLKKLFAQRMVQKASSCHSSISELIQTELDEEPGDHSPGQGSLRFRHKPPVELKGPDGIHVVHGSTGTLLATDLNSLPEEDQKGLGRSLETLTAAEATAFERNARTESAKSTPLHKLRDVIMETPLEITEL >PCDH15_homSap domains: signal, 11 spacers, 11 cadhedrin domains, unknown extracellular, single pass transmembrane, unknown cytoplasmic: MFRQFYLWTCLASGIILGSLFEICLG QYDDDCKLARGGPPATIVAIDEESRNGTILVDNMLIK GTAGGPDPTIELSLKDNVDYWVLMDPVKQMLFLNSTGRVLDRDPPMNIHSIVVQVQCINKKVGTIIYHEVRIVVRDRNDNSPTF KHESYYATVNELTPVGTTIFTGFSGD NGATDIDDGPNGQIEYVIQYNPDDPTSNDTFEIPLMLTGNIVLRKRLNYEDKTRYFVIIQANDRAQNLNERRTTTTTLTVDVLDGDDLGPMF LPCVLVPNTRDCRPLTYQAAIPELRTPEELNPIIVTPP IQAIDQDRNIQPPSDRPGILYSILVGTPEDYPRFFHMHPRTAELSLLEPVNRDFHQKFDLVIKAEQDNGHPLPAFAGLHIEILDENNQSPYF TMPSYQGYILESAPVGATISDSLNLTSPLR IVALDKDIEDTKDPELHLFLNDYTSVFTVTQTGITRYLTLLQPVDREEQQTYTFSITAFDGVQESEPVIVNIQVMDANDNTPTF PEISYDVYVYTDMRPGDSVIQ LTAVDADEGSNGEITYEILVGAQGDFIINKTTGLITIAPGVEMIVGRTYALTVQAADNAPPAERRNSICTVYIEVLPPNNQSPPRF PQLMYSLEISEAMRVGAVLLN LQATDREGDSITYAIENGDPQRVFNLSETTGILTLGKALDRESTDRYILIITASDGRPDGTSTATVNIVVTDVNDNAPVF DPYLPRNLSVVEEEANAFVGQ VKATDPDAGINGQVHYSLGNFNNLFRITSNGSIYTAVKLNREVRDYYELVVVATDGAVHPRHSTLTLAIKVLDIDDNSPVF TNSTYTVLVEENLPAGTTILQ IEAKDVDLGANVSYRIRSPEVKHFFALHPFTGELSLLRSLDYEAFPDQEASITFLVEAFDIYGTMPPGIATVTVIVKDMNDYPPVF SKRIYKGMVAPDAVKGTPITT VYAEDADPPGLPASRVRYRVDDVQFPYPASIFEVEEDSGRVITRVNLNEEPTTIFKLVVVAFDDGEPVMSSSATVKILVLHPGEIPRF TQEEYRPPPVSELATKGTM VGVISAAAINQSIVYSIVSGNEEDTFGINNITGVIYVNGPLDYETRTSYVLRVQADSLEVVLANLRVPSKSNTAKVYIEIQDENNHPPVF QKKFYIGGVSEDARMFTSVLR VKATDKDTGNYSVMAYRLIIPPIKEGKEGFVVETYTGLIKTAMLFHNMRRSYFKFQVIATDDYGKGLSGKADVLVSVVNQLDMQV IVSNVPPTLVEKKIEDLTEILDRYVQEQIPGAKVVVESIGARRHGDAFSLEDYTKCDLTVYAIDPQTNRAIDRNELFKFLDGKLLDINKDFQPYYGEGGRILEIRTPEAVTSIKKRGESLGYTE GALLALAFIIILCCIPAILVVLV SYRQFKVRQAECTKTARIQAALPAAKPAVPAPAPVAAPPPPPPPPPGAHLYEELGDSSILFLLYHFQQSRGNNSVSEDRKHQQVVMPFSSNTIEAHKSAHVDGSLKSNKLKSARKFTFLSDEDDLSAHNPLYKENISQVSTNSDISQRTDFVDPFSPKIQAKSKSLRGPREKIQRLWSQSVSLPRRLMRKVPNRPEIIDLQQWQGTRQKAENENTGICTNKRGSSNPLLTTEEANLTEKEEIRQGETLMIEGTEQLKSLSSDSSFCFPRPHFSFSTLPTVSRTVELKSEPNVISSPAECSLELSPSRPCVLHSSLSRRETPICMLPIETERNIFENFAHPPNISPSACPLPPPPPISPPSPPPAPAPLAPPPDISPFSLFCPPPSPPSIPLPLPPPTFFPLSVSTSGPPTPPLLPPFPTPLPPPPPSIPCPPPPSASFLSTECVCITGVKCTTNLMPAEKIKSSMTQLSTTTVCKTDPQREPKGILRHVKNLAELEKSVANMYSQIEKNYLRTNVSELQTMCPSEVTNMEITSEQNKGSLNNIVEGTEKQSHSQSTSL >CDH23_spacers color shows repeat reported by SMART QVNRLPFFTNHFFDTYLLISEDTPVGSSVTQ TFHNQPYSVRIPENTPVGTPIFI IFINLPYSTNIYEHSPPGTTVRI EFNSSEYSVAITELAQVGFALPLF IFSQPLYNISLYENVTVGTSVLT TFQKDAYVGALRENEPSVTQLVR TFSKPAYFVSVVENIMAGATVLF TWKDAPYYINLVEMTPPDSDVTT TFQNLPFVAEVLEGIPAGVSIYQ TFFPAVYNVSVSEDVPREFRVVW IFLQSSYEASVPEDIPEGHSILQ VFTQQQYSRLGLRETAGIGTSVIV QFSNASYEAAILENLALGTEIVR RFDFTSDSAVSIPEDCPVGQRVAT VIESPFGYNVSVNENVGGGTAVVQ MFQQPHYEVLLDEGPDTLNTSLIT TFPRDYEGPFEVTEGQPGPRVWT VLLNLPMNITISENSPVSSFVAH LFTKSTYQAEVMENSPAGTPLTVLNGP TFSPATLTVHLLENCPPGFSVLQ EFLNPIQTVSVLESAEPGTVIAN QFKPFGITYYMERILEGATPGTTLIA IFDQPSYQEAVFEDVPVGTIILT QFSKPQFSTSVYENEPAGTSVIT VFVRPPNGTILHIREEIPLRSNVYE LFVRPPKGSPQYQLLTVPEHSPRGTLVGNV RFTKAEYTAGVATDAKVGSELIQ >PCDH15_spacers colored region shows 26% identity alignment to CDH23 spacer repeat QYDDDCKLARGGPPATIVAIDEESRNGTILVDNMLIK TFKHESYYATVNELTPVGTTIFTGFSGD MFLPCVLVPNTRDCRPLTYQAAIPELRTPEELNPIIVTPP YFTMPSYQGYILESAPVGATISDSLNLTSPLR TFPEISYDVYVYTDMRPGDSVIQ PRFPQLMYSLEISEAMRVGAVLLN VFDPYLPRNLSVVEEEANAFVGQ VFTNSTYTVLVEENLPAGTTILQ VFSKRIYKGMVAPDAVKGTPITT FTQEEYRPPPVSELATKGTM VFQKKFYIGGVSEDARMFTSVLR
Localization of disease alleles
Of the 49 known disease alleles of CDH23, 14 have a clear explanation in terms of directly disrupting a calcium binding motif. Quite a few linker domains are also affected, suggesting that L1122V is positioned where it could have an adverse impact (it matches location of deafness allele L0480Q). The first graphic below shows CDH23 marked up for these disease alleles as well as for signal peptide, linkers, cadhedrin domains, the single pass transmembrane region, potential glycosylation sites, and some apparent anomalies in wildtype calcium binding sites.
Glycosylation with a bulky complex carbohydrate would preclude that residue from participating in the parallel dimer. However potential sites are not always utilized; these might be phylogenetically conserved but that conservation would be difficult to distinguish from overall protein invariance. In any event, there is no evident pattern to sites with correct motifs so no constraints emerge for the dimer.
Several anomalous calcium binding sites occur in human. This raises the possibility that the reference human sequence arose from a carrier or diseased individual. However when these are investigated in the 44 species alignment, the 'anomalies' are invariant back to lamprey. So rather than been questionable alleles, they are instead highly conserved variants in these sites.
The second graphic examines each of the 49 mutations for phylogenetic conservation at its position, finding extreme conservation in almost all cases. Only rarely does the altered value of the allele show up in other species and in no case as part of the reduced alphabet at that position. At posiltion 755, an unusual phyloSNP H755P is observed after frog divergence, possibly correlating with the air/water shift in hearing. A more recent phyloSNP Q1496K occurs in gorilla, chimp and human but not gibbon and earlier diverging species.
Finally, the bottom alignments illustrate conservation locally about 3 alleles. This establishes that CDH23 has high conservation overall (relative to the average protein) but many positions are not nearly as well conserved as the disease allele positions outside of mammals. Consequently, this has considerable predictive value for evaluating other alleles that surface in genome sequencing projects or clinical setting: only variations on very deeply conserved residues have surfaced to date in deafness or syndromic disease.
* context change disease location effect D VNDNAPTFH D0124G DFNB12 EC1 Ca+2 binding motif DVNDNAPTF disrupted P YSTNIYEHS P0240L DFNB12 link E HSPPGTTVR E0247K USH1D link R ENPLYSHGF R0301Q DFNB12 EC3 Ca+2 binding motif DRE disrupted A LPLFIQVVD A0366T USH1D link N ENDNRPIFS N0452S DFNB12 EC4 Ca+2 binding motif NENDNRPIF disrupted L TVLATDNDA L0480Q DFNB12 link A TDNDAGTFG A0484P USH1D link R LRATDEDSP R0582Q DFNB12 link H NQKTGIATV H0755Y USH1D sheet D ETPTFFPAV D0990N DFNB12 EC5 Ca+2 binding motif DVNDETPTF disrupted R ETTAAYMLI R1060W DFNB12 EC6 Ca+2 binding motif DRE disrupted V TVLDVNDNR V1090I USH1D link N RPIFLQSSY N1098S USH1D EC6 Ca+2 binding motif DVNDNRPIF disrupted G PMRSSVRVI G1186D DFNB12 sheet P VFTQQQYSR P1206R USH1D EC7 Ca+2 binding motif DINDEAPVF disrupted T QQQYSRLGL T1209A USH1D link D NLNQITYRF D1341N DFNB12 EC9 Ca+2 binding motif DNL disrupted Q VVASDRGTP Q1496H USH1D sheet R KKDHILQVT R1507Q USH1D sheet A TRPAPPDRE A1586P DFNB12 sheet E RQSFYHLVA E1595K DFNB12 EC11 Ca+2 binding motif DRE disrupted Q CPILSHRLT Q1716P DFNB12 sheet R DYEGPFEVT R1746Q USH1D link P LGEFVISPV P1788L USH1D sheet D NDPVLLNLP D1846N DFNB12 EC17 Ca+2 binding motif DINDNDPVL disrupted F NITAGNRER F1888S DFNB12 sheet R PLDRERIPE R1912W USH1D sheet D NPENPRIAR D1930N USH1D sheet G VVTVRSGVI G2017S USH1D sheet R EAFSPPILE R2029W DFNB12 EC19 Ca+2 binding motif DRE disrupted D IGLLNSTAH D2045N DFNB12 sheet D RGTVPLSGT D2148N DFNB12 sheet D LNPKLEYHI D2202N DFNB12 EC21 Ca+2 binding motif DHD disrupted D NGSPPRAAE D2376V USH1D sheet R EKKDHYILT R2465W DFNB12 EC23 Ca+2 binding motif DRE disrupted S VYENEPAGT S2517G USH1D link T MMATDQDEG T2530I USH1D link R PVFVRPPNG R2608H DFNB12 EC23 Ca+2 binding motif DVNDNRPVF disrupted G TLVGNVTGA G2744S USH1D link G NEEKNFHLQ G2771S USH1D sheet R VVLEDINDQ R2833G USH1D sheet I LRDDQRVKI I2950N DFNB12 sheet R VKIVINEIP R2956C DFNB12 sheet V RGFEEEFIH V2968A USH1D link P DDMSALQMA P3059T DFNB12 cytoplasm R EPAAVKPDD R3175H USH1D cytoplasm R AAIQEYDNI R3189W USH1D cytoplasm S ELIQTELDE S3245F USH1D cytoplasm
EC1 D0124G Link1-2 P0240L E0247K EC3 R0301Q homSap VITRKVNIQVGDVNDNAPTFHNQPYSVRIPE VQDMDPIFINLPYSTNIYEHSPPGTTVRII SGVLTLNGLLDRENPLYSHGFILTVK panTro ............................... ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ..A....................... gorGor ............................... .............................. ,,,,,,,,,,,,,,,,,,,,,,,,,, ponAbe ............................... .............................. ..A....................... rheMac ............................... ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ..A....................... calJac ............................... .............................. ..A....................... tarSyr ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ..A....................... micMur ............................... .............................. ..A....................... otoGar ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,,,,,,,,,,,,,,,,,,,,,, tupBel ............................... ............................V. ,,,,,,,,,,,,,,,,,,,,,,,,,, musMus ............................... M...........................V. ..A....................... ratNor ............................... ............................V. ..A....................... dipOrd ............................... I...........................V. ,,,,,,,,,,,,,,,,,,,,,,,,,, cavPor ............................... I...........................V. ..A....................... speTri ........-.........-............ ............................V. ..A..................L.... oryCun ............................... ......V......................V ..A....................... ochPri ............................... .............................. ..A....................... vicPac ............................... ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,,,,,,,,,,,,,,,,,,,,,, turTru ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ................V.....L.....V. ..A.....V................. bosTau ............................... .............................. ..A....................... equCab ............................... I............................. ..A....................... felCat ............................... ............................V. ..A....................... canFam ............................... ............................V. ..A....................... myoLuc ............................... ............................V. ..A....................... pteVam ............................... .............................. ..A....................... eriEur ............................... ......V....................... ..A....................... sorAra ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,,,,,,,,,,,,,,,,,,,,,, loxAfr ............................... .............................. ,,,,,,,,,,,,,,,,,,,,,,,,,, proCap ............................... I............................. ,,,,,,,,,,,,,,,,,,,,,,,,,, echTel ............................... .............................. ..A....................... dasNov ............................... I...................... ,,,,,,,,,,,,,,,,,,,,,,,,,, choHof ............................... ............................V. ,,,,,,,,,,,,,,,,,,,,,,,,,, monDom ............................... ............................M. ..A.....V................. ornAna ..................R............ I...........................M. ,,,,,,,,,,,,,,,,,,,,,,,,,, galGal ..KGT.............R............ ...................N........M. ..A.....P......F..S....... taeGut ..KGT...........T.Q............ ...................N........M. ..A.....P......F..A..V.... anoCar ..KGS.............R............ ...................N........M. ..A.....P......F.GA....... xenTro ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, I..................N... ..A.....Q........TA....... tetNig .VKDI.K..I.E....VAI.QGE..I.H.E. ........T........E.DV.L---..K. ..S..VS.Q.........A..T.... takRub .VKDT....I........I..G...T.H... ........T........E.DV.L---..K. ..S..VS.Q.........A..T.... gasAcu .VKDT....I........V..G...T..... ........T........E.DV.L.FE..K. ..S..V..Q.........A..S...R oryLat .VKDT....I...........G...T..... ........T....N..LQ.NV.L.YQ..N. ..S..V..Q.........S..T.... danRer .VKDT....I........S.Y....AIQ... I.......T........M.DA.. ..S..VS.Q.........S...I... petMar ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,,,,,,,,,,,,,,,,,,,,,,,,,, homSap VITRKVNIQVGDVNDNAPTFHNQPYSVRIPE VQDMDPIFINLPYSTNIYEHSPPGTTVRII SGVLTLNGLLDRENPLYSHGFILTVK EC1: D0124G LK1-2 P0240L E0247K EC3: R0301Q
Final assessment of CDH23 allele L1122V
Note L1122V will test out as benign at PolyPhen and SIFT because it is a common conservative substitution with comparative genomics support in fish that is not de-weighted for phylogenetic remoteness. Comparative genomics of these tools is limited to sequences at SwissProt and do not incorporate phylogenetic relations; consequently they miss the stable L-->V transition at the level of the tetrapod ancestor and all its descendent clades. Much is known about cadherin domains and their stability but that literature is not used. For example with 27 cadherin domains within the same molecule, the mutational record at homologous residues is likely informative. Consequently such initial screening tools do not utilize a significant part of the available information and are not authoritative here.
The evidence discussed above suggests that a moderately adverse auditory outcome for the V1122V homozygote without syndromic retinitis pigmentosa, similar but milder than non-syndromic L0480Q which also occurs 6 residues back from a DxD motif. Offsetting this, weakly aligning linker regions 6,10 and 15 have valine (as seen in the difference alignment above). The change is unlikely to improve CDH23 performance in hair cell tip links because leucine in this position has been stable for billions of years of branch length despite presumed opportunities for change (indeed residues around it have experienced change and fixation in the population). If valine were neutral, it would be part of the reduced alphabet at this position by now.
However CDH23 is expressed many other cellular sites including but not limited to the retina. Perhaps gain in functionality elsewhere could offset a slight loss of optimality in tip links. However compensation by an unseen allele in another gene seems unlikely given the position of 1122 in a linker domain. The bottom line is that only clinical observation of older homozygotes and mode of familial inheritance of effects can resolve the impact of this allele.