Advertisement

Phenotype based prediction of exome sequencing outcome using machine learning for neurodevelopmental disorders

  • Alexander J.M. Dingemans
    Affiliations
    Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Radboud University, Nijmegen, The Netherlands

    Department of Artificial Intelligence, Faculty of Social Sciences, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Radboud University, Nijmegen, The Netherlands
    Search for articles by this author
  • Max Hinne
    Affiliations
    Department of Artificial Intelligence, Faculty of Social Sciences, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Radboud University, Nijmegen, The Netherlands
    Search for articles by this author
  • Sandra Jansen
    Affiliations
    Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Radboud University, Nijmegen, The Netherlands
    Search for articles by this author
  • Jeroen van Reeuwijk
    Affiliations
    Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Radboud University, Nijmegen, The Netherlands
    Search for articles by this author
  • Nicole de Leeuw
    Affiliations
    Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Radboud University, Nijmegen, The Netherlands
    Search for articles by this author
  • Rolph Pfundt
    Affiliations
    Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Radboud University, Nijmegen, The Netherlands
    Search for articles by this author
  • Bregje W. van Bon
    Affiliations
    Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Radboud University, Nijmegen, The Netherlands
    Search for articles by this author
  • Anneke T. Vulto-van Silfhout
    Affiliations
    Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Radboud University, Nijmegen, The Netherlands
    Search for articles by this author
  • Tjitske Kleefstra
    Affiliations
    Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Radboud University, Nijmegen, The Netherlands
    Search for articles by this author
  • David A. Koolen
    Affiliations
    Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Radboud University, Nijmegen, The Netherlands
    Search for articles by this author
  • Marcel A.J. van Gerven
    Affiliations
    Department of Artificial Intelligence, Faculty of Social Sciences, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Radboud University, Nijmegen, The Netherlands
    Search for articles by this author
  • Lisenka E.L.M. Vissers
    Affiliations
    Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Radboud University, Nijmegen, The Netherlands
    Search for articles by this author
  • Bert B.A. de Vries
    Correspondence
    Correspondence and requests for materials should be addressed to Bert B.A. de Vries, Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Radboud University, Geert Grooteplein Zuid 10, 6500 HB, 9101 Nijmegen, The Netherlands
    Affiliations
    Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Radboud University, Nijmegen, The Netherlands
    Search for articles by this author
Published:November 30, 2021DOI:https://doi.org/10.1016/j.gim.2021.10.019

      Abstract

      Purpose

      Although the introduction of exome sequencing (ES) has led to the diagnosis of a significant portion of patients with neurodevelopmental disorders (NDDs), the diagnostic yield in actual clinical practice has remained stable at approximately 30%. We hypothesized that improving the selection of patients to test on the basis of their phenotypic presentation will increase diagnostic yield and therefore reduce unnecessary genetic testing.

      Methods

      We tested 4 machine learning methods and developed PredWES from these: a statistical model predicting the probability of a positive ES result solely on the basis of the phenotype of the patient.

      Results

      We first trained the tool on 1663 patients with NDDs and subsequently showed that diagnostic ES on the top 10% of patients with the highest probability of a positive ES result would provide a diagnostic yield of 56%, leading to a notable 114% increase. Inspection of our model revealed that for patients with NDDs, comorbid abnormal (lower) muscle tone and microcephaly positively correlated with a conclusive ES diagnosis, whereas autism was negatively associated with a molecular diagnosis.

      Conclusion

      In conclusion, PredWES allows prioritizing patients with NDDs eligible for diagnostic ES on the basis of their phenotypic presentation to increase the diagnostic yield, making a more efficient use of health care resources.

      Keywords

      To read this article in full you will need to make a payment

      Purchase one-time access:

      Academic & Personal: 24 hour online accessCorporate R&D Professionals: 24 hour online access
      One-time access price info
      • For academic or personal research use, select 'Academic and Personal'
      • For corporate R&D use, select 'Corporate R&D Professionals'

      ACMG Member Login

      Are you an ACMG Member? Sign in for online access.

      Subscribe:

      Subscribe to Genetics in Medicine
      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect

      References

        • Vissers L.E.
        • de Ligt J.
        • Gilissen C.
        • et al.
        A de novo paradigm for mental retardation.
        Nat Genet. 2010; 42: 1109-1112https://doi.org/10.1038/ng.712
        • De Ligt J.
        • Willemsen M.H.
        • van Bon B.W.
        • et al.
        Diagnostic exome sequencing in persons with severe intellectual disability.
        N Engl J Med. 2012; 367: 1921-1929https://doi.org/10.1056/NEJMoa1206524
        • Stojanovic J.R.
        • Miletic A.
        • Peterlin B.
        • et al.
        Diagnostic and clinical utility of clinical exome sequencing in children with moderate and severe global developmental delay / intellectual disability.
        J Child Neurol. 2020; 35: 116-131https://doi.org/10.1177/0883073819879835
        • Gilissen C.
        • Hehir-Kwa J.Y.
        • Thung D.T.
        • et al.
        Genome sequencing identifies major causes of severe intellectual disability.
        Nature. 2014; 511: 344-347https://doi.org/10.1038/nature13394
        • Retterer K.
        • Juusola J.
        • Cho M.T.
        • et al.
        Clinical application of whole-exome sequencing across clinical indications.
        Genet Med. 2016; 18: 696-704https://doi.org/10.1038/gim.2015.148
        • Robinson P.N.
        • Köhler S.
        • Bauer S.
        • Seelow D.
        • Horn D.
        • Mundlos S.
        The human phenotype ontology: a tool for annotating and analyzing human hereditary disease.
        Am J Hum Genet. 2008; 83: 610-615https://doi.org/10.1016/j.ajhg.2008.09.017
        • Feenstra I.
        • Hanemaaijer N.
        • Sikkema-Raddatz B.
        • Yntema H.
        • et al.
        Balanced into array: genome-wide array analysis in 54 patients with an apparently balanced de novo chromosome rearrangement and a meta-analysis.
        Eur J Hum Genet. 2011; 19: 1152-1160https://doi.org/10.1038/ejhg.2011.120
        • de Vries B.B.
        • White S.M.
        • Knight S.J.
        • et al.
        Clinical studies on submicroscopic subtelomeric rearrangements: a checklist.
        J Med Genet. 2001; 38: 145-150https://doi.org/10.1136/jmg.38.3.145
        • Gubbels C.S.
        • VanNoy G.E.
        • Madden J.A.
        • et al.
        Prospective, phenotype-driven selection of critically ill neonates for rapid exome sequencing is associated with high diagnostic yield.
        Genet. Med. 2020; 22: 736-744https://doi.org/10.1038/s41436-019-0708-6
        • Robinson P.N.
        • Köhler S.
        • Oellrich A.
        • et al.
        Improved exome prioritization of disease genes through cross-species phenotype comparison.
        Genome Res. 2014; 24: 340-348https://doi.org/10.1101/gr.160325.113
        • Franke A.
        • Wollstein A.
        • Teuber M.
        • et al.
        GENOMIZER: an integrated analysis system for genome-wide association data.
        Hum Mutat. 2006; 27: 583-588https://doi.org/10.1002/humu.20306
        • Manders P.
        • Lutomski J.E.
        • Smit C.
        • Swinkels D.W.
        • Zielhuis G.A.
        The Radboud Biobank: a central facility for disease-based biobanks to optimise use and distribution of biomaterial for scientific research in the Radboud University Medical Center, Nijmegen.
        Open J Bioresour. 2018; 5: 2https://doi.org/10.5334/ojb.36
        • Haer-Wigman L.
        • van Zelst-Stams W.A.
        • Pfundt R.
        • et al.
        Diagnostic exome sequencing in 266 dutch patients with visual impairment.
        Eur J Hum Genet. 2017; 25: 591-599https://doi.org/10.1038/ejhg.2017.9
      1. Bell J, Bodmer D, Sistermans E, Ramsden S. Practice guidelines for the interpretation and reporting of unclassified variants (UVs) in clinical molecular genetics. 2007.

        • Richards S.
        • Aziz N.
        • Bale S.
        • et al.
        Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the association for molecular pathology.
        Genet Med. 2015; 17: 405-424https://doi.org/10.1038/gim.2015.30
        • Brier G.W.
        Verification of forecasts expressed in terms of probability.
        Mon Weather Rev. 1950; 78: 1-3https://doi.org/10.1175/1520-0493(1950)078<0001:VOFEIT>2.0.CO;2
        • Nick T.G.
        • Campbell K.M.
        Logistic regression.
        Methods Mol Biol. 2007; 404: 273-301https://doi.org/10.1007/978-1-59745-530-5_14
        • Meurer W.J.
        • Tolles J.
        Logistic regression diagnostics: understanding how well a model predicts outcomes.
        JAMA. 2017; 317: 1068-1069https://doi.org/10.1001/jama.2016.20441
        • Salvatier J.
        • Wiecki T.V.
        • Fonnesbeck C.
        Probabilistic programming in python using PyMC3.
        PeerJ Comput Sci. 2016; 2: e55https://doi.org/10.7717/peerj-cs.55
        • Carpenter B.
        • Gelman A.
        • Hoffman M.D.
        • et al.
        Stan: a probabilistic programming language.
        J Stat Softw. 2017; 76: 1-32https://doi.org/10.18637/jss.v076.i01
        • Piironen J.
        • Vehtari A.
        Sparsity information and regularization in the horseshoe and other shrinkage priors.
        Electron J Statist. 2017; 11: 5018-5051https://doi.org/10.1214/17-EJS1337SI
        • van Erp S.
        • Oberski D.L.
        • Mulder J.
        Shrinkage priors for bayesian penalized regression.
        J Math Psychol. 2019; 89: 31-50https://doi.org/10.1016/j.jmp.2018.12.004
        • Cortes C.
        • Vapnik V.
        Support-vector networks.
        Mach Learn. 1995; 20: 273-297https://doi.org/10.1007/BF00994018
        • Singh G.
        • Gupta R.
        • Rastogi A.
        • Chandel M.D.S.
        • Ahmad R.
        A machine learning approach for detection of fraud based on svm.
        Int J Sci Eng Technol. 2012; 1: 192-196
        • Huang S.
        • Cai N.
        • Pacheco P.P.
        • Narrandes S.
        • Wang Y.
        • Xu W.
        Applications of support vector machine (SVM) learning in cancer genomics.
        Cancer Genomics Proteomics. 2018; 15: 41-51https://doi.org/10.21873/cgp.20063
        • Gurovich Y.
        • Hanani Y.
        • Omri B.
        • et al.
        Identifying facial phenotypes of genetic disorders using deep learning.
        Nat Med. 2019; 25: 60-64https://doi.org/10.1038/s41591-018-0279-0
        • Dudding-Byth T.
        • Baxter A.
        • Holliday E.G.
        • et al.
        Computer face-matching technology using two-dimensional photographs accurately matches the facial gestalt of unrelated individuals with the same syndromic form of intellectual disability.
        BMC Biotechnol. 2017; 17: 90https://doi.org/10.1186/s12896-017-0410-1
        • Ferry Q.
        • Steinberg J.
        • Webber C.
        • et al.
        Diagnostically relevant facial gestalt information from ordinary photos.
        eLife. 2014; 3: e02020https://doi.org/10.7554/eLife.02020
        • Dingemans A.J.M.
        • Stremmelaar D.E.
        • van der Donk R.
        • et al.
        Quantitative facial phenotyping for koolen-de vries and 22q11.2 deletion syndrome.
        Eur J Hum Genet. 2021; 29: 1418-1423https://doi.org/10.1038/s41431-021-00824-x
        • Bou Assi E.
        • Gagliano L.
        • Rihana S.
        • Nguyen D.K.
        • Sawan M.
        Bispectrum features and multilayer perceptron classifier to enhance seizure prediction.
        Sci Rep. 2018; 8: 15491https://doi.org/10.1038/s41598-018-33969-9
        • Fujita T.
        • Sato A.
        • Narita A.
        • et al.
        Use of a multilayer perceptron to create a prediction model for dressing independence in a small sample at a single facility.
        J Phys Ther Sci. 2019; 31: 69-74https://doi.org/10.1589/jpts.31.69
        • Needell D.
        • Saab Rayan
        • Woolf T.
        Simple classification using binary data.
        J Mach Learn Res. 2018; 19: 2487-2516
        • Pedregosa F.
        • Varoquaux G.
        • Gramfort A.
        • et al.
        Scikit-learn: Machine learning in python.
        J Mach Learn Res. 2011; 12: 2825-2830
        • Bergstra J.
        • Komer B.
        • Eliasmith C.
        • Yamins D.
        • Cox D.D.
        Hyperopt: a python library for model selection and hyperparameter optimization.
        Comput Sci Disc. 2015; 8: 014008
        • Hoffman M.D.
        • Gelman A.
        The No-U-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo.
        J Mach Learn Res. 2014; 15: 1593-1623
        • Wright C.F.
        • Fitzgerald T.W.
        • Jones W.D.
        • et al.
        Genetic diagnosis of developmental disorders in the DDD study: a scalable analysis of genome-wide research data.
        Lancet. 2015; 385: 1305-1314https://doi.org/10.1016/S0140-6736(14)61705-0
        • McGrother C.W.
        • Bhaumik S.
        • Thorp C.F.
        • Hauck A.
        • Branford D.
        • Watson J.M.
        Epilepsy in adults with intellectual disabilities: prevalence, associations and service implications.
        Seizure. 2006; 15: 376-386https://doi.org/10.1016/j.seizure.2006.04.002
        • Zhou J.
        • Park C.Y.
        • Theesfeld C.L.
        • et al.
        Whole-genome deep-learning analysis identifies contribution of noncoding mutations to autism risk.
        Nat Genet. 2019; 51: 973-980https://doi.org/10.1038/s41588-019-0420-0
        • Werling D.M.
        • Brand H.
        • An J.Y.
        • et al.
        An analytical framework for whole-genome sequence association studies and its implications for autism spectrum disorder.
        Nat Genet. 2018; 50: 727-736https://doi.org/10.1038/s41588-018-0107-y
        • Dingemans A.J.M.
        • Stremmelaar Diante E.
        • Vissers L.E.L.M.
        • et al.
        Human disease genes website series: An international, open and dynamic library for up-to-date clinical information.
        Am J Med Genet A. 2021; 185: 1039-1046https://doi.org/10.1002/ajmg.a.62057