Toward robust clinical genome interpretation: Developing a consistent terminology to characterize Mendelian disease-gene relationships—allelic requirement, inheritance modes, and disease mechanisms

Purpose: The terminology used for gene-disease curation and variant annotation to describe inheritance, allelic requirement, and both sequence and functional consequences of a variant is currently not standardized. There is considerable discrepancy in the literature and across clinical variant reporting in the derivation and application of terms. Here, we standardize the terminology for the characterization of disease-gene relationships to facilitate harmonized global curation and to support variant classification within the ACMG/AMP framework. Methods: Terminology for inheritance, allelic requirement, and both structural and functional consequences of a variant used by Gene Curation Coalition members and partner organizations was collated and reviewed. Harmonized terminology with definitions and use examples was created, reviewed, and validated. Results: We present a standardized terminology to describe gene-disease relationships, and to support variant annotation. We demonstrate application of the terminology for classification of variation in the ACMG SF 2.0 genes recommended for reporting of secondary findings. Consensus terms were agreed and formalized in both Sequence Ontology (SO) and Human Phenotype Ontology (HPO) ontologies. Gene Curation Coalition member groups intend to use or map to these terms in their respective resources. Conclusion: The terminology standardization presented here will improve harmonization, facilitate the pooling of curation datasets across international curation efforts and, in turn, improve consistency in variant classification and genetic test interpretation.

Note not all founder members had separate allelic requirement terms in addition to inheritance terms. Abbreviations: PAR -pseudoautosomal region.• An X-linked dominant condition would be curated as monoallelic_X_het and we would understand that those diseases manifest when het or hem (or indeed hom/compound het -though this may be more severe or lethal).• An X-linked recessive condition would be curated as monoallelic_X_hem and would not manifest when heterozygous (though they can manifest with ameliorated phenotype, or manifest if skewed inactivation etc -we intend that this is implicit in the term, as characteristic of many sex-linked disorders, and do not anticipate that an additional qualifier term is needed to communicate this, unless the het phenotype is sufficiently distinct as to be classified as a different disease entity) • Terms are specific to each disease-gene pair, so for example if there is good evidence for manifesting heterozygotes of X_Hem disorders presenting in infancy/early childhood that would be coded as X_Hem in DDG2P.If heterozygotes only have late onset cardiomyopathy (e.g.female heterozygote of DMD) they would be X_het in the Cardiac panel but not DD.• Mosaic is intended for conditions that are typically lethal when constitutive • Imprinted requires that the abnormal allele be paternal or maternal in origin.

G2P -Gene2Phenotype
• Requires heterozygosity covers edge cases such as Craniofrontonasal dysplasia due to EFNB1which requires heterozygosity and would not manifest (fully) if hemizygous.Importantly the mutant allele can be inherited from a normal or very mildly affected father.

Disease associated variant consequences:
High level terms to describe variant consequences: Other Literature review: This is to gather new information, not to re-evaluate the gene-disease relationship.Evidence is collected primarily from published peer-reviewed literature, but can also be present in publicly accessible resources, such as variant databases.Up-to-date reviews from centres with particular expertise in a given gene or disease are particularly helpful.
Useful publication search engines include: PubMed Google Scholar LitVar GeneCards Mastermind Other useful information GeneReviews and the "Molecular Genetics" section OMIM ClinVar to search for relevant variant classes PanelApp (If using a resource like PanelApp need to reference the assertion and check original references) As these gene-disease pairs have all been classified as "Definitive" or "Strong" by ClinGen, they are well established and there may be abundant information.The goal is not to re-evaluate the genedisease validity, and the literature review therefore does not have to be exhaustive.The literature search should be focused on establishing inheritance pattern, allelic requirement and where possible disease-associated variant class and functional consequences.
For example, for some gene-disease pairs it may be well established that the pattern of inheritance is autosomal dominant but there may be a small number of reports of recessive inheritance.A broad search of the literature can determine if other modes of inheritance have been reported, using search terms: Gene AND disease AND ("recessive" OR "autosomal recessive" OR "homozyg*" OR "compound heterozyg*" OR "biallelic") Gene AND disease AND (dominant OR "autosomal dominant" OR monoallelic OR heterozyg*) Gene AND disease AND ("x-linked" OR "x linked" OR "X chromosome" OR "X linked dominant" OR "X linked recessive") Any reports of a different mode of inheritance should be reviewed to see if they are relevant or not.For many genes, a second hit may lead to a more severe phenotype but that does not necessarily mean the inheritance follows a recessive or digenic pattern as both the first and second hit would in fact cause disease in isolation.
For disease-associated variant consequence and mechanism, literature review should focus on establishing the most likely consequences.Curators should review the evidence for haploinsufficiency in the ClinGen Dosage Sensitivity curation (http://search.clinicalgenome.org/kb/genedosage?page=1&size=25&order=asc&sort=symbol&search=), pathogenic/likely pathogenic variant classes on ClinVar (Simple ClinVar can be a helpful tool to search ClinVar, see screen shot below and link http://simple-clinvar.broadinstitute.org)and other public variant databases where available.For well described genes, recent publications re-evaluating variants, expert reviews, meta-analyses, and reviews of burden testing are highly relevant.
It is not necessary to review every variant.However, if for example the predominant class of variant is missense but there are a small number of nonsense variants reported, extra time should be spent determining whether there is sufficient evidence to include these as a pathogenic variant class before expanding the disease mechanism.Sufficient evidence could include segregation or functional evidence.
If high level reviews are not available for a gene-disease pair, then a broad literature search may be necessary e.g.Gene AND disease AND (variant OR mutation).For a variant class to be included that would add to the predicted functional consequence, there should be sufficient qualitative evidence to support that such as segregation, functional or burden data.
Where there is uncertainty that cannot be resolved a note should be made in the narrative summary.Include PMIDs where possible and or links to other resources.

Inheritance and Allelic Requirement
List inheritance and allelic requirement terms and any number of appropriate qualifier terms.
Use of qualifiers enables recording of data important to reproductive advice and family screening.

Harmonised allelic requirement and Mendelian inheritance terms.
Abbreviations: HPO -Human phenotype ontology, PAR -pseudoautosomal region.Inheritance qualifier terms-these optional terms can be combined with either inheritance terms or allelic requirement terms to provide additional information about the relationship of a diseasegene pair.Requires that the abnormal allele be paternal or maternal in origin, depending on the disease-gene relationship.Imprinting refers to a normal developmental process in which either the paternal or maternal allele is inactivated, depending on the specific locus, thus leading to expression from only one copy of the gene.Disease typically manifests when a deleterious variant is inherited from a parent whose copy of the gene would normally be expressed, but not when a deleterious variant is inherited from a parent whose copy of the gene would normally be inactivated.

Displays anticipation HP:0003743
A phenomenon in which the severity of a disorder increases, or the age of onset decreases, as the disorder is passed from one generation to the next, typically due to expansion of a repeat sequence.For example, Myotonic Dystrophy is caused by triplet repeat expansion in the DMPK gene.

Requires heterozygosity HP:0034343
Covers rare instances of a condition that is most severe in the heterozygous state.Such disorders are rare and currently all are X-linked.Most X-linked recessive conditions manifest if hemizygous in males, or biallelic in females, though may have a mild phenotype in the heterozygous state in females.
However, Craniofrontonasal dysplasia due to EFNB1, and PCDH19-related epilepsy, are both X-linked dominant and paradoxically more severe in females.Hemizygous males may be mildly affected but seldom manifest the full phenotype.Importantly the mutant allele can be inherited from a normal or very mildly affected father.The mechanism is currently accepted to be due to cellular interference whereby the two distinct cell populations (those with and without the variant) exhibit abnormal cellular interactions in the mosaic state -in women, who are functionally mosaic due to random X inactivation, or mosaic males.The same mechanism could theoretically be observed in autosomal genes with a mosaic variant.
Sex-limited expression HP:0001470 -Male-limited expression HP:0001475 -Female-limited expression -HP:0034344 Condition in which the phenotype only manifests in one sex, i.e. either manifests in males or females but not both.Example: Autosomal recessive sex reversal due to DHH on chr12 manifests only in XY males causing gonadal dysgenesis, while XX females are phenotypically normal.
Contiguous gene syndrome HP:0001466 Syndrome caused by the effects of abnormality (typically a deletion or duplication) of 2 or more adjacent genes.

Notes:
• Mitochondrial -the inheritance of a trait encoded in the mitochondrial genome.Persons with mitochondrial disease may be male or female but the mode of inheritance is strictly maternal.No male with the disease can transmit it to their offspring.• PAR -genes within the pseudoautosomal regions (PAR) are inherited like autosomal genes.PAR1 comprises 2.6mb of the short-arm of both X and Y chromosomes in humans.PAR2 is at the tip of the long arms, spanning 320kb.Normal male mammals have two copies of these genes: one in the pseudoautosomal region of their Y chromosome, the other in the corresponding portion of their X chromosome.Normal females also possess two copies of pseudoautosomal genes, as each of their two X chromosomes contains a pseudoautosomal region.Crossing over between the X and Y chromosomes is normally restricted to the pseudoautosomal regions; thus, pseudoautosomal genes exhibit an autosomal, rather than sexlinked, pattern of inheritance.So, females can inherit an allele originally present on the Y chromosome of their father.
• For monoallelic_X_het (X-linked dominant) conditions, we would understand that those diseases manifest when het or hem (or indeed hom/compound het -though this may be more severe or lethal).• For monoallelic_X_hem (X-linked recessive) conditions, we would understand that these would not manifest when heterozygous (though they can manifest with ameliorated phenotype, or manifest if skewed inactivation etc -primarily recessive with milder female expression) • Terms are specific to each disease gene pair.For PP2: • If there are no variant hotspots, what is the etiological fraction for the whole gene (this may not be available for very rare conditions where there may be insufficient case numbers to do the analysis).

General notes:
-If monoallelic and biallelic inheritance can cause the same disease, they should be recorded as separate entities if biallelic variants lead to a different phenotype (not just a change in severity) -If a dominant variant can also be seen on both alleles but the outcome is essentially the same disease, then this should be categorised as one entity using dominant and monoallelic.
For example: AD and AR DSC2 causing isolated ARVC are one disease gene pair AR DSC2 causing ARVC with cutaneous manifestations is a separate disease gene pair.

Novo: description
of conditions that are exclusively or predominantly de novo due to the limited reproductive fitness of affected individuals.Should not be used to describe a situation where a gene just has a high new mutation rate.Digenic: for example, in Long QT syndrome, heterozygosity for pathogenic variants in two different genes occurs more frequently than would be expected by chance.It is generally associated with a more severe phenotype and the possibility that both parents have pathogenic variants should be considered.The genes involved should be specified.
o De o o

Inheritance qualifier HP:0034335 (Parent and child terms) Definition (Parent term)
Description of conditions in which only an incomplete but relatively high proportion of individuals with a given genotype exhibit the disease regardless of age assuming a full lifespan of 80 years.There is no commonly accepted definition for incomplete, but high penetrance, but we suggest that this term be applied if at least 80 percent but less than 100 percent of individuals with the given genotype would manifest the disease with a full lifespan.Description of conditions in which age of onset is highly variable even in family members who share the same disease-associated variant or variants.

List other variant classes predicted to lead to the same functional consequence:
Considering a hypothetical example of a gene on the X-chromosome in which biallelic or hemizygous monoallelic variation causes congenital structural heart abnormalities, but a heterozygous monoallelic variant typically presents with late onset cardiomyopathy, this might be coded as monoallelic_X_hemizygous for congenital heart disease, and appropriate filtering applied in a developmental disorders panel for diagnosis of an infant, and monoallelic_X_heterozygous (age-related onset) for cardiomyopathy, with different variant filtering applied for a cardiac gene panel analysis in an adult.This has the advantage of tracing the evidence for each disease association.Other variant classes that could be predicted to lead to the same functional consequence based on inferred mechanism (score 4 or 5, see matrix below) and therefore might cause the same phenotype.Matrix of six new high-level predicted functional consequences mapped to SO structural consequence terms via a semi-quantitative scale indicating likelihood of each high-level consequenceThe semi-quantitative scale is characterized from first principles by expert evaluation.