[go: up one dir, main page]

HK40062217B - Ancestry-specific genetic risk scores - Google Patents

Ancestry-specific genetic risk scores Download PDF

Info

Publication number
HK40062217B
HK40062217B HK62022051515.4A HK62022051515A HK40062217B HK 40062217 B HK40062217 B HK 40062217B HK 62022051515 A HK62022051515 A HK 62022051515A HK 40062217 B HK40062217 B HK 40062217B
Authority
HK
Hong Kong
Prior art keywords
individual
specific
lineage
genetic
variants
Prior art date
Application number
HK62022051515.4A
Other languages
Chinese (zh)
Other versions
HK40062217A (en
Inventor
黄文耀
哈嘉怡
宝林·C·吴
王春萌
罗伯特·基姆斯·瓦伦祖埃拉
维什韦什瓦兰·斯里达尔
Original Assignee
亚洲基因私人有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 亚洲基因私人有限公司 filed Critical 亚洲基因私人有限公司
Publication of HK40062217A publication Critical patent/HK40062217A/en
Publication of HK40062217B publication Critical patent/HK40062217B/en

Links

Description

世系特异性遗传风险得分Lineage-specific genetic risk score

交叉引用Cross-referencing

本申请要求于2018年11月28日提交的美国临时申请号62/772,565和于2018年12月11日提交的美国专利申请号16/216,940的权益,这些申请通过引用以其全部并入本文。This application claims the benefit of U.S. Provisional Application No. 62/772,565, filed November 28, 2018, and U.S. Patent Application No. 16/216,940, filed December 11, 2018, which are incorporated herein by reference in their entirety.

序列表sequence list

即时申请包含序列表,其通过引用以其全部并入本文。所述ASCII副本创建于2019年10月28日,被命名为55075-701_601_SL.txt,并且大小是47.9KB。The immediate request contains a sequence list, which is incorporated herein in its entirety by reference. The ASCII copy was created on October 28, 2019, named 55075-701_601_SL.txt, and is 47.9KB in size.

发明内容Summary of the Invention

全基因组关联研究(GWAS)使科学家能够识别与广泛的表型性状相关的遗传变异。遗传风险得分(GRS)用于根据从个体获得的样品中检测某些遗传变体是否存在来预测所述个体是否会发展某个性状。然而,数据显示,离散世系群体潜在的遗传变异和模式是不同的。因此,检测到的遗传变体是否会给出个体发展所述性状的风险,在很大程度上取决于所述个体的世系(ancestry)。目前的遗传风险预测方法要么根本不考虑个体的世系,要么使用消费者调查表来考虑世系,导致不精确,而且往往不准确的遗传风险预测。Genome-wide association studies (GWAS) enable scientists to identify genetic variations associated with a wide range of phenotypic traits. Genetic risk scores (GRS) are used to predict whether an individual will develop a particular trait based on the detection of certain genetic variants in samples taken from that individual. However, data show that the underlying genetic variations and patterns differ across discrete lineages. Therefore, whether a detected genetic variant gives an individual a risk of developing that trait depends largely on that individual's ancestry. Current genetic risk prediction methods either do not consider an individual's ancestry at all or use consumer questionnaires to account for ancestry, leading to imprecise and often inaccurate genetic risk predictions.

在某些实施方案中,本文公开了用于通过分析个体的基因型以确定个体的世系并基于源自与个体具有相同世系的受试者的GWAS的世系特异性遗传风险变体来计算GRS的方法、介质和系统。在一些实施方案中,GRS中考虑的遗传变体(多个)可以包括单核苷酸变体(SNV)、核苷酸碱基的插入或缺失(indel)、或拷贝数变体(CNV)。在一些实施方案中,如果在从个体获得的样品中检测到的遗传变体与世系特异性受试者组的GWAS中报告的遗传变体(未知遗传变体)不对应,则基于与用作风险预测的基础的特定世系群体内的未知遗传变体的非随机关联(称为连锁不平衡(LD))来选择代理遗传变体。研究表明,人类基因组中LD的模式在不同的世系群体中有所不同。In some embodiments, this document discloses methods, media, and systems for determining an individual's lineage by analyzing their genotype and calculating GRS based on lineage-specific genetic risk variants derived from GWAS of subjects sharing the same lineage as the individual. In some embodiments, the genetic variants (multiple) considered in the GRS may include single nucleotide variants (SNVs), nucleotide base insertions or deletions (indels), or copy number variants (CNVs). In some embodiments, if a genetic variant detected in a sample obtained from the individual does not correspond to a genetic variant (unknown genetic variant) reported in a GWAS of a lineage-specific subject group, a proxy genetic variant is selected based on a non-random association (referred to as linkage disequilibrium (LD)) with the unknown genetic variant within the specific lineage population used as the basis for risk prediction. Studies have shown that patterns of LD in the human genome differ across different lineage populations.

在一些实施方案中,本文公开了计算机实现的方法,所述方法包括:(a)使用基于距离或基于模型的计算机程序分配个体的世系,以分析所述个体的基因型,所述基因型包括一个或多个个体特异性遗传变体;以及(b)在所述个体的基因型中检测与特定表型性状相关联的世系特异性变体,所述世系特异性变体对应于:(i)在所述个体的基因型中可检测到的个体特异性遗传变体;或(ii)通过插补(imput)从世系特异性分型单倍型(ancestry-specific phased haplotypes)中缺失的个体特异性变体而确定的与所述个体特异性遗传变体处于连锁不平衡(LD)的遗传变体,所述世系特异性分型单倍型是使用与所述个体具有相同世系的个体的参考组确定的;和(c)基于在(b)中检测到的世系特异性变体计算所述个体的遗传风险得分(GRS),其中GRS指示所述个体具有或将发展所述特定表型性状的可能性。在一些实施方案中,世系特异性遗传变体和个体特异性遗传变体选自单核苷酸变体(SNV)、拷贝数变体(CNV)和indel。在一些实施方案中,步骤(ii)中的插补包括:(1)将来自所述个体的未分型基因型数据分型,以基于所述个体的世系产生世系特异性分型单倍型;和(2)使用来自与所述个体具有相同世系的参考组的分型单倍型数据,插补在所述世系特异性分型单倍型中不存在的个体特异性基因型,以选择与所述个体特异性遗传变体处于LD的遗传变体。在一些实施方案中,LD由包括至少约0.80的D’值或包括至少0.80的r2值定义。在一些实施方案中,特定性状包括营养性状、临床性状、亚临床性状、体育锻炼性状、皮肤性状、过敏性状或精神性状,或其组合。在一些实施方案中,个体的基因型是通过对从个体获得的遗传物质进行或已经进行基因分型测定而获得的。在一些实施方案中,基因分型测定包括脱氧核糖核酸(DNA)阵列、核糖核酸(RNA)阵列、测序测定或其组合。在一些实施方案中,基于距离的计算机程序是主成分分析,并且其中基于模型的计算机程序是最大似然或贝叶斯方法。在一些实施方案中,所述个体的基于所述世系特异性变体的GRS比所述个体的基于非世系特异性变体的相应GRS更准确。在一些实施方案中,计算机实现的方法还包括提供包括个体的针对特定表型性状的GRS的通知。在一些实施方案中,通知还包括基于针对特定表型性状的GRS对个体的行为建议。在一些实施方案中,与特定表型性状相关的行为改变包括增加、减少或避免包括进行体育锻炼,摄入药物、维生素或补充剂,接触产品,使用产品,饮食改变,睡眠改变,酒精消耗或咖啡因消耗的活动。在一些实施方案中,亚临床性状包括疾病或病症的表型。在一些实施方案中,体育锻炼性状包括反感锻炼、有氧运动能力、减肥困难、耐力、力量、身体素质益处、对锻炼的心跳反应降低、瘦体重、肌肉酸痛、肌肉损伤风险、肌肉修复受损、应力性骨折、整体损伤风险、肥胖的可能性或静息代谢率受损。在一些实施方案中,皮肤性状包括胶原分解、干燥、抗氧化剂缺乏、解毒受损、皮肤糖化、色素斑点、年轻性、光老化、真皮敏感性或对阳光的敏感性。在一些实施方案中,毛发性状包括头发厚度、头发稀疏、头发脱落、秃顶、油性、干燥、头皮屑或发量。在一些实施方案中,营养性状包括维生素缺乏、矿物质缺乏、抗氧化剂缺乏、脂肪酸缺乏、代谢失衡、代谢受损、代谢敏感性、过敏、饱腹感或健康饮食的有效性。在一些实施方案中,维生素缺乏包括有包括维生素A、维生素B1、维生素B2、维生素B3、维生素B5、维生素B6、维生素B7、维生素B8、维生素B9、维生素B12、维生素C、维生素D、维生素E和维生素K的维生素的缺乏。在一些实施方案中,矿物质缺乏包括有包括钙、铁、镁、锌或硒的矿物质缺乏。在一些实施方案中,抗氧化剂缺乏包括有包括谷胱甘肽或辅酶Q10(CoQ10)的抗氧化剂的缺乏。在一些实施方案中,脂肪酸缺乏包括多不饱和脂肪酸或单不饱和脂肪酸的缺乏。在一些实施方案中,代谢失衡包括葡萄糖失衡。在一些实施方案中,代谢受损包括咖啡因或药物治疗的代谢受损。在一些实施方案中,代谢敏感性包括麸质敏感性、聚糖敏感性或乳糖敏感性。在一些实施方案中,过敏包括对食物的过敏(食物过敏)或对环境因素的过敏(环境过敏)。在一些实施方案中,所述方法还包括对个体进行有效地改善或防止个体中的特定性状的治疗,前提是遗传风险得分指示个体具有或将发展特定性状的高可能性。在一些实施方案中,治疗包括补充剂或药物疗法。在一些实施方案中,补充剂包括维生素、矿物质、益生菌、抗氧化剂、抗炎剂或其组合。在一些实施方案中,与特定性状相关的行为改变包括增加、减少或避免包括进行体育锻炼,摄入药物、维生素或补充剂,接触产品,使用产品,饮食改变,睡眠改变,酒精消耗或咖啡因消耗的活动。In some embodiments, this document discloses a computer-implemented method comprising: (a) assigning an individual’s lineage using a distance-based or model-based computer program to analyze the individual’s genotype, the genotype including one or more individual-specific genetic variants; and (b) detecting in the individual’s genotype a lineage-specific variant associated with a particular phenotypic trait, the lineage-specific variant corresponding to: (i) an individual-specific genetic variant detectable in the individual’s genotype; or (ii) a genetic variant in linkage disequilibrium (LD) with the individual-specific genetic variant determined by imputation from an individual-specific variant missing from ancestry-specific phased haplotypes, the ancestry-specific phased haplotypes being determined using a reference group of individuals with the same lineage as the individual; and (c) calculating a genetic risk score (GRS) for the individual based on the lineage-specific variants detected in (b), wherein the GRS indicates the likelihood that the individual has or will develop the particular phenotypic trait. In some embodiments, lineage-specific genetic variants and individual-specific genetic variants are selected from single nucleotide variants (SNVs), copy number variants (CNVs), and indels. In some embodiments, the interpolation in step (ii) includes: (1) typing untyped genotype data from the individual to generate a lineage-specific typing haplotype based on the individual's lineage; and (2) interpolating individual-specific genotypes not present in the lineage-specific typing haplotype using typing haplotype data from a reference group with the same lineage as the individual to select genetic variants at the LD (Lower Pathway). In some embodiments, the LD is defined by a D' value including at least about 0.80 or an value including at least 0.80. In some embodiments, a specific trait includes nutritional traits, clinical traits, subclinical traits, physical traits, skin traits, allergic traits, or mental traits, or combinations thereof. In some embodiments, the individual's genotype is obtained by or has been obtained through genotyping of genetic material obtained from the individual. In some embodiments, genotyping assays include deoxyribonucleic acid (DNA) arrays, ribonucleic acid (RNA) arrays, sequencing assays, or combinations thereof. In some embodiments, the distance-based computer program is principal component analysis, and the model-based computer program is a maximum likelihood or Bayesian method. In some embodiments, the individual's GRS based on the pedigree-specific variant is more accurate than the individual's corresponding GRS based on non-pedigree-specific variants. In some embodiments, the computer-implemented method further includes providing notification including the individual's GRS for a specific phenotypic trait. In some embodiments, the notification also includes behavioral recommendations for the individual based on the GRS for a specific phenotypic trait. In some embodiments, behavioral changes associated with a specific phenotypic trait include increasing, decreasing, or avoiding activities including physical exercise, ingestion of drugs, vitamins, or supplements, exposure to products, use of products, dietary changes, sleep changes, alcohol consumption, or caffeine consumption. In some embodiments, subclinical traits include phenotypes of diseases or conditions. In some implementations, physical activity traits include aversion to exercise, aerobic capacity, difficulty losing weight, endurance, strength, physical fitness benefits, decreased heart rate response to exercise, lean body mass, muscle soreness, risk of muscle injury, impaired muscle repair, stress fractures, overall injury risk, likelihood of obesity, or impaired resting metabolic rate. In some implementations, skin traits include collagen breakdown, dryness, antioxidant deficiency, impaired detoxification, skin glycation, pigmentation spots, premature aging, photoaging, dermal sensitivity, or sensitivity to sunlight. In some implementations, hair traits include hair thickness, thinning hair, hair loss, baldness, oiliness, dryness, dandruff, or hair volume. In some implementations, nutritional traits include vitamin deficiencies, mineral deficiencies, antioxidant deficiencies, fatty acid deficiencies, metabolic imbalances, impaired metabolism, metabolic sensitivity, allergies, satiety, or effectiveness of a healthy diet. In some embodiments, vitamin deficiency includes deficiencies in vitamins A, B1, B2, B3, B5, B6, B7, B8, B9, B12, C, D, E, and K. In some embodiments, mineral deficiency includes deficiencies in minerals including calcium, iron, magnesium, zinc, or selenium. In some embodiments, antioxidant deficiency includes deficiencies in antioxidants including glutathione or coenzyme Q10 (CoQ10). In some embodiments, fatty acid deficiency includes deficiencies in polyunsaturated or monounsaturated fatty acids. In some embodiments, metabolic imbalance includes glucose imbalance. In some embodiments, metabolic impairment includes metabolic impairment due to caffeine or drug treatment. In some embodiments, metabolic sensitivity includes gluten sensitivity, glycan sensitivity, or lactose sensitivity. In some embodiments, allergy includes allergies to food (food allergy) or allergies to environmental factors (environmental allergy). In some embodiments, the method further includes treatment to effectively improve or prevent a specific trait in an individual, provided that a genetic risk score indicates a high probability that the individual has or will develop the specific trait. In some embodiments, the treatment includes supplements or pharmacological therapy. In some embodiments, supplements include vitamins, minerals, probiotics, antioxidants, anti-inflammatory agents, or combinations thereof. In some embodiments, behavioral changes associated with the specific trait include increasing, decreasing, or avoiding activities including physical exercise, ingestion of drugs, vitamins, or supplements, exposure to products, use of products, dietary changes, sleep changes, alcohol consumption, or caffeine consumption.

在一些实施方案中,本文公开了系统,其包括:计算设备,其包括至少一个处理器、存储器和软件程序,软件程序包括可由至少一个处理器执行以评估个体具有或将发展特定表型性状的可能性的指令,所述指令包括以下步骤:(a)使用基于距离或基于模型的计算机程序分配个体的世系,以分析所述个体的基因型,所述基因型包括一个或多个个体特异性遗传变体;以及(b)在所述个体的基因型中检测与特定表型性状相关联的世系特异性变体,所述世系特异性变体对应于:(i)在所述个体的基因型中可检测到的个体特异性遗传变体;或(ii)通过插补从世系特异性分型单倍型中缺失的个体特异性变体而确定的与所述个体特异性遗传变体处于连锁不平衡(LD)的遗传变体,所述世系特异性分型单倍型是使用与所述个体具有相同世系的个体的参考组确定的;和(c)基于在(b)中检测到的世系特异性变体计算所述个体的遗传风险得分(GRS),其中GRS指示所述个体具有或将发展所述特定表型性状的可能性。在一些实施方案中,世系特异性遗传变体和个体特异性遗传变体选自单核苷酸变体(SNV)、拷贝数变体(CNV)和indel。在一些实施方案中,步骤(2)中的插补包括:(1)将来自所述个体的未分型基因型数据分型,以基于所述个体的世系产生世系特异性分型单倍型;和(2)使用来自与所述个体具有相同世系的参考组的分型单倍型数据,插补在所述世系特异性分型单倍型中不存在的个体特异性基因型,以选择与所述个体特异性遗传变体处于LD的遗传变体。在一些实施方案中,LD由包括至少约0.80的D’值或包括至少0.80的r2值定义。在一些实施方案中,特定性状包括营养性状、临床性状、亚临床性状、体育锻炼性状、皮肤性状、过敏性状或精神性状。在一些实施方案中,所述系统还包括基因分型测定。在一些实施方案中,基因分型测定包括脱氧核糖核酸(DNA)阵列、核糖核酸(RNA)阵列、测序测定或其组合。在一些实施方案中,基于距离的计算机程序是主成分分析,并且其中基于模型的计算机程序是最大似然或贝叶斯方法。在一些实施方案中,所述个体的基于所述世系特异性变体的GRS比所述个体的基于非世系特异性变体的相应GRS更准确。在一些实施方案中,所述系统还包括报告模块,所述报告模块被配置为生成包括个体的针对特定表型性状的GRS的报告。在一些实施方案中,所述系统还包括被配置成向个体展示报告的输出模块。在一些实施方案中,报告包括个体具有或将发展特定性状的风险。在一些实施方案中,所述报告还包括以下建议:基于针对所述特定表型性状的GRS对所述个体的行为建议。在一些实施方案中,与特定表型性状相关的行为改变包括增加、减少或避免包括进行体育锻炼,摄入药物、维生素或补充剂,接触产品,使用产品,饮食改变,睡眠改变,酒精消耗或咖啡因消耗的活动。在一些实施方案中,亚临床性状包括疾病或病症的表型。在一些实施方案中,体育锻炼性状包括反感锻炼、有氧运动能力、减肥困难、耐力、力量、身体素质益处、对锻炼的心跳反应降低、瘦体重、肌肉酸痛、肌肉损伤风险、肌肉修复受损、应力性骨折、整体损伤风险、肥胖的可能性或静息代谢率受损。在一些实施方案中,皮肤性状包括胶原分解、干燥、抗氧化剂缺乏、解毒受损、皮肤糖化、色素斑点、年轻性、光老化、真皮敏感性或对阳光的敏感性。在一些实施方案中,毛发性状包括头发厚度、头发稀疏、头发脱落、秃顶、油性、干燥、头皮屑或发量。在一些实施方案中,营养性状包括维生素缺乏、矿物质缺乏、抗氧化剂缺乏、脂肪酸缺乏、代谢失衡、代谢受损、代谢敏感性、过敏、饱腹感或健康饮食的有效性。在一些实施方案中,维生素缺乏包括有包括维生素A、维生素B1、维生素B2、维生素B3、维生素B5、维生素B6、维生素B7、维生素B8、维生素B9、维生素B12、维生素C、维生素D、维生素E和维生素K的维生素的缺乏。在一些实施方案中,矿物质缺乏包括有包括钙、铁、镁、锌或硒的矿物质的缺乏。在一些实施方案中,抗氧化剂缺乏包括缺乏包括谷胱甘肽或辅酶Q10(CoQ10)的抗氧化剂。在一些实施方案中,脂肪酸缺乏包括多不饱和脂肪酸或单不饱和脂肪酸的缺乏。在一些实施方案中,代谢失衡包括葡萄糖失衡。在一些实施方案中,代谢受损包括咖啡因或药物疗法的代谢受损。在一些实施方案中,代谢敏感性包括麸质敏感性、聚糖敏感性或乳糖敏感性。在一些实施方案中,过敏包括对食物的过敏(食物过敏)或对环境因素的过敏(环境过敏)。在一些实施方案中,所述系统还包括对个体进行有效地改善或防止个体中的特定性状的治疗,前提是遗传风险得分指示个体具有或将发展特定性状的高可能性。在一些实施方案中,治疗包括补充剂或药物疗法。在一些实施方案中,补充剂包括维生素、矿物质、益生菌、抗氧化剂、抗炎剂或其组合。在一些实施方案中,与特定性状相关的行为改变包括增加、减少或避免包括进行体育锻炼,摄入药物、维生素或补充剂,接触产品,使用产品,饮食改变,睡眠改变,酒精消耗或咖啡因消耗的活动。In some embodiments, this document discloses a system comprising: a computing device including at least one processor, memory, and software program, the software program including instructions executable by at least one processor to assess the likelihood that an individual has or will develop a particular phenotypic trait, the instructions including the steps of: (a) assigning a pedigree of an individual using a distance-based or model-based computer program to analyze the individual's genotype, the genotype including one or more individual-specific genetic variants; and (b) detecting pedigree-specific variants associated with a particular phenotypic trait in the individual's genotype, the pedigree-specific variants... The variant corresponds to: (i) an individual-specific genetic variant detectable in the individual's genotype; or (ii) a genetic variant in linkage disequilibrium (LD) with the individual-specific genetic variant determined by interpolating the individual-specific variant missing from the pedigree-specific haplotype, the pedigree-specific haplotype being determined using a reference group of individuals of the same pedigree as the individual; and (c) a genetic risk score (GRS) calculated for the individual based on the pedigree-specific variant detected in (b), wherein the GRS indicates the likelihood that the individual has or will develop the particular phenotypic trait. In some embodiments, the pedigree-specific genetic variant and the individual-specific genetic variant are selected from single nucleotide variants (SNVs), copy number variants (CNVs), and indels. In some embodiments, the interpolation in step (2) includes: (1) typing untyped genotype data from the individual to generate pedigree-specific haplotypes based on the individual's lineage; and (2) interpolating individual-specific genotypes not present in the pedigree-specific haplotypes using haplotype data from a reference group with the same lineage as the individual, to select genetic variants at the LD (Lower Pathway) with respect to the individual-specific genetic variant. In some embodiments, the LD is defined by a D' value including at least about 0.80 or an value including at least 0.80. In some embodiments, the specific trait includes nutritional traits, clinical traits, subclinical traits, physical traits, skin traits, allergic traits, or mental traits. In some embodiments, the system further includes genotyping assays. In some embodiments, genotyping assays include deoxyribonucleic acid (DNA) arrays, ribonucleic acid (RNA) arrays, sequencing assays, or combinations thereof. In some embodiments, the distance-based computer program is principal component analysis, and the model-based computer program is maximum likelihood or Bayesian methods. In some embodiments, the individual's GRS based on the pedigree-specific variant is more accurate than the corresponding GRS based on non-pedigree-specific variants. In some embodiments, the system further includes a reporting module configured to generate a report including the individual's GRS for a specific phenotypic trait. In some embodiments, the system further includes an output module configured to present the report to the individual. In some embodiments, the report includes the risk that the individual has or will develop a specific trait. In some embodiments, the report further includes recommendations for the individual's behavior based on the GRS for the specific phenotypic trait. In some embodiments, behavioral changes associated with a specific phenotypic trait include increasing, decreasing, or avoiding activities including physical exercise, ingestion of medications, vitamins, or supplements, exposure to products, use of products, dietary changes, sleep changes, alcohol consumption, or caffeine consumption. In some embodiments, subclinical traits include phenotypes of diseases or conditions. In some implementations, physical activity traits include aversion to exercise, aerobic capacity, difficulty losing weight, endurance, strength, physical fitness benefits, decreased heart rate response to exercise, lean body mass, muscle soreness, risk of muscle injury, impaired muscle repair, stress fractures, overall injury risk, likelihood of obesity, or impaired resting metabolic rate. In some implementations, skin traits include collagen breakdown, dryness, antioxidant deficiency, impaired detoxification, skin glycation, pigmentation spots, premature aging, photoaging, dermal sensitivity, or sensitivity to sunlight. In some implementations, hair traits include hair thickness, thinning hair, hair loss, baldness, oiliness, dryness, dandruff, or hair volume. In some implementations, nutritional traits include vitamin deficiencies, mineral deficiencies, antioxidant deficiencies, fatty acid deficiencies, metabolic imbalances, impaired metabolism, metabolic sensitivity, allergies, satiety, or effectiveness of a healthy diet. In some embodiments, vitamin deficiency includes deficiencies in vitamins A, B1, B2, B3, B5, B6, B7, B8, B9, B12, C, D, E, and K. In some embodiments, mineral deficiency includes deficiencies in minerals including calcium, iron, magnesium, zinc, or selenium. In some embodiments, antioxidant deficiency includes deficiencies in antioxidants including glutathione or coenzyme Q10 (CoQ10). In some embodiments, fatty acid deficiency includes deficiencies in polyunsaturated or monounsaturated fatty acids. In some embodiments, metabolic imbalance includes glucose imbalance. In some embodiments, metabolic impairment includes metabolic impairment due to caffeine or drug therapy. In some embodiments, metabolic sensitivity includes gluten sensitivity, glycan sensitivity, or lactose sensitivity. In some embodiments, allergy includes allergies to food (food allergy) or allergies to environmental factors (environmental allergy). In some embodiments, the system further includes treatment to effectively improve or prevent a specific trait in an individual, provided that a genetic risk score indicates a high probability that the individual has or will develop the specific trait. In some embodiments, the treatment includes supplements or pharmacological therapy. In some embodiments, supplements include vitamins, minerals, probiotics, antioxidants, anti-inflammatory agents, or combinations thereof. In some embodiments, behavioral changes associated with the specific trait include increasing, decreasing, or avoiding activities including physical exercise, ingestion of drugs, vitamins, or supplements, exposure to products, use of products, dietary changes, sleep changes, alcohol consumption, or caffeine consumption.

在一些实施方案中,本文公开了本公开的系统用于基于在(c)中计算的GRS向个体建议行为改变或推荐产品中的用途。In some implementations, this document discloses the use of the system disclosed herein for suggesting behavioral changes or recommending products to individuals based on the GRS calculated in (c).

在一些实施方案中,本文还公开了非暂时性计算机可读存储介质,其包括配置成使至少一个处理器执行本文公开的方法中的步骤的计算机可执行代码。In some embodiments, this document also discloses a non-transitory computer-readable storage medium comprising computer-executable code configured to cause at least one processor to perform the steps of the methods disclosed herein.

在某些实施方案中,本文公开了用于基于个体的世系和基因型向个体建议行为改变的计算机实现的方法,所述方法包括:a)提供所述个体的基因型,所述基因型包括一个或多个个体特异性遗传变体;b)至少部分地根据所述个体的基因型为所述个体分配世系;c)使用性状相关变体(其包括源自与所述个体具有相同世系的受试者(受试者组)的世系特异性遗传变体)的数据库,以至少部分地基于所述个体的世系选择一个或多个世系特异性遗传变体,其中所述一个或多个世系特异性遗传变体中的每一个对应于:(i)所述一个或多个个体特异性遗传变体中的个体特异性遗传变体,或(ii)在与所述个体具有相同世系的受试者群体中与所述一个或多个个体特异性遗传变体中的个体特异性遗传变体处于连锁不平衡(LD)的预先确定的遗传变体,并且其中所述一个或多个世系特异性遗传变体中的每一个和所述个体特异性遗传变体中的每一个包括一个或多个风险单位;(d)基于所选择的一个或多个世系特异性遗传变体计算所述个体的遗传风险得分,其中所述遗传风险得分指示所述个体具有或将发展所述特定性状的可能性;以及(e)向所述个体提供建议,所述建议包括基于所述遗传风险得分的与所述特定性状相关的行为改变。在一些实施方案中,所述方法还包括向所述个体提供调查表,所述调查表包括与所述特定性状有关的一个或多个问题。在一些实施方案中,所述方法还包括从所述个体接收对提供给所述个体的调查表中与所述特定性状有关的一个或多个问题的一个或多个答案。在一些实施方案中,所述方法还包括:a)向所述个体提供调查表,所述调查表包括与所述特定性状有关的一个或多个问题;以及b)从所述个体接收对所述一个或多个问题的一个或多个答案,其中对所述个体的包括与所述特定性状有关的行为改变的建议进一步基于所述个体提供的所述一个或多个答案。在一些实施方案中,所述方法还包括在性状相关变体数据库中存储源自所述受试者组的与特定性状相关联的世系特定遗传变体。在一些实施方案中,遗传风险得分包括百分位数或z得分。在一些实施方案中,LD由(i)至少约0.20的D’值或(ii)至少约0.70的r2值定义。在一些实施方案中,LD由D’值定义,所述D’值包括约0.20至0.25、0.25至0.30、0.30至0.35、0.35至0.40、0.40至0.45、0.45至0.50、0.50至0.55、0.55至0.60、0.60至0.65、0.65至0.70、0.70至0.75、0.75至0.80、0.80至0.85、0.85至0.90、0.90至0.95、或0.95至1.0。在一些实施方案中,LD由D’值定义,所述D’值包括至少约0.20、0.25、0.30、0.35、0.40、0.45、0.50、0.55、0.60、0.65、0.70、0.75、0.80、0.85、0.85、0.90、0.95和1.0。在一些实施方案中,LD由r2值定义,所述r2值包括至少约0.70、0.75、0.80、0.85、0.90、0.95和1.0。在一些实施方案中,LD由r2值定义,所述r2值包括至少约0.70至0.75、0.75至0.80、0.80至0.85、0.85至0.90、0.90至0.95或0.95至1.0。在一些实施方案中,个体的基因型是通过对从个体获得的遗传物质进行或已经进行基因分型测定而获得的。在一些实施方案中,通过对从个体获得的遗传物质进行脱氧核糖核酸(DNA)阵列、核糖核酸(RNA)阵列、测序测定或其组合来获得个体的基因型。在一些实施方案中,测序测定包括下一代测序(NGS)。在一些实施方案中,所述方法还包括用所述个体的所分配的世系、特定性状和基因型更新所述性状相关变体数据库。在一些实施方案中,使用主成分分析(PCA)、或最大似然估计(MLE)或其组合向(b)中的个体分配世系。在一些实施方案中,一个或多个世系特异性遗传变体、一个或多个个体特异性遗传变体和与一个或多个个体特异性遗传变体处于LD的遗传变体包括单核苷酸变体(SNV)。在一些实施方案中,一个或多个风险单位包括风险等位基因。在一些实施方案中,所述一个或多个世系特异性遗传变体、所述一个或多个个体特异性遗传变体和与所述一个或多个个体特异性遗传变体的处于LD的遗传变体包括以插入或缺失一个或多个核苷酸为特征的indel。在一些实施方案中,一个或多个风险单位包括核苷酸碱基的插入(I)或缺失(D)。在一些实施方案中,所述一个或多个世系特异性遗传变体,或所述一个或多个个体特异性遗传变体包括拷贝数变体(CNV)。在一些实施方案中,一个或多个风险单位包括核酸序列的重复或缺失。在一些实施方案中,核酸序列包括约两个、三个、四个、五个、六个、七个、八个、九个或十个核苷酸。在一些实施方案中,核酸序列包括多于三个的核苷酸。在一些实施方案中,核酸序列包括整个基因。在一些实施方案中,所述方法还包括向所述个体提供关于所述个体具有或将发展所述特定性状的风险的通知。在一些实施方案中,特定性状包括营养性状、临床性状、亚临床性状、体育锻炼性状、皮肤性状、毛发性状、过敏性状或精神性状。在一些实施方案中,临床性状包括疾病或病症。在一些实施方案中,亚临床性状包括疾病或病症的表型。在一些实施方案中,体育锻炼性状包括反感锻炼、有氧运动能力、减肥困难、耐力、力量、身体素质益处、对锻炼的心跳反应降低、瘦体重、肌肉酸痛、肌肉损伤风险、肌肉修复受损、应力性骨折、整体损伤风险、肥胖的可能性或静息代谢率受损。在一些实施方案中,皮肤性状包括胶原分解、干燥、抗氧化剂缺乏、解毒受损、皮肤糖化、色素斑点、年轻性、光老化、真皮敏感性或对阳光的敏感性。在一些实施方案中,毛发性状包括头发厚度、头发稀疏、头发脱落、秃顶、油性、干燥、头皮屑或发量。在一些实施方案中,营养性状包括维生素缺乏、矿物质缺乏、抗氧化剂缺乏、脂肪酸缺乏、代谢失衡、代谢受损、代谢敏感性、过敏、饱腹感或健康饮食的有效性。在一些实施方案中,维生素缺乏包括有包括维生素A、维生素B1、维生素B2、维生素B3、维生素B5、维生素B6、维生素B7、维生素B8、维生素B9、维生素B12、维生素C、维生素D、维生素E和维生素K的维生素的缺乏。在一些实施方案中,矿物质缺乏包括有包括钙、铁、镁、锌或硒的矿物质缺乏。在一些实施方案中,抗氧化剂缺乏包括缺乏包括谷胱甘肽或辅酶Q10(CoQ10)的抗氧化剂。在一些实施方案中,脂肪酸缺乏包括多不饱和脂肪酸或单不饱和脂肪酸的缺乏。在一些实施方案中,代谢失衡包括葡萄糖失衡。在一些实施方案中,代谢受损包括咖啡因或药物疗法的代谢受损。在一些实施方案中,代谢敏感性包括麸质敏感性、聚糖敏感性或乳糖敏感性。在一些实施方案中,过敏包括对食物的过敏(食物过敏)或对环境因素的过敏(环境过敏)。在一些实施方案中,所述方法还包括对个体进行有效地改善或防止个体中的特定性状的治疗,前提是遗传风险得分指示个体具有或将发展特定性状的高可能性。在一些实施方案中,治疗包括补充剂或药物疗法。在一些实施方案中,补充剂包括维生素、矿物质、益生菌、抗氧化剂、抗炎剂或其组合。在一些实施方案中,与特定性状相关的行为改变包括增加、减少或避免包括进行体育锻炼,摄入药物、维生素或补充剂,接触产品,使用产品,饮食改变,睡眠改变、酒精消耗或咖啡因消耗的活动。在一些实施方案中,在报告中展示建议。在一些实施方案中,通过电子设备的用户界面向个体展示报告。在一些实施方案中,报告还包括个体的针对所述特定性状的遗传风险得分。在一些实施方案中,通过以下来计算遗传风险得分:a)计算原始得分,所述原始得分包括受试者组中每个受试者的每个世系特异性遗传变体的一个或多个风险单位的总数,由此生成原始得分的世系特异性观察范围;b)计算一个或多个个体特异性遗传变体中的每个个体特异性遗传变体的一个或多个风险单位的总数,由此生成个体原始得分;以及c)将所述个体原始得分与所述世系特异性观察范围进行比较,以生成所述遗传风险得分。在一些实施方案中,通过以下来计算遗传风险得分:a)确定每个世系特异性遗传风险变体的让步比;以及b)如果选择了两个或更多个世系特异性遗传变体,则将所述两个或更多个世系特异性遗传变体中的每个的让步比相乘。在一些实施方案中,通过以下来计算遗传风险得分:a)确定每个世系特异性遗传风险变体的相对风险;以及b)如果选择了两个或更多个世系特异性遗传变体,则将所述两个或更多个世系特异性遗传变体中的每个的相对风险相乘。在一些实施方案中,通过以下来确定预先确定的的遗传变体:a)提供来自个体的未分型基因型数据;b)将未分型基因型数据分型,以根据所述个体的世系产生个体特异性分型单倍型;c)使用来自与所述个体具有相同世系的参考组的分型单倍型数据,插补分型的个体特异性分型单倍型中不存在的个体特异性基因型;和d)从插补的个体特异性基因型中选择与个体特异性遗传变体处于连锁不平衡(LD)的遗传变体,其与个体具有或将发展特定性状的可能性相关联。In some embodiments, this document discloses a computer-implemented method for suggesting behavioral changes to an individual based on their lineage and genotype, the method comprising: a) providing the individual's genotype, the genotype including one or more individual-specific genetic variants; b) assigning the individual a lineage at least in part based on the individual's genotype; c) using a database of trait-related variants (which include lineage-specific genetic variants derived from subjects (subject groups) with the same lineage as the individual) to select one or more lineage-specific genetic variants at least in part based on the individual's lineage, wherein each of the one or more lineage-specific genetic variants corresponds to: (i) an individual-specific genetic variant of the one or more individual-specific genetic variants. The method includes (i) a heterosexual variant, or (ii) a predetermined variant in linkage disequilibrium (LD) with an individual-specific variant among the one or more individual-specific variants in a group of subjects with the same lineage as the individual, wherein each of the one or more lineage-specific variants and each of the individual-specific variants includes one or more risk units; (d) calculating a genetic risk score for the individual based on the selected one or more lineage-specific variants, wherein the genetic risk score indicates the likelihood that the individual has or will develop the specific trait; and (e) providing the individual with recommendations including behavioral changes related to the specific trait based on the genetic risk score. In some embodiments, the method further includes providing the individual with a questionnaire including one or more questions related to the specific trait. In some embodiments, the method further includes receiving from the individual one or more answers to one or more questions related to the specific trait in a questionnaire provided to the individual. In some embodiments, the method further includes: a) providing the individual with a questionnaire including one or more questions related to the specific trait; and b) receiving one or more answers from the individual to the one or more questions, wherein recommendations for behavioral changes related to the specific trait, including those provided by the individual, are further based on the one or more answers provided by the individual. In some embodiments, the method further includes storing lineage-specific genetic variants associated with the specific trait from the subject group in a trait-related variant database. In some embodiments, the genetic risk score includes percentiles or z-scores. In some embodiments, LD is defined by (i) a D' value of at least about 0.20 or (ii) an value of at least about 0.70. In some implementations, LD is defined by a D' value, which includes approximately 0.20 to 0.25, 0.25 to 0.30, 0.30 to 0.35, 0.35 to 0.40, 0.40 to 0.45, 0.45 to 0.50, 0.50 to 0.55, 0.55 to 0.60, 0.60 to 0.65, 0.65 to 0.70, 0.70 to 0.75, 0.75 to 0.80, 0.80 to 0.85, 0.85 to 0.90, 0.90 to 0.95, or 0.95 to 1.0. In some embodiments, LD is defined by a D' value, which includes at least about 0.20, 0.25, 0.30, 0.35, 0.40, 0.45, 0.50, 0.55, 0.60, 0.65, 0.70, 0.75, 0.80, 0.85, 0.85, 0.90, 0.95, and 1.0. In some embodiments, LD is defined by an value, which includes at least about 0.70, 0.75, 0.80, 0.85, 0.90, 0.95, and 1.0. In some embodiments, LD is defined by an value, which includes at least about 0.70 to 0.75, 0.75 to 0.80, 0.80 to 0.85, 0.85 to 0.90, 0.90 to 0.95, or 0.95 to 1.0. In some embodiments, an individual's genotype is obtained by genotyping or has been performed on genetic material obtained from the individual. In some embodiments, an individual's genotype is obtained by performing deoxyribonucleic acid (DNA) arraying, ribonucleic acid (RNA) arraying, sequencing, or a combination thereof on genetic material obtained from the individual. In some embodiments, sequencing includes next-generation sequencing (NGS). In some embodiments, the method further includes updating the trait-related variant database with the assigned pedigree, specific trait, and genotype of the individual. In some embodiments, principal component analysis (PCA), or maximum likelihood estimation (MLE), or a combination thereof, is used to assign pedigrees to the individuals in (b). In some embodiments, one or more pedigree-specific genetic variants, one or more individual-specific genetic variants, and genetic variants at the LD with one or more individual-specific genetic variants include single nucleotide variants (SNVs). In some embodiments, one or more risk units include risk alleles. In some embodiments, the one or more pedigree-specific genetic variants, the one or more individual-specific genetic variants, and genetic variants at the LD with one or more individual-specific genetic variants include indels characterized by the insertion or deletion of one or more nucleotides. In some embodiments, one or more risk units comprise an insertion (I) or deletion (D) of a nucleotide base. In some embodiments, the one or more lineage-specific genetic variants, or the one or more individual-specific genetic variants, comprise copy number variants (CNVs). In some embodiments, one or more risk units comprise a duplication or deletion of a nucleic acid sequence. In some embodiments, the nucleic acid sequence comprises about two, three, four, five, six, seven, eight, nine, or ten nucleotides. In some embodiments, the nucleic acid sequence comprises more than three nucleotides. In some embodiments, the nucleic acid sequence comprises the entire gene. In some embodiments, the method further includes providing the individual with notification of the risk that the individual has or will develop the specific trait. In some embodiments, the specific trait includes nutritional traits, clinical traits, subclinical traits, physical activity traits, skin traits, hair traits, allergic traits, or mental traits. In some embodiments, a clinical trait includes a disease or condition. In some embodiments, a subclinical trait includes a phenotype of a disease or condition. In some implementations, physical activity traits include aversion to exercise, aerobic capacity, difficulty losing weight, endurance, strength, physical fitness benefits, decreased heart rate response to exercise, lean body mass, muscle soreness, risk of muscle injury, impaired muscle repair, stress fractures, overall injury risk, likelihood of obesity, or impaired resting metabolic rate. In some implementations, skin traits include collagen breakdown, dryness, antioxidant deficiency, impaired detoxification, skin glycation, pigmentation spots, premature aging, photoaging, dermal sensitivity, or sensitivity to sunlight. In some implementations, hair traits include hair thickness, thinning hair, hair loss, baldness, oiliness, dryness, dandruff, or hair volume. In some implementations, nutritional traits include vitamin deficiencies, mineral deficiencies, antioxidant deficiencies, fatty acid deficiencies, metabolic imbalances, impaired metabolism, metabolic sensitivity, allergies, satiety, or effectiveness of a healthy diet. In some embodiments, vitamin deficiency includes deficiencies in vitamins A, B1, B2, B3, B5, B6, B7, B8, B9, B12, C, D, E, and K. In some embodiments, mineral deficiency includes deficiencies in minerals including calcium, iron, magnesium, zinc, or selenium. In some embodiments, antioxidant deficiency includes deficiencies in antioxidants including glutathione or coenzyme Q10 (CoQ10). In some embodiments, fatty acid deficiency includes deficiencies in polyunsaturated or monounsaturated fatty acids. In some embodiments, metabolic imbalance includes glucose imbalance. In some embodiments, metabolic impairment includes metabolic impairment due to caffeine or drug therapy. In some embodiments, metabolic sensitivity includes gluten sensitivity, glycan sensitivity, or lactose sensitivity. In some embodiments, allergy includes allergies to food (food allergy) or allergies to environmental factors (environmental allergy). In some embodiments, the method further includes treatment to effectively improve or prevent a specific trait in an individual, provided that a genetic risk score indicates a high probability that the individual has or will develop the specific trait. In some embodiments, the treatment includes supplements or pharmacological therapy. In some embodiments, supplements include vitamins, minerals, probiotics, antioxidants, anti-inflammatory agents, or combinations thereof. In some embodiments, behavioral changes associated with the specific trait include increasing, decreasing, or avoiding activities including physical exercise, ingestion of drugs, vitamins, or supplements, exposure to products, use of products, dietary changes, sleep changes, alcohol consumption, or caffeine consumption. In some embodiments, recommendations are presented in the report. In some embodiments, the report is presented to the individual via a user interface of an electronic device. In some embodiments, the report also includes the individual's genetic risk score for the specific trait. In some embodiments, a genetic risk score is calculated by: a) calculating a raw score comprising the total number of one or more risk units for each lineage-specific genetic variant for each subject in the subject group, thereby generating a lineage-specific observation range for the raw score; b) calculating the total number of one or more risk units for each individual-specific genetic variant among one or more individual-specific genetic variants, thereby generating an individual raw score; and c) comparing the individual raw score to the lineage-specific observation range to generate the genetic risk score. In some embodiments, a genetic risk score is calculated by: a) determining a concession ratio for each lineage-specific genetic risk variant; and b) if two or more lineage-specific genetic variants are selected, multiplying the concession ratios of each of the two or more lineage-specific genetic variants. In some embodiments, a genetic risk score is calculated by: a) determining a relative risk for each lineage-specific genetic risk variant; and b) if two or more lineage-specific genetic variants are selected, multiplying the relative risks of each of the two or more lineage-specific genetic variants. In some implementations, predetermined genetic variants are determined by: a) providing untyped genotype data from an individual; b) typing the untyped genotype data to generate individual-specific typing haplotypes based on the individual's lineage; c) imputing individual-specific genotypes not present in the typed individual-specific typing haplotypes using typing haplotype data from a reference group with the same lineage as the individual; and d) selecting from the imputed individual-specific genotypes genetic variants that are in linkage disequilibrium (LD) with the individual-specific genetic variants, which are associated with the likelihood that the individual has or will develop the specific trait.

在某些实施方案中,本文公开了基于个体的世系确定个体具有或将发展特定性状的可能性的计算机实现的方法,所述方法包括:a)提供所述个体的基因型,所述基因型包括一个或多个个体特异性遗传变体;b)至少部分地根据所述个体的基因型为所述个体分配世系;c)使用性状相关变体数据库(其包括源自与所述个体具有相同世系的受试者(受试者组)的世系特异性遗传变体),以至少部分地基于所述个体的世系选择一个或多个世系特异性遗传变体,其中所述一个或多个世系特异性遗传变体中的每一个对应于:(i)所述一个或多个个体特异性遗传变体中的个体特异性遗传变体,或(ii)在与所述个体具有相同世系的受试者群体中与所述一个或多个个体特异性遗传变体中的个体特异性遗传变体处于连锁不平衡(LD)的预先确定的遗传变体,并且其中所述一个或多个世系特异性遗传变体中的每一个和所述个体特异性遗传变体中的每一个包括一个或多个风险单位;以及(d)基于所选择的一个或多个世系特异性遗传变体计算所述个体的遗传风险得分,其中所述遗传风险得分指示所述个体具有或将发展所述特定性状的可能性。在一些实施方案中,所述方法还包括向所述个体提供关于所述个体具有或将发展所述特定性状的风险的通知。在一些实施方案中,通知包括与特定性状相关的行为改变的建议。在一些实施方案中,与特定性状相关的行为改变包括增加、减少或避免包括进行体育锻炼,摄入药物、维生素或补充剂,接触产品,使用产品,饮食改变,睡眠改变,酒精消耗或咖啡因消耗的活动。在一些实施方案中,在报告中展示通知。在一些实施方案中,通过电子设备的用户界面向个体展示报告。在一些实施方案中,所述方法还包括向所述个体提供调查表,所述调查表包括与所述特定性状有关的一个或多个问题。在一些实施方案中,所述方法还包括从所述个体接收对提供给所述个体的调查表中与所述特定性状有关的一个或多个问题的一个或多个答案。在一些实施方案中,所述方法还包括:a)向所述个体提供调查表,所述调查表包括与所述特定性状有关的一个或多个问题;以及b)从所述个体接收对所述一个或多个问题的一个或多个答案,其中对所述个体的包括与所述特定性状有关的行为改变的建议进一步基于所述个体提供的所述一个或多个答案。在一些实施方案中,所述方法还包括在性状相关变体数据库中存储源自所述受试者组的与特定性状相关联的世系特定遗传变体。在一些实施方案中,遗传风险得分包括百分位数或z得分。在一些实施方案中,LD由(i)至少约0.20的D’值或(ii)至少约0.70的r2值定义。在一些实施方案中,LD由D’值定义,所述D’值包括约0.20至0.25、0.25至0.30、0.30至0.35、0.35至0.40、0.40至0.45、0.45至0.50、0.50至0.55、0.55至0.60、0.60至0.65、0.65至0.70、0.70至0.75、0.75至0.80、0.80至0.85、0.85至0.90、0.90至0.95、或0.95至1.0。在一些实施方案中,LD由r2值定义,所述r2值包括约0.70至0.75、0.75至0.80、0.80至0.85、0.85至0.90、0.90至0.95或0.95至1.0。在一些实施方案中,LD由D’值定义,所述D’值包括至少约0.20、0.25、0.30、0.35、0.40、0.45、0.50、0.55、0.60、0.65、0.70、0.75、0.80、0.85、0.85、0.90、0.95和1.0。在一些实施方案中,LD由r2值定义,所述r2值包括至少约0.70、0.75、0.80、0.85、0.90、0.95和1.0。在一些实施方案中,个体的基因型是通过对从个体获得的遗传物质进行或已经进行基因分型测定而获得的。在一些实施方案中,通过对从个体获得的遗传物质进行脱氧核糖核酸(DNA)阵列、核糖核酸(RNA)阵列、测序测定或其组合来获得个体的基因型。在一些实施方案中,测序测定包括下一代测序(NGS)。在一些实施方案中,所述方法还包括用所述个体的所分配的世系、特定性状和基因型更新所述性状相关变体数据库。在一些实施方案中,使用主成分分析(PCA)、或最大似然估计(MLE)或其组合向(b)中的个体分配世系。在一些实施方案中,一个或多个世系特异性遗传变体、一个或多个个体特异性遗传变体和与一个或多个个体特异性遗传变体处于LD的遗传变体包括单核苷酸变体(SNV)。在一些实施方案中,一个或多个风险单位包括风险等位基因。在一些实施方案中,所述一个或多个世系特异性遗传变体、所述一个或多个个体特异性遗传变体和与所述一个或多个个体特异性遗传变体处于LD的遗传变体包括以插入或缺失一个或多个核苷酸为特征的indel。在一些实施方案中,一个或多个风险单位包括一个或多个核苷酸的插入(I)或缺失(D)。在一些实施方案中,所述一个或多个世系特异性遗传变体,或所述一个或多个个体特异性遗传变体包括拷贝数变体(CNV)。在一些实施方案中,一个或多个风险单位包括核酸序列的插入或缺失。在一些实施方案中,核酸序列包括约两个、三个、四个、五个、六个、七个、八个、九个或十个核苷酸。在一些实施方案中,核酸序列包括多于三个的核苷酸。在一些实施方案中,核酸序列包括整个基因。在一些实施方案中,所述方法还包括向所述个体提供关于所述个体具有或将发展所述特定性状的风险的通知。在一些实施方案中,特定性状包括营养性状、临床性状、亚临床性状、体育锻炼性状、皮肤性状、毛发性状、过敏性状或精神性状。在一些实施方案中,临床性状包括疾病或病症。在一些实施方案中,亚临床性状包括疾病或病症的表型。在一些实施方案中,体育锻炼性状包括反感锻炼、有氧运动能力、减肥困难、耐力、力量、身体素质益处、对锻炼的心跳反应降低、瘦体重、肌肉酸痛、肌肉损伤风险、肌肉修复受损、应力性骨折、整体损伤风险、肥胖的可能性或静息代谢率受损。在一些实施方案中,皮肤性状包括胶原分解、干燥、抗氧化剂缺乏、解毒受损、皮肤糖化、色素斑点、年轻性、光老化、真皮敏感性或对阳光的敏感性。在一些实施方案中,营养性状包括维生素缺乏、矿物质缺乏、抗氧化剂缺乏、脂肪酸缺乏、代谢失衡、代谢受损、代谢敏感性、过敏、饱腹感或健康饮食的有效性。在一些实施方案中,毛发性状包括头发厚度、头发稀疏、头发脱落、秃顶、油性、干燥、头皮屑或发量。在一些实施方案中,维生素缺乏包括有包括维生素A、维生素B1、维生素B2、维生素B3、维生素B5、维生素B6、维生素B7、维生素B8、维生素B9、维生素B12、维生素C、维生素D、维生素E和维生素K的维生素的缺乏。在一些实施方案中,矿物质缺乏包括有包括钙、铁、镁、锌或硒的矿物质缺乏。在一些实施方案中,抗氧化剂缺乏包括缺乏包括谷胱甘肽或辅酶Q10(CoQ10)的抗氧化剂。在一些实施方案中,脂肪酸缺乏包括多不饱和脂肪酸或单不饱和脂肪酸的缺乏。在一些实施方案中,代谢失衡包括葡萄糖失衡。在一些实施方案中,代谢受损包括咖啡因或药物疗法的代谢受损。在一些实施方案中,代谢敏感性包括麸质敏感性、聚糖敏感性或乳糖敏感性。在一些实施方案中,过敏包括对食物的过敏(食物过敏)或对环境因素的过敏(环境过敏)。在一些实施方案中,所述方法还包括对个体进行有效地改善或防止个体中的特定性状的治疗,前提是遗传风险得分指示个体具有或将发展特定性状的高可能性。在一些实施方案中,治疗包括补充剂或药物疗法。在一些实施方案中,补充剂包括维生素、矿物质、益生菌、抗氧化剂、抗炎剂或其组合。在一些实施方案中,通过以下来计算遗传风险得分:a)计算原始得分,所述原始得分包括受试者组中每个受试者的每个世系特异性遗传变体的一个或多个风险单位的总数,由此生成原始得分的世系特异性观察范围;b)计算一个或多个个体特异性遗传变体中的每个个体特异性遗传变体的一个或多个风险单位的总数,由此生成个体原始得分;以及c)将所述个体原始得分与所述世系特异性观察范围进行比较,以生成所述遗传风险得分。在一些实施方案中,通过以下来计算遗传风险得分:a)确定每个世系特异性遗传风险变体的让步比;以及b)如果选择了两个或更多个世系特异性遗传变体,则将所述两个或更多个世系特异性遗传变体中的每个的让步比相乘。在一些实施方案中,通过以下来计算遗传风险得分:a)确定每个世系特异性遗传风险变体的相对风险;以及b)如果选择了两个或更多个世系特异性遗传变体,则将所述两个或更多个世系特异性遗传变体中的每个的相对风险相乘。在一些实施方案中,通过以下来确定预先确定的的遗传变体:a)提供来自个体的未分型基因型数据;b)将未分型基因型数据分型,以根据所述个体的世系产生个体特异性分型单倍型;c)使用来自与所述个体具有相同世系的参考组的分型单倍型数据,插补分型的个体特异性分型单倍型中不存在的个体特异性基因型;和d)从插补的个体特异性基因型中选择与个体特异性遗传变体处于连锁不平衡(LD)的遗传变体,其与个体具有或将发展特定性状的可能性相关联。In some embodiments, this document discloses a computer-implemented method for determining the likelihood that an individual has or will develop a specific trait based on the individual's lineage, the method comprising: a) providing the individual's genotype, the genotype comprising one or more individual-specific genetic variants; b) assigning the individual a lineage at least in part based on the individual's genotype; c) using a trait-related variant database (which comprises lineage-specific genetic variants derived from subjects (subject groups) sharing the same lineage as the individual) to select one or more lineage-specific genetic variants at least in part based on the individual's lineage, wherein each of the one or more lineage-specific genetic variants corresponds to: ( (i) an individual-specific genetic variant among the one or more individual-specific genetic variants, or (ii) a predetermined genetic variant in linkage disequilibrium (LD) with the individual-specific genetic variant among the one or more individual-specific genetic variants in a subject population with the same lineage as the individual, and wherein each of the one or more lineage-specific genetic variants and each of the individual-specific genetic variants includes one or more risk units; and (d) calculating a genetic risk score for the individual based on the selected one or more lineage-specific genetic variants, wherein the genetic risk score indicates the likelihood that the individual has or will develop the specific trait. In some embodiments, the method further includes providing the individual with notification of the risk that the individual has or will develop the specific trait. In some embodiments, the notification includes recommendations for behavioral changes associated with the specific trait. In some embodiments, behavioral changes associated with the specific trait include increasing, decreasing, or avoiding activities including physical exercise, ingestion of drugs, vitamins, or supplements, exposure to products, use of products, dietary changes, sleep changes, alcohol consumption, or caffeine consumption. In some embodiments, the notification is displayed in a report. In some embodiments, the report is displayed to the individual via a user interface of an electronic device. In some embodiments, the method further includes providing the individual with a questionnaire comprising one or more questions relating to the specific trait. In some embodiments, the method further includes receiving from the individual one or more answers to one or more questions relating to the specific trait in a questionnaire provided to the individual. In some embodiments, the method further includes: a) providing the individual with a questionnaire comprising one or more questions relating to the specific trait; and b) receiving from the individual one or more answers to the one or more questions, wherein recommendations for behavioral changes related to the specific trait for the individual are further based on the one or more answers provided by the individual. In some embodiments, the method further includes storing lineage-specific genetic variants associated with the specific trait from the subject group in a trait-related variant database. In some embodiments, the genetic risk score includes percentiles or z-scores. In some embodiments, LD is defined by (i) a D' value of at least about 0.20 or (ii) an value of at least about 0.70. In some implementations, LD is defined by a D' value, which includes approximately 0.20 to 0.25, 0.25 to 0.30, 0.30 to 0.35, 0.35 to 0.40, 0.40 to 0.45, 0.45 to 0.50, 0.50 to 0.55, 0.55 to 0.60, 0.60 to 0.65, 0.65 to 0.70, 0.70 to 0.75, 0.75 to 0.80, 0.80 to 0.85, 0.85 to 0.90, 0.90 to 0.95, or 0.95 to 1.0. In some embodiments, LD is defined by an value, which includes approximately 0.70 to 0.75, 0.75 to 0.80, 0.80 to 0.85, 0.85 to 0.90, 0.90 to 0.95, or 0.95 to 1.0. In some embodiments, LD is defined by a D' value, which includes at least approximately 0.20, 0.25, 0.30, 0.35, 0.40, 0.45, 0.50, 0.55, 0.60, 0.65, 0.70, 0.75, 0.80, 0.85, 0.85, 0.90, 0.95, and 1.0. In some embodiments, LD is defined by an value, which includes at least approximately 0.70, 0.75, 0.80, 0.85, 0.90, 0.95, and 1.0. In some embodiments, an individual's genotype is obtained by genotyping or has been performed on genetic material obtained from the individual. In some embodiments, an individual's genotype is obtained by performing deoxyribonucleic acid (DNA) arraying, ribonucleic acid (RNA) arraying, sequencing, or a combination thereof on genetic material obtained from the individual. In some embodiments, sequencing includes next-generation sequencing (NGS). In some embodiments, the method further includes updating the trait-related variant database with the assigned pedigree, specific trait, and genotype of the individual. In some embodiments, principal component analysis (PCA), or maximum likelihood estimation (MLE), or a combination thereof, is used to assign pedigrees to the individuals in (b). In some embodiments, one or more pedigree-specific genetic variants, one or more individual-specific genetic variants, and genetic variants at the LD with one or more individual-specific genetic variants include single nucleotide variants (SNVs). In some embodiments, one or more risk units include risk alleles. In some embodiments, the one or more pedigree-specific genetic variants, the one or more individual-specific genetic variants, and genetic variants at the LD with one or more individual-specific genetic variants include indels characterized by the insertion or deletion of one or more nucleotides. In some embodiments, one or more risk units comprise an insertion (I) or deletion (D) of one or more nucleotides. In some embodiments, the one or more lineage-specific genetic variants, or the one or more individual-specific genetic variants, comprise copy number variants (CNVs). In some embodiments, one or more risk units comprise an insertion or deletion of a nucleic acid sequence. In some embodiments, the nucleic acid sequence comprises about two, three, four, five, six, seven, eight, nine, or ten nucleotides. In some embodiments, the nucleic acid sequence comprises more than three nucleotides. In some embodiments, the nucleic acid sequence comprises the entire gene. In some embodiments, the method further includes providing the individual with notification of the risk that the individual has or will develop the specific trait. In some embodiments, the specific trait includes nutritional traits, clinical traits, subclinical traits, physical activity traits, skin traits, hair traits, allergic traits, or mental traits. In some embodiments, a clinical trait includes a disease or condition. In some embodiments, a subclinical trait includes a phenotype of a disease or condition. In some implementations, physical activity traits include aversion to exercise, aerobic capacity, difficulty losing weight, endurance, strength, physical fitness benefits, decreased heart rate response to exercise, lean body mass, muscle soreness, risk of muscle injury, impaired muscle repair, stress fractures, overall injury risk, likelihood of obesity, or impaired resting metabolic rate. In some implementations, skin traits include collagen breakdown, dryness, antioxidant deficiency, impaired detoxification, skin glycation, pigmentation spots, premature aging, photoaging, dermal sensitivity, or sensitivity to sunlight. In some implementations, nutritional traits include vitamin deficiencies, mineral deficiencies, antioxidant deficiencies, fatty acid deficiencies, metabolic imbalances, impaired metabolism, metabolic sensitivity, allergies, satiety, or effectiveness of a healthy diet. In some implementations, hair traits include hair thickness, thinning hair, hair loss, baldness, oiliness, dryness, dandruff, or hair volume. In some embodiments, vitamin deficiency includes deficiencies in vitamins A, B1, B2, B3, B5, B6, B7, B8, B9, B12, C, D, E, and K. In some embodiments, mineral deficiency includes deficiencies in minerals including calcium, iron, magnesium, zinc, or selenium. In some embodiments, antioxidant deficiency includes deficiencies in antioxidants including glutathione or coenzyme Q10 (CoQ10). In some embodiments, fatty acid deficiency includes deficiencies in polyunsaturated or monounsaturated fatty acids. In some embodiments, metabolic imbalance includes glucose imbalance. In some embodiments, metabolic impairment includes metabolic impairment due to caffeine or drug therapy. In some embodiments, metabolic sensitivity includes gluten sensitivity, glycan sensitivity, or lactose sensitivity. In some embodiments, allergy includes allergies to food (food allergy) or allergies to environmental factors (environmental allergy). In some embodiments, the method further includes treatment to effectively improve or prevent a specific trait in an individual, provided that the genetic risk score indicates a high probability that the individual has or will develop the specific trait. In some embodiments, the treatment includes supplements or pharmacological therapy. In some embodiments, supplements include vitamins, minerals, probiotics, antioxidants, anti-inflammatory agents, or combinations thereof. In some embodiments, the genetic risk score is calculated by: a) calculating a raw score comprising the total number of one or more risk units for each lineage-specific genetic variant for each subject in the subject group, thereby generating a lineage-specific observation range for the raw score; b) calculating the total number of one or more risk units for each of one or more individual-specific genetic variants, thereby generating an individual raw score; and c) comparing the individual raw score to the lineage-specific observation range to generate the genetic risk score. In some embodiments, the genetic risk score is calculated by: a) determining a concession ratio for each lineage-specific genetic risk variant; and b) if two or more lineage-specific genetic variants are selected, multiplying the concession ratios of each of the two or more lineage-specific genetic variants. In some embodiments, a genetic risk score is calculated by: a) determining the relative risk of each lineage-specific genetic risk variant; and b) if two or more lineage-specific genetic variants are selected, multiplying the relative risks of each of the two or more lineage-specific genetic variants. In some embodiments, predetermined genetic variants are determined by: a) providing untyped genotype data from an individual; b) typing the untyped genotype data to generate individual-specific typing haplotypes based on the individual's lineage; c) imputing individual-specific genotypes not present in the typed individual-specific typing haplotypes using typing haplotype data from a reference group with the same lineage as the individual; and d) selecting from the imputed individual-specific genotypes genetic variants that are in linkage disequilibrium (LD) with the individual-specific genetic variants, which are associated with the likelihood that the individual has or will develop the specific trait.

在某些实施方案中,本文公开了健康报告系统,其包括:a)计算设备,其包括至少一个处理器、存储器和软件程序,软件程序包括可由至少一个处理器执行以评估个体具有或将发展特定性状的可能性的指令,所述指令包括以下步骤:(i)提供所述个体的基因型,所述基因型包括一个或多个个体特异性遗传变体;(ii)至少部分地根据所述个体的基因型为所述个体分配世系;(iii)使用性状相关变体数据库(其包括源自与所述个体具有相同世系的受试者(受试者组)的世系特异性遗传变体),以至少部分地基于所述个体的世系选择一个或多个世系特异性遗传变体,其中所述一个或多个世系特异性遗传变体中的每一个对应于:(1)所述一个或多个个体特异性遗传变体中的个体特异性遗传变体,或(2)在与所述个体具有相同世系的受试者群体中与所述一个或多个个体特异性遗传变体中的个体特异性遗传变体处于连锁不平衡(LD)的预先确定的遗传变体,并且其中所述一个或多个世系特异性遗传变体中的每一个和所述个体特异性遗传变体中的每一个包括一个或多个风险单位;以及(iv)基于所选择的一个或多个世系特异性遗传变体计算所述个体的遗传风险得分,其中所述遗传风险得分指示所述个体具有或将发展所述特定性状的可能性;b)报告模块,其生成包括所述个体的针对所述特定性状的遗传风险得分的报告;以及c)输出模块,其被配置为向所述个体展示所述报告。在一些实施方案中,遗传风险得分包括百分位数或z得分。在一些实施方案中,LD由(i)至少约0.20的D’值或(ii)至少约0.70的r2值定义。在一些实施方案中,LD由D’值定义,所述D’值包括约0.20至0.25、0.25至0.30、0.30至0.35、0.35至0.40、0.40至0.45、0.45至0.50、0.50至0.55、0.55至0.60、0.60至0.65、0.65至0.70、0.70至0.75、0.75至0.80、0.80至0.85、0.85至0.90、0.90至0.95、或0.95至1.0。在一些实施方案中,LD由r2值定义,所述r2值包括约0.70至0.75、0.75至0.80、0.80至0.85、0.85至0.90、0.90至0.95或0.95至1.0。在一些实施方案中,LD由D’值定义,所述D’值包括至少约0.20、0.25、0.30、0.35、0.40、0.45、0.50、0.55、0.60、0.65、0.70、0.75、0.80、0.85、0.85、0.90、0.95和1.0。在一些实施方案中,LD由r2值定义,所述r2值包括至少约0.70、0.75、0.80、0.85、0.90、0.95和1.0。在一些实施方案中,个体的基因型是通过对从个体获得的遗传物质进行或已经进行基因分型测定而获得的。在一些实施方案中,通过对从个体获得的遗传物质进行脱氧核糖核酸(DNA)阵列、核糖核酸(RNA)阵列、测序测定或其组合来获得个体的基因型。在一些实施方案中,测序测定包括下一代测序(NGS)。在一些实施方案中,所述方法还包括用所述个体的所分配的世系、特定性状和基因型更新所述性状相关变体数据库。在一些实施方案中,使用主成分分析(PCA)、或最大似然估计(MLE)或其组合向(b)中的个体分配世系。在一些实施方案中,一个或多个世系特异性遗传变体、一个或多个个体特异性遗传变体和与一个或多个个体特异性遗传变体处于LD的遗传变体包括单核苷酸变体(SNV)。在一些实施方案中,一个或多个风险单位包括风险等位基因。在一些实施方案中,所述一个或多个世系特异性遗传变体、所述一个或多个个体特异性遗传变体和与所述一个或多个个体特异性遗传变体处于LD的遗传变体包括以插入或缺失一个或多个核苷酸为特征的indel。在一些实施方案中,一个或多个风险单位包括一个或多个核苷酸的插入(I)或缺失(D)。在一些实施方案中,所述一个或多个世系特异性遗传变体,或所述一个或多个个体特异性遗传变体包括拷贝数变体(CNV)。在一些实施方案中,一个或多个风险单位包括核酸序列的插入或缺失。在一些实施方案中,核酸序列包括约两个、三个、四个、五个、六个、七个、八个、九个或十个核苷酸。在一些实施方案中,核酸序列包括多于三个的核苷酸。在一些实施方案中,核酸序列包括整个基因。在一些实施方案中,所述方法还包括向所述个体提供关于所述个体具有或将发展所述特定性状的风险的通知。在一些实施方案中,特定性状包括营养性状、临床性状、亚临床性状、体育锻炼性状、皮肤性状、毛发性状、过敏性状或精神性状。在一些实施方案中,临床性状包括疾病或病症。在一些实施方案中,亚临床性状包括疾病或病症的表型。在一些实施方案中,体育锻炼性状包括反感锻炼、有氧运动能力、减肥困难、耐力、力量、身体素质益处、对锻炼的心跳反应降低、瘦体重、肌肉酸痛、肌肉损伤风险、肌肉修复受损、应力性骨折、整体损伤风险、肥胖的可能性或静息代谢率受损。在一些实施方案中,皮肤性状包括胶原分解、干燥、抗氧化剂缺乏、解毒受损、皮肤糖化、色素斑点、年轻性、光老化、真皮敏感性或对阳光的敏感性。在一些实施方案中,毛发性状包括头发厚度、头发稀疏、头发脱落、秃顶、油性、干燥、头皮屑或发量。在一些实施方案中,营养性状包括维生素缺乏、矿物质缺乏、抗氧化剂缺乏、脂肪酸缺乏、代谢失衡、代谢受损、代谢敏感性、过敏、饱腹感或健康饮食的有效性。在一些实施方案中,维生素缺乏包括有包括维生素A、维生素B1、维生素B2、维生素B3、维生素B5、维生素B6、维生素B7、维生素B8、维生素B9、维生素B12、维生素C、维生素D、维生素E和维生素K的维生素的缺乏。在一些实施方案中,矿物质缺乏包括有包括钙、铁、镁、锌或硒的矿物质缺乏。在一些实施方案中,抗氧化剂缺乏包括缺乏包括谷胱甘肽或辅酶Q10(CoQ10)的抗氧化剂。在一些实施方案中,脂肪酸缺乏包括多不饱和脂肪酸或单不饱和脂肪酸的缺乏。在一些实施方案中,代谢失衡包括葡萄糖失衡。在一些实施方案中,代谢受损包括咖啡因或药物疗法的代谢受损。在一些实施方案中,代谢敏感性包括麸质敏感性、聚糖敏感性或乳糖敏感性。在一些实施方案中,过敏包括对食物的过敏(食物过敏)或对环境因素的过敏(环境过敏)。在一些实施方案中,所述方法还包括对个体进行有效地改善或防止个体中的特定性状的治疗,前提是遗传风险得分指示个体具有或将发展特定性状的高可能性。在一些实施方案中,治疗包括补充剂或药物疗法。在一些实施方案中,补充剂包括维生素、矿物质、益生菌、抗氧化剂、抗炎剂或其组合。在一些实施方案中,所述指令还包括向所述个体提供调查表,所述调查表包括与所述特定性状有关的一个或多个问题。在一些实施方案中,所述指令还包括从所述个体接收对提供给所述个体的调查表中与所述特定性状有关的一个或多个问题的一个或多个答案。在一些实施方案中,所述指令还包括:(i)向所述个体提供调查表,所述调查表包括与所述特定性状有关的一个或多个问题;以及(ii)从所述个体接收对所述一个或多个问题的一个或多个答案。在一些实施方案中,所述指令还包括在性状相关变体数据库中存储源自所述受试者组的与特定性状相关联的世系特定遗传变体。在一些实施方案中,输出模块被配置为在个人电子设备的用户界面上展示报告。在一些实施方案中,系统还包括个人电子设备,其具有被配置为经由计算机网络与输出模块通信以访问报告的应用程序。在一些实施方案中,通过以下来计算遗传风险得分:(1)计算原始得分,所述原始得分包括受试者组中每个受试者的每个世系特异性遗传变体的一个或多个风险单位的总数,由此生成原始得分的世系特异性观察范围;(2)计算一个或多个个体特异性遗传变体中的每个个体特异性遗传变体的一个或多个风险单位的总数,由此生成个体原始得分;以及(3)将所述个体原始得分与所述世系特异性观察范围进行比较,以生成所述遗传风险得分。在一些实施方案中,通过以下来计算遗传风险得分:(1)确定每个世系特异性遗传风险变体的让步比;以及(2)如果选择了两个或更多个世系特异性遗传变体,则将所述两个或更多个世系特异性遗传变体中的每个的让步比相乘。在一些实施方案中,系统还包括通过以下确定预先确定的遗传变体的步骤:a)提供来自个体的未分型基因型数据;b)将未分型基因型数据分型,以根据所述个体的世系产生个体特异性分型单倍型;c)使用来自与所述个体具有相同世系的参考组的分型单倍型数据,插补分型的个体特异性分型单倍型中不存在的个体特异性基因型;和d)从插补的个体特异性基因型中选择与个体特异性遗传变体处于连锁不平衡(LD)的遗传变体,其与个体具有或将发展特定性状的可能性相关联。In some embodiments, this document discloses a health reporting system comprising: a) a computing device including at least one processor, memory, and software program, the software program including instructions executable by at least one processor to assess the likelihood that an individual has or will develop a particular trait, the instructions comprising the steps of: (i) providing the individual's genotype, the genotype including one or more individual-specific genetic variants; (ii) assigning a lineage to the individual at least in part based on the individual's genotype; and (iii) using a trait-related variant database (which includes lineage-specific genetic variants derived from subjects (subject groups) with the same lineage as the individual) to select one or more lineage-specific genetic variants at least in part based on the individual's lineage, wherein each of the one or more lineage-specific genetic variants corresponds to: (i) the... (i) an individual-specific genetic variant among one or more individual-specific genetic variants, or (ii) a predetermined genetic variant in linkage disequilibrium (LD) with the individual-specific genetic variant among one or more individual-specific genetic variants in a subject population of the same lineage as the individual, and wherein each of the one or more lineage-specific genetic variants and each of the individual-specific genetic variants includes one or more risk units; and (iv) calculating a genetic risk score for the individual based on the selected one or more lineage-specific genetic variants, wherein the genetic risk score indicates the likelihood that the individual has or will develop the particular trait; (b) a reporting module that generates a report including the genetic risk score for the particular trait for the individual; and (c) an output module configured to display the report to the individual. In some embodiments, the genetic risk score includes percentiles or z-scores. In some embodiments, LD is defined by (i) a D' value of at least about 0.20 or (ii) an value of at least about 0.70. In some implementations, LD is defined by a D' value, which includes approximately 0.20 to 0.25, 0.25 to 0.30, 0.30 to 0.35, 0.35 to 0.40, 0.40 to 0.45, 0.45 to 0.50, 0.50 to 0.55, 0.55 to 0.60, 0.60 to 0.65, 0.65 to 0.70, 0.70 to 0.75, 0.75 to 0.80, 0.80 to 0.85, 0.85 to 0.90, 0.90 to 0.95, or 0.95 to 1.0. In some embodiments, LD is defined by an value, which includes approximately 0.70 to 0.75, 0.75 to 0.80, 0.80 to 0.85, 0.85 to 0.90, 0.90 to 0.95, or 0.95 to 1.0. In some embodiments, LD is defined by a D' value, which includes at least approximately 0.20, 0.25, 0.30, 0.35, 0.40, 0.45, 0.50, 0.55, 0.60, 0.65, 0.70, 0.75, 0.80, 0.85, 0.85, 0.90, 0.95, and 1.0. In some embodiments, LD is defined by an value, which includes at least approximately 0.70, 0.75, 0.80, 0.85, 0.90, 0.95, and 1.0. In some embodiments, an individual's genotype is obtained by genotyping or has been performed on genetic material obtained from the individual. In some embodiments, an individual's genotype is obtained by performing deoxyribonucleic acid (DNA) arraying, ribonucleic acid (RNA) arraying, sequencing, or a combination thereof on genetic material obtained from the individual. In some embodiments, sequencing includes next-generation sequencing (NGS). In some embodiments, the method further includes updating the trait-related variant database with the assigned pedigree, specific trait, and genotype of the individual. In some embodiments, principal component analysis (PCA), or maximum likelihood estimation (MLE), or a combination thereof, is used to assign pedigrees to the individuals in (b). In some embodiments, one or more pedigree-specific genetic variants, one or more individual-specific genetic variants, and genetic variants at the LD with one or more individual-specific genetic variants include single nucleotide variants (SNVs). In some embodiments, one or more risk units include risk alleles. In some embodiments, the one or more pedigree-specific genetic variants, the one or more individual-specific genetic variants, and genetic variants at the LD with one or more individual-specific genetic variants include indels characterized by the insertion or deletion of one or more nucleotides. In some embodiments, one or more risk units comprise an insertion (I) or deletion (D) of one or more nucleotides. In some embodiments, the one or more lineage-specific genetic variants, or the one or more individual-specific genetic variants, comprise copy number variants (CNVs). In some embodiments, one or more risk units comprise an insertion or deletion of a nucleic acid sequence. In some embodiments, the nucleic acid sequence comprises about two, three, four, five, six, seven, eight, nine, or ten nucleotides. In some embodiments, the nucleic acid sequence comprises more than three nucleotides. In some embodiments, the nucleic acid sequence comprises the entire gene. In some embodiments, the method further includes providing the individual with notification of the risk that the individual has or will develop the specific trait. In some embodiments, the specific trait includes nutritional traits, clinical traits, subclinical traits, physical activity traits, skin traits, hair traits, allergic traits, or mental traits. In some embodiments, a clinical trait includes a disease or condition. In some embodiments, a subclinical trait includes a phenotype of a disease or condition. In some implementations, physical activity traits include aversion to exercise, aerobic capacity, difficulty losing weight, endurance, strength, physical fitness benefits, decreased heart rate response to exercise, lean body mass, muscle soreness, risk of muscle injury, impaired muscle repair, stress fractures, overall injury risk, likelihood of obesity, or impaired resting metabolic rate. In some implementations, skin traits include collagen breakdown, dryness, antioxidant deficiency, impaired detoxification, skin glycation, pigmentation spots, premature aging, photoaging, dermal sensitivity, or sensitivity to sunlight. In some implementations, hair traits include hair thickness, thinning hair, hair loss, baldness, oiliness, dryness, dandruff, or hair volume. In some implementations, nutritional traits include vitamin deficiencies, mineral deficiencies, antioxidant deficiencies, fatty acid deficiencies, metabolic imbalances, impaired metabolism, metabolic sensitivity, allergies, satiety, or effectiveness of a healthy diet. In some embodiments, vitamin deficiency includes deficiencies in vitamins A, B1, B2, B3, B5, B6, B7, B8, B9, B12, C, D, E, and K. In some embodiments, mineral deficiency includes deficiencies in minerals including calcium, iron, magnesium, zinc, or selenium. In some embodiments, antioxidant deficiency includes deficiencies in antioxidants including glutathione or coenzyme Q10 (CoQ10). In some embodiments, fatty acid deficiency includes deficiencies in polyunsaturated or monounsaturated fatty acids. In some embodiments, metabolic imbalance includes glucose imbalance. In some embodiments, metabolic impairment includes metabolic impairment due to caffeine or drug therapy. In some embodiments, metabolic sensitivity includes gluten sensitivity, glycan sensitivity, or lactose sensitivity. In some embodiments, allergy includes allergies to food (food allergy) or allergies to environmental factors (environmental allergy). In some embodiments, the method further includes treatment to effectively improve or prevent a specific trait in an individual, provided that a genetic risk score indicates a high probability that the individual has or will develop the specific trait. In some embodiments, the treatment includes supplements or pharmacological therapy. In some embodiments, supplements include vitamins, minerals, probiotics, antioxidants, anti-inflammatory agents, or combinations thereof. In some embodiments, the instructions further include providing the individual with a questionnaire including one or more questions related to the specific trait. In some embodiments, the instructions further include receiving from the individual one or more answers to one or more questions related to the specific trait in a questionnaire provided to the individual. In some embodiments, the instructions further include: (i) providing the individual with a questionnaire including one or more questions related to the specific trait; and (ii) receiving from the individual one or more answers to the one or more questions. In some embodiments, the instructions further include storing lineage-specific genetic variants associated with the specific trait from the subject group in a trait-related variant database. In some embodiments, the output module is configured to display a report on a user interface of a personal electronic device. In some embodiments, the system also includes a personal electronic device having an application configured to communicate with the output module via a computer network to access the report. In some embodiments, the genetic risk score is calculated by: (1) calculating a raw score comprising the total number of one or more risk units for each lineage-specific genetic variant for each subject in the subject group, thereby generating a lineage-specific observation range for the raw score; (2) calculating the total number of one or more risk units for each individual-specific genetic variant among one or more individual-specific genetic variants, thereby generating an individual raw score; and (3) comparing the individual raw score with the lineage-specific observation range to generate the genetic risk score. In some embodiments, the genetic risk score is calculated by: (1) determining a concession ratio for each lineage-specific genetic risk variant; and (2) if two or more lineage-specific genetic variants are selected, multiplying the concession ratios of each of the two or more lineage-specific genetic variants. In some implementations, the system further includes the steps of determining predetermined genetic variants by: a) providing untyped genotype data from an individual; b) typing the untyped genotype data to generate individual-specific typing haplotypes based on the individual's lineage; c) interpolating individual-specific genotypes not present in the interpolated individual-specific typing haplotypes using typing haplotype data from a reference group with the same lineage as the individual; and d) selecting from the interpolated individual-specific genotypes genetic variants that are in linkage disequilibrium (LD) with the individual-specific genetic variants, which are associated with the likelihood that the individual has or will develop the specific trait.

在某些实施方案中,本文公开了非暂时性计算机可读存储介质,其包括计算机可执行代码,所述计算机可执行代码被配置为使至少一个处理器执行以下步骤:a)提供所述个体的基因型,所述基因型包括一个或多个个体特异性遗传变体;b)至少部分地根据所述个体的基因型为所述个体分配世系;c)使用性状相关变体数据库(其包括源自与所述个体具有相同世系的受试者(受试者组)的世系特异性遗传变体),以至少部分地基于所述个体的世系选择一个或多个世系特异性遗传变体,其中所述一个或多个世系特异性遗传变体中的每一个对应于:(i)所述一个或多个个体特异性遗传变体中的个体特异性遗传变体,或(ii)在与所述个体具有相同世系的受试者群体中与所述一个或多个个体特异性遗传变体中的个体特异性遗传变体处于连锁不平衡(LD)的预先确定的遗传变体,并且其中所述一个或多个世系特异性遗传变体中的每一个和所述个体特异性遗传变体中的每一个包括一个或多个风险单位;以及(d)基于所选择的一个或多个世系特异性遗传变体计算所述个体的遗传风险得分,其中所述遗传风险得分指示所述个体具有或将发展所述特定性状的可能性。在一些实施方案中,介质还包括向个体提供包括与特定性状有关的一个或多个问题的调查表。在一些实施方案中,介质还包括从所述个体接收对提供给所述个体的调查表中与所述特定性状有关的一个或多个问题的一个或多个答案。在一些实施方案中,介质还包括:a)向所述个体提供调查表,所述调查表包括与所述特定性状有关的一个或多个问题;以及c)从所述个体接收对所述一个或多个问题的一个或多个答案。在一些实施方案中,介质还包括在性状相关变体数据库中存储源自所述受试者组的与特定性状相关联的世系特定遗传变体。在一些实施方案中,遗传风险得分包括百分位数或z得分。在一些实施方案中,LD由(i)至少约0.20的D’值或(ii)至少约0.70的r2值定义。在一些实施方案中,LD由D’值定义,所述D’值包括约0.20至0.25、0.25至0.30、0.30至0.35、0.35至0.40、0.40至0.45、0.45至0.50、0.50至0.55、0.55至0.60、0.60至0.65、0.65至0.70、0.70至0.75、0.75至0.80、0.80至0.85、0.85至0.90、0.90至0.95、或0.95至1.0。在一些实施方案中,LD由r2值定义,所述r2值包括约0.70至0.75、0.75至0.80、0.80至0.85、0.85至0.90、0.90至0.95或0.95至1.0。在一些实施方案中,LD由D’值定义,所述D’值包括至少约0.20、0.25、0.30、0.35、0.40、0.45、0.50、0.55、0.60、0.65、0.70、0.75、0.80、0.85、0.85、0.90、0.95和1.0。在一些实施方案中,LD由r2值定义,所述r2值包括至少约0.70、0.75、0.80、0.85、0.90、0.95和1.0。在一些实施方案中,个体的基因型是通过对从个体获得的遗传物质进行或已经进行基因分型测定而获得的。在一些实施方案中,通过对从个体获得的遗传物质进行脱氧核糖核酸(DNA)阵列、核糖核酸(RNA)阵列、测序测定或其组合来获得个体的基因型。在一些实施方案中,测序测定包括下一代测序(NGS)。在一些实施方案中,所述方法还包括用所述个体的所分配的世系、特定性状和基因型更新所述性状相关变体数据库。在一些实施方案中,使用主成分分析(PCA)、或最大似然估计(MLE)或其组合向(b)中的个体分配世系。在一些实施方案中,一个或多个世系特异性遗传变体、一个或多个个体特异性遗传变体和与一个或多个个体特异性遗传变体处于LD的遗传变体包括单核苷酸变体(SNV)。在一些实施方案中,一个或多个风险单位包括风险等位基因。在一些实施方案中,所述一个或多个世系特异性遗传变体、所述一个或多个个体特异性遗传变体和与所述一个或多个个体特异性遗传变体处于LD的遗传变体包括以插入或缺失一个或多个核苷酸为特征的indel。在一些实施方案中,一个或多个风险单位包括一个或多个核苷酸的插入(I)或缺失(D)。在一些实施方案中,所述一个或多个世系特异性遗传变体,或所述一个或多个个体特异性遗传变体包括拷贝数变体(CNV)。在一些实施方案中,一个或多个风险单位包括核酸序列的插入或缺失。在一些实施方案中,核酸序列包括约两个、三个、四个、五个、六个、七个、八个、九个或十个核苷酸。在一些实施方案中,核酸序列包括多于三个的核苷酸。在一些实施方案中,核酸序列包括整个基因。在一些实施方案中,所述方法还包括向所述个体提供关于所述个体具有或将发展所述特定性状的风险的通知。在一些实施方案中,特定性状包括营养性状、临床性状、亚临床性状、体育锻炼性状、皮肤性状、毛发性状、过敏性状或精神性状。在一些实施方案中,临床性状包括疾病或病症。在一些实施方案中,亚临床性状包括疾病或病症的表型。在一些实施方案中,体育锻炼性状包括反感锻炼、有氧运动能力、减肥困难、耐力、力量、身体素质益处、对锻炼的心跳反应降低、瘦体重、肌肉酸痛、肌肉损伤风险、肌肉修复受损、应力性骨折、整体损伤风险、肥胖的可能性或静息代谢率受损。在一些实施方案中,皮肤性状包括胶原分解、干燥、抗氧化剂缺乏、解毒受损、皮肤糖化、色素斑点、年轻性、光老化、真皮敏感性或对阳光的敏感性。在一些实施方案中,毛发性状包括头发厚度、头发稀疏、头发脱落、秃顶、油性、干燥、头皮屑或发量。在一些实施方案中,营养性状包括维生素缺乏、矿物质缺乏、抗氧化剂缺乏、脂肪酸缺乏、代谢失衡、代谢受损、代谢敏感性、过敏、饱腹感或健康饮食的有效性。在一些实施方案中,维生素缺乏包括有包括维生素A、维生素B1、维生素B2、维生素B3、维生素B5、维生素B6、维生素B7、维生素B8、维生素B9、维生素B12、维生素C、维生素D、维生素E和维生素K的维生素的缺乏。在一些实施方案中,矿物质缺乏包括有包括钙、铁、镁、锌或硒的矿物质缺乏。在一些实施方案中,抗氧化剂缺乏包括缺乏包括谷胱甘肽或辅酶Q10(CoQ10)的抗氧化剂。在一些实施方案中,脂肪酸缺乏包括多不饱和脂肪酸或单不饱和脂肪酸的缺乏。在一些实施方案中,代谢失衡包括葡萄糖失衡。在一些实施方案中,代谢受损包括咖啡因或药物疗法的代谢受损。在一些实施方案中,代谢敏感性包括麸质敏感性、聚糖敏感性或乳糖敏感性。在一些实施方案中,过敏包括对食物的过敏(食物过敏)或对环境因素的过敏(环境过敏)。在一些实施方案中,所述方法还包括对个体进行有效地改善或防止个体中的特定性状的治疗,前提是遗传风险得分指示个体具有或将发展特定性状的高可能性。在一些实施方案中,治疗包括补充剂或药物疗法。在一些实施方案中,补充剂包括维生素、矿物质、益生菌、抗氧化剂、抗炎剂或其组合。在一些实施方案中,通过以下来计算遗传风险得分:(1)计算原始得分,所述原始得分包括受试者组中每个受试者的每个世系特异性遗传变体的一个或多个风险单位的总数,由此生成原始得分的世系特异性观察范围;(2)计算一个或多个个体特异性遗传变体中的每个个体特异性遗传变体的一个或多个风险单位的总数,由此生成个体原始得分;以及(3)将所述个体原始得分与所述世系特异性观察范围进行比较,以生成所述遗传风险得分。在一些实施方案中,通过以下来计算遗传风险得分:(1)确定每个世系特异性遗传风险变体的让步比;以及(2)如果选择了两个或更多个世系特异性遗传变体,则将所述两个或更多个世系特异性遗传变体中的每个的让步比相乘。在一些实施方案中,其中所述计算机可执行代码还被配置为使至少一个处理器执行通过以下来执行确定所述预先确定的遗传变体的步骤:a)提供来自个体的未分型基因型数据;b)将未分型基因型数据分型,以根据所述个体的世系产生个体特异性分型单倍型;c)使用来自与所述个体具有相同世系的参考组的分型单倍型数据,插补分型的个体特异性分型单倍型中不存在的个体特异性基因型;和d)从插补的个体特异性基因型中选择与个体特异性遗传变体处于连锁不平衡(LD)的遗传变体,其与个体具有或将发展特定性状的可能性相关联。In some embodiments, this document discloses a non-transitory computer-readable storage medium comprising computer-executable code configured to cause at least one processor to perform the following steps: a) providing a genotype of the individual, the genotype comprising one or more individual-specific genetic variants; b) assigning a lineage to the individual at least in part based on the individual's genotype; c) using a trait-related variant database (which includes lineage-specific genetic variants derived from subjects (subject groups) having the same lineage as the individual) to select one or more lineage-specific genetic variants at least in part based on the individual's lineage, wherein each of the one or more lineage-specific genetic variants... A corresponding to: (i) an individual-specific genetic variant among the one or more individual-specific genetic variants, or (ii) a predetermined genetic variant in linkage disequilibrium (LD) with the individual-specific genetic variant among the one or more individual-specific genetic variants in a subject population with the same lineage as the individual, and wherein each of the one or more lineage-specific genetic variants and each of the individual-specific genetic variants includes one or more risk units; and (d) calculating a genetic risk score for the individual based on the selected one or more lineage-specific genetic variants, wherein the genetic risk score indicates the likelihood that the individual has or will develop the specific trait. In some embodiments, the medium further includes providing the individual with a questionnaire comprising one or more questions relating to the specific trait. In some embodiments, the medium further includes receiving from the individual one or more answers to one or more questions relating to the specific trait in a questionnaire provided to the individual. In some embodiments, the medium further includes: a) providing the individual with a questionnaire comprising one or more questions relating to the specific trait; and c) receiving from the individual one or more answers to the one or more questions. In some embodiments, the medium further includes storing lineage-specific genetic variants associated with the specific trait from the subject group in a trait-related variant database. In some embodiments, the genetic risk score includes percentiles or z-scores. In some embodiments, LD is defined by (i) a D' value of at least about 0.20 or (ii) an value of at least about 0.70. In some embodiments, LD is defined by a D' value, which includes about 0.20 to 0.25, 0.25 to 0.30, 0.30 to 0.35, 0.35 to 0.40, 0.40 to 0.45, 0.45 to 0.50, 0.50 to 0.55, 0.55 to 0.60, 0.60 to 0.65, 0.65 to 0.70, 0.70 to 0.75, 0.75 to 0.80, 0.80 to 0.85, 0.85 to 0.90, 0.90 to 0.95, or 0.95 to 1.0. In some embodiments, LD is defined by an value, which includes approximately 0.70 to 0.75, 0.75 to 0.80, 0.80 to 0.85, 0.85 to 0.90, 0.90 to 0.95, or 0.95 to 1.0. In some embodiments, LD is defined by a D' value, which includes at least approximately 0.20, 0.25, 0.30, 0.35, 0.40, 0.45, 0.50, 0.55, 0.60, 0.65, 0.70, 0.75, 0.80, 0.85, 0.85, 0.90, 0.95, and 1.0. In some embodiments, LD is defined by an value, which includes at least approximately 0.70, 0.75, 0.80, 0.85, 0.90, 0.95, and 1.0. In some embodiments, an individual's genotype is obtained by genotyping or has been performed on genetic material obtained from the individual. In some embodiments, an individual's genotype is obtained by performing deoxyribonucleic acid (DNA) arraying, ribonucleic acid (RNA) arraying, sequencing, or a combination thereof on genetic material obtained from the individual. In some embodiments, sequencing includes next-generation sequencing (NGS). In some embodiments, the method further includes updating the trait-related variant database with the assigned pedigree, specific trait, and genotype of the individual. In some embodiments, principal component analysis (PCA), or maximum likelihood estimation (MLE), or a combination thereof, is used to assign pedigrees to the individuals in (b). In some embodiments, one or more pedigree-specific genetic variants, one or more individual-specific genetic variants, and genetic variants at the LD with one or more individual-specific genetic variants include single nucleotide variants (SNVs). In some embodiments, one or more risk units include risk alleles. In some embodiments, the one or more pedigree-specific genetic variants, the one or more individual-specific genetic variants, and genetic variants at the LD with one or more individual-specific genetic variants include indels characterized by the insertion or deletion of one or more nucleotides. In some embodiments, one or more risk units comprise an insertion (I) or deletion (D) of one or more nucleotides. In some embodiments, the one or more lineage-specific genetic variants, or the one or more individual-specific genetic variants, comprise copy number variants (CNVs). In some embodiments, one or more risk units comprise an insertion or deletion of a nucleic acid sequence. In some embodiments, the nucleic acid sequence comprises about two, three, four, five, six, seven, eight, nine, or ten nucleotides. In some embodiments, the nucleic acid sequence comprises more than three nucleotides. In some embodiments, the nucleic acid sequence comprises the entire gene. In some embodiments, the method further includes providing the individual with notification of the risk that the individual has or will develop the specific trait. In some embodiments, the specific trait includes nutritional traits, clinical traits, subclinical traits, physical activity traits, skin traits, hair traits, allergic traits, or mental traits. In some embodiments, a clinical trait includes a disease or condition. In some embodiments, a subclinical trait includes a phenotype of a disease or condition. In some implementations, physical activity traits include aversion to exercise, aerobic capacity, difficulty losing weight, endurance, strength, physical fitness benefits, decreased heart rate response to exercise, lean body mass, muscle soreness, risk of muscle injury, impaired muscle repair, stress fractures, overall injury risk, likelihood of obesity, or impaired resting metabolic rate. In some implementations, skin traits include collagen breakdown, dryness, antioxidant deficiency, impaired detoxification, skin glycation, pigmentation spots, premature aging, photoaging, dermal sensitivity, or sensitivity to sunlight. In some implementations, hair traits include hair thickness, thinning hair, hair loss, baldness, oiliness, dryness, dandruff, or hair volume. In some implementations, nutritional traits include vitamin deficiencies, mineral deficiencies, antioxidant deficiencies, fatty acid deficiencies, metabolic imbalances, impaired metabolism, metabolic sensitivity, allergies, satiety, or effectiveness of a healthy diet. In some embodiments, vitamin deficiency includes deficiencies in vitamins A, B1, B2, B3, B5, B6, B7, B8, B9, B12, C, D, E, and K. In some embodiments, mineral deficiency includes deficiencies in minerals including calcium, iron, magnesium, zinc, or selenium. In some embodiments, antioxidant deficiency includes deficiencies in antioxidants including glutathione or coenzyme Q10 (CoQ10). In some embodiments, fatty acid deficiency includes deficiencies in polyunsaturated or monounsaturated fatty acids. In some embodiments, metabolic imbalance includes glucose imbalance. In some embodiments, metabolic impairment includes metabolic impairment due to caffeine or drug therapy. In some embodiments, metabolic sensitivity includes gluten sensitivity, glycan sensitivity, or lactose sensitivity. In some embodiments, allergy includes allergies to food (food allergy) or allergies to environmental factors (environmental allergy). In some embodiments, the method further includes treatment to effectively improve or prevent a specific trait in an individual, provided that the genetic risk score indicates a high probability that the individual has or will develop the specific trait. In some embodiments, the treatment includes supplements or pharmacological therapy. In some embodiments, supplements include vitamins, minerals, probiotics, antioxidants, anti-inflammatory agents, or combinations thereof. In some embodiments, the genetic risk score is calculated by: (1) calculating a raw score comprising the total number of one or more risk units for each lineage-specific genetic variant for each subject in the subject group, thereby generating a lineage-specific observation range for the raw score; (2) calculating the total number of one or more risk units for each of one or more individual-specific genetic variants, thereby generating an individual raw score; and (3) comparing the individual raw score to the lineage-specific observation range to generate the genetic risk score. In some embodiments, the genetic risk score is calculated by: (1) determining a concession ratio for each lineage-specific genetic risk variant; and (2) if two or more lineage-specific genetic variants are selected, multiplying the concession ratios of each of the two or more lineage-specific genetic variants. In some embodiments, the computer-executable code is further configured to cause at least one processor to perform the steps of determining the predetermined genetic variant by: a) providing untyped genotype data from an individual; b) typing the untyped genotype data to generate individual-specific typing haplotypes based on the individual's lineage; c) imputing individual-specific genotypes not present in the typed individual-specific typing haplotypes using typing haplotype data from a reference group having the same lineage as the individual; and d) selecting from the imputed individual-specific genotypes a genetic variant that is in linkage disequilibrium (LD) with the individual-specific genetic variant, which is associated with the likelihood that the individual has or will develop the specific trait.

附图说明Attached Figure Description

图1是示出用于确定个体的世系特异性遗传风险得分的示例性系统的框图。Figure 1 is a block diagram illustrating an exemplary system for determining an individual's lineage-specific genetic risk score.

图2是示出用于确定个体的遗传风险得分的示例性过程的流程图。Figure 2 is a flowchart illustrating an exemplary process for determining an individual's genetic risk score.

图3是示出使用一个或多个参考遗传变体来确定个体的世系特异性遗传风险得分的示例性过程的流程图。Figure 3 is a flowchart illustrating an exemplary process for determining an individual’s lineage-specific genetic risk score using one or more reference genetic variants.

图4是示出使用来自性状相关数据库的一个或多个世系特异性遗传变体来确定个体的世系特异性遗传风险得分的示例性过程的流程图。Figure 4 is a flowchart illustrating an exemplary process for determining an individual’s lineage-specific genetic risk score using one or more lineage-specific genetic variants from a trait-related database.

图5是示出使用来自性状相关数据库的一个或多个世系特异性遗传变体来确定个体的世系特异性遗传风险得分的示例性过程的流程图。Figure 5 is a flowchart illustrating an exemplary process for determining an individual’s lineage-specific genetic risk score using one or more lineage-specific genetic variants from a trait-related database.

图6A-图6F举例说明了根据本实施方案的报告,其中向受试者展示多个特定表型性状的GRS。图6A举例说明了身体素质表型性状的总结报告。图6B举例说明与身体素质表型性状:肥胖的可能性相关的行为建议。图6C举例说明了皮肤表型性状的总结报告。图6D举例说明了与皮肤表型性状:抗氧化剂缺乏相关的行为建议。图6E举例说明了营养表型性状的总结报告。图6F举例说明了与营养表型性状:饱腹感受损相关的行为建议。Figures 6A-6F illustrate examples of reports according to this implementation plan, in which the GRS of multiple specific phenotypic traits are presented to the subjects. Figure 6A illustrates a summary report of the physical fitness phenotypic trait. Figure 6B illustrates behavioral recommendations related to the physical fitness phenotypic trait: the likelihood of obesity. Figure 6C illustrates a summary report of the skin phenotypic trait. Figure 6D illustrates behavioral recommendations related to the skin phenotypic trait: antioxidant deficiency. Figure 6E illustrates a summary report of the nutritional phenotypic trait. Figure 6F illustrates behavioral recommendations related to the nutritional phenotypic trait: impaired satiety.

图7A-图7D举例说明了根据本实施方案的关注营养和健康的营养报告,其中向受试者展示多个特定表型性状的GRS。图7A举例说明了食物敏感性的总结报告。图7B举例说明了矿物质和营养缺乏的总结报告。图7C举例说明了饮食管理表型的总结报告。图7D举例说明了维生素缺乏的总结报告。Figures 7A-7D illustrate examples of nutrition reports focusing on nutrition and health according to this implementation plan, where GRS for multiple specific phenotypic traits are presented to the subjects. Figure 7A illustrates a summary report of food sensitivity. Figure 7B illustrates a summary report of mineral and nutrient deficiencies. Figure 7C illustrates a summary report of dietary management phenotypes. Figure 7D illustrates a summary report of vitamin deficiencies.

具体实施方式Detailed Implementation

认为单倍型异质性的差异以及重组率的差异是导致不同世系群体间连锁不平衡(LD)存在变异的重要原因。目前的遗传风险预测方法在选择代理遗传变体时未能考虑受试者组的世系,这导致在给定群体中选择的风险指标较差。本文公开的方法、介质和系统通过在个体所属的特定世系群体内基于LD选择代理遗传变体来提供该问题的解决方案。此外,本文公开的方法、介质和系统利用被配置为使用预先确定的LD模式的软件程序,在计算先前未公开的个体特异性遗传变体的遗传风险得分(GRS)时,可以利用该预先确定的LD模式。因此,与现有方法相比,本文公开的本解决方案提高了遗传风险预测的准确性和效率。Differences in haplotype heterogeneity and recombination rates are considered important reasons for the variation in linkage disequilibrium (LD) among different lineage populations. Current genetic risk prediction methods fail to consider the lineage of the subject group when selecting surrogate variants, resulting in poor risk indicators selected in a given population. The methods, media, and systems disclosed herein provide a solution to this problem by selecting surrogate variants based on LD within a specific lineage population to which an individual belongs. Furthermore, the methods, media, and systems disclosed herein utilize software programs configured to use a predetermined LD pattern, which can be used when calculating the genetic risk score (GRS) of previously undisclosed individual-specific genetic variants. Therefore, the solution disclosed herein improves the accuracy and efficiency of genetic risk prediction compared to existing methods.

目前的风险预测方法没有利用世系特异性LD信息。然而,遗传变体是否与另一个遗传变体处于LD,很大程度上受所研究的世系群体的影响。在非限制性示例中,在以高加索人为主的群体中处于LD的两个遗传变体可能不一定在例如,中国人群体中处于LD。反之亦然。在计算个体的GRS时考虑世系特异性LD模式优于现有技术的水平,许多原因包括但不限于(i)避免错误(例如,两个遗传变体在该群体中根本不处于LD),以及(ii)避免对遗传变体进行多于一次计数。考虑到世系特异性LD模式,通过确保鉴定处于LD的遗传风险变体,并防止因对单个遗传变体多于一次计数而导致的GRS膨胀,产生了更准确的GRS预测。Current risk prediction methods do not utilize lineage-specific risk variant (LD) information. However, whether a genetic variant is in the LD with another genetic variant is largely influenced by the lineage population under study. In a non-limiting example, two genetic variants that are in the LD in a predominantly Caucasian population may not necessarily be in the LD in, for example, a Chinese population. And vice versa. Considering lineage-specific LD patterns when calculating an individual's GRS is superior to the level of existing techniques for many reasons, including but not limited to (i) avoiding errors (e.g., two genetic variants not being in the LD at all in the population) and (ii) avoiding more than one count of genetic variants. Considering lineage-specific LD patterns results in more accurate GRS predictions by ensuring the identification of genetic risk variants in the LD and preventing GRS inflation caused by more than one count of a single genetic variant.

本文在一些实施方案中公开了用于基于个体的世系计算遗传风险得分(GRS)的遗传风险预测方法、介质和系统,所述GRS表示个体将发展特定表型性状的可能性。在一些实施方案中,和与个体具有相同世系的受试者群体相比,基于在从个体获得的样品中检测到的构成个体基因型的遗传变体的数量和类型,计算GRS。在一些实施方案中,通过分析个体的基因型来确定个体的世系。本文还公开了用于基于所计算的针对特定表型性状的GRS向个体建议与特定表型性状相关的行为改变的方法、介质和系统。This document discloses, in some embodiments, methods, media, and systems for calculating a genetic risk score (GRS) based on an individual's pedigree, wherein the GRS represents the probability that an individual will develop a particular phenotypic trait. In some embodiments, the GRS is calculated based on the number and type of genetic variants constituting the individual's genotype detected in samples obtained from the individual, compared to a group of subjects with the same pedigree as the individual. In some embodiments, the individual's pedigree is determined by analyzing the individual's genotype. This document also discloses methods, media, and systems for suggesting behavioral changes associated with a specific phenotypic trait to an individual based on the calculated GRS for that phenotypic trait.

基因型和遗传变体Genotypes and genetic variants

全基因组关联研究(GWAS)考虑了成千上万的遗传变体,包括单核苷酸变异(SNV)、插入/缺失(indel)和拷贝数变异(CNV),以鉴定群体中的遗传变体与复杂的临床病症和表型性状之间的联系。在从个体获得的样品中检测与特定表型性状相关的遗传变体被认为指示所述个体具有或将发展特定表型性状。在一些实施方案中,个体获得他或她自己的样品,并将样品提供给实验室用于处理和分析。在一些实施方案中,从获得自受试者的样品中提取遗传物质。在一些实施方案中,使用基因分型测定(例如,基因分型测定,定量聚合酶链反应(qPCR),和/或荧光qPCR)在从个体获得的样品的遗传物质中检测遗传变体。在一些实施方案中,分析遗传信息以确定个体的世系。Genome-wide association studies (GWAS) consider thousands of genetic variants, including single nucleotide variants (SNVs), insertions/deletions (indels), and copy number variants (CNVs), to identify associations between genetic variants in a population and complex clinical conditions and phenotypic traits. Detection of genetic variants associated with a specific phenotypic trait in a sample obtained from an individual is considered an indication that the individual has or will develop that phenotypic trait. In some embodiments, the individual obtains his or her own sample and provides it to a laboratory for processing and analysis. In some embodiments, genetic material is extracted from the sample obtained from the subject. In some embodiments, genetic variants are detected in the genetic material of the sample obtained from the individual using genotyping assays (e.g., genotyping assays, quantitative polymerase chain reaction (qPCR), and/or fluorescent qPCR). In some embodiments, genetic information is analyzed to determine the individual's pedigree.

遗传变体(例如,SNV、SNP、indel、CNV)可以位于基因的编码区、基因的非编码区或基因之间的基因间区。基因编码区中的遗传变体可能导致,也可能不会导致由于遗传密码冗余而产生的不同蛋白质同种型。基因的非编码区或基因间区内的遗传变体可影响所述基因或由所述基因表达的基因表达产物的表达和/或活性。Genetic variants (e.g., SNVs, SNPs, indels, CNVs) can be located in the coding region of a gene, the non-coding region of a gene, or the intergenic region between genes. Genetic variants in the coding region of a gene may or may not result in different protein isotypes due to redundancy of the genetic code. Genetic variants in the non-coding region of a gene or the intergenic region can affect the expression and/or activity of the gene or the gene expression product expressed by the gene.

本文在一些实施方案中公开了用于确定个体的基因型的方法和系统。在一些实施方案中,个体正罹患疾病或病症,或与疾病或病症相关的症状。在一些实施方案中,疾病或病症包括缺乏性疾病、遗传性疾病或心理疾病。在一些实施方案中,疾病或病症包括免疫性疾病和/或代谢性疾病。在一些实施方案中,免疫性疾病包括自身免疫性疾病或紊乱。自身免疫性疾病或紊乱的非限制性示例包括格雷夫病、桥本甲状腺炎、系统性红斑狼疮(狼疮)、多发性硬化症、类风湿性关节炎、炎症性肠病、克罗恩病、溃疡性结肠炎和癌症。代谢性疾病或病症的非限制性示例包括1型糖尿病、2型糖尿病、影响大量营养素(例如,氨基酸、碳水化合物或脂类)吸收的疾病、影响微量营养物(例如,维生素或矿物质)吸收的疾病、影响线粒体功能的疾病、影响肝功能的疾病(例如,非酒精性脂肪性肝病)、以及影响肾功能的疾病。This document discloses methods and systems for determining an individual's genotype in some embodiments. In some embodiments, the individual is suffering from a disease or condition, or symptoms associated with a disease or condition. In some embodiments, the disease or condition includes deficiency diseases, genetic diseases, or mental illnesses. In some embodiments, the disease or condition includes immune diseases and/or metabolic diseases. In some embodiments, immune diseases include autoimmune diseases or disorders. Non-limiting examples of autoimmune diseases or disorders include Graves' disease, Hashimoto's thyroiditis, systemic lupus erythematosus (lupus), multiple sclerosis, rheumatoid arthritis, inflammatory bowel disease, Crohn's disease, ulcerative colitis, and cancer. Non-limiting examples of metabolic diseases or conditions include type 1 diabetes, type 2 diabetes, diseases affecting the absorption of macronutrients (e.g., amino acids, carbohydrates, or lipids), diseases affecting the absorption of micronutrients (e.g., vitamins or minerals), diseases affecting mitochondrial function, diseases affecting liver function (e.g., non-alcoholic fatty liver disease), and diseases affecting kidney function.

在一些实施方案中,本文公开了用于使用本文公开的基因型和/或遗传变体计算遗传风险得分(GRS)的方法和系统,所述GRS表示个体具有或将发展特定表型性状的可能性。在一些实施方案中,使用单个遗传变体。在一些实施方案中,使用两个遗传变体。在一些实施方案中,使用三个遗传变体。在一些实施方案中,使用四个遗传变体。在一些实施方案中,使用五个遗传变体。在一些实施方案中,使用六个遗传变体。在一些实施方案中,使用七个遗传变体。在一些实施方案中,使用八个遗传变体。在一些实施方案中,使用九个遗传变体。在一些实施方案中,使用十个遗传变体。在一些实施方案中,使用至少约两个遗传变体。在一些实施方案中,使用至少约三个遗传变体。在一些实施方案中,使用至少约四个遗传变体。在一些实施方案中,使用至少约五个遗传变体。在一些实施方案中,使用至少约六个遗传变体。在一些实施方案中,使用至少约七个遗传变体。在一些实施方案中,使用至少约八个遗传变体。在一些实施方案中,使用至少约九个遗传变体。在一些实施方案中,使用至少约十个遗传变体。在一些实施方案中,使用两个遗传变体。In some embodiments, this document discloses methods and systems for calculating a genetic risk score (GRS) using the genotypes and/or genetic variants disclosed herein, the GRS representing the probability that an individual has or will develop a particular phenotypic trait. In some embodiments, a single genetic variant is used. In some embodiments, two genetic variants are used. In some embodiments, three genetic variants are used. In some embodiments, four genetic variants are used. In some embodiments, five genetic variants are used. In some embodiments, six genetic variants are used. In some embodiments, seven genetic variants are used. In some embodiments, eight genetic variants are used. In some embodiments, nine genetic variants are used. In some embodiments, ten genetic variants are used. In some embodiments, at least about two genetic variants are used. In some embodiments, at least about three genetic variants are used. In some embodiments, at least about four genetic variants are used. In some embodiments, at least about five genetic variants are used. In some embodiments, at least about six genetic variants are used. In some embodiments, at least about seven genetic variants are used. In some embodiments, at least about eight genetic variants are used. In some embodiments, at least about nine genetic variants are used. In some embodiments, at least about ten genetic variants are used. In some implementations, two genetic variants are used.

在一些实施方案中,本文公开了包括一种或多种遗传变体(例如,indel、SNV、SNP)的基因型,所述基因型在本文所述的方法、系统和试剂盒中使用的SEQ ID NO:1-218中的一个或多个中提供。在一些实施方案中,本文描述的基因型包括单个遗传变体。在一些实施方案中,基因型包括两个遗传变体。在一些实施方案中,基因型包括三个遗传变体。在一些实施方案中,基因型包括四个遗传变体。在一些实施方案中,基因型包括五个遗传变体。在一些实施方案中,基因型包括六个遗传变体。在一些实施方案中,基因型包括七个遗传变体。在一些实施方案中,基因型包括八个遗传变体。在一些实施方案中,基因型包括九个遗传变体。在一些实施方案中,基因型包括十个遗传变体。在一些实施方案中,基因型包括超过十个遗传变体。In some embodiments, this document discloses genotypes comprising one or more genetic variants (e.g., indel, SNV, SNP), said genotypes being provided in one or more of SEQ ID NO: 1-218 used in the methods, systems, and kits described herein. In some embodiments, the genotypes described herein comprise a single genetic variant. In some embodiments, the genotype comprises two genetic variants. In some embodiments, the genotype comprises three genetic variants. In some embodiments, the genotype comprises four genetic variants. In some embodiments, the genotype comprises five genetic variants. In some embodiments, the genotype comprises six genetic variants. In some embodiments, the genotype comprises seven genetic variants. In some embodiments, the genotype comprises eight genetic variants. In some embodiments, the genotype comprises nine genetic variants. In some embodiments, the genotype comprises ten genetic variants. In some embodiments, the genotype comprises more than ten genetic variants.

在一些实施方案中,基因型包括至少约两个遗传变体。在一些实施方案中,基因型包括至少约三个遗传变体。在一些实施方案中,基因型包括至少约四个遗传变体。在一些实施方案中,基因型包括至少约五个遗传变体。在一些实施方案中,基因型包括至少约六个遗传变体。在一些实施方案中,基因型包括至少约七个遗传变体。在一些实施方案中,基因型包括至少约八个遗传变体。在一些实施方案中,基因型包括至少约九个遗传变体。在一些实施方案中,基因型包括至少约十个遗传变体。In some embodiments, the genotype includes at least about two genetic variants. In some embodiments, the genotype includes at least about three genetic variants. In some embodiments, the genotype includes at least about four genetic variants. In some embodiments, the genotype includes at least about five genetic variants. In some embodiments, the genotype includes at least about six genetic variants. In some embodiments, the genotype includes at least about seven genetic variants. In some embodiments, the genotype includes at least about eight genetic variants. In some embodiments, the genotype includes at least about nine genetic variants. In some embodiments, the genotype includes at least about ten genetic variants.

在一些实施方案中,使用在下列表1-表44任一项中列出的至少一个遗传变体。在一些实施方案中,使用本文公开的检测方法,使用遗传变体。在一些实施方案中,本文描述的方法和系统使用(例如,检测、分析)表1中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表2中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表3中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表4中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表5中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表6中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表7中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表8中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表9中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表10中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表11中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表12中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表13中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表14中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表15中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表16中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表17中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表18中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表19中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表20中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表21中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表22中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表23中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表24中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表25中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表26中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表27中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表28中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表29中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表30中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表31中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表32中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表33中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表34中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表35中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表36中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表37中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表38中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表39中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表40中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表41中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表42中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表43中提供的一种或多种遗传变体。在一些实施方案中,本文描述的方法和系统使用表44中提供的一种或多种遗传变体。In some embodiments, at least one genetic variant listed in any one of Tables 1-44 is used. In some embodiments, the genetic variant is used using the detection methods disclosed herein. In some embodiments, the methods and systems described herein use (e.g., to detect, analyze) one or more genetic variants provided in Table 1. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 2. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 3. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 4. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 5. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 6. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 7. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 8. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 9. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 10. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 11. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 12. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 13. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 14. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 15. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 16. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 17. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 18. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 19. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 20. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 21. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 22. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 23. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 24. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 25. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 26. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 27. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 28. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 29. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 30. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 31. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 32. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 33. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 34. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 35. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 36. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 37. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 38. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 39. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 40. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 41. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 42. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 43. In some embodiments, the methods and systems described herein use one or more genetic variants provided in Table 44.

在一些实施方案中,所述方法、系统和试剂盒利用所述一种或多种遗传变体中的主要或次要等位基因。在一些实施方案中,使用次要等位基因。在一些实施方案中,使用主要等位基因。在一些实施方案中,所述方法、系统和试剂盒利用由非核酸字母或代码表示的核苷酸。在一些情况下,非核酸字母或代码是SEQ ID NO:1-218中任何一个提供的国际纯粹与应用化学联合会(IUPAC)核苷酸代码。表1-表44任一项中提供的遗传变体具有相应的SEQID NO,其提供包含与具有或发展特定表型性状的风险相关的核苷酸或多核苷酸序列的核酸序列。In some embodiments, the methods, systems, and kits utilize a major or minor allele from one or more of the genetic variants. In some embodiments, a minor allele is used. In some embodiments, a major allele is used. In some embodiments, the methods, systems, and kits utilize nucleotides represented by non-nucleic acid letters or codes. In some cases, the non-nucleic acid letters or codes are the International Union of Pure and Applied Chemistry (IUPAC) nucleotide codes provided in any of SEQ ID NOs: 1-218. The genetic variants provided in any of Tables 1-44 have corresponding SEQ ID NOs that provide nucleic acid sequences containing nucleotide or polynucleotide sequences associated with the risk of having or developing a particular phenotypic trait.

本文公开的方法和系统一般适用于分析从个体获得的样品。类似地,本文公开的方法包括样品的处理和/或分析。在一些情况下,样品直接或间接地从个体获得。在一些情况下,样品是通过抽液、拭子或流体收集获得。在一些情况下,样品包括全血、外周血、血浆、血清、唾液、面颊拭子、尿液或其他体液或组织。The methods and systems disclosed herein are generally applicable to the analysis of samples obtained from individuals. Similarly, the methods disclosed herein include sample processing and/or analysis. In some cases, samples are obtained directly or indirectly from individuals. In some cases, samples are obtained by aspiration, swabs, or fluid collection. In some cases, samples include whole blood, peripheral blood, plasma, serum, saliva, buccal swabs, urine, or other bodily fluids or tissues.

在一些实施方案中,通过对从个体获得的样品进行基于核酸的检测测定来确定个体的基因型。在一些情况下,基于核酸的检测测定包括定量聚合酶链反应(qPCR)、凝胶电泳(包括例如,RNA或DNA印迹)、免疫化学、原位杂交例如荧光原位杂交(FISH)、细胞化学、或测序。在一些实施方案中,测序技术包括下一代测序。在一些实施方案中,所述方法涉及杂交测定,例如荧光qPCR(例如,TaqManTM或SYBR绿),其涉及用特异性引物对的核酸扩增反应,以及包含对靶核酸序列特异性的可检测部分或分子的扩增核酸探针的杂交。另外的示例性基于核酸的检测测定包括使用缀合或以其他方式固定在珠、多孔板、阵列或其他底物上的核酸探针,其中所述核酸探针被配置为与靶核酸序列杂交。在一些情况下,核酸探特异于遗传变体(例如,SNP、SNV、CNV或indel)。在一些情况下,特异于SNP或SNV的核酸探针包括与感兴趣的风险等位基因或保护性等位基因充分互补的核酸探针序列,使得杂交特异于风险等位基因或保护性等位基因。在一些情况下,特异于indel的核酸探针包括与插入侧翼的多核苷酸序列中的核碱基插入充分互补的核酸探针序列,使得杂交针对indel特异。在一些情况下,对indel特异的核酸探针包括与多核苷酸序列内核碱基缺失侧翼的多核苷酸序列充分互补的探针序列,使得杂交对indel特异。在一些情况下,需要多个核酸探针来检测CNV,所述核酸探针特异于包含CNV的多核苷酸序列中的各个区域。在非限制性实例中,对基因内单个外显子CNV特异的多个核酸探针可以包括2至3、3至4、4至5、5至6和6至7个核酸探针之间的高密度核酸,可以使用与基因的外显子区域充分互补的每个核酸探针。在另一非限制性实施方案中,可以利用分散在整个个体基因组中的多个核酸探针来检测长CNV。In some embodiments, an individual's genotype is determined by performing nucleic acid-based assays on samples obtained from the individual. In some cases, nucleic acid-based assays include quantitative polymerase chain reaction (qPCR), gel electrophoresis (including, for example, RNA or DNA blotting), immunochemistry, in situ hybridization such as fluorescence in situ hybridization (FISH), cytochemistry, or sequencing. In some embodiments, the sequencing technology includes next-generation sequencing. In some embodiments, the method involves hybridization assays, such as fluorescent qPCR (e.g., TaqMan or SYBR Green), which involve a nucleic acid amplification reaction with specific primer pairs and hybridization of amplified nucleic acid probes containing a detectable moiety or molecule specific to a target nucleic acid sequence. Other exemplary nucleic acid-based assays include the use of nucleic acid probes conjugated or otherwise immobilized on beads, multiwell plates, arrays, or other substrates, wherein the nucleic acid probes are configured to hybridize with a target nucleic acid sequence. In some cases, the nucleic acid probes are specific to genetic variants (e.g., SNPs, SNVs, CNVs, or indels). In some cases, SNP- or SNV-specific nucleic acid probes comprise nucleic acid probe sequences fully complementary to the risk or protective allele of interest, such that hybridization is specific to the risk or protective allele. In some cases, indel-specific nucleic acid probes comprise nucleic acid probe sequences fully complementary to nucleobase insertions in polynucleotide sequences flanking the insertion, such that hybridization is specific to the indel. In some cases, indel-specific nucleic acid probes comprise probe sequences fully complementary to polynucleotide sequences flanking the nucleobase deletions in polynucleotide sequences, such that hybridization is specific to the indel. In some cases, multiple nucleic acid probes are required to detect CNVs, said nucleic acid probes being specific to various regions within the polynucleotide sequence containing the CNV. In a non-limiting example, multiple nucleic acid probes specific to a single exon CNV within a gene may comprise a high-density nucleic acid array of 2 to 3, 3 to 4, 4 to 5, 5 to 6, and 6 to 7 nucleic acid probes, each nucleic acid probe being fully complementary to the exon region of the gene. In another non-limiting embodiment, multiple nucleic acid probes dispersed throughout an individual's genome can be used to detect long CNVs.

用于检测本文描述的基因型的示例性核酸探针是风险等位基因特异性的,并且包括SEQ ID NO:1-218中任一项中提供的寡核苷酸序列。在一些情况下,核酸探针的长度至少为10个但不超过50个连续核苷酸。在一些情况下,核酸探针的长度在约15到约55个核苷酸之间。在一些情况下,核酸探针的长度在约10到约100个核苷酸之间。在一些情况下,核酸探针的长度在约10到约90个核苷酸之间。在一些情况下,核酸探针的长度在约10到约80个核苷酸之间。在一些情况下,核酸探针的长度在约10到约70个核苷酸之间。在一些情况下,核酸探针的长度在约10到约60个核苷酸之间。在一些情况下,核酸探针的长度在约10到约50个核苷酸之间。在一些情况下,核酸探针的长度在约10到约40个核苷酸之间。在一些情况下,核酸探针的长度在约10到约30个核苷酸之间。在一些情况下,核酸探针的长度在约20到约60个核苷酸之间。在一些情况下,核酸探针的长度在约25到约65个核苷酸之间。在一些情况下,核酸探针的长度在约30到约70个核苷酸之间。在一些情况下,核酸探针的长度在约35到约75个核苷酸之间。在一些情况下,核酸探针的长度在约40到约70个核苷酸之间。Exemplary nucleic acid probes for detecting the genotypes described herein are risk allele-specific and comprise the oligonucleotide sequences provided in any of SEQ ID NO: 1-218. In some cases, the nucleic acid probe is at least 10 but no more than 50 consecutive nucleotides in length. In some cases, the nucleic acid probe is between about 15 and about 55 nucleotides in length. In some cases, the nucleic acid probe is between about 10 and about 100 nucleotides in length. In some cases, the nucleic acid probe is between about 10 and about 90 nucleotides in length. In some cases, the nucleic acid probe is between about 10 and about 80 nucleotides in length. In some cases, the nucleic acid probe is between about 10 and about 70 nucleotides in length. In some cases, the nucleic acid probe is between about 10 and about 60 nucleotides in length. In some cases, the nucleic acid probe is between about 10 and about 50 nucleotides in length. In some cases, the nucleic acid probe is between about 10 and about 40 nucleotides in length. In some cases, the nucleic acid probe is between about 10 and about 30 nucleotides in length. In some cases, the length of nucleic acid probes is between approximately 20 and approximately 60 nucleotides. In some cases, the length of nucleic acid probes is between approximately 25 and approximately 65 nucleotides. In some cases, the length of nucleic acid probes is between approximately 30 and approximately 70 nucleotides. In some cases, the length of nucleic acid probes is between approximately 35 and approximately 75 nucleotides. In some cases, the length of nucleic acid probes is between approximately 40 and approximately 70 nucleotides.

在一些实施方案中,检测个体基因型的方法包括对从个体获得的样品进行核酸扩增测定。在一些情况下,扩增测定包括聚合酶链反应(PCR)、qPCR、自我维持序列复制、转录扩增系统、Q-β复制酶、滚动循环复制或任何合适的其他核酸扩增技术。适当的核酸扩增技术被配置成扩增包含风险变体(例如,SNP、SNV、CNV或indel)的核酸序列的区域。在一些情况下,扩增测定需要引物。基因型内的基因或遗传变体的已知核酸序列足以使本领域技术人员能够选择引物来扩增基因或遗传变体的任何部分。,例如,通过PCR扩增基因组DNA、基因组DNA片段、连接到衔接子序列或克隆序列的基因组DNA片段,可以获得适合作为引物的DNA样品。任何合适的计算机程序都可以用来设计具有所需特异性和最佳扩增特性的引物,如Oligo7.0版(National Biosciences)。用于扩增本文所述基因型的示例性引物长度至少为10个且不超过30个核苷酸,并包含位于SEQ ID NO:1-218中的一个或多个所提供的感兴趣indel、SNV、SNP或CNV侧翼的核酸序列。In some implementations, methods for detecting an individual's genotype include nucleic acid amplification assays on samples obtained from the individual. In some cases, amplification assays include polymerase chain reaction (PCR), qPCR, self-sustaining sequence replication, transcriptional amplification systems, Q-β replicase, rolling cycle replication, or any other suitable nucleic acid amplification technique. Appropriate nucleic acid amplification techniques are configured to amplify regions of nucleic acid sequences containing risk variants (e.g., SNPs, SNVs, CNVs, or indels). In some cases, amplification assays require primers. Known nucleic acid sequences of genes or genetic variants within the genotype are sufficient to enable a person skilled in the art to select primers to amplify any portion of the gene or genetic variant. For example, DNA samples suitable as primers can be obtained by PCR amplification of genomic DNA, genomic DNA fragments, genomic DNA fragments ligated to adaptor sequences, or cloning sequences. Any suitable computer program can be used to design primers with the desired specificity and optimal amplification characteristics, such as Oligo version 7.0 (National Biosciences). Exemplary primers used to amplify the genotypes described herein are at least 10 and no more than 30 nucleotides in length and contain nucleic acid sequences flanking one or more of the provided indels, SNVs, SNPs, or CNVs of interest in SEQ ID NO:1-218.

在一些实施方案中,检测基因型的存在或不存在包括对从受试者获得的样品中的遗传物质进行测序。可以用任何适当的测序技术进行测序,包括但不限于单分子实时(SMRT)测序、Polony测序、连接测序、可逆终止子测序、质子检测测序、离子半导体测序、纳米孔测序、电子测序、焦磷酸测序、Maxam-Gilbert测序、链终止(例如,Sanger)测序、+S测序或合成测序。测序方法还包括下一代测序,例如,现代测序技术,如Illumina测序(例如,Solexa)、罗氏454测序、离子激流测序和SOLiD测序。在一些情况下,下一代测序涉及高通量测序方法。也可以采用本领域技术人员可用的其他测序方法。In some implementations, detecting the presence or absence of a genotype involves sequencing the genetic material in a sample obtained from the subject. Sequencing can be performed using any suitable sequencing technology, including but not limited to single-molecule real-time (SMRT) sequencing, Polony sequencing, ligation sequencing, reversible terminator sequencing, proton detection sequencing, ion semiconductor sequencing, nanopore sequencing, electron sequencing, pyrosequencing, Maxam-Gilbert sequencing, chain termination (e.g., Sanger) sequencing, +S sequencing, or synthetic sequencing. Sequencing methods also include next-generation sequencing, such as modern sequencing technologies like Illumina sequencing (e.g., Solexa), Roche 454 sequencing, ion-fluidic sequencing, and SOLiD sequencing. In some cases, next-generation sequencing involves high-throughput sequencing methods. Other sequencing methods available to those skilled in the art may also be employed.

在一些情况下,测序的核苷酸数目为至少5、10、15、20、25、30、35、40、45、50、100、150、200、300、400、500、2000、4000、6000、8000、10000、20000、50000、100000、或超过100000的核苷酸。在一些情况下,测序的核苷酸数目在以下范围中:约1至约100000个核苷酸、约1至约10000个核苷酸、约1至约1000个核苷酸、约1至约500个核苷酸、约1至约300个核苷酸、约1至约200个核苷酸、约1至约100个核苷酸、约5至约100000个核苷酸、约5至约10000个核苷酸、约5至约1000个核苷酸、约5至约500个核苷酸、约5至约300个核苷酸、约5至约200个核苷酸、约5至约100个核苷酸、约10至约100000个核苷酸、约10至约10000个核苷酸、约10至约1000个核苷酸、约10至约500个核苷酸、约10至约300个核苷酸、约10至约200个核苷酸、约10至约100个核苷酸、约20至约100000个核苷酸、约20至约10000个核苷酸、约20至约1000个核苷酸、约20至约500个核苷酸、约20至约300个核苷酸、约20至约200个核苷酸、约20至约100个核苷酸、约30至约100000个核苷酸、约30至约10000个核苷酸、约30至约1000个核苷酸、约30至约500个核苷酸、约30至约300个核苷酸、约30至约200个核苷酸、约30至约100个核苷酸、约50至约100000个核苷酸、约50至约10000个核苷酸、约50至约1000个核苷酸、约50至约500个核苷酸、约50至约300个核苷酸、约50至约200个核苷酸、或约50至约100个核苷酸。In some cases, the number of nucleotides sequenced is at least 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 100, 150, 200, 300, 400, 500, 2000, 4000, 6000, 8000, 10000, 20000, 50000, 100000, or more than 100000 nucleotides. In some cases, the number of nucleotides sequenced falls within the following ranges: approximately 1 to 100,000 nucleotides, approximately 1 to 10,000 nucleotides, approximately 1 to 1,000 nucleotides, approximately 1 to 500 nucleotides, approximately 1 to 300 nucleotides, approximately 1 to 200 nucleotides, approximately 1 to 100 nucleotides, approximately 5 to 100,000 nucleotides, approximately 5 to 10,000 nucleotides, approximately 5 to 1,000 nucleotides, approximately 5 to 500 nucleotides, approximately 5 to 300 nucleotides, approximately 5 to 200 nucleotides, approximately 5 to 100 nucleotides, approximately 10 to 100,000 nucleotides, approximately 10 to 10,000 nucleotides, approximately 10 to 1,000 nucleotides, approximately 10 to 500 nucleotides, approximately 10 to 300 nucleotides, approximately 10 to 200 nucleotides, approximately 10 to 100 nucleotides. Nucleotides, about 20 to about 100,000 nucleotides, about 20 to about 10,000 nucleotides, about 20 to about 1,000 nucleotides, about 20 to about 500 nucleotides, about 20 to about 300 nucleotides, about 20 to about 200 nucleotides, about 20 to about 100 nucleotides, about 30 to about 100,000 nucleotides, about 30 to about 10,000 nucleotides, about 30 to about 1,000 nucleotides, about 30 to about 500 nucleotides, about 30 to about 300 nucleotides, about 30 to about 200 nucleotides, about 30 to about 100 nucleotides, about 50 to about 100,000 nucleotides, about 50 to about 10,000 nucleotides, about 50 to about 1,000 nucleotides, about 50 to about 500 nucleotides, about 50 to about 300 nucleotides, about 50 to about 200 nucleotides, or about 50 to about 100 nucleotides.

在一些情况下,基因型的核酸序列包括变性的DNA分子或其片段。在一些情况下,核酸序列包括选自以下的DNA:基因组DNA、病毒DNA、线粒体DNA、质粒DNA、扩增的DNA、环状DNA、循环DNA、无细胞DNA或外泌体DNA。在一些情况下,DNA是单链DNA(ssDNA)、双链DNA、变性双链DNA、合成DNA,以及其组合。环状DNA可以被切割或片段化。在一些情况下,核酸序列包括RNA。在一些情况下,核酸序列包括片段化的RNA。在一些情况下,核酸序列包括部分地降解的RNA。在一些情况下,核酸序列包括微RNA或其部分。在一些情况下,所述核酸序列包括RNA分子或片段化RNA分子(RNA片段),选自:微RNA(miRNA)、pre-miRNA、pri-miRNA、mRNA、pre-mRNA、病毒RNA、类病毒RNA、拟病毒RNA、环状RNA(circRNA)、核糖体RNA(rRNA)、转移RNA(tRNA)、pre-tRNA、长非编码RNA(lncRNA)、小核RNA(snRNA)、循环RNA、无细胞RNA、外泌体RNA、载体表达的RNA、RNA转录本、合成的RNA及其组合。In some cases, the nucleic acid sequence of a genotype includes denatured DNA molecules or fragments thereof. In some cases, the nucleic acid sequence includes DNA selected from: genomic DNA, viral DNA, mitochondrial DNA, plasmid DNA, amplified DNA, circular DNA, circulating DNA, cell-free DNA, or exosome DNA. In some cases, the DNA is single-stranded DNA (ssDNA), double-stranded DNA, denatured double-stranded DNA, synthetic DNA, and combinations thereof. Circular DNA may be cleaved or fragmented. In some cases, the nucleic acid sequence includes RNA. In some cases, the nucleic acid sequence includes fragmented RNA. In some cases, the nucleic acid sequence includes partially degraded RNA. In some cases, the nucleic acid sequence includes microRNA or portions thereof. In some cases, the nucleic acid sequence comprises an RNA molecule or a fragmented RNA molecule (RNA fragment), selected from: microRNA (miRNA), pre-miRNA, pri-miRNA, mRNA, pre-mRNA, viral RNA, virusoid RNA, viral RNA, circular RNA (circRNA), ribosomal RNA (rRNA), transfer RNA (tRNA), pre-tRNA, long noncoding RNA (lncRNA), small nuclear RNA (snRNA), circulating RNA, cell-free RNA, exosomal RNA, vector-expressed RNA, RNA transcripts, synthetic RNA, and combinations thereof.

确定个体具有或将要发展特定表型性状的可能性Determine the likelihood that an individual possesses or will develop a specific phenotypic trait.

本文公开的方面提供了计算遗传风险得分(GRS)的方法、介质和系统,所述GRS表示个体将发展特定表型性状的可能性。在一些实施方案中,特定表型性状包括本文讨论的表型性状,包括但不限于临床性状、亚临床性状、体育锻炼性状或精神性状。The aspects disclosed herein provide methods, media, and systems for calculating a genetic risk score (GRS), which represents the likelihood that an individual will develop a particular phenotypic trait. In some embodiments, the particular phenotypic trait includes the phenotypic traits discussed herein, including but not limited to clinical traits, subclinical traits, physical traits, or mental traits.

图2描述了通过计算遗传风险得分(GRS)来确定个体具有或将发展特定性状的可能性的示例工作流程。提供个体的基因型202;所述基因型包括一个或多个个体特异性遗传变体。接下来,至少部分基于个体的基因型来分配个体的世系204。接下来,选择基于一个或多个参考遗传变体206,其中所述一个或多个参考遗传变体中的每一个对应于所述一个或多个个体特异性遗传变体中的个体特异性遗传变体或与所述一个或多个个体特异性遗传变体中的个体特异性遗传变体在受试者群体中处于连锁不平衡(LD)的预先确定的遗传变体。接下来,基于受试者群体内所选择的一个或多个参考遗传变体,计算个体的遗传风险得分208,其中所述遗传风险得分指示所述个体具有或将发展所述特定性状的可能性。在一些情况下,使用本文公开的任何一种方法来计算GRS。Figure 2 illustrates an example workflow for determining the likelihood of an individual having or developing a specific trait by calculating a Genetic Risk Score (GRS). An individual's genotype 202 is provided; the genotype includes one or more individual-specific genetic variants. Next, the individual's lineage 204 is assigned, at least in part based on the individual's genotype. Next, one or more reference genetic variants 206 are selected, each of which corresponds to an individual-specific variant among the one or more individual-specific genetic variants or a predetermined variant in linkage disequilibrium (LD) with the individual-specific variant among the one or more individual-specific genetic variants in the subject population. Next, based on the one or more reference genetic variants selected within the subject population, the individual's genetic risk score 208 is calculated, whereby the genetic risk score indicates the likelihood that the individual has or will develop the specific trait. In some cases, any of the methods disclosed herein are used to calculate the GRS.

图3描述了示例性工作流,通过计算如与非世系特异性受试者群体相比较的GRS来确定个体具有或将发展特定性状的可能性。提供个体的基因型302;所述基因型包括一个或多个个体特异性遗传变体。接下来,至少部分基于个体的基因型来分配个体的世系304。接下来,选择基于一个或多个参考遗传变体306,其中所述一个或多个参考遗传变体中的每一个对应于所述一个或多个个体特异性遗传变体中的个体特异性遗传变体或与所述一个或多个个体特异性遗传变体中的个体特异性遗传变体在受试者群体中处于连锁不平衡(LD)的预先确定的遗传变体。接下来,计算个体特异性原始得分308。将数值分配给个体特异性遗传变体内的风险单位并且将每个个体特异性遗传变体的所有数值相加在一起,以产生个体特异性原始得分。执行相同的计算以生成受试者组内的每个个体的原始得分,由此生成原始得分的观察范围(观察范围)310。接下来,将个体特异性原始得分与观察范围进行比较,以计算相对于受试者群体的风险百分比312。接下来,将遗传风险得分(GRS)分配给个体314。在一些情况下,GRS的形式是百分位数。在一些情况下,百分位数是z得分的形式。Figure 3 illustrates an exemplary workflow for determining the likelihood of an individual having or developing a specific trait by calculating the Genetic Risk Score (GRS) as compared to a non-lineage-specific subject population. The individual's genotype 302 is provided; the genotype includes one or more individual-specific genetic variants. Next, the individual's lineage is assigned at least in part based on the individual's genotype 304. Next, one or more reference genetic variants are selected 306, each of which corresponds to an individual-specific variant among the one or more individual-specific genetic variants or a predetermined variant in linkage disequilibrium (LD) with the individual-specific variant among the one or more individual-specific genetic variants in the subject population. Next, an individual-specific raw score 308 is calculated. Numerical values are assigned to risk units within the individual-specific genetic variant, and all numerical values for each individual-specific genetic variant are summed to produce the individual-specific raw score. The same calculation is performed to generate a raw score for each individual within the subject group, thereby generating an observation range (observation range) 310 for the raw scores. Next, the individual-specific raw scores are compared to the observation range to calculate a percentage risk relative to the subject population 312. Next, the Genetic Risk Score (GRS) is assigned to the individual 314. In some cases, the GRS is in the form of percentiles. In other cases, percentiles are in the form of z-scores.

图4描述了示例性工作流,以根据个体的世系确定个体具有或将发展特定性状的可能性。提供个体的基因型402;所述基因型包括一个或多个个体特异性遗传变体。接下来,至少部分基于个体的基因型来分配个体的世系404。接下来,从性状相关变体数据库中选择来自与个体具有相同世系的受试者(世系特异性受试者组)的世系特异性遗传变体406,这至少部分地基于所述个体的世系而选择,其中所述一个或多个世系特异性遗传变体中的每一个对应于:(i)所述一个或多个个体特异性遗传变体中的个体特异性遗传变体,或(ii)在与所述个体具有相同世系的受试者群体中与所述一个或多个个体特异性遗传变体中的个体特异性遗传变体处于连锁不平衡(LD)的预先确定的遗传变体,并且其中所述一个或多个世系特异性遗传变体中的每一个和所述个体特异性遗传变体中的每一个包括一个或多个风险单位。接下来,计算个体特异性原始得分408。将数值分配给个体特异性遗传变体内的风险单位并且将每个个体特异性遗传变体的所有数值相加在一起,以产生个体特异性原始得分。执行相同的计算以生成世系特异性受试者组内的每个个体的原始得分,由此生成原始得分的观察范围(观察范围)410。接下来,将个体特异性原始得分与世系特异性观察范围进行比较,以计算相对于世系特异性受试者群体的风险百分比412。接下来,将遗传风险得分(GRS)分配给个体414。在一些情况下,GRS的形式是百分位数。在一些情况下,百分位数是z得分的形式。Figure 4 illustrates an exemplary workflow for determining the likelihood that an individual possesses or will develop a specific trait based on their lineage. An individual's genotype 402 is provided; the genotype includes one or more individual-specific genetic variants. Next, the individual's lineage 404 is assigned at least in part based on the individual's genotype. Next, lineage-specific genetic variants 406 are selected from a trait-related variant database from subjects with the same lineage as the individual (lineage-specific subject group), this selection being at least in part based on the individual's lineage, wherein each of the one or more lineage-specific genetic variants corresponds to: (i) an individual-specific genetic variant among the one or more individual-specific genetic variants, or (ii) a predetermined genetic variant in linkage disequilibrium (LD) with the individual-specific genetic variant among the one or more individual-specific genetic variants in the subject group with the same lineage as the individual, and wherein each of the one or more lineage-specific genetic variants and each of the individual-specific genetic variants includes one or more risk units. Next, an individual-specific raw score 408 is calculated. Numerical values are assigned to risk units within the individual-specific genetic variants and all numerical values for each individual-specific genetic variant are summed to produce the individual-specific raw score. The same calculations are performed to generate raw scores for each individual within the lineage-specific subject group, thereby generating an observation range (observation range) 410 for the raw scores. Next, the individual-specific raw scores are compared to the lineage-specific observation range to calculate the percentage risk relative to the lineage-specific subject population 412. Next, the genetic risk score (GRS) is assigned to the individuals 414. In some cases, the GRS is in the form of percentiles. In some cases, percentiles are in the form of z-scores.

图5描述了示例性工作流,以根据个体的世系确定个体具有或将发展特定性状的可能性。提供个体的基因型502;所述基因型包括一个或多个个体特异性遗传变体。接下来,至少部分基于个体的基因型来分配个体的世系504。接下来,从性状相关变体数据库中选择来自与个体具有相同世系的受试者(世系特异性受试者组)的世系特异性遗传变体506,这至少部分地基于所述个体的世系而选择,其中所述一个或多个世系特异性遗传变体中的每一个对应于:(i)所述一个或多个个体特异性遗传变体中的个体特异性遗传变体,或(ii)在与所述个体具有相同世系的受试者群体中与所述一个或多个个体特异性遗传变体中的个体特异性遗传变体处于连锁不平衡(LD)的预先确定的遗传变体,并且其中所述一个或多个世系特异性遗传变体中的每一个和所述个体特异性遗传变体中的每一个包括一个或多个风险单位。接下来,基于所选择的一个或多个世系特异性遗传变体计算对于所述个体的遗传风险得分(GRS)508,其中所述遗传风险得分指示所述个体具有或将发展所述特定性状的可能性。在一些情况下,使用本文公开的任何一种方法来计算GRS。Figure 5 illustrates an exemplary workflow for determining the likelihood of an individual having or developing a specific trait based on their lineage. An individual's genotype 502 is provided; the genotype includes one or more individual-specific genetic variants. Next, the individual's lineage 504 is assigned at least in part based on the individual's genotype. Next, lineage-specific genetic variants 506 are selected from a trait-related variant database from subjects with the same lineage as the individual (lineage-specific subject group), this selection being at least in part based on the individual's lineage, wherein each of the one or more lineage-specific genetic variants corresponds to: (i) an individual-specific genetic variant among the one or more individual-specific genetic variants, or (ii) a predetermined genetic variant in linkage disequilibrium (LD) with the individual-specific genetic variant among the one or more individual-specific genetic variants in the subject group with the same lineage as the individual, and wherein each of the one or more lineage-specific genetic variants and each of the individual-specific genetic variants includes one or more risk units. Next, a genetic risk score (GRS) 508 is calculated for the individual based on the selected one or more lineage-specific genetic variants, wherein the genetic risk score indicates the likelihood of the individual having or developing the specific trait. In some cases, any of the methods disclosed herein may be used to calculate GRS.

分配个体的世系Assigning individual lineages

在一些情况下,世系是通过分析个体的基因型来分配给个体。在一些情况下,使用如下方法分析个体的基因型,所述方法包括:最大似然或主成分分析(PCA)。在一些情况下,使用包括SNPRelate、ADMIXTURE、PLINK或STRUCTURE的计算机程序。例如,在SNPRelate进行PCA后,来自已知世系的群体的前两个主成分(PC1和PC2)各自组合成单个数据点或质心。个体世系根据其与已知世系的最近质心的接近程度来分类。该方法依赖于最近质心分类模型In some cases, pedigrees are assigned to individuals by analyzing their genotypes. In others, methods such as maximum likelihood or principal component analysis (PCA) are used to analyze an individual's genotype. In still others, computer programs including SNPRelate, ADMIXTURE, PLINK, or STRUCTURE are used. For example, after PCA with SNPRelate, the first two principal components (PC1 and PC2) of a population from a known pedigree are each combined into a single data point or centroid. An individual pedigree is classified based on its proximity to the nearest centroid of a known pedigree. This method relies on a nearest centroid classification model.

性状相关数据库Trait-related database

在一些实施方案中,使用性状相关数据库。在一些情况下,性状相关数据库包括受试者组的基因型、表型和/或世系数据。在一些情况下,受试者组源于公开的全基因组关联研究(GWAS)。在一些情况下,公开的GWAS被记录在同行审查的期刊中。在一些情况下,性状相关数据库使得能够选择与个体处于相同世系的受试者组中存在的遗传变体。在一些情况下,性状相关数据库用来自个体的基因型、表型和/或世系数据更新。许多数据库适用于基因型数据、表型数据和世系数据的存储和检索。通过非限制性示例,合适的数据库包括关系数据库、非关系数据库、面向特征的数据库、特征数据库、实体关系模型数据库、关联数据库和XML数据库。在一些实施方案中,数据库是基于因特网的。在一些实施方案中,数据库是基于web的。在一些实施方案中,数据库是基于云计算的。在一些实施方案中,数据库连接到分布式帐本。在一些实施方案中,分布式账本包括区块链。数据库可以基于一个或多个本地计算机存储设备。In some implementations, a trait-related database is used. In some cases, the trait-related database includes genotype, phenotype, and/or pedigree data of a subject group. In some cases, the subject group is derived from publicly available genome-wide association studies (GWAS). In some cases, publicly available GWAS are documented in peer-reviewed journals. In some cases, the trait-related database enables the selection of genetic variants present in subject groups that are phylogeneticly related to the individual. In some cases, the trait-related database is updated with genotype, phenotype, and/or pedigree data from the individual. Many databases are suitable for storing and retrieving genotype, phenotype, and pedigree data. By way of non-limiting examples, suitable databases include relational databases, non-relational databases, trait-oriented databases, trait databases, entity-relationship model databases, association databases, and XML databases. In some implementations, the database is Internet-based. In some implementations, the database is web-based. In some implementations, the database is cloud-based. In some implementations, the database is connected to a distributed ledger. In some implementations, the distributed ledger includes a blockchain. The database may be based on one or more local computer storage devices.

选择一个或多个参考遗传变体或世系特异性遗传变体Select one or more reference genetic variants or lineage-specific genetic variants.

在一些实施方案中,参考遗传变体或世系特异性遗传变体用于计算个体的GRS。在一些情况下,所述一个或多个遗传变体包括来自任何世系的受试者组的参考遗传变体。在一些实施方案中,受试者组包括一个或多个世系的个体,所述世系包括日本人、德国人、爱尔兰人、非洲人、南非人、英国人、墨西哥人、意大利人、波兰人、法国人、美洲土著人、苏格兰人、荷兰人、挪威人、苏格兰-爱尔兰人、瑞典人、波多黎各人、俄罗斯人、西班牙人、法裔加拿大人、菲律宾人、韩国人、朝鲜人、印度尼西亚人、中国人、马来西亚人、非洲-加勒比海人、高加索人、美洲印第安人/阿拉斯加土著(包括有部落联系的中美洲和南美洲血统的人)、太平洋岛民(包括夏威夷、关岛、萨摩亚等)、南亚人(包括来自阿富汗、印度、巴基斯坦、孟加拉国、斯里兰卡和尼泊尔的人)、日本人、泰国人、土著澳大利亚人(本土、托雷斯海峡岛民)。在一些情况下,所述一个或多个参考遗传变体包括世系特异性遗传变体,所述世系特异性遗传变体来自包括与所述个体同世系的个体的受试者组(世系特异性遗传变体)。In some implementations, a reference genetic variant or lineage-specific genetic variant is used to calculate an individual's GRS. In some cases, the one or more genetic variants include a reference genetic variant from any group of subjects from any lineage. In some implementations, the group of subjects includes individuals from one or more lineages, including Japanese, German, Irish, African, South African, British, Mexican, Italian, Polish, French, Native American, Scottish, Dutch, Norwegian, Scottish-Irish, Swedish, Puerto Rican, Russian, Spanish, French-Canadian, Filipino, Korean, North Korean, Indonesian, Chinese, Malaysian, African-Caribbean, Caucasian, Native American/Alaskan Native (including those of tribal ancestry from Central and South America), Pacific Islanders (including Hawaiian, Guam, Samoan, etc.), South Asians (including those from Afghanistan, India, Pakistan, Bangladesh, Sri Lanka, and Nepal), Japanese, Thai, and Indigenous Australians (native and Torres Strait Islanders). In some cases, the one or more reference genetic variants include lineage-specific genetic variants derived from a subject group comprising individuals from the same lineage as the individual (lineage-specific genetic variants).

在一些情况下,选择参考遗传变体,至少部分是因为他们来自与个体同世系的受试者组(世系特异性遗传变体)。在一些情况下,通过使用本文公开的方法分析个体的基因型来确定个体的世系。在一些情况下,世系特异性遗传变体选自本文公开的性状相关变体数据库。In some cases, reference genetic variants are selected, at least in part, because they come from a group of subjects ancestry-related to the individual (lineage-specific genetic variants). In some cases, an individual's lineage is determined by analyzing the individual's genotype using the methods disclosed herein. In some cases, lineage-specific genetic variants are selected from the trait-related variant database disclosed herein.

在一些情况下,世系特异性遗传变体对应于个体基因型内的个体特异性遗传变体。在一些情况下,对应的个体特异性遗传变体是未知的,在这种情况下,选择另一个遗传变体作为未知个体特异性遗传变体的代理。In some cases, lineage-specific genetic variants correspond to individual-specific genetic variants within an individual's genotype. In other cases, the corresponding individual-specific genetic variant is unknown; in such cases, another genetic variant is chosen as a proxy for the unknown individual-specific genetic variant.

选择代理遗传变体Select proxy genetic variants

在一些实施方案中,当个体特异性遗传变体未知时,使用代理遗传变体来计算GRS。在一些情况下,选择预先确定的遗传变体作为提供的代理。在一些实施方案中,本文公开了预先确定对应于未知个体特异性遗传变体的代理遗传变体的方法,所述方法包括:(i)提供来自个体的未分型基因型数据;(ii)将未分型基因型数据分型,以根据所述个体的世系产生个体特异性分型单倍型;(iii)使用来自与所述个体具有相同世系的参考组的分型单倍型数据,插补分型的个体特异性分型单倍型中不存在的个体特异性基因型;和(iv)从插补的个体特异性基因型中选择与和个体具有或将发展特定性状的可能性相关联的个体特异性遗传变体处于连锁不平衡(LD)的遗传变体。In some embodiments, a surrogate genetic variant is used to calculate GRS when the individual-specific genetic variant is unknown. In some cases, a predetermined genetic variant is selected as the provided surrogate. In some embodiments, a method is disclosed herein for pre-determining a surrogate genetic variant corresponding to an unknown individual-specific genetic variant, the method comprising: (i) providing untyped genotype data from an individual; (ii) typing the untyped genotype data to generate an individual-specific typing haplotype based on the individual's lineage; (iii) imputing an individual-specific genotype absent in the typed individual-specific typing haplotype using typing haplotype data from a reference group having the same lineage as the individual; and (iv) selecting from the imputed individual-specific genotypes a genetic variant in linkage disequilibrium (LD) that is associated with the individual-specific genetic variant and the likelihood that the individual has or will develop the specific trait.

在一些情况下,方法包括选择indel(插入/删除)作为未知的个体特异性indel的代理。在一些情况下,方法包括选择拷贝数变体(CNV)作为未知的个体特异性CNV的代理。In some cases, the method involves selecting an indel (insertion/deletion) as a proxy for an unknown individual-specific indel. In other cases, the method involves selecting a copy number variant (CNV) as a proxy for an unknown individual-specific CNV.

本文所用的“连锁不平衡”或“LD”是指风险单位与给定群体中的遗传风险变体的非随机关联。LD可以由D’值来定义,所述D’值对应于群体中观察到的和预期的风险单位频率之间的差值(D=Pab-PaPb),该差值由D的理论最大值来标度。LD可以由r2值来定义,所述r2值对应于群体中观察到的和预期的风险单位频率之间的差值(D=Pab-PaPb),该差值由不同基因座的单个频率标度。在一些实施方案中,D’包括至少0.20。在一些实施方案中,r2包括至少0.70。在一些实施方案中,LD由包括如下的D’值定义:至少约0.20、0.25、0.30、0.35、0.40、0.45、0.50、0.55、0.60、0.65、0.70、0.75、0.80、0.85、0.90、0.95或1。在一些实施方案中,LD由包括如下的r2值定义:至少约0.70、0.75、0.75、0.80、0.85、0.90、0.95或1.0。LD在属于不同世系的受试者群体中存在差异。在非限制性示例中,在中国人个体的受试者群体中与代理SNV处于LD的SNV可能不一定在高加索人个体的受试者群体中处于LD。因此,基于世系特异性分型单倍型数据的代理遗传变体的预先确定提供了至少部分基于代理的遗传风险预测的准确性提高。As used herein, “linkage disequilibrium” or “LD” refers to a non-random association between a risk unit and a genetic risk variant in a given population. LD can be defined by a D' value, which corresponds to the difference between the observed and expected frequencies of risk units in the population (D = Pab - PaPb), scaled by the theoretical maximum value of D. LD can also be defined by an value, which corresponds to the difference between the observed and expected frequencies of risk units in the population (D = Pab - PaPb), scaled by the individual frequencies of different loci. In some embodiments, D' includes at least 0.20. In some embodiments, includes at least 0.70. In some implementations, LD is defined by a D' value including at least about 0.20, 0.25, 0.30, 0.35, 0.40, 0.45, 0.50, 0.55, 0.60, 0.65, 0.70, 0.75, 0.80, 0.85, 0.90, 0.95, or 1. In some implementations, LD is defined by an value including at least about 0.70, 0.75, 0.75, 0.80, 0.85, 0.90, 0.95, or 1.0. LD varies among subject populations belonging to different lineages. In a non-limiting example, an SNV that is at LD with a surrogate SNV in a subject population of Chinese individuals may not necessarily be at LD in a subject population of Caucasian individuals. Therefore, the prior determination of surrogate genetic variants based on lineage-specific haplotype data provides at least a partial improvement in the accuracy of surrogate-based genetic risk prediction.

计算遗传风险得分Calculate genetic risk score

在一些实施方案中,提供了基于个体的世系计算个体的遗传风险得分(GRS)的方法。本文公开的遗传变体包括SNV、indel、和/或CNV。每个遗传变体包括用于计算GRS的风险单位。在一些情况下,SNV中的风险单位包括风险等位基因。在一些情况下,indel中的风险单位包括插入或缺失。在一些情况下,与野生型拷贝数相比,CNV内的风险单位包括基因或基因片段拷贝数的增加或减少。本领域技术人员将理解,根据本发明的方法和系统,可以使用计算GRS的许多方法来计算个体的GRS。In some embodiments, methods are provided for calculating an individual's genetic risk score (GRS) based on their pedigree. The genetic variants disclosed herein include SNVs, indels, and/or CNVs. Each genetic variant includes risk units for calculating the GRS. In some cases, the risk unit in an SNV includes a risk allele. In some cases, the risk unit in an indel includes an insertion or deletion. In some cases, the risk unit within a CNV includes an increase or decrease in the copy number of a gene or gene segment compared to the wild-type copy number. Those skilled in the art will understand that many methods for calculating GRS can be used to calculate an individual's GRS according to the methods and systems of the present invention.

在一些实施方案中,本文公开了计算个体的GRS的方法。在一些情况下,SNV(例如,风险等位基因)、Indel(例如,插入或删除)、和/或CNV(例如,拷贝数)中的风险单位可以被分配任意数值。在计算涉及SNV的GRS的非限制性示例中,SNV(RR)内的风险等位基因的纯合基因型被分配数值2;在SN(R)内的风险等位基因的杂合基因型被分配数值1;无风险(N)的基因型被分配数值0。接下来,将对应于世系特异性SNV的所有个体SNV的每个数值相加在一起,除以模型中使用的遗传变体的总数,以生成个体的原始得分(个体原始得分)。对属于受试者组的每个个体执行相同的计算,从而产生原始得分的范围(观察范围)。在一些情况下,受试者组包括与个体具有相同世系的个体。接下来,将个体原始得分与观察范围进行比较,以计算相对于受试者群体的风险百分比。In some implementations, methods for calculating an individual's GRS are disclosed herein. In some cases, risk units in SNVs (e.g., risk alleles), indels (e.g., insertions or deletions), and/or CNVs (e.g., copy numbers) can be assigned arbitrary values. In a non-limiting example of calculating GRS involving SNVs, the homozygous genotype of a risk allele within an SNV (RR) is assigned the value 2; the heterozygous genotype of a risk allele within an SN (R) is assigned the value 1; and the risk-free (N) genotype is assigned the value 0. Next, each value of the individual SNVs corresponding to the lineage-specific SNV is summed and divided by the total number of genetic variants used in the model to generate the individual's raw score (individual raw score). The same calculation is performed for each individual belonging to the subject group to generate a range of raw scores (observation range). In some cases, the subject group includes individuals of the same lineage as the individual. Next, the individual raw score is compared to the observation range to calculate the percentage of risk relative to the subject population.

在计算涉及SNV的GRS的另一非限制性示例中,提供与个体特异性SNV相对应的每个选择的世系特异性SNV的等位基因让步比(OR)并将其相乘。在一些情况下,OR从重复的、公开的和/或同行评审的GWAS中获得。在一些情况下,提供与个体特异性SNV相对应的每个选择的世系特异性SNV的OR。然后,将每个世系特异性SNV的基因型OR相加在一起;将个体的基因型OR相乘。对个体和受试者组的基因型OR进行比较,并计算百分位数GRS。In another non-limiting example of calculating GRS involving SNVs, the allele concession ratio (OR) for each selected lineage-specific SNV corresponding to the individual-specific SNV is provided and multiplied. In some cases, the OR is obtained from replicated, publicly available, and/or peer-reviewed GWAS. In some cases, the OR for each selected lineage-specific SNV corresponding to the individual-specific SNV is provided. The genotype ORs for each lineage-specific SNV are then summed; the genotype ORs for the individuals are multiplied. The genotype ORs for individuals and the subject groups are compared, and percentile GRS is calculated.

在计算涉及indel的GRS的另一非限制性示例中,对indel(II)内的插入的纯合基因型分配数值2;对indel(I)内的插入的杂合基因型分配数值1;无风险(N)的基因型被分配数值0。接下来,将对应于世系特异性indel的所有个体indel的每个数值相加在一起,除以模型中使用的遗传变体的总数,以生成个体的原始得分(个体原始得分)。对属于受试者组的每个个体执行相同的计算,从而产生原始得分的范围(观察范围)。在一些情况下,受试者组包括与个体具有相同世系的个体。接下来,将个体原始得分与观察范围进行比较,以计算相对于受试者群体的风险百分位数。In another non-restrictive example of calculating GRS involving indels, a value of 2 is assigned to homozygous genotypes of insertions within indel (II); a value of 1 is assigned to heterozygous genotypes of insertions within indel (I); and a value of 0 is assigned to risk-free (N) genotypes. Next, the values of each individual indel corresponding to a lineage-specific indel are summed and divided by the total number of genetic variants used in the model to generate the individual's raw score (individual raw score). The same calculation is performed for each individual belonging to the subject group, resulting in a range of raw scores (observation range). In some cases, the subject group includes individuals of the same lineage as the individual. The individual raw score is then compared to the observation range to calculate the risk percentile relative to the subject population.

在计算涉及indel的GRS的另一非限制性示例中,提供与个体特异性indel相对应的每个选择的世系特异性indel的让步比(OR)并将其相乘。在一些情况下,OR从重复的、公开的和/或同行评审的GWAS中获得。在一些情况下,提供与个体特异性indel相对应的每个选择的世系特异性indel的OR,并将每个风险indel等位基因的OR相乘以生成受试者组中每个受试者的基因型OR。接下来,对个体执行相同的计算,以生成个体的基因型OR。对个体和受试者组的基因型OR进行比较,并计算百分位数GRS。In another non-restrictive example of calculating GRS involving indels, the concession ratio (OR) for each choice of pedigree-specific indel corresponding to the individual-specific indel is provided and multiplied. In some cases, the OR is obtained from replicated, publicly available, and/or peer-reviewed GWAS. In some cases, the OR for each choice of pedigree-specific indel corresponding to the individual-specific indel is provided, and the OR for each risk indel allele is multiplied to generate the genotype OR for each subject in the subject group. Next, the same calculation is performed on individuals to generate the individual's genotype OR. The genotype ORs for individuals and subject groups are compared, and percentile GRS is calculated.

在计算涉及CNV的GRS的非限制性示例中,对无风险基因型(例如,拷贝数与野生型相同,或正常对照相同)分配数值0,对由1个CNV组成的基因型分配数值1,对由2个CNV组成的基因型分配数值2。接下来,将对应于世系特异性CNV的所有个体CNV的每个数值相加在一起,除以模型中使用的遗传变体的总数,以生成个体的原始得分(个体原始得分)。对属于受试者组的每个个体执行相同的计算,从而产生原始得分的范围(观察范围)。在一些情况下,受试者组包括与个体具有相同世系的个体。接下来,将个体原始得分与观察范围进行比较,以计算相对于受试者群体的风险百分位数。In a non-restrictive example of calculating GRS involving CNVs, a value of 0 is assigned to risk-free genotypes (e.g., those with the same copy number as wild-type or normal controls), a value of 1 is assigned to genotypes consisting of 1 CNV, and a value of 2 is assigned to genotypes consisting of 2 CNVs. Next, the values of each individual CNV corresponding to a lineage-specific CNV are summed and divided by the total number of genetic variants used in the model to generate the individual's raw score (individual raw score). The same calculation is performed for each individual belonging to the subject group, resulting in a range of raw scores (observation range). In some cases, the subject group includes individuals of the same lineage as the individual. The individual raw score is then compared to the observation range to calculate the risk percentile relative to the subject population.

在计算涉及CNV的GRS的另一非限制性示例中,提供与个体特异性CNV相对应的每个选择的世系特异性SNV的让步比(OR)并将其相乘。在一些情况下,OR从重复的、公开的和/或同行评审的GWAS中获得。在一些情况下,提供与个体特异性CNV相对应的每个选择的世系特异性CNV的OR,并将每个CNV的OR相乘以生成受试者组中每个受试者的基因型OR。接下来,对个体执行相同的计算,以生成个体的基因型OR。对个体和受试者组的基因型OR进行比较,并计算百分位数GRS。In another non-limiting example of calculating GRS involving CNVs, the concession ratio (OR) for each selected lineage-specific SNV corresponding to the individual-specific CNV is provided and multiplied. In some cases, the OR is obtained from replicated, publicly available, and/or peer-reviewed GWAS. In some cases, the OR for each selected lineage-specific CNV corresponding to the individual-specific CNV is provided, and the OR for each CNV is multiplied to generate the genotype OR for each subject in the subject group. Next, the same calculation is performed on individuals to generate the individual's genotype OR. The genotype ORs for individuals and subject groups are compared, and percentile GRS is calculated.

在一些实施方案中,本文公开了用于使用上面公开的方法计算遗传风险得分(GRS)的方法、介质和系统,所述方法涉及一个或多个SNV和一个或多个CNV、一个或多个SNV和一个或多个indel、一个或多个CNV和一个或多个indel、或一个或多个SNV、一个或多个CNV和一个或多个indel。In some embodiments, this document discloses methods, media, and systems for calculating genetic risk scores (GRS) using the methods disclosed above, the methods involving one or more SNVs and one or more CNVs, one or more SNVs and one or more indels, one or more CNVs and one or more indels, or one or more SNVs, one or more CNVs and one or more indels.

表型性状Phenotypic traits

大多数表型性状和复杂疾病是遗传和环境因素综合作用的结果,每一种因素都增加或减少表型性状发展的易感性。预测个体是否具有或将发展一种表型性状的能力对于多种目的是有用的,包括但不限于为个体选择治疗方案、管制个体饮食、建议产品(例如,护肤、护发、化妆品、补充剂、维生素、运动等)。Most phenotypic traits and complex diseases are the result of the combined effects of genetic and environmental factors, each of which increases or decreases susceptibility to the development of a phenotypic trait. The ability to predict whether an individual has or will develop a phenotypic trait is useful for a variety of purposes, including but not limited to selecting treatment options for an individual, regulating an individual's diet, and recommending products (e.g., skincare, haircare, cosmetics, supplements, vitamins, exercise, etc.).

术语“表型性状”和“特定表型性状”在此可互换地使用,以指至少由个体的基因型引起的个体的可观察特征。本文公开的遗传风险预测方法、介质和系统通过分析与参考群体相比遗传变体的数量和类型来量化个体基因型中遗传变异的负荷。从个体获得的样品中存在的遗传变体的数量和类型可以告知所述个体发展某种表型性状的可能性(或风险)是增加还是减少。在一些情况下,特定表型性状会对个体的保健或健康产生不利影响。在一些实施方案中,本文公开了用于建议行为改变以防止、减轻或改善个体中特定表型性状的不利影响的方法、系统和介质。The terms "phenotypic trait" and "specific phenotypic trait" are used interchangeably herein to refer to an observable characteristic of an individual caused at least by the individual's genotype. The genetic risk prediction methods, media, and systems disclosed herein quantify the load of genetic variation in an individual's genotype by analyzing the number and type of genetic variants compared to a reference population. The number and type of genetic variants present in a sample obtained from an individual can inform whether the individual's likelihood (or risk) of developing a certain phenotypic trait is increased or decreased. In some cases, a specific phenotypic trait can have an adverse effect on an individual's health or well-being. In some embodiments, methods, systems, and media are disclosed herein for recommending behavioral changes to prevent, mitigate, or improve the adverse effects of a specific phenotypic trait in an individual.

本文公开的方面提供了计算遗传风险得分(GRS)的方法和系统,所述GRS表示个体将发展特定表型性状的可能性。GRS是基于个体基因组或基因型中存在的一种或多种遗传变体。在一些实施方案中,使用本文公开的方法在从个体获得的样品中检测一种或多种遗传变体。在一些实施方案中,所述一种或多种遗传变体包括SNV、indel和/或CNV。在一些实施方案中,个体基因型中存在的一种或多种遗传变体与个体具有或将发展特定表型性状的可能性增加相关联。在一些实施方案中,个体基因型中存在的一种或多种遗传变体与个体具有或将发展特定表型性状的可能性降低相关联。在一些实施方案中,表型性状包括临床性状、亚临床性状、体育锻炼性状、皮肤性状、毛发性状、过敏性状、营养性状或精神性状。The aspects disclosed herein provide methods and systems for calculating a genetic risk score (GRS), which represents the likelihood that an individual will develop a particular phenotypic trait. The GRS is based on one or more genetic variants present in an individual's genome or genotype. In some embodiments, one or more genetic variants are detected in a sample obtained from an individual using the methods disclosed herein. In some embodiments, the one or more genetic variants include SNVs, indels, and/or CNVs. In some embodiments, the presence of one or more genetic variants in an individual's genotype is associated with an increased likelihood that the individual has or will develop a particular phenotypic trait. In some embodiments, the presence of one or more genetic variants in an individual's genotype is associated with a decreased likelihood that the individual has or will develop a particular phenotypic trait. In some embodiments, the phenotypic trait includes clinical traits, subclinical traits, physical traits, skin traits, hair traits, allergic traits, nutritional traits, or mental traits.

临床和亚临床性状Clinical and subclinical characteristics

在一些实施方案中,临床性状包括疾病或病症,或所述疾病或病症的亚临床性状。在一些实施方案中,临床性状包括可诊断的疾病或病症。在一些实施方案中,亚临床性状包括亚可诊断的疾病、病症或与疾病或病症相关联的其他表型。在一些实施方案中,疾病或病症包括缺乏性疾病、遗传性疾病或心理疾病。在一些实施方案中,疾病或病症包括免疫性疾病和/或代谢性疾病白内障风险、青光眼风险、关节炎症风险、肾结石风险、整体炎症风险、盆底功能障碍、炎症生物标志物(CRP、ESR、IL18)、与年龄相关的认知衰退、与年龄相关的听力损失、白癜风、升高的同型半胱氨酸风险。非限制性示例包括失眠风险、肾结石风险和牙周炎。在一些实施方案中,免疫性疾病包括自身免疫性疾病或紊乱。自身免疫性疾病或紊乱的非限制性示例包括格雷夫病、桥本甲状腺炎、系统性红斑狼疮(狼疮)、多发性硬化症、类风湿性关节炎、炎症性肠病、克罗恩病、溃疡性结肠炎和癌症。代谢性疾病或病症的非限制性示例包括1型糖尿病、2型糖尿病、影响大量营养素(例如,氨基酸、碳水化合物或脂类)吸收的疾病、影响微量营养物(例如,维生素或矿物质)吸收的疾病、影响线粒体功能的疾病、影响肝功能的疾病(例如,非酒精性脂肪性肝病)、以及影响肾功能的疾病。亚临床性状可以包括与本文公开的疾病或病症相关联的亚可诊断病症或紊乱。In some embodiments, a clinical symptom includes a disease or condition, or a subclinical symptom of said disease or condition. In some embodiments, a clinical symptom includes a diagnosable disease or condition. In some embodiments, a subclinical symptom includes a subdiagnosable disease, condition, or other phenotype associated with a disease or condition. In some embodiments, a disease or condition includes a deficiency disease, a genetic disease, or a mental illness. In some embodiments, a disease or condition includes immune and/or metabolic diseases such as the risk of cataracts, glaucoma, joint inflammation, kidney stones, overall inflammation, pelvic floor dysfunction, inflammatory biomarkers (CRP, ESR, IL18), age-related cognitive decline, age-related hearing loss, vitiligo, and elevated homocysteine levels. Non-limiting examples include the risk of insomnia, kidney stones, and periodontitis. In some embodiments, an immune disease includes an autoimmune disease or disorder. Non-limiting examples of autoimmune diseases or disorders include Graves' disease, Hashimoto's thyroiditis, systemic lupus erythematosus (lupus), multiple sclerosis, rheumatoid arthritis, inflammatory bowel disease, Crohn's disease, ulcerative colitis, and cancer. Non-limiting examples of metabolic diseases or conditions include type 1 diabetes, type 2 diabetes, diseases affecting the absorption of macronutrients (e.g., amino acids, carbohydrates, or lipids), diseases affecting the absorption of micronutrients (e.g., vitamins or minerals), diseases affecting mitochondrial function, diseases affecting liver function (e.g., non-alcoholic fatty liver disease), and diseases affecting kidney function. Subclinical characteristics may include subdiagnosable conditions or disorders associated with the diseases or conditions disclosed herein.

皮肤性状Skin characteristics

在一些实施方案中,表型性状包括与个体皮肤相关联的性状(皮肤性状)。在一些实施方案中,皮肤性状包括胶原分解率。胶原分解率可能受编码MMP、MMP-3、MMP-1胶原分解酶的基因内的遗传变异影响。编码胶原分解酶的基因内的遗传变异的非限制性示例包括在表1中公开的单核苷酸变体(SNV)。In some embodiments, phenotypic traits include traits associated with an individual's skin (skin traits). In some embodiments, skin traits include collagen degradation rate. Collagen degradation rate may be influenced by genetic variations within the genes encoding MMP, MMP-3, and MMP-1 collagenases. Non-limiting examples of genetic variations within the genes encoding collagenases include single nucleotide variants (SNVs) disclosed in Table 1.

表1Table 1

在一些实施方案中,皮肤性状包括干燥程度。编码水通道蛋白3的基因内的遗传变异可能会影响皮肤含水量,进而影响皮肤干燥程度。编码水通道蛋白3的基因内的遗传变异的非限制性示例包括在表2中公开的SNV。In some implementations, skin traits include dryness. Genetic variations within the gene encoding aquaporin 3 can affect skin hydration, and consequently, skin dryness. Non-limiting examples of genetic variations within the gene encoding aquaporin 3 include the SNVs disclosed in Table 2.

表2Table 2

在一些实施方案中,皮肤性状包括皮肤的抗氧化剂缺乏。皮肤抗氧化剂缺乏可受到编码NQO1、SOD2、NFE2L2、GPX1和/或CAT的基因内的遗传变异的影响。编码NQO1、SOD2、NFE2L2、GPX1和CAT的基因内的遗传变异的非限制性示例包括在表3中公开的SNV。In some embodiments, skin traits include a lack of antioxidants in the skin. This lack of antioxidants can be influenced by genetic variations within genes encoding NQO1, SOD2, NFE2L2, GPX1, and/or CAT. Non-limiting examples of genetic variations within genes encoding NQO1, SOD2, NFE2L2, GPX1, and CAT include the SNVs disclosed in Table 3.

表3Table 3

在一些实施方案中,皮肤性状包括皮肤解毒受损。皮肤解毒能力可受到编码LOC157273、SGOL1、TBC1D22B、FST、MIR4432、RNASEH2C、和/或TGFB2的基因内的遗传变异的影响。编码LOC157273、SGOL1、TBC1D22B、FST、MIR4432、RNASEH2C、和TGFB2的基因内的遗传变异的非限制性示例包括在表4中公开的SNV。In some embodiments, skin traits include impaired skin detoxification. Skin detoxification capacity can be affected by genetic variations within genes encoding LOC157273, SGOL1, TBC1D22B, FST, MIR4432, RNASEH2C, and/or TGFB2. Non-limiting examples of genetic variations within genes encoding LOC157273, SGOL1, TBC1D22B, FST, MIR4432, RNASEH2C, and TGFB2 are included in the SNVs disclosed in Table 4.

表4Table 4

在一些实施方案中,皮肤性状包括皮肤糖化。糖化可受到编码SLC24A5、SLC45A2、BCN2、MC1R、C16orf55、SPATA33、ASIP、RALY、和/或NAT2的基因内的遗传变异的影响。编码SLC24A5、SLC45A2、BCN2、MC1R、C16orf55、SPATA33、ASIP、RALY、和NAT2的基因内的遗传变异的非限制性示例包括在表5中公开的SNV。In some embodiments, skin traits include skin glycation. Glycation can be influenced by genetic variations within the genes encoding SLC24A5, SLC45A2, BCN2, MC1R, C16orf55, SPATA33, ASIP, RALY, and/or NAT2. Non-limiting examples of genetic variations within the genes encoding SLC24A5, SLC45A2, BCN2, MC1R, C16orf55, SPATA33, ASIP, RALY, and NAT2 are included in the SNVs disclosed in Table 5.

表5Table 5

在一些实施方案中,皮肤性状包括色素斑点。皮肤色素斑点可受到编码SEC5L1、IRF4、MC1R、SLC45A2、TYR、NTM、ASIP、RALY的基因中的遗传变异的影响。编码SEC5L1、IRF4、MC1R、SLC45A2、TYR、NTM、ASIP、RALY的基因内的遗传变异的非限制性示例包括在表6中公开的SNV。In some embodiments, skin traits include pigmented spots. Skin pigmented spots can be influenced by genetic variations in the genes encoding SEC5L1, IRF4, MC1R, SLC45A2, TYR, NTM, ASIP, and RALY. Non-limiting examples of genetic variations within the genes encoding SEC5L1, IRF4, MC1R, SLC45A2, TYR, NTM, ASIP, and RALY are included in the SNVs disclosed in Table 6.

表6Table 6

在一些实施方案中,皮肤性状包括年轻性。本文所公开的“年轻性”指的是包括老化速率缓慢的皮肤质量,或者看起来比实际更小或更年轻。年轻性可受到编码EDEM1的基因内遗传变异的影响。编码EDEM1的基因内的遗传变异的非限制性示例包括在表7中公开的SNV。在一些实施方案中,年轻性是指皮肤的质量,其包括与不表达表7中公开的SNV的个体的衰老速度相比,衰老速度慢1个月、2个月、3个月、4个月、5个月、6个月、7个月、8个月、9个月、10个月、11个月、12个月、1年、2年、3年、4年或5年。In some embodiments, skin traits include youthfulness. As disclosed herein, "youthfulness" refers to skin quality that includes a slow rate of aging, or skin that appears smaller or younger than it actually is. Youthfulness can be influenced by genetic variations within the gene encoding EDEM1. Non-limiting examples of genetic variations within the gene encoding EDEM1 include the SNVs disclosed in Table 7. In some embodiments, youthfulness refers to skin quality that includes a slower rate of aging compared to the rate of aging of individuals who do not express the SNVs disclosed in Table 7, for 1 month, 2 months, 3 months, 4 months, 5 months, 6 months, 7 months, 8 months, 9 months, 10 months, 11 months, 12 months, 1 year, 2 years, 3 years, 4 years, or 5 years.

表7Table 7

在一些实施方案中,皮肤性状包括光老化。本文所公开的“光老化”是指由于紫外线辐射而对皮肤造成的损伤,并且是过早老化的主要原因。光老化可受到编码MC1R、NTM、TYR、FBXO40、STXBP5L、ASIP、RALY、FANCA、ID4-RPL29P17的基因内的遗传变异的影响。编码MC1R、NTM、TYR、FBXO40、STXBP5L、ASIP、RALY、FANCA、ID4-RPL29P17的基因内的遗传变异的非限制性示例包括在表8中公开的SNV。In some embodiments, skin traits include photoaging. "Photoaging" as disclosed herein refers to damage to the skin caused by ultraviolet radiation and is a major cause of premature aging. Photoaging can be affected by genetic variations within genes encoding MC1R, NTM, TYR, FBXO40, STXBP5L, ASIP, RALY, FANCA, and ID4-RPL29P17. Non-limiting examples of genetic variations within genes encoding MC1R, NTM, TYR, FBXO40, STXBP5L, ASIP, RALY, FANCA, and ID4-RPL29P17 include the SNVs disclosed in Table 8.

表8Table 8

在一些实施方案中,皮肤性状包括真皮敏感性。本文公开的“真皮敏感性”是指可导致皮肤屏障缺陷并促进皮肤敏感性和刺激性的遗传变异。真皮敏感性可受到编码RNASEH2C、DDB2、C11orf49、SELL、TGFB2、SGOL1、ERI1、LOC157273、MFHAS1、MIR597、MIR4660、PPP1R3B、U6、TNKS、BC017578、TBC1D22B、AL833181、BCL11A、JB153659、PAPOLG、MIR4432、Mir_562的基因内的遗传变异的影响。编码RNASEH2C、DDB2、C11orf49、SELL、TGFB2、SGOL1、ERI1、LOC157273、MFHAS1、MIR597、MIR4660、PPP1R3B、U6、TNKS、BC017578、TBC1D22B、AL833181、BCL11A、JB153659、PAPOLG、MIR4432、Mir_562的基因内的遗传变异的非限制性示例包括在表9中公开的SNV。In some implementations, skin traits include dermal sensitivity. As disclosed herein, “dermal sensitivity” refers to genetic variations that can lead to skin barrier defects and promote skin sensitivity and irritation. Dermal sensitivity can be influenced by genetic variations within genes encoding RNASEH2C, DDB2, C11orf49, SELL, TGFB2, SGOL1, ERI1, LOC157273, MFHAS1, MIR597, MIR4660, PPP1R3B, U6, TNKS, BC017578, TBC1D22B, AL833181, BCL11A, JB153659, PAPOLG, MIR4432, and Mir_562. Non-limiting examples of genetic variations within genes encoding RNASEH2C, DDB2, C11orf49, SELL, TGFB2, SGOL1, ERI1, LOC157273, MFHAS1, MIR597, MIR4660, PPP1R3B, U6, TNKS, BC017578, TBC1D22B, AL833181, BCL11A, JB153659, PAPOLG, MIR4432, and Mir_562 are included in the SNVs disclosed in Table 9.

表9Table 9

在一些实施方案中,皮肤性状包括对太阳的敏感性。对太阳的敏感性是指一些皮肤类型由于适度的日晒而容易受到损害。对太阳的敏感性可受到编码NTM、TYR、MC1R的基因内的遗传变异的影响。编码NTM、TYR、MC1R的基因内的遗传变异的非限制性示例包括在表10中公开的SNV。In some implementations, skin traits include sensitivity to sunlight. Sensitivity to sunlight refers to the susceptibility of certain skin types to damage from moderate sun exposure. Sensitivity to sunlight can be influenced by genetic variations within genes encoding NTM, TYR, and MC1R. Non-limiting examples of genetic variations within genes encoding NTM, TYR, and MC1R include the SNVs disclosed in Table 10.

表10Table 10

体育锻炼性状Physical exercise characteristics

在一些实施方案中,本文公开了包括与个体的身体素质相关的性状(身体素质性状)的体育锻炼性状。在一些实施方案中,身体素质性状包括反感锻炼。“反感锻炼”是指回避和/或不喜欢体验锻炼。反感锻炼可受到编码PAPSS2、C18orf2、DNAPTP6、TMEM18、LEP、MC4R的基因内的遗传变异影响。编码PAPSS2、C18orf2、DNAPTP6、TMEM18、LEP、MC4R的基因内的遗传变异的非限制性示例包括在表11中公开的单核苷酸变体(SNV)。In some embodiments, this document discloses physical fitness traits, including traits related to an individual's physical fitness (physical fitness traits). In some embodiments, physical fitness traits include aversion to exercise. "Aversion to exercise" refers to avoidance and/or dislike of experiencing exercise. Aversion to exercise can be influenced by genetic variations within genes encoding PAPSS2, C18orf2, DNAPTP6, TMEM18, LEP, and MC4R. Non-limiting examples of genetic variations within genes encoding PAPSS2, C18orf2, DNAPTP6, TMEM18, LEP, and MC4R include the single nucleotide variants (SNVs) disclosed in Table 11.

表11Table 11

在一些实施方案中,身体素质性状包括有氧运动能力。有氧运动能力可受到编码TSHR、ACSL1、PRDM1、DBX1、GRIN3A、ESRRB、ZIC4、CDH13的基因内的遗传变异的影响。基因TSHR、ACSL1、PRDM1、DBX1、GRIN3A、ESRRB、ZIC4、CDH13内的遗传变异的非限制性示例包括表12中公开的SNV。In some implementations, physical fitness traits include aerobic capacity. Aerobic capacity can be influenced by genetic variations within genes encoding TSHR, ACSL1, PRDM1, DBX1, GRIN3A, ESRRB, ZIC4, and CDH13. Non-limiting examples of genetic variations within the genes TSHR, ACSL1, PRDM1, DBX1, GRIN3A, ESRRB, ZIC4, and CDH13 include the SNVs disclosed in Table 12.

表12Table 12

在一些实施方案中,身体素质性状包括减肥困难。减肥困难可能受编码FTO、TMEM18、MC4R、KCTD15、CHST8、PPARG、NEGR1、IRS1、SFRS10、ETV5、DGKG、ATP2A1、SH2B1、BDNF、SEC16B、RASAL2、NOS 1AP、AIF1、NCR3、MSRA、TNKS、SPRY2、SH3PXD2B、NEURL1B、BCDIN3D、FAIM2、CHRNA9、RBM47、RGMA、MCTP2、MIR4275、PCDH7、TENM2、PRR16、FTMT、SLC24A5、SDCCAG8、COL25A1、NEURL1B、SH3PXD2B、ERBB4、MIR4776-2、STXBP6、NOVA1、DEFB112、TFAP2D、EEF1A1P11-LOC105378866、MTIF3-RNU6-63P、NRXN3、CEP120和/或LOC105378866-RN7SL831P的基因内的遗传变异的影响。编码FTO、TMEM18、MC4R、KCTD15、CHST8、PPARG、NEGR1、IRS1、SFRS10、ETV5、DGKG、ATP2A1、SH2B1、BDNF、SEC16B、RASAL2、NOS1AP、AIF1、NCR3、MSRA、TNKS、SPRY2、SH3PXD2B、NEURL1B、BCDIN3D、FAIM2、CHRNA9、RBM47、RGMA、MCTP2、MIR4275、PCDH7、TENM2、PRR16、FTMT、SLC24A5、SDCCAG8、COL25A1、NEURL1B、SH3PXD2B、ERBB4、MIR4776-2、STXBP6、NOVA1、DEFB112、TFAP2D、EEF1A1P11-LOC105378866、MTIF3-RNU6-63P、NRXN3、CEP120和/或LOC105378866-RN7SL831P的基因内的遗传变异的非限制性示例包括表13中公开的SNV。In some implementations, physical traits include difficulty in losing weight. Difficulty in losing weight may be associated with the following codes: FTO, TMEM18, MC4R, KCTD15, CHST8, PPARG, NEGRS1, IRS1, SFRS10, ETV5, DGKG, ATP2A1, SH2B1, BDNF, SEC16B, RASAL2, NOS 1AP, AIF1, NCR3, MSRA, TNKS, SPRY2, SH3PXD2B, NEURL1B, BCDIN3D, FAIM2, CHRNA9, RBM47, RGMA, MCTP2, and MIR42. The effects of genetic variations within the genes of 75, PCDH7, TENM2, PRR16, FTMT, SLC24A5, SDCCAG8, COL25A1, NEURL1B, SH3PXD2B, ERBB4, MIR4776-2, STXBP6, NOVA1, DEFB112, TFAP2D, EEF1A1P11-LOC105378866, MTIF3-RNU6-63P, NRXN3, CEP120 and/or LOC105378866-RN7SL831P. Coding FTO, TMEM18, MC4R, KCTD15, CHST8, PPARG, NEGR1, IRS1, SFRS10, ETV5, DGKG, ATP2A1, SH2B1, BDNF, SEC16B, RASAL2, NOS1 AP, AIF1, NCR3, MSRA, TNKS, SPRY2, SH3PXD2B, NEURL1B, BCDIN3D, FAIM2, CHRNA9, RBM47, RGMA, MCTP2, MIR4275, PCDH7, TEN Non-limiting examples of genetic variations within the genes of M2, PRR16, FTMT, SLC24A5, SDCCAG8, COL25A1, NEURL1B, SH3PXD2B, ERBB4, MIR4776-2, STXBP6, NOVA1, DEFB112, TFAP2D, EEF1A1P11-LOC105378866, MTIF3-RNU6-63P, NRXN3, CEP120 and/or LOC105378866-RN7SL831P include the SNVs disclosed in Table 13.

表13Table 13

在一些实施方案中,身体素质性状包括耐力。耐力可受到编码PPARGC1A、PPAR-a、TSHR、ESRRB和/或CDH13的基因内的遗传变异的影响。编码PPARGC1A、PPAR-a、TSHR、ESRRB和CDH13的基因内的遗传变异的非限制性示例包括表14中公开的SNV。In some implementations, physical traits include endurance. Endurance can be influenced by genetic variations within genes encoding PPARGC1A, PPAR-a, TSHR, ESRRB, and/or CDH13. Non-limiting examples of genetic variations within genes encoding PPARGC1A, PPAR-a, TSHR, ESRRB, and CDH13 include the SNVs disclosed in Table 14.

表14Table 14

在一些实施方案中,身体素质性状包括力量。力量可受到编码TSHR、ESRRB和/或CDH13的基因内的遗传变异的影响。编码TSHR、ESRRB和CDH13的基因内的遗传变异的非限制性示例包括在表15中公开的SNV。In some implementations, physical traits include strength. Strength can be influenced by genetic variations within genes encoding TSHR, ESRRB, and/or CDH13. Non-limiting examples of genetic variations within genes encoding TSHR, ESRRB, and CDH13 include the SNVs disclosed in Table 15.

表15Table 15

在一些实施方案中,身体素质性状包括身体素质益处。“身体素质益处”指的是个体具有导致从锻炼中表现出更快和更强的受益的某些遗传变异,而其他遗传变异可能需要更长时间并且结果不太明显。身体素质益处可受到编码KLKB1、F12、CETP、APOE、APOC1、EDN1、SORT1、PLA2G7、LPL、LIPC、GALNT2、SCARB1、LIPG、MS4A4E、ABCA1、TMEM49、LOC101928635、MVK、MMAB、FLJ41733、FADS1、RREB1、COL8A1和/或GCKR的基因内的遗传变异的影响。编码KLKB1、F12、CETP、APOE、APOC1、EDN1、SORT1、PLA2G7、LPL、LIPC、GALNT2、SCARB1、LIPG、MS4A4E、ABCA1、TMEM49、LOC101928635、MVK、MMAB、FLJ41733、FADS1、RREB1、COL8A1和GCKR的基因内的遗传变异的非限制性示例包括在表16中公开的SNV。In some implementations, physical fitness traits include physical fitness benefits. A "physical fitness benefit" refers to an individual possessing certain genetic variations that result in a faster and stronger performance from exercise, while other genetic variations may take longer and have less noticeable effects. Physical fitness benefits can be influenced by genetic variations within genes encoding KLKB1, F12, CETP, APOE, APOC1, EDN1, SORT1, PLA2G7, LPL, LIPC, GALNT2, SCARB1, LIPG, MS4A4E, ABCA1, TMEM49, LOC101928635, MVK, MMAB, FLJ41733, FADS1, RREB1, COL8A1, and/or GCKR. Non-limiting examples of genetic variations within the genes encoding KLKB1, F12, CETP, APOE, APOC1, EDN1, SORT1, PLA2G7, LPL, LIPC, GALNT2, SCARB1, LIPG, MS4A4E, ABCA1, TMEM49, LOC101928635, MVK, MMAB, FLJ41733, FADS1, RREB1, COL8A1, and GCKR are included in the SNVs disclosed in Table 16.

表16Table 16

在一些实施方案中,身体素质性状包括响应于锻炼的心跳减少(例如,恢复速率)。响应于锻炼的心跳减少可受到编码RBPMS、PIWIL1、OR6N2、ERBB4、CREB1、MAP2和/或IKZF2的基因内的遗传变异的影响。编码RBPMS、PIWIL1、OR6N2、ERBB4、CREB1、MAP2和IKZF2的基因内遗传变异的非限制性示例包括表17中公开的SNV。In some implementations, physical fitness traits include a reduction in heart rate in response to exercise (e.g., recovery rate). This reduction in heart rate in response to exercise can be influenced by genetic variations within genes encoding RBPMS, PIWIL1, OR6N2, ERBB4, CREB1, MAP2, and/or IKZF2. Non-limiting examples of genetic variations within genes encoding RBPMS, PIWIL1, OR6N2, ERBB4, CREB1, MAP2, and IKZF2 include the SNVs disclosed in Table 17.

表17Table 17

在一些实施方案中,身体素质性状包括瘦体重。瘦体重可受到编码TRHR、DARC、GLYAT、FADS1和/或FADS2的基因内的遗传变异的影响。编码TRHR、DARC、GLYAT、FADS1和FADS2的基因内的遗传变异的非限制性示例包括表18中公开的SNV。In some implementations, physical fitness traits include lean body mass. Lean body mass can be influenced by genetic variations within genes encoding TRHR, DARC, GLYAT, FADS1, and/or FADS2. Non-limiting examples of genetic variations within genes encoding TRHR, DARC, GLYAT, FADS1, and FADS2 include the SNVs disclosed in Table 18.

表18Table 18

在一些实施方案中,身体素质性状包括肌肉酸痛。肌肉酸痛可受到编码CD163L1、DARC、CD163、ABO、CRP、CD163、CADM3、CR1、NRNR、NINJ1、CFH、DARC、CPN1、CSF1、HBB、CCL2和/或IGF2的基因内的遗传变异的影响。编码CD163L1、DARC、CD163、ABO、CRP、CD163、CADM3、CR1、NRNR、NINJ1、CFH、DARC、CPN1、CSF1、HBB、CCL2和IGF2的基因内的遗传变异的非限制性示例包括在表19中公开的SNV。In some implementations, physical fitness traits include muscle soreness. Muscle soreness can be affected by genetic variations within genes encoding CD163L1, DARC, CD163, ABO, CRP, CD163, CADM3, CR1, NRNR, NINJ1, CFH, DARC, CPN1, CSF1, HBB, CCL2, and/or IGF2. Non-limiting examples of genetic variations within genes encoding CD163L1, DARC, CD163, ABO, CRP, CD163, CADM3, CR1, NRNR, NINJ1, CFH, DARC, CPN1, CSF1, HBB, CCL2, and IGF2 are included in the SNVs disclosed in Table 19.

表19Table 19

在一些实施方案中,身体素质性状包括肌肉损伤风险。“肌肉损伤”是指有增加肌肉损伤风险的倾向。肌肉损伤风险可受到编码IGF-II、MLCK、ACTN3、IL-6和/或COL5A1的基因内的遗传变异的影响。编码IGF-II、MLCK、ACTN3、IL-6和COL5A1的基因内的遗传变异的非限制性示例包括表20中公开的SNV。In some implementations, physical fitness traits include risk of muscle injury. “Muscle injury” refers to a predisposition to an increased risk of muscle injury. Risk of muscle injury can be influenced by genetic variations within genes encoding IGF-II, MLCK, ACTN3, IL-6, and/or COL5A1. Non-limiting examples of genetic variations within genes encoding IGF-II, MLCK, ACTN3, IL-6, and COL5A1 include the SNVs disclosed in Table 20.

表20Table 20

在一些实施方案中,身体素质性状包括肌肉修复受损。肌肉修复受损可受到编码HCP5、HCG26、MICB、ATP6V1G2和/或DDX39B的基因内的遗传变异的影响。编码HCP5、HCG26、MICB、ATP6V1G2和DDX39B的基因内的遗传变异的非限制性示例包括表21中公开的SNV。In some implementations, physical fitness traits include impaired muscle repair. Impaired muscle repair can be affected by genetic variations within genes encoding HCP5, HCG26, MICB, ATP6V1G2, and/or DDX39B. Non-limiting examples of genetic variations within genes encoding HCP5, HCG26, MICB, ATP6V1G2, and DDX39B include the SNVs disclosed in Table 21.

表21Table 21

在一些实施方案中,身体素质性状包括应力性骨折风险。应力性骨折风险可受到编码LOC101060363-LOC105376856、ZBTB40、EN1、FLJ42280、COLEC10、WNT16、ESR1、ATP6V1G1、CLDN14、ESR1FABP3P2、ADAMTS18、SOST、CLDN14、MEF2C、KCNH1、C6orf97、CKAP5、C17orf53、SOST、TNFRSF11A、LOC105373519-LOC728815、PTCH1、SMOC1、LOC646794-LOC101928765和LOC105377045-MRPS31P1的基因内的遗传变异的影响。编码LOC101060363-LOC105376856、ZBTB40、EN1、FLJ42280、COLEC10、WNT16、ESR1、ATP6V1G1、CLDN14、ESR1FABP3P2、ADAMTS18、SOST、CLDN14、MEF2C、KCNH1、C6orf97、CKAP5、C17orf53、SOST、TNFRSF11A、LOC105373519-LOC728815、PTCH1、SMOC1、LOC646794-LOC101928765和LOC105377045-MRPS31P1的基因内的遗传变异的非限制性示例包括表22中公开的SNV。In some implementations, physical fitness traits include stress fracture risk. Stress fracture risk can be influenced by genetic variations within genes encoding LOC101060363-LOC105376856, ZBTB40, EN1, FLJ42280, COLEC10, WNT16, ESR1, ATP6V1G1, CLDN14, ESR1FABP3P2, ADAMTS18, SOST, CLDN14, MEF2C, KCNH1, C6orf97, CKAP5, C17orf53, SOST, TNFRSF11A, LOC105373519-LOC728815, PTCH1, SMOC1, LOC646794-LOC101928765, and LOC105377045-MRPS31P1. Non-limiting examples of genetic variations within the genes encoding LOC101060363-LOC105376856, ZBTB40, EN1, FLJ42280, COLEC10, WNT16, ESR1, ATP6V1G1, CLDN14, ESR1FABP3P2, ADAMTS18, SOST, CLDN14, MEF2C, KCNH1, C6orf97, CKAP5, C17orf53, SOST, TNFRSF11A, LOC105373519-LOC728815, PTCH1, SMOC1, LOC646794-LOC101928765, and LOC105377045-MRPS31P1 include the SNVs disclosed in Table 22.

表22Table 22

在一些实施方案中,身体素质性状包括整体损伤风险。整体损伤风险可受到编码HAO1、RSPO2、EMC2、EIF3E、CCDC91、PTHLH、LOC100506393、LINC00536、EIF3H、CDC5L、SUPT3H和/或MIR4642的基因内的遗传变异的影响。编码HAO1、RSPO2、EMC2、EIF3E、CCDC91、PTHLH、LOC100506393、LINC00536、EIF3H、CDC5L、SUPT3H和MIR4642的基因内的遗传变异的非限制性示例包括在表23中公开的SNV。In some implementations, physical fitness traits include overall injury risk. Overall injury risk can be influenced by genetic variations within the genes encoding HAO1, RSPO2, EMC2, EIF3E, CCDC91, PTHLH, LOC100506393, LINC00536, EIF3H, CDC5L, SUPT3H, and/or MIR4642. Non-limiting examples of genetic variations within the genes encoding HAO1, RSPO2, EMC2, EIF3E, CCDC91, PTHLH, LOC100506393, LINC00536, EIF3H, CDC5L, SUPT3H, and MIR4642 are included in the SNVs disclosed in Table 23.

表23Table 23

在一些实施方案中,身体素质性状包括静息代谢心率受损。静息代谢心率受损可受FTO编码的基因内的遗传变异的影响。编码FTO的基因内的遗传变异的非限制性示例包括在表24中公开的SNV。In some implementations, physical fitness traits include impaired resting metabolic heart rate. Impaired resting metabolic heart rate can be influenced by genetic variations within the gene encoding FTO. Non-limiting examples of genetic variations within the gene encoding FTO include the SNVs disclosed in Table 24.

表24Table 24

营养性状Nutritional traits

在一些实施方案中,本文公开了营养性状,包括维生素缺乏、矿物质缺乏、抗氧化剂缺乏、代谢失衡、代谢受损、代谢敏感性、过敏、饱腹感和/或健康饮食的有效性。In some implementations, this document discloses nutritional traits, including vitamin deficiency, mineral deficiency, antioxidant deficiency, metabolic imbalance, metabolic impairment, metabolic sensitivity, allergies, satiety, and/or the effectiveness of a healthy diet.

在一些实施方案中,营养性状包括维生素缺乏。在一些情况下,维生素缺乏包括以下的缺乏:维生素A、维生素B1、维生素B2、维生素B3、维生素B5、维生素B6、维生素B7、维生素B8、维生素B9、维生素B12、维生素C、维生素D、维生素E或维生素K。维生素缺乏可受到编码GC、FUT2、HAAO、BCMO1、ALPL、CYP2R1、MS4A3、FFAR4、TTR、CUBN、FUT6、ZNF259、LOC100128347、APOA5、SIK3、BUD13、ZNF259、APOA5、BUD13、KYNU、NBPF3、TCN1、CYP4F2、PDE3B、CYP2R1、CALCA、CALCP、OR7E41P、APOA5、CLYBL、NADSYN1、DHCR7、SCARB1、RNU7-49P、COPB1、RRAS2、PSMA1、PRELID2、CYP2R1、PDE3B、CALCA、CALCP、OR7E41P、MUT、ZNF259、CTNAA2、CDO1、SLC23A1、KCNK9、CYP4F2、LOC729645、ZNF259、BUD13、ST6GALNAC3、NKAIN3、VDAC1P12、RASIP1、MYT1L、PAX3、NPY、ADCYAP1R1、HSF5、RNF43、MTMR4、TMEM215-ASS1P12、FAM155A、CD44、BRAF、CD4、LEPREL2、GNB3、MKLN1、SLC6A1、PRICKLE2、SVCT1和/或SVCT2的基因内的遗传变异的影响。编码GC、FUT2、HAAO、BCMO1、ALPL、CYP2R1、MS4A3、FFAR4、TTR、CUBN、FUT6、ZNF259、LOC100128347、APOA5、SIK3、BUD13、ZNF259、APOA5、BUD13、KYNU、NBPF3、TCN1、CYP4F2、PDE3B、CYP2R1、CALCA、CALCP、OR7E41P、APOA5、CLYBL、NADSYN1、DHCR7、SCARB1、RNU7-49P、COPB1、RRAS2、PSMA1、PRELID2、CYP2R1、PDE3B、CALCA、CALCP、OR7E41P、MUT、ZNF259、CTNAA2、CDO1、SLC23A1、KCNK9、CYP4F2、LOC729645、ZNF259、BUD13、ST6GALNAC3、NKAIN3、VDAC1P12、RASIP1、MYT1L、PAX3、NPY、ADCYAP1R1、HSF5、RNF43、MTMR4、TMEM215-ASS1P12、FAM155A、CD44、BRAF、CD4、LEPREL2、GNB3、MKLN1、SLC6A1、PRICKLE2、SVCT1和SVCT2的基因内的遗传变异的非限制性示例包括表25中列出的SNV。In some implementations, nutritional traits include vitamin deficiencies. In some cases, vitamin deficiencies include deficiencies in the following: vitamin A, vitamin B1, vitamin B2, vitamin B3, vitamin B5, vitamin B6, vitamin B7, vitamin B8, vitamin B9, vitamin B12, vitamin C, vitamin D, vitamin E, or vitamin K. Vitamin deficiencies can be caused by the following enzymes: GC, FUT2, HAAO, BCMO1, ALPL, CYP2R1, MS4A3, FFAR4, TTR, CUBN, FUT6, ZNF259, LOC100128347, APOA5, SIK3, BUD13, ZNF259, APOA5, BUD13, KYNU, NBPF3, TCN1, CYP4F2, PDE3B, CYP2R1, CALCA, CALCP, OR7E41P, APOA5, CLYBL, NADSYN1, DHCR7, SCARB1, RNU7-49P, COPB1, RRAS2, PSMA1, PRELID2, CYP2R1, PDE3B, C The effects of genetic variations within the genes ALCA, CALCP, OR7E41P, MUT, ZNF259, CTNAA2, CDO1, SLC23A1, KCNK9, CYP4F2, LOC729645, ZNF259, BUD13, ST6GALNAC3, NKAIN3, VDAC1P12, RASIP1, MYT1L, PAX3, NPY, ADCYAP1R1, HSF5, RNF43, MTMR4, TMEM215-ASS1P12, FAM155A, CD44, BRAF, CD4, LEPREL2, GNB3, MKLN1, SLC6A1, PRICKLE2, SVCT1, and/or SVCT2. Coding GC, FUT2, HAAO, BCMO1, ALPL, CYP2R1, MS4A3, FFAR4, TTR, CUBN, FUT6, ZNF259, LOC100128347, APOA5, SIK3, BUD13, ZNF259, APOA5, BUD13, KYNU, NBPF3, TCN1, C YP4F2, PDE3B, CYP2R1, CALCA, CALCP, OR7E41P, APOA5, CLYBL, NADSYN1, DHCR7, SCARB1, RNU7-49P, COPB1, RRAS2, PSMA1, PRELID2, CYP2R1, PDE3B, CALCA, CALCP, Non-restrictive examples of genetic variations within the genes of OR7E41P, MUT, ZNF259, CTNAA2, CDO1, SLC23A1, KCNK9, CYP4F2, LOC729645, ZNF259, BUD13, ST6GALNAC3, NKAIN3, VDAC1P12, RASIP1, MYT1L, PAX3, NPY, ADCYAP1R1, HSF5, RNF43, MTMR4, TMEM215-ASS1P12, FAM155A, CD44, BRAF, CD4, LEPREL2, GNB3, MKLN1, SLC6A1, PRICKLE2, SVCT1, and SVCT2 include the SNVs listed in Table 25.

表25Table 25

在一些实施方案中,营养性状包括矿物质缺乏。在一些情况下,矿物质缺乏包括钙、铁、镁、锌和/或硒的缺乏。在一些情况下,矿物质缺乏可受到编码CASR、TF、TFR2、SCAMP5、PPCDC、ARSB、BHMT2、DMGDH、ATP2B1、DCDC5、TRPM6、SHROOM3、CYP24A1、BHMT、BHMT2、JMY、TMPRSS6、GCKR、KIAA0564、DGKH、HFE、GATA3、VKORC1L1、MDS1、MUC1、CSTA、JMY、HOMER1、MAX、FNTB、SLC36A4、CCDC67、MIR379、FGFR2、LUZP2、PAPSS2、HOXD9、LOC102724653-IGLV4-60、HOOK3、FNTA、MEOX2、LOC101928964、PRPF8、MGC14376、SMYD4、SERPINF2、SERPINF1、WDR81、MIR4778、MEIS1-AS3、PRDM9、CALCOCO1、HOXC13、GPR39、SLC22A16、CDK19、TMOD1、TXNRD1、NFYB、MYOM2、CSMD1、KBTBD11、ARHGEF10、DYNC2H1、DCUN1D5、PDGFD、PRMT7、SERPINF2、WDR81、CRMP1、FLJ46481、KHDRBS2-LOC100132056、CD109、LOC100616530、SLC16A7、FLRT2、KYNU、ARHGAP15、RARB、C3orf58、PLOD2、RPRM、GALNT13、EPHA6、RGS14、SLC34A1、SLC22A18、PHLDA2、CDKN1C、NAP1L4、LOC101929578、ZNF14、ZNF101、ATP13A1、PYGB、CHD5、SDCCAG8、XDH、SRD5A2、CMYA5、RP11-314C16.1、TFAP2A、PTPRN2、CA1、KNOP1P1、RNU7-14P-LOC107987283、FNDC4、IFT172、GCKR、C2orf16、CBLB、LINC00882、LOC107983965、MIR4790、AC069277.1、IRX2、C5orf38、ZNF521、SS18、ATG4C、LPHN2、TTLL7、SAG、DGKD、RN7SKP61-MRPS17P3、GPBP1、STXBP6、NOVA1、TMEM211和/或MT2A的基因内的遗传变异的影响。编码CASR、TF、TFR2、SCAMP5、PPCDC、ARSB、BHMT2、DMGDH、ATP2B1、DCDC5、TRPM6、SHROOM3、CYP24A1、BHMT、BHMT2、JMY、TMPRSS6、GCKR、KIAA0564、DGKH、HFE、GATA3、VKORC1L1、MDS1、MUC1、CSTA、JMY、HOMER1、MAX、FNTB、SLC36A4、CCDC67、MIR379、FGFR2、LUZP2、PAPSS2、HOXD9、LOC102724653-IGLV4-60、HOOK3、FNTA、MEOX2、LOC101928964、PRPF8、MGC14376、SMYD4、SERPINF2、SERPINF1、WDR81、MIR4778、MEIS1-AS3、PRDM9、CALCOCO1、HOXC13、GPR39、SLC22A16、CDK19、TMOD1、TXNRD1、NFYB、MYOM2、CSMD1、KBTBD11、ARHGEF10、DYNC2H1、DCUN1D5、PDGFD、PRMT7、SERPINF2、WDR81、CRMP1、FLJ46481、KHDRBS2-LOC100132056、CD109、LOC100616530、SLC16A7、FLRT2、KYNU、ARHGAP15、RARB、C3orf58、PLOD2、RPRM、GALNT13、EPHA6、RGS14、SLC34A1、SLC22A18、PHLDA2、CDKN1C、NAP1L4、LOC101929578、ZNF14、ZNF101、ATP13A1、PYGB、CHD5、SDCCAG8、XDH、SRD5A2、CMYA5、RP11-314C16.1、TFAP2A、PTPRN2、CA1、KNOP1P1、RNU7-14P-LOC107987283、FNDC4、IFT172、GCKR、C2orf16、CBLB、LINC00882、LOC107983965、MIR4790、AC069277.1、IRX2、C5orf38、ZNF521、SS18、ATG4C、LPHN2、TTLL7、SAG、DGKD、RN7SKP61-MRPS17P3、GPBP1、STXBP6、NOVA1、TMEM211和MT2A的基因内的遗传变异的非限制性示例包括表26中列出的SNV。In some implementations, nutritional traits include mineral deficiencies. In some cases, mineral deficiencies include deficiencies in calcium, iron, magnesium, zinc, and/or selenium. In some cases, mineral deficiencies may be coded by CASR, TF, TFR2, SCAMP5, PPCDC, ARSB, BHMT2, DMGDH, ATP2B1, DCDC5, TRPM6, SHROOM3, CYP24A1, BHMT, BHMT2, JMY, TMPRSS6, GCKR, KIAA0564, DGKH, HFE, GATA3, VKORC1L1, MDS1, MUC1, CSTA, JMY, HOMER1, MAX, FNTB, SLC36A4, CCDC67, MIR379, FGFR2, LUZP2, PAPSS2, HOXD9, LOC1. 02724653-IGLV4-60, HOOK3, FNTA, MEOX2, LOC101928964, PRPF8, MGC14376, SMYD4, SERPINF2, SERPINF1, WDR81, MIR4778, MEIS1-AS3, PRDM9, CALC OCO1, HOXC13, GPR39, SLC22A16, CDK19, TMOD1, TXNRD1, NFYB, MYOM2, CSMD1, KBTBD11, ARHGEF10, DYNC2H1, DCUN1D5, PDGFD, PRMT7, SERPINF2, WDR81 , CRMP1, FLJ46481, KHDRBS2-LOC100132056, CD109, LOC100616530, SLC16A7, FLRT2, KYNU, ARHGAP15, RARB, C3orf58, PLOD2, RPRM, GALNT13, EPHA6 , RGS14, SLC34A1, SLC22A18, PHLDA2, CDKN1C, NAP1L4, LOC101929578, ZNF14, ZNF101, ATP13A1, PYGB, CHD5, SDCCAG8, XDH, SRD5A2, CMYA5, RP11-314 The effects of genetic variations within the genes C16.1, TFAP2A, PTPRN2, CA1, KNOP1P1, RNU7-14P-LOC107987283, FNDC4, IFT172, GCKR, C2orf16, CBLB, LINC00882, LOC107983965, MIR4790, AC069277.1, IRX2, C5orf38, ZNF521, SS18, ATG4C, LPHN2, TTLL7, SAG, DGKD, RN7SKP61-MRPS17P3, GPBP1, STXBP6, NOVA1, TMEM211, and/or MT2A. Coding CASR, TF, TFR2, SCAMP5, PPCDC, ARSB, BHMT2, DMGDH, ATP2B1, DCDC5, TRPM6, SHROOM3, CYP24A1, BHMT, BHMT2, JMY, TMPRSS6, GCKR, KIAA0564, DGKH , HFE, GATA3, VKORC1L1, MDS1, MUC1, CSTA, JMY, HOMER1, MAX, FNTB, SLC36A4, CCDC67, MIR379, FGFR2, LUZP2, PAPSS2, HOXD9, LOC102724653-IGLV4- 60. HOOK3, FNTA, MEOX2, LOC101928964, PRPF8, MGC14376, SMYD4, SERPINF2, SERPINF1, WDR81, MIR4778, MEIS1-AS3, PRDM9, CALCOCO1, HOXC13, GPR 39. SLC22A16, CDK19, TMOD1, TXNRD1, NFYB, MYOM2, CSMD1, KBTBD11, ARHGEF10, DYNC2H1, DCUN1D5, PDGFD, PRMT7, SERPINF2, WDR81, CRMP1, FLJ46481 , KHDRBS2-LOC100132056, CD109, LOC100616530, SLC16A7, FLRT2, KYNU, ARHGAP15, RARB, C3orf58, PLOD2, RPRM, GALNT13, EPHA6, RGS14, SLC34A1, SLC22A18, PHLDA2, CDKN1C, NAP1L4, LOC101929578, ZNF14, ZNF101, ATP13A1, PYGB, CHD5, SDCCAG8, XDH, SRD5A2, CMYA5, RP11-314C16.1, TFAP2A, P Non-restrictive examples of genetic variations within the genes of TPRN2, CA1, KNOP1P1, RNU7-14P-LOC107987283, FNDC4, IFT172, GCKR, C2orf16, CBLB, LINC00882, LOC107983965, MIR4790, AC069277.1, IRX2, C5orf38, ZNF521, SS18, ATG4C, LPHN2, TTLL7, SAG, DGKD, RN7SKP61-MRPS17P3, GPBP1, STXBP6, NOVA1, TMEM211, and MT2A include the SNVs listed in Table 26.

表26Table 26

在一些实施方案中,营养性状包括抗氧化剂缺乏。在一些情况下,抗氧化剂缺乏包括谷胱甘肽和/或辅酶Q10(CoQ10)的缺乏。抗氧化剂缺乏可受到编码GGT1、GGTLC2、MYL2、C12orf27、HNF1A、OAS1、C14orf73、ZNF827、RORA、EPHA2、RSG1、MICAL3、DPM3、EFNA1、PKLR、GCKR、C2orf16、NEDD4L、MYO1B、STAT4、CCBL2、PKN2、SLC2A2、ITGA1、DLG5、FUT2、ATP8B1、EFHD1、CDH6、CD276、FLJ37644、SOX9、DDT、DDTL、GSTT1、GSTT2B、MIF、MLIP、MLXIPL、DYNLRB2、CEPT1、DENND2D、COLEC12、LOC101927479-ARHGEF19、LOC105377979、MMP26、DNM1、LUZP1、ADH5P2-LOC553139、FST、MIR4708-LOC105370537、LOC105373450-KCNS3、LOC107984041-GRIK2、LINC01520和/或NQO1的基因内的遗传变异的影响。编码GGT1、GGTLC2、MYL2、C12orf27、HNF1A、OAS1、C14orf73、ZNF827、RORA、EPHA2、RSG1、MICAL3、DPM3、EFNA1、PKLR、GCKR、C2orf16、NEDD4L、MYO1B、STAT4、CCBL2、PKN2、SLC2A2、ITGA1、DLG5、FUT2、ATP8B1、EFHD1、CDH6、CD276、FLJ37644、SOX9、DDT、DDTL、GSTT1、GSTT2B、MIF、MLIP、MLXIPL、DYNLRB2、CEPT1、DENND2D、COLEC12、LOC101927479-ARHGEF19、LOC105377979、MMP26、DNM1、LUZP1、ADH5P2-LOC553139、FST、MIR4708-LOC105370537、LOC105373450-KCNS3、LOC107984041-GRIK2、LINC01520和NQO1的基因内的遗传变异的非限制性示例包括表27中列出的SNV。In some implementations, nutritional traits include antioxidant deficiency. In some cases, antioxidant deficiency includes a deficiency of glutathione and/or coenzyme Q10 (CoQ10). Antioxidant deficiency can be caused by the following enzymes: GGT1, GGTLC2, MYL2, C12orf27, HNF1A, OAS1, C14orf73, ZNF827, RORA, EPHA2, RSG1, MICAL3, DPM3, EFNA1, PKLR, GCKR, C2orf16, NEDD4L, MYO1B, STAT4, CCBL2, PKN2, SLC2A2, ITGA1, DLG5, FUT2, ATP8B1, EFHD1, CDH6, CD276, FLJ37644, SOX9, DDT, DDTL, GS. The effects of genetic variations within the genes of TT1, GSTT2B, MIF, MLIP, MLXIPL, DYNLRB2, CEPT1, DENND2D, COLEC12, LOC101927479-ARHGEF19, LOC105377979, MMP26, DNM1, LUZP1, ADH5P2-LOC553139, FST, MIR4708-LOC105370537, LOC105373450-KCNS3, LOC107984041-GRIK2, LINC01520, and/or NQO1. Coding GGT1, GGTLC2, MYL2, C12orf27, HNF1A, OAS1, C14orf73, ZNF827, RORA, EPHA2, RSG1, MICAL3, DPM3, EFNA1, PKLR, GCKR, C2orf16, NED D4L, MYO1B, STAT4, CCBL2, PKN2, SLC2A2, ITGA1, DLG5, FUT2, ATP8B1, EFHD1, CDH6, CD276, FLJ37644, SOX9, DDT, DDTL, GSTT1, GSTT2B, Non-restrictive examples of genetic variations within the genes of MIF, MLIP, MLXIPL, DYNLRB2, CEPT1, DENND2D, COLEC12, LOC101927479-ARHGEF19, LOC105377979, MMP26, DNM1, LUZP1, ADH5P2-LOC553139, FST, MIR4708-LOC105370537, LOC105373450-KCNS3, LOC107984041-GRIK2, LINC01520, and NQO1 include the SNVs listed in Table 27.

表27Table 27

在一些实施方案中,营养性状包括代谢失衡。在一些情况下,代谢失衡包括葡萄糖失衡。代谢失衡可受到编码G6PC2、MTNR1B、GCK、ADCY5、MADD、ADRA2A、GCKR、MRPL33、ABCB11、FADS1、PCSK1,CRY2、ARAP1、SIX2、SIX3、PPP1R3B、SLC2A2、GLIS3、DPYSL5、SLC30A8、PROX1、CDKN2A、CDKN2B、FOXA2、TMEM195、DGKB、PDK1、RAPGEF4、PDX1、CDKAL1、KANK1、IGF1R、C2CD4B、LEPR、GRB10、LMO1、RREB1、FBXL10和/或FOXN3的基因内的遗传变异的影响。编码G6PC2、MTNR1B、GCK、ADCY5、MADD、ADRA2A、GCKR、MRPL33、ABCB11、FADS1、PCSK1,CRY2、ARAP1、SIX2、SIX3、PPP1R3B、SLC2A2、GLIS3、DPYSL5、SLC30A8、PROX1、CDKN2A、CDKN2B、FOXA2、TMEM195、DGKB、PDK1、RAPGEF4、PDX1、CDKAL1、KANK1、IGF1R、C2CD4B、LEPR、GRB10、LMO1、RREB1、FBXL10和FOXN3的基因内的遗传变异的非限制性示例包括表28中列出的SNV。In some implementations, nutritional traits include metabolic imbalances. In some cases, metabolic imbalances include glucose imbalances. Metabolic imbalances can be influenced by genetic variations within genes encoding G6PC2, MTNR1B, GCK, ADCY5, MADD, ADRA2A, GCKR, MRPL33, ABCB11, FADS1, PCSK1, CRY2, ARAP1, SIX2, SIX3, PPP1R3B, SLC2A2, GLIS3, DPYSL5, SLC30A8, PROX1, CDKN2A, CDKN2B, FOXA2, TMEM195, DGKB, PDK1, RAPGEF4, PDX1, CDKAL1, KANK1, IGF1R, C2CD4B, LEPR, GRB10, LMO1, RREB1, FBXL10, and/or FOXN3. Non-restrictive examples of genetic variations within the genes encoding G6PC2, MTNR1B, GCK, ADCY5, MADD, ADRA2A, GCKR, MRPL33, ABCB11, FADS1, PCSK1, CRY2, ARAP1, SIX2, SIX3, PPP1R3B, SLC2A2, GLIS3, DPYSL5, SLC30A8, PROX1, CDKN2A, CDKN2B, FOXA2, TMEM195, DGKB, PDK1, RAPGEF4, PDX1, CDKAL1, KANK1, IGF1R, C2CD4B, LEPR, GRB10, LMO1, RREB1, FBXL10, and FOXN3 include the SNVs listed in Table 28.

表28Table 28

在一些实施方案中,营养性状包括代谢受损。在一些情况下,代谢代谢受损包括咖啡因和/或药物的代谢受损。代谢受损可受到编码MTNR1B、CACNA2D3、NEDD4L、AC105008.1、P2RY2、RP11-479A21.1、MTUS2、PRIMA1和/或RP11-430J3.1的基因内的遗传变异的影响。编码MTNR1B、CACNA2D3、NEDD4L、AC105008.1、P2RY2、RP11-479A21.1、MTUS2、PRIMA1和RP11-430J3.1的基因内的遗传变异的非限制性示例包括表29中列出的SNV。In some implementations, nutritional traits include impaired metabolism. In some cases, impaired metabolism includes impaired metabolism of caffeine and/or drugs. Impaired metabolism can be affected by genetic variations within the genes encoding MTNR1B, CACNA2D3, NEDD4L, AC105008.1, P2RY2, RP11-479A21.1, MTUS2, PRIMA1, and/or RP11-430J3.1. Non-limiting examples of genetic variations within the genes encoding MTNR1B, CACNA2D3, NEDD4L, AC105008.1, P2RY2, RP11-479A21.1, MTUS2, PRIMA1, and RP11-430J3.1 include the SNVs listed in Table 29.

表29Table 29

在一些实施方案中,营养性状包括代谢敏感性。在一些情况下,代谢敏感性包括麸质敏感性、对盐的敏感性、聚糖敏感性和/或乳糖敏感性。代谢敏感性可受到编码PIBF1、IRAK1BP1、PRMT6、CDCA7、NOTCH4、HLA-DRA、BTNL2、ARSJ、CSMD1、ALX4、NSUN3、RAB9BP1、GPR65、C15orf32、TSN、CREB1和/或ARMC9的基因内的遗传变异的影响。编码PIBF1、IRAK1BP1、PRMT6、CDCA7、NOTCH4、HLA-DRA、BTNL2、ARSJ、CSMD1、ALX4、NSUN3、RAB9BP1、GPR65、C15orf32、TSN、CREB1和ARMC9的基因内的遗传变异的非限制性示例包括表30中列出的SNV。In some implementations, nutritional traits include metabolic sensitivity. In some cases, metabolic sensitivity includes gluten sensitivity, salt sensitivity, polysaccharide sensitivity, and/or lactose sensitivity. Metabolic sensitivity can be affected by genetic variations within the genes encoding PIBF1, IRAK1BP1, PRMT6, CDCA7, NOTCH4, HLA-DRA, BTNL2, ARSJ, CSMD1, ALX4, NSUN3, RAB9BP1, GPR65, C15orf32, TSN, CREB1, and/or ARMC9. Non-limiting examples of genetic variations within the genes encoding PIBF1, IRAK1BP1, PRMT6, CDCA7, NOTCH4, HLA-DRA, BTNL2, ARSJ, CSMD1, ALX4, NSUN3, RAB9BP1, GPR65, C15orf32, TSN, CREB1, and ARMC9 include the SNVs listed in Table 30.

表30Table 30

在一些实施方案中,营养性状包括食物过敏。在一些实施方案中,食物过敏包括花生过敏。花生过敏可受到编码HLA-DRB1、HLA-DQA1、HLA-DQB1、HLA-DQA2、HCG27、HLA-C、ADGB、RPS15P9、MUM1、RYR1、LINC00992、LOC100129526、FAM118A、SMC1B、MIATNB、ATP2C2、PLAGL1、MRPL42和/或STAT6的基因内的遗传变异的影响。编码HLA-DRB1、HLA-DQA1、HLA-DQB1、HLA-DQA2、HCG27、HLA-C、ADGB、RPS15P9、MUM1、RYR1、LINC00992、LOC100129526、FAM118A、SMC1B、MIATNB、ATP2C2、PLAGL1、MRPL42和STAT6的基因内的遗传变异的非限制性示例包括表31中列出的SNV。In some embodiments, the nutritional trait includes food allergies. In some embodiments, food allergies include peanut allergies. Peanut allergies can be influenced by genetic variations within genes encoding HLA-DRB1, HLA-DQA1, HLA-DQB1, HLA-DQA2, HCG27, HLA-C, ADGB, RPS15P9, MUM1, RYR1, LINC00992, LOC100129526, FAM118A, SMC1B, MIATNB, ATP2C2, PLAGL1, MRPL42, and/or STAT6. Non-restrictive examples of genetic variations within the genes encoding HLA-DRB1, HLA-DQA1, HLA-DQB1, HLA-DQA2, HCG27, HLA-C, ADGB, RPS15P9, MUM1, RYR1, LINC00992, LOC100129526, FAM118A, SMC1B, MIATNB, ATP2C2, PLAGL1, MRPL42, and STAT6 include the SNVs listed in Table 31.

表31Table 31

在一些实施方案中,营养性状包括饱腹感。饱腹感可受到编码LEPR的基因内的遗传变异的影响。编码LEPR的基因内的遗传变异的非限制性示例包括表32中列出的SNV。In some implementations, the nutritional trait includes satiety. Satiety can be influenced by genetic variations within the gene encoding LEPR. Non-limiting examples of genetic variations within the gene encoding LEPR include the SNVs listed in Table 32.

表32Table 32

健康饮食的有效性可受到编码FGF21、ZPR1、TANK、FNBP1、RNU6-229P-LOC105375346、ARGFX、BEND3、SUMO2P6-LOC105377740、LOC101929216-GDF10、LOC105377451-LOC105377622、CPA3、KCNQ3、THBS4、TENM2、HSPA9P2-LOC105372045、LINC00113-LINC00314、SH3BGRL2、NKAIN2、OPRM1、LOC105377795、NCALD、LOC728503、LOC105370491、LOC107985318-MIA3、BECN1P2-LYPLA1P3、LOC105376778-LINC01082、SOX5、LHX5-AS1-LOC105369990、NBAS、ABCG2、PPARγ2、CLOCK、RARB、FTO、IRS1、TCF7L2、HNMT和/或PFKL的基因内的遗传变异的影响。编码FGF21、ZPR1、TANK、FNBP1、RNU6-229P-LOC105375346、ARGFX、BEND3、SUMO2P6-LOC105377740、LOC101929216-GDF10、LOC105377451-LOC105377622、CPA3、KCNQ3、THBS4、TENM2、HSPA9P2-LOC105372045、LINC00113-LINC00314、SH3BGRL2、NKAIN2、OPRM1、LOC105377795、NCALD、LOC728503、LOC105370491、LOC107985318-MIA3、BECN1P2-LYPLA1P3、LOC105376778-LINC01082、SOX5、LHX5-AS1-LOC105369990、NBAS、ABCG2、PPARγ2、CLOCK、RARB、FTO、IRS1、TCF7L2、HNMT和PFKL的基因内的遗传变异的非限制性示例包括表33中列出的SNV。The effectiveness of a healthy diet can be assessed by the codes FGF21, ZPR1, TANK, FNBP1, RNU6-229P-LOC105375346, ARGFX, BEND3, SUMO2P6-LOC105377740, LOC101929216-GDF10, LOC105377451-LOC105377622, CPA3, KCNQ3, THBS4, TENM2, HSPA9P2-LOC105372045, LINC00113-LINC00314, and SH3BGRL2. The influence of genetic variations within the genes of NKAIN2, OPRM1, LOC105377795, NCALD, LOC728503, LOC105370491, LOC107985318-MIA3, BECN1P2-LYPLA1P3, LOC105376778-LINC01082, SOX5, LHX5-AS1-LOC105369990, NBAS, ABCG2, PPARγ2, CLOCK, RARB, FTO, IRS1, TCF7L2, HNMT and/or PFKL. Encoding: FGF21, ZPR1, TANK, FNBP1, RNU6-229P-LOC105375346, ARGFX, BEND3, SUMO2P6-LOC105377740, LOC101929216-GDF10, LOC105377451-LOC105377622, CPA3, KCNQ3, THBS4, TENM2, HSPA9P2-LOC105372045, LINC00113-LINC00314, SH3BGRL2, NKAIN2, OPRM1 Non-restrictive examples of genetic variations within the genes of LOC105377795, NCALD, LOC728503, LOC105370491, LOC107985318-MIA3, BECN1P2-LYPLA1P3, LOC105376778-LINC01082, SOX5, LHX5-AS1-LOC105369990, NBAS, ABCG2, PPARγ2, CLOCK, RARB, FTO, IRS1, TCF7L2, HNMT, and PFKL include the SNVs listed in Table 33.

表33Table 33

过敏性状Allergic symptoms

在一些实施方案中,本文公开了过敏性状。在一些实施方案中,过敏性状包括皮肤过敏、灰尘过敏、昆虫叮咬过敏、宠物过敏、眼睛过敏、药物过敏、乳胶过敏、霉菌过敏和/或有害生物过敏。在一些实施方案中,过敏性状包括过敏性炎症。本文所用的“过敏性炎症”是指由过敏反应引起的炎症或与过敏反应相关的炎症。In some embodiments, allergic traits are disclosed herein. In some embodiments, allergic traits include skin allergies, dust allergies, insect bite allergies, pet allergies, eye allergies, drug allergies, latex allergies, mold allergies, and/or pest allergies. In some embodiments, allergic traits include allergic inflammation. As used herein, "allergic inflammation" refers to inflammation caused by or associated with an allergic reaction.

在一些实施方案中,营养性状包括过敏性炎症。在一些情况下,过敏性炎症可受到编码FCER1A、LRRC32、C11orf30、IL13,OR10J3、HLA-A、STAT6、TSLP、SLC25A46、WDR36、CAMK4、HLA-DQB1、HLA-DQA1、STAT6、NAB2、DARC、IL18R1、IL1RL1、IL18RAP、FAM114A1、MIR574、TLR10、TLR1、TLR6、LPP、BCL6、MYC、PVT1、IL2、ADAD1、KIAA1109、IL21、HLA region、TMEM232、SLCA25A46、HLA-DQA2、HLA-G、MICA、HLA-C、HLA-B、MICB、HLA-DRB1、IL4R、ID2、LOC730217、OPRK1、WWP2、EPS15、ANAPC1、LPP、LOC101927026、IL4R、IL21R、SUCLG2、TMEM108、DNAH5、OR6X1、DOCK10、ABL2、COL21A1和/或CDH13的基因内的遗传变异的影响。编码FCER1A、LRRC32、C11orf30、IL13,OR10J3、HLA-A、STAT6、TSLP、SLC25A46、WDR36、CAMK4、HLA-DQB1、HLA-DQA1、STAT6、NAB2、DARC、IL18R1、IL1RL1、IL18RAP、FAM114A1、MIR574、TLR10、TLR1、TLR6、LPP、BCL6、MYC、PVT1、IL2、ADAD1、KIAA1109、IL21、HLA region、TMEM232、SLCA25A46、HLA-DQA2、HLA-G、MICA、HLA-C、HLA-B、MICB、HLA-DRB1、IL4R、ID2、LOC730217、OPRK1、WWP2、EPS15、ANAPC1、LPP、LOC101927026、IL4R、IL21R、SUCLG2、TMEM108、DNAH5、OR6X1、DOCK10、ABL2、COL21A1和CDH13的基因内的遗传变异的非限制性示例包括表34中列出的SNV。In some implementations, nutritional traits include allergic inflammation. In some cases, allergic inflammation may be mediated by substances encoding FCER1A, LRRC32, C11orf30, IL13, OR10J3, HLA-A, STAT6, TSLP, SLC25A46, WDR36, CAMK4, HLA-DQB1, HLA-DQA1, STAT6, NAB2, DARC, IL18R1, IL1RL1, IL18RAP, FAM114A1, MIR574, TLR10, TLR1, TLR6, LPP, BCL6, MYC, PVT1, IL2, ADAD1, KIAA1109, and I. The influence of genetic variations within the genes of L21, HLA region, TMEM232, SLCA25A46, HLA-DQA2, HLA-G, MICA, HLA-C, HLA-B, MICB, HLA-DRB1, IL4R, ID2, LOC730217, OPRK1, WWP2, EPS15, ANAPC1, LPP, LOC101927026, IL4R, IL21R, SUCLG2, TMEM108, DNAH5, OR6X1, DOCK10, ABL2, COL21A1, and/or CDH13. Coding FCER1A, LRRC32, C11orf30, IL13,OR10J3, HLA-A, STAT6, TSLP, SLC25A46, WDR36, CAMK4, HLA-DQB1, HLA-DQA1, STAT6, NAB2, DARC , IL18R1, IL1RL1, IL18RAP, FAM114A1, MIR574, TLR10, TLR1, TLR6, LPP, BCL6, MYC, PVT1, IL2, ADAD1, KIAA1109, IL21, HLA region, Non-restrictive examples of intragenetic variations in TMEM232, SLCA25A46, HLA-DQA2, HLA-G, MICA, HLA-C, HLA-B, MICB, HLA-DRB1, IL4R, ID2, LOC730217, OPRK1, WWP2, EPS15, ANAPC1, LPP, LOC101927026, IL4R, IL21R, SUCLG2, TMEM108, DNAH5, OR6X1, DOCK10, ABL2, COL21A1, and CDH13 include the SNVs listed in Table 34.

表34Table 34

在一些实施方案中,过敏性状包括有害生物过敏。在一些实施方案中,有害生物过敏包括对螨虫过敏。对螨虫过敏可受到编码LOC730217、OPRK1、OR6X1、DOCK10、CDH13、CapS、IL4、ADAM33、IRS2、ABHD13、LINC00299、IL18、CYP2R1和/或VDR的基因内的遗传变异的影响。编码LOC730217、OPRK1、OR6X1、DOCK10、CDH13、Cap S、IL4、ADAM33、IRS2、ABHD13、LINC00299、IL18、CYP2R1和VDR的基因内的遗传变异的非限制性示例包括表35中列出的SNV。In some embodiments, the allergic trait includes pest allergy. In some embodiments, pest allergy includes mite allergy. Mite allergy can be affected by genetic variations within genes encoding LOC730217, OPRK1, OR6X1, DOCK10, CDH13, CapS, IL4, ADAM33, IRS2, ABHD13, LINC00299, IL18, CYP2R1, and/or VDR. Non-limiting examples of genetic variations within genes encoding LOC730217, OPRK1, OR6X1, DOCK10, CDH13, CapS, IL4, ADAM33, IRS2, ABHD13, LINC00299, IL18, CYP2R1, and VDR include the SNVs listed in Table 35.

表35Table 35

精神性状Psychological characteristics

在一些实施方案中,本文公开了精神性状,其包括与个体的精神健康或精神敏锐度、精神疾病、精神状况有关的性状。精神健康或精神敏锐度的非限制性示例包括一定程度的压力、短期记忆保留、长期记忆保留、创造性或艺术性(例如,“右脑”)、分析性和有条不紊性(例如,“左脑”)。精神疾病的非限制性示例包括精神分裂症、双相情感障碍、躁狂抑郁障碍、自闭症谱系障碍和唐氏综合征。精神状况的非限制性示例包括抑郁风险、社交焦虑、内向的可能性、外向的可能性。精神性状的非限制性示例包括早起者、同理心、焦虑人格、数学能力、成瘾人格、记忆表现、OCD倾向、探究行为、阅读能力、体验性学习困难、一般创造力、一般智力、冲动性、注意力不集中症状、数学能力、心理反应时、音乐创造力、咬指甲、阅读和拼写困难、语言和数字推理以及发音错误。In some implementations, this document discloses mental traits, which include traits related to an individual's mental health or mental acuity, mental illness, or mental condition. Non-limiting examples of mental health or mental acuity include a degree of stress, short-term memory retention, long-term memory retention, creativity or artistry (e.g., "right brain"), analytical and orderly behavior (e.g., "left brain"). Non-limiting examples of mental illness include schizophrenia, bipolar disorder, manic-depressive disorder, autism spectrum disorder, and Down syndrome. Non-limiting examples of mental condition include risk of depression, social anxiety, likelihood of introversion, and likelihood of extroversion. Non-limiting examples of mental traits include early riser, empathy, anxious personality, mathematical ability, addictive personality, memory performance, OCD tendency, exploratory behavior, reading ability, experiential learning difficulties, general creativity, general intelligence, impulsivity, symptoms of inattention, mathematical ability, mental reaction time, musical creativity, nail biting, reading and spelling difficulties, verbal and numerical reasoning, and pronunciation errors.

在一些实施方案中,精神性状包括记忆能力。记忆能力可受到编码APOC1、APOE、FASTKD2、MIR3130-1、MIR3130-2、SPOCK3、ANXA10、ISL1、PARP8、BAIAP2、HS3ST4、C16orf82、AJAP1、C1orf174、ODZ4、NARS2、PRR16、FTMT、PCDH20、TDRD3、LBXCOR1、MAP2K5、PTGER3、ZRANB2、AXUD1、TTC21A、GFRA2、DOK2、SLC39A14、PPP3CC、VPS26B、NCAPD3、ZNF236、MBP、RIN2、NAT5、SEMA5A、MTRR、DGKB、ETV1、BHLHB5、CYP7B1、TMEPAI、ZBP1、TBC1D1、KLHL1、DACH1、LRRTM4、C2orf3、B3GAT1、LOC89944、ATP8B4、SLC27A2、CHD6、EMILIN3、RWDD3、TMEM56、SCN1A、KIBRA和/或NCAN的基因内的遗传变异的影响。编码APOC1、APOE、FASTKD2、MIR3130-1、MIR3130-2、SPOCK3、ANXA10、ISL1、PARP8、BAIAP2、HS3ST4、C16orf82、AJAP1、C1orf174、ODZ4、NARS2、PRR16、FTMT、PCDH20、TDRD3、LBXCOR1、MAP2K5、PTGER3、ZRANB2、AXUD1、TTC21A、GFRA2、DOK2、SLC39A14、PPP3CC、VPS26B、NCAPD3、ZNF236、MBP、RIN2、NAT5、SEMA5A、MTRR、DGKB、ETV1、BHLHB5、CYP7B1、TMEPAI、ZBP1、TBC1D1、KLHL1、DACH1、LRRTM4、C2orf3、B3GAT1、LOC89944、ATP8B4、SLC27A2、CHD6、EMILIN3、RWDD3、TMEM56、SCN1A、KIBRA和NCAN的基因内的遗传变异的非限制性示例包括表36中列出的SNV。In some implementations, mental traits include memory ability. Memory ability may be coded by APOC1, APOE, FASTKD2, MIR3130-1, MIR3130-2, SPOCK3, ANXA10, ISL1, PARP8, BAIAP2, HS3ST4, C16orf82, AJAP1, C1orf174, ODZ4, NARS2, PRR16, FTMT, PCDH20, TDRD3, LBXCOR1, MAP2K5, PTGER3, ZRANB2, AXUD1, TTC21A, GFRA2, DOK2, SLC39A14, P The influence of genetic variations within the genes of PP3CC, VPS26B, NCAPD3, ZNF236, MBP, RIN2, NAT5, SEMA5A, MTRR, DGKB, ETV1, BHLHB5, CYP7B1, TMEPAI, ZBP1, TBC1D1, KLHL1, DACH1, LLRTM4, C2orf3, B3GAT1, LOC89944, ATP8B4, SLC27A2, CHD6, EMILIN3, RWDD3, TMEM56, SCN1A, KIBRA, and/or NCAN. Coding APOC1, APOE, FASTKD2, MIR3130-1, MIR3130-2, SPOCK3, ANXA10, ISL1, PARP8, BAIAP2, HS3ST4, C16orf82, AJAP1, C1orf174, OD Z4, NARS2, PRR16, FTMT, PCDH20, TDRD3, LBXCOR1, MAP2K5, PTGER3, ZRANB2, AXUD1, TTC21A, GFRA2, DOK2, SLC39A14, PPP3CC, VPS26 Non-restrictive examples of genetic variation within the genes of B, NCAPD3, ZNF236, MBP, RIN2, NAT5, SEMA5A, MTRR, DGKB, ETV1, BHLHB5, CYP7B1, TMEPAI, ZBP1, TBC1D1, KLHL1, DACH1, LLRTM4, C2orf3, B3GAT1, LOC89944, ATP8B4, SLC27A2, CHD6, EMILIN3, RWDD3, TMEM56, SCN1A, KIBRA, and NCAN include the SNVs listed in Table 36.

表36Table 36

在一些实施方案中,精神状况包括强迫症(OCD)倾向。OCD倾向可受到编码PTPRD、LOC646114、LOC100049717、FAIM2、AQP2、TXNL1、WDR7、CDH10、MSNL1、GRIK2、HACE1、DACH1、MZT1、DLGAP1、EFNA5和/或GRIN2B的基因内的遗传变异的影响。编码PTPRD、LOC646114、LOC100049717、FAIM2、AQP2、TXNL1、WDR7、CDH10、MSNL1、GRIK2、HACE1、DACH1、MZT1、DLGAP1、EFNA5和GRIN2B的基因内的遗传变异的非限制性示例包括表37中列出的SNV。In some implementations, the mental condition includes obsessive-compulsive disorder (OCD) predisposition. OCD predisposition can be influenced by genetic variations within the genes encoding PTPRD, LOC646114, LOC100049717, FAIM2, AQP2, TXNL1, WDR7, CDH10, MSNL1, GRIK2, HACE1, DACH1, MZT1, DLGAP1, EFNA5, and/or GRIN2B. Non-limiting examples of genetic variations within the genes encoding PTPRD, LOC646114, LOC100049717, FAIM2, AQP2, TXNL1, WDR7, CDH10, MSNL1, GRIK2, HACE1, DACH1, MZT1, DLGAP1, EFNA5, and GRIN2B include the SNVs listed in Table 37.

表37Table 37

毛发性状Hair characteristics

在一些实施方案中,本文公开了毛发性状。在一些实施方案中,毛发性状包括毛发厚度、毛发稀疏、脱发、秃顶、油性、干燥、头皮屑、须部假性毛囊炎(剃刀状肿块)、念珠状发、蓬发(pili trianguli)、扭曲发和/或发量。在一些实施方案中,此处使用的术语“秃顶”是指雄激素性脱发(AGA)。在一些实施方案中,蓬发可受到编码PADI3、TGM3和/或TCHH的基因内的遗传变异的影响。在一些实施方案中,须部假性毛囊炎可受到编码K6HF的基因内的遗传变异的影响。在一些实施方案中,念珠状发可受到编码KRT81、KRT83、KRT86和/或DSG4的基因内的遗传变异的影响。在一些实施方案中,扭曲发可受到编码BCS1L的基因内的遗传变异的影响。在一些实施方案中,秃顶可受到编码PAX1、TARDBP、HDAC4、HDAC9、AUTS2、MAPT-AS1、SPPL2C、SETBP1、GRID1、WNT10A、EBF1、SUCNR1、MBNL1、SSPN、ITPR2、AR、EDA2R、EDA2R、ICOS、CTLA4、IL2、IL21、ULBP3、ULBP6、STX17、IL2RA、PRDX5、IKZF4和/或HLA-DQA2的基因内的遗传变异的影响。影响秃顶的非限制性基因包括但不限于表38中列出的SNV。In some embodiments, hair traits are disclosed herein. In some embodiments, hair traits include hair thickness, thinning hair, hair loss, baldness, oiliness, dryness, dandruff, pseudofolliculitis of the beard (razor razor bumps), moniliform hair, pili trianguli, twisted hair, and/or hair volume. In some embodiments, the term "baldness" as used herein refers to androgenetic alopecia (AGA). In some embodiments, pili trianguli may be affected by genetic variations within genes encoding PADI3, TGM3, and/or TCHH. In some embodiments, pseudofolliculitis of the beard may be affected by genetic variations within genes encoding K6HF. In some embodiments, pili trianguli may be affected by genetic variations within genes encoding KRT81, KRT83, KRT86, and/or DSG4. In some embodiments, twisted hair may be affected by genetic variations within genes encoding BCS1L. In some implementations, baldness can be influenced by genetic variations within genes encoding PAX1, TARDBP, HDAC4, HDAC9, AUTS2, MAPT-AS1, SPPL2C, SETBP1, GRID1, WNT10A, EBF1, SUCNR1, MBNL1, SSPN, ITPR2, AR, EDA2R, EDA2R, ICOS, CTLA4, IL2, IL21, ULBP3, ULBP6, STX17, IL2RA, PRDX5, IKZF4, and/or HLA-DQA2. Non-restrictive genes influencing baldness include, but are not limited to, the SNVs listed in Table 38.

表38Table 38

行为改变behavior change

本文公开的方面提供了用于至少部分地基于针对特定表型性状的遗传风险得分(GRS)向个体建议与该性状相关的行为改变的方法和系统。在一些情况中,向个体提供多个行为改变建议。在一些情况下,由个体提供个体的调查表,包括与感兴趣的特定表型性状有关的问题。在一些情况下,行为改变是基于针对性状的GRS和从个体那里收到的问题的答案。在一些情况下,行为改变包括增加、减少或避免活动。活动的非限制性示例包括但不限于包括体育锻炼、摄入一物质(例如,补充剂或药物)、接触产品(例如,烟雾、毒素、刺激物等)、使用产品(例如,护肤品、护发品、护甲品等)、饮食、生活方式、睡眠和消耗(例如,酒精、药物、咖啡因、过敏原、食物或一类食物的消耗)。在一些情况下,行为改变包括用于补救或防止特定表型性状的活动(用于例如,从事或不从事作为特定表型性状的发生原因或与特定表型性状的发生相关的活动)。The aspects disclosed herein provide methods and systems for recommending behavioral changes associated with a specific phenotypic trait to an individual, at least in part, based on a Genetic Risk Score (GRS) for that trait. In some cases, multiple behavioral change recommendations are provided to the individual. In some cases, the individual provides a questionnaire including questions related to the specific phenotypic trait of interest. In some cases, behavioral changes are based on the GRS for the trait and the answers to questions received from the individual. In some cases, behavioral changes include increasing, decreasing, or avoiding activities. Non-limiting examples of activities include, but are not limited to, physical exercise, ingestion of a substance (e.g., supplements or drugs), exposure to products (e.g., smoke, toxins, irritants, etc.), use of products (e.g., skin care products, hair care products, nail care products, etc.), diet, lifestyle, sleep, and consumption (e.g., consumption of alcohol, drugs, caffeine, allergens, food, or a class of foods). In some cases, behavioral changes include activities used to remedy or prevent a specific phenotypic trait (e.g., engaging in or refraining from activities that are a cause of or associated with the occurrence of the specific phenotypic trait).

本公开通过非限制性示例提供了与本文描述的特定表型性状相关的行为改变的各种建议。在一些实施方案中,与受试者群体相比,具有指示干燥皮肤可能性增加的GRS的个体被建议从事补救和/或预防干燥皮肤的活动(例如,每天涂抹保湿霜)。在一些实施方案中,与受试者群体相比,具有指示胶原分解可能性增加的GRS的个体被建议从事补救和/或防止胶原分解的活动(例如,食用胶原补充剂,使用特定产品或装置,避免使用特定产品或装置)。在一些实施方案中,与受试者群体相比,具有指示反感锻炼可能性增加的GRS的个体被建议从事非常规体育活动(例如,爱好,如攀岩、徒步旅行、背包旅行等。在一些实施方案中,与受试者群体相比,具有指示肌肉损伤风险增加的可能性的GRS的个体被建议避免进行活动(例如,健身、极限耐力事件等)以补救或预防肌肉损伤。在一些实施方案中,与受试者群体相比,具有指示应力性骨折可能性增加的GRS的个体被建议避免进行活动(例如,重复的和/或高影响性的活动,如跑步)以补救或预防应力性骨折。在一些实施方案中,与受试者群体相比,具有指示酒精代谢不良的可能性增加的GRS的个体被建议避免酒精消耗或减少酒精消耗。在一些实施方案中,受试者群体对于个体是世系特异性的。This disclosure provides, by way of non-limiting examples, various recommendations for behavioral changes related to the specific phenotypic traits described herein. In some embodiments, individuals with GRS indicating an increased likelihood of dry skin, compared to the subject population, are advised to engage in remedial and/or preventative activities for dry skin (e.g., applying moisturizer daily). In some embodiments, individuals with GRS indicating an increased likelihood of collagen breakdown, compared to the subject population, are advised to engage in remedial and/or preventative activities for collagen breakdown (e.g., consuming collagen supplements, using specific products or devices, avoiding the use of specific products or devices). In some implementations, individuals with a GRS indicating an increased likelihood of aversion to exercise, compared to the subject population, are advised to engage in unconventional physical activities (e.g., hobbies such as rock climbing, hiking, backpacking, etc.). In some implementations, individuals with a GRS indicating an increased risk of muscle injury, compared to the subject population, are advised to avoid activities (e.g., fitness, extreme endurance events, etc.) to remedy or prevent muscle injury. In some implementations, individuals with a GRS indicating an increased likelihood of stress fractures, compared to the subject population, are advised to avoid activities (e.g., repetitive and/or high-impact activities such as running) to remedy or prevent stress fractures. In some implementations, individuals with a GRS indicating an increased likelihood of impaired alcohol metabolism, compared to the subject population, are advised to avoid or reduce alcohol consumption. In some implementations, the subject population is lineage-specific for individuals.

报告Report

在一些实施方案中,本文公开了诸如健康报告的报告。本实施方案的报告的非限制性示例提供在图6A-6F和图7A-7D中。使用本文描述的方法和系统生成报告,以向个体提供来自针对本文描述的一个或多个特定表型性状对个体基因型进行的世系特异性遗传风险得分(GRS)分析的结果。在一些情况下,报告包括对个体的建议,例如基于个体的GRS的行为改变或产品建议。In some embodiments, reports such as health reports are disclosed herein. Non-limiting examples of reports in this embodiment are provided in Figures 6A-6F and 7A-7D. Reports are generated using the methods and systems described herein to provide individuals with the results of lineage-specific genetic risk score (GRS) analyses of an individual's genotype against one or more specific phenotypic traits described herein. In some cases, the report includes recommendations for the individual, such as behavioral changes or product recommendations based on the individual's GRS.

在一些实施方案中,报告包括以发展或具有感兴趣的特定表型性状的风险的范围(例如,正常到高)表示的GRS分析的结果,这是相对于参考群体而言的。在一些情况下,由与个体具有相同世系的个体组成参考群体。在一些情况下,参考群体对于个体不是世系特异性的。一般说来,“正常”结果表明个体不倾向于发展或具有所述表型性状。相比之下,“高”的结果表明,与参考群体相比,个体有更高的可能性发展或具有所述表型性状。“低”风险表明个体倾向于不具有或不发展所述特定表型性状。“略高”或“略低”结果分别表示介于正常得分与高或低得分之间的得分。In some implementations, the report includes the results of a GRS analysis, expressed as a range of risk (e.g., normal to high) of developing or having a specific phenotypic trait of interest, relative to a reference group. In some cases, the reference group consists of individuals of the same lineage as the individual. In other cases, the reference group is not lineage-specific for the individual. Generally, a "normal" result indicates that the individual is not predisposed to developing or having the stated phenotypic trait. In contrast, a "high" result indicates that the individual is more likely to develop or have the stated phenotypic trait compared to the reference group. A "low" risk indicates that the individual is predisposed to not having or developing the stated phenotypic trait. "Slightly high" or "slightly low" results represent scores between normal and high or low, respectively.

在一些情况下,本文所述的报告根据针对特定表型性状的个体GRS提供产品建议。在非限制性示例中,易发展过早胶原分解的个体(例如,第50百分位数或更高得分)将被建议一产品,以恢复、停止、或防止胶原分解,例如胶原补充剂。在各种实施方案中,报告还包括所建议的产品的超链接。超链接将引导个体到与该产品相关的在线资源,例如购买该产品的在线商务平台,或与特定表型相关的研究文章或文献综述文章。In some cases, the reports described herein provide product recommendations based on an individual's GRS for a specific phenotypic trait. In a non-limiting example, individuals prone to premature collagen breakdown (e.g., those scoring at the 50th percentile or higher) will be recommended a product to restore, halt, or prevent collagen breakdown, such as a collagen supplement. In various embodiments, the report also includes a hyperlink to the recommended product. The hyperlink will direct the individual to online resources related to the product, such as an online marketplace for purchasing the product, or research articles or literature reviews relevant to the specific phenotypic trait.

在一些实施方案中,本文公开的报告为个体提供了针对多个特定表型性状的GRS结果,例如本文描述的那些。例如,在一些情况下,单个报告包括针对与皮肤、身体素质、营养和其他中的一个或多个相关的一个或多个特定表型性状的结果,如图6A-6F和图7A-7D中所提供并在本文描述中的那些。In some implementations, the reports disclosed herein provide an individual with GRS results for multiple specific phenotypic traits, such as those described herein. For example, in some cases, a single report includes results for one or more specific phenotypic traits related to one or more of skin, physical fitness, nutrition, and others, as provided in and described herein in Figures 6A-6F and 7A-7D.

报告被格式化以用任何适当的方法,包括电子或邮寄,交付给个体。在一些实施方案中,报告是电子报告。在一些情况下,电子报告被格式化为通过计算机网络传输到个体的个人电子设备(例如,平板电脑、笔记本电脑、智能手机、健身跟踪设备)。在一些情况下,报告被集成到个人电子设备上的移动应用程序中。在一些情况下,应用程序是交互式的,允许个体单击嵌入在报告中的超链接,这些超链接会自动将用户重定向到在线资源。在一些情况下,报告被加密或以其他方式保护,以保护个体隐私。在一些情况下,报告被打印并邮寄给个体。The report is formatted for delivery to the individual using any suitable method, including electronically or by mail. In some implementations, the report is an electronic report. In some cases, the electronic report is formatted for transmission over a computer network to the individual's personal electronic device (e.g., tablet, laptop, smartphone, fitness tracker). In some cases, the report is integrated into a mobile application on the personal electronic device. In some cases, the application is interactive, allowing the individual to click hyperlinks embedded in the report that automatically redirect the user to online resources. In some cases, the report is encrypted or otherwise protected to safeguard the individual's privacy. In some cases, the report is printed and mailed to the individual.

系统system

本文公开的方面提供了被配置为实现本公开中描述的方法的系统,包括但不限于确定个体具有或将发展特定表型性状的可能性。The aspects disclosed herein provide systems configured to implement the methods described herein, including, but not limited to, determining the likelihood that an individual has or will develop a particular phenotypic trait.

图1描述了包括计算设备的示例性健康报告系统,该计算设备包括至少一个处理器104、110,存储器和软件程序118,软件程序118包括可由至少一个处理器执行以评估个体具有或将发展特定表型性状的可能性的指令。在一些情况下,系统包括报告模块,其被配置为生成对个体的报告GRS。在一些情况下,报告包括与特定表型性状相关的行为改变的建议。在一些情况下,系统包括被配置为向个体显示报告的输出模块。在一些情况下,系统包括中央处理单元(CPU)、存储器(例如,随机存取存储器、闪存)、电子存储单元、软件程序、与一个或多个其他系统通信的通信接口以及它们的任何组合。在一些情况下,系统耦合到计算机网络,例如,因特网,与因特网、电信或数据网络通信的内联网和/或外联网。在一些情况下,系统连接到分布式账本。在一些情况下,分布式账本包括区块链。在一些实施方案中,系统包括存储单元,用于存储关于本公开中描述的方法的任何方面的数据和信息。系统的各个方面是产品、物品或制品。Figure 1 illustrates an exemplary health reporting system including a computing device comprising at least one processor 104, 110, memory, and software program 118, the software program 118 including instructions executable by at least one processor to assess the likelihood that an individual has or will develop a particular phenotypic trait. In some cases, the system includes a reporting module configured to generate a report GRS for the individual. In some cases, the report includes recommendations for behavioral changes related to a particular phenotypic trait. In some cases, the system includes an output module configured to display the report to the individual. In some cases, the system includes a central processing unit (CPU), memory (e.g., random access memory, flash memory), electronic storage units, software program, communication interfaces for communicating with one or more other systems, and any combination thereof. In some cases, the system is coupled to a computer network, such as the Internet, an intranet and/or extranet communicating with the Internet, telecommunications or data networks. In some cases, the system is connected to a distributed ledger. In some cases, the distributed ledger includes a blockchain. In some embodiments, the system includes a storage unit for storing data and information regarding any aspect of the methods described in this disclosure. A aspect of the system is a product, article, or artifact.

图1的示例性健康报告系统包括软件程序的一个特征,所述软件程序包括可由至少一个处理器执行的指令序列,所述指令序列被写入以执行指定任务。在一些实施方案中,计算机可读指令被实现为执行特定任务或实现特定数据类型的程序模块,例如功能、特征、应用程序编程接口(API)、数据结构等。根据本文提供的公开,本领域技术人员将认识到软件程序可以用各种语言的各种版本编写。在一些实施方案中,软件程序118包括可由本文描述的至少一个处理器执行的指令。在一些实施方案中,指令包括以下步骤:(i)提供所述个体的基因型,所述基因型包括一个或多个个体特异性遗传变体;(ii)至少部分地根据所述个体的基因型为所述个体分配世系106;(iii)使用性状相关变体数据库108(其包括源自与所述个体具有相同世系的受试者(受试者组)的世系特异性遗传变体),以至少部分地基于所述个体的世系选择一个或多个世系特异性遗传变体,其中所述一个或多个世系特异性遗传变体中的每一个对应于:(1)所述一个或多个个体特异性遗传变体中的个体特异性遗传变体,或(2)在与所述个体具有相同世系的受试者群体中与所述一个或多个个体特异性遗传变体中的个体特异性遗传变体处于连锁不平衡(LD)的预先确定的遗传变体,并且其中所述一个或多个世系特异性遗传变体中的每一个和所述个体特异性遗传变体中的每一个包括一个或多个风险单位;和(iv)基于所选择的一个或多个世系特异性遗传变体计算对于所述个体的遗传风险得分112,其中所述遗传风险得分指示所述个体具有或将发展所述特定性状的可能性。在一些实施方案中,软件程序118还包括可由本文描述的至少一个处理器执行的指令,包括与个体特异性遗传变体处于LD的预先确定的遗传变体。在一些情况下,软件程序包括可由至少一个处理器执行以确定预先确定的遗传变体的指令,所述指令包括以下步骤:(i)提供来自个体的未分型基因型数据;(ii)将未分型基因型数据分型,以根据所述个体的世系产生个体特异性分型单倍型;(iii)使用来自与所述个体具有相同世系的参考组的分型单倍型数据,插补分型的个体特异性分型单倍型中不存在的个体特异性基因型;和(iv)从插补的个体特异性基因型中选择与个体具有或将发展特定性状的可能性相关联的个体特异性遗传变体处于连锁不平衡(LD)的遗传变体。在一些实施方案中,LD由至少约0.20、0.25、0.30、0.35、0.40、0.45、0.50、0.55、0.60、0.65、0.70、0.75、0.80、0.85、0.90、0.95或1.0的D'值定义。在一些实施方案中,LD由至少约0.70、0.75、0.80、0.85、0.90、0.95或1.0的r2值定义。The exemplary health reporting system of Figure 1 includes a feature of a software program comprising a sequence of instructions executable by at least one processor, the sequence of instructions being written to perform a specified task. In some embodiments, the computer-readable instructions are implemented as program modules that perform a specific task or implement a specific data type, such as functions, features, application programming interfaces (APIs), data structures, etc. Based on the disclosure provided herein, those skilled in the art will recognize that the software program can be written in various versions of various languages. In some embodiments, software program 118 includes instructions executable by at least one processor described herein. In some implementations, the instructions include the following steps: (i) providing the individual's genotype, the genotype including one or more individual-specific genetic variants; (ii) assigning the individual a lineage 106 at least in part based on the individual's genotype; (iii) using a trait-related variant database 108 (which includes lineage-specific genetic variants derived from subjects (subject groups) with the same lineage as the individual) to select one or more lineage-specific genetic variants at least in part based on the individual's lineage, wherein each of the one or more lineage-specific genetic variants corresponds to: (1) an individual-specific genetic variant among the one or more individual-specific genetic variants, or (2) a predetermined genetic variant in linkage disequilibrium (LD) with the individual-specific genetic variant among the one or more individual-specific genetic variants in a subject group with the same lineage as the individual, and wherein each of the one or more lineage-specific genetic variants and each of the individual-specific genetic variants includes one or more risk units; and (iv) calculating a genetic risk score 112 for the individual based on the selected one or more lineage-specific genetic variants, wherein the genetic risk score indicates the likelihood that the individual has or will develop the particular trait. In some embodiments, the software program 118 further includes instructions executable by at least one processor described herein, including predetermined genetic variants at linkage disequilibrium (LD) with individual-specific genetic variants. In some cases, the software program includes instructions executable by at least one processor to determine predetermined genetic variants, the instructions including the steps of: (i) providing untyped genotype data from an individual; (ii) typing the untyped genotype data to generate individual-specific typing haplotypes based on the individual's lineage; (iii) interpolating individual-specific genotypes absent in the typed individual-specific typing haplotypes using typing haplotype data from a reference group having the same lineage as the individual; and (iv) selecting from the interpolated individual-specific genotypes genetic variants at linkage disequilibrium (LD) that are associated with the individual's likelihood of having or developing a particular trait. In some embodiments, LD is defined by a D' value of at least about 0.20, 0.25, 0.30, 0.35, 0.40, 0.45, 0.50, 0.55, 0.60, 0.65, 0.70, 0.75, 0.80, 0.85, 0.90, 0.95, or 1.0. In some embodiments, LD is defined by an value of at least about 0.70, 0.75, 0.80, 0.85, 0.90, 0.95, or 1.0.

在各种环境中根据需要组合或分布计算机可读指令的功能。在一些情况下,软件程序包括一个指令序列或多个指令序列。可以从一个位置提供软件程序。可以从多个位置提供软件程序。在一些实施方案中,软件程序包括一个或多个软件模块。在一些实施方案中,计算机程序包括(部分地或全部地)一个或多个web应用程序、一个或多个移动应用程序、一个或多个独立应用程序、一个或多个web浏览器插件、扩展项、加载项或附加程序或其组合。The software program allows for the combination or distribution of computer-readable instructions as needed in various environments. In some cases, the software program comprises one or more sequences of instructions. The software program can be provided from one location. The software program can be provided from multiple locations. In some embodiments, the software program comprises one or more software modules. In some embodiments, the computer program comprises (partially or wholly) one or more web applications, one or more mobile applications, one or more standalone applications, one or more web browser plugins, extensions, add-ons, or appenders, or combinations thereof.

图1描述了包括报告模块114的示例性健康报告系统。本文描述的报告模块114包括至少一个处理器,所述至少一个处理器被配置为执行生成报告的任务,所述报告包括指示个体具有或将发展感兴趣的特定表型性状的可能性的所述个体的计算GRS。在一些情况下,至少一个处理器是上面描述的相同处理器118,并另外配置为执行生成报告的步骤。在一些情况下,至少一个处理器包括单独的处理器,例如在双CPU中。在一些情况下,报告模块114被配置为执行如下任务:在由个体提供给系统的调查表中检索与特定性状有关的一个或多个问题的一个或多个答案。在一些情况下,报告还包括至少部分基于GRS的与性状相关的行为改变的建议。在一些情况下,报告模块114生成的报告包括与感兴趣的特定表型性状相关的行为改变的建议,所述建议基于针对所述性状的GRS以及对与所述性状相关的一个或多个问题的一个或多个答案的检索。Figure 1 illustrates an exemplary health reporting system including a reporting module 114. The reporting module 114 described herein includes at least one processor configured to perform the task of generating a report comprising a calculated GRS indicating the individual's likelihood of having or developing a specific phenotypic trait of interest. In some cases, the at least one processor is the same processor 118 described above and is additionally configured to perform the report generation step. In some cases, the at least one processor comprises a single processor, e.g., in a dual-CPU configuration. In some cases, the reporting module 114 is configured to perform the task of retrieving one or more answers to one or more questions related to a specific trait from a questionnaire provided by the individual to the system. In some cases, the report also includes recommendations for trait-related behavioral changes, at least partially based on the GRS. In some cases, the report generated by the reporting module 114 includes recommendations for behavioral changes related to a specific phenotypic trait of interest, based on the GRS for said trait and the retrieval of one or more answers to one or more questions related to said trait.

在一些实施方案中,图1的示例性健康报告系统包括输出模块116。本文描述的输出模块116包括能够在处理器上执行的硬件或软件程序,其被配置为向个体展示报告。在一些实施方案中,输出模块116包括用户界面,包括屏幕或其他输出显示器(例如,投影仪)。在一些实施方案中,输出模块116包括电邮服务,其能够将报告的电子版电邮给其所属的个体。在一些实施方案中,输出模块116包括个人计算设备例如计算机、智能手机或平板电脑上的用户界面。在一些实施方案中,个人计算设备经由计算机网络远程连接到本文描述的系统。在一些情况下,个人计算设备属于个体。在一些实施方案中,个人电子设备被配置为运行被配置为经由计算机网络与报告模块通信以访问报告的应用程序。In some embodiments, the exemplary health reporting system of Figure 1 includes an output module 116. The output module 116 described herein includes hardware or software programs executable on a processor, configured to display a report to an individual. In some embodiments, the output module 116 includes a user interface, including a screen or other output display (e.g., a projector). In some embodiments, the output module 116 includes an email service capable of emailing an electronic version of the report to its associated individual. In some embodiments, the output module 116 includes a user interface on a personal computing device such as a computer, smartphone, or tablet. In some embodiments, the personal computing device is remotely connected to the system described herein via a computer network. In some cases, the personal computing device belongs to an individual. In some embodiments, the personal electronic device is configured to run an application configured to communicate with the reporting module via a computer network to access the report.

Web应用程序Web Applications

在一些实施方案中,本文描述的软件程序包括web应用程序。根据本文提供的公开,本领域技术人员将认识到,web应用程序可以利用一个或多个软件框架和一个或多个数据库系统。例如,web应用程序是在诸如.NET或Ruby on Rails(RoR)的软件框架上创建的。在一些情况下,web应用程序利用一个或多个数据库系统,其通过非限制性示例,包括关系数据库系统、非关系数据库系统、面向特征数据库系统、关联数据库系统和XML数据库系统。通过非限制性示例,合适的关系数据库系统包括SQLServer、mySQLTM,和本领域技术人员还将认识到,web应用程序可以用一种或多种语言的一种或多种版本编写。在一些实施方案中,用一种或多种标记语言、表示定义语言、客户端脚本语言、服务器端编码语言、数据库查询语言或其组合来编写web应用程序。在一些实施方案中,web应用程序在某种程度上以标记语言编写,例如超文本标记语言(HTML)、可扩展超文本标记语言(XHTML)或可扩展标记语言(XML)。在一些实施方案中,web应用程序在某种程度上是用诸如级联样式表(CSS)的表示定义语言编写的。在一些实施方案中,web应用程序在某种程度上用客户端脚本语言编写,例如异步Javascript和XML(AJAX)、Actionscript、Javascript或在一些实施方案中,web应用程序在某种程度上用服务器端编码语言编写,例如Active Server Pages(ASP)、Perl、JavaTM、JavaServer Pages(JSP)、超文本预处理器(PHP)、PythonTM、Ruby、Tcl、Smalltalk、或Groovy。在一些实施方案中,web应用程序在某种程度上是用诸如结构化查询语言(SQL)的数据库查询语言编写的。web应用程序可以集成企业服务器产品,如Lotusweb应用程序可以包括媒体播放器要素。媒体播放器要素可以利用许多合适的多媒体技术中的一个或多个,包括(通过非限制性示例)HTML 5、JavaTM和移动应用程序In some embodiments, the software programs described herein include web applications. Based on the disclosure provided herein, those skilled in the art will recognize that web applications can utilize one or more software frameworks and one or more database systems. For example, web applications are created on software frameworks such as .NET or Ruby on Rails (RoR). In some cases, web applications utilize one or more database systems, which, by way of non-limiting examples, include relational database systems, non-relational database systems, feature-oriented database systems, relational database systems, and XML database systems. By way of non-limiting examples, suitable relational database systems include SQL Server, MySQL , and those skilled in the art will also recognize that web applications can be written in one or more languages and one or more versions thereof. In some embodiments, web applications are written in one or more markup languages, presentation definition languages, client-side scripting languages, server-side coding languages, database query languages, or combinations thereof. In some embodiments, web applications are written to some extent in a markup language, such as Hypertext Markup Language (HTML), Extensible Hypertext Markup Language (XHTML), or Extensible Markup Language (XML). In some embodiments, web applications are written to some extent in a presentation definition language such as Cascading Style Sheets (CSS). In some implementations, the web application is written to some extent in a client-side scripting language, such as Asynchronous JavaScript and XML (AJAX), ActionScript, or JavaScript. Alternatively, in some implementations, the web application is written to some extent in a server-side coding language, such as Active Server Pages (ASP), Perl, Java , JavaServer Pages (JSP), Hypertext Preprocessor (PHP), Python , Ruby, Tcl, Smalltalk, or Groovy. In some implementations, the web application is written to some extent in a database query language such as Structured Query Language (SQL). The web application can integrate with enterprise server products; for example, a Lotus web application may include a media player element. The media player element can utilize one or more of many suitable multimedia technologies, including (by way of non-limiting example) HTML5, Java , and mobile applications.

在一些情况下,本文描述的软件程序包括提供给移动数字处理设备的移动应用程序。可以在制造移动数字处理设备时将移动应用提供给移动数字处理设备。移动应用程序可以经由本文描述的计算机网络提供给移动数字处理设备。In some cases, the software programs described herein include mobile applications provided to mobile digital processing devices. Mobile applications can be provided to mobile digital processing devices during the manufacture of the mobile digital processing device. Mobile applications can be provided to mobile digital processing devices via the computer networks described herein.

使用本领域已知的硬件、语言和开发环境,通过本领域技术人员已知的技术创建移动应用程序。本领域技术人员将认识到移动应用程序可以用多种语言编写。通过非限制性示例,合适的编程语言包括:C、C++、C#、Featureive-C、JavaTM、Javascript、Pascal、Feature Pascal、PythonTM、Ruby、VB.NET、WML、和带有或不带有CSS的XHTML/HTML,或者它们的组合。Mobile applications are created using hardware, languages, and development environments known in the art, employing techniques known to those skilled in the art. Those skilled in the art will recognize that mobile applications can be written in a variety of languages. By way of non-limiting examples, suitable programming languages include: C, C++, C#, Featured-C, Java , Javascript, Pascal, Feature Pascal, Python , Ruby, VB.NET, WML, and XHTML/HTML with or without CSS, or combinations thereof.

合适的移动应用程序开发环境可以从多个来源获得。通过非限制性示例,可商购的开发环境包括AirplaySDK、alcheMo、Celsius、Bedrock、FlashLite、.NET Compact Framework、Rhomobile和WorkLight Mobile Platform。其他开发环境可以免费提供,通过非限制性示例,包括Lazarus、MobiFlex、MoSync和PhoneGap。此外,移动设备制造商分发软件开发工具包,通过非限制性示例包括,iPhone和iPad(iOS)SDK、AndroidTM SDK、SDK、BREW SDK、OS SDK、Symbian SDK、webOS SDK和Mobile SDK。Suitable mobile application development environments are available from multiple sources. Commercially available development environments, through non-restricted examples, include AirplaySDK, alcheMo, Celsius, Bedrock, FlashLite, .NET Compact Framework, Rhomobile, and WorkLight Mobile Platform. Other development environments are available free of charge, through non-restricted examples, including Lazarus, MobiFlex, MoSync, and PhoneGap. Additionally, mobile device manufacturers distribute software development kits, through non-restricted examples, including the iPhone and iPad (iOS) SDK, Android SDK, BREW SDK, OS SDK, Symbian SDK, webOS SDK, and Mobile SDK.

本领域技术人员将认识到多个商业论坛可用于移动应用程序的分发,通过非限制性示例包括,应用程序商店、AndroidTM市场、应用程序世界、用于Palm设备的应用程序商店、用于webOS的应用程序目录、用于移动的市场、用于设备的Ovi商店、应用程序和DSi商店。Those skilled in the art will recognize that numerous business forums can be used for the distribution of mobile applications, including, by way of non-limiting examples, app stores, Android Marketplace, App World, App Store for Palm devices, App Catalog for webOS, Marketplace for mobile, Ovi Store for devices, App Store, and DSi Store.

独立应用程序standalone application

在一些实施方案中,本文描述的软件程序包括独立应用程序,其是可以作为独立计算机进程运行的程序,而不是现有进程的附加程序,例如,不是插件。本领域的技术人员将认识到,有时会编译独立的应用程序。在一些情况下,编译器是计算机程序,它将用编程语言编写的源代码转换成二进制特征代码,如汇编语言或机器代码。通过非限制性示例,合适的编译编程语言包括C、C++、Featureive-C、COBOL、Delphi、Eiffel、JavaTM、Lisp、Perl、R、PythonTM、Visual Basic和VB.NET或它们的组合。编译可经常执行,至少部分执行,以创建可执行程序。在一些情况下,计算机程序包括一个或多个可执行的编译应用程序。In some implementations, the software programs described herein include standalone applications, which are programs that can run as independent computer processes, rather than appendages to existing processes, e.g., not plugins. Those skilled in the art will recognize that sometimes standalone applications are compiled. In some cases, a compiler is a computer program that translates source code written in a programming language into binary characteristic code, such as assembly language or machine code. By way of non-limiting examples, suitable compilation programming languages include C, C++, Featured-C, COBOL, Delphi, Eiffel, Java , Lisp, Perl, R, Python , Visual Basic, and VB.NET, or combinations thereof. Compilation may often be performed, at least partially, to create an executable program. In some cases, a computer program includes one or more executable compiled applications.

Web浏览器插件Web browser plugin

在一些实施方案中,本文公开了在一些方面包括web浏览器插件的软件程序。在计算中,插件在一些情况下是一个或多个向较大的软件应用程序添加特定功能的软件组件。软件应用程序的制造商可以支持插件,以使第三方开发人员能够创建扩展应用程序的能力,支持轻松地添加新特征,并减小应用程序的大小。如果得到支持,插件能够自定义软件应用程序的功能。例如,插件通常在web浏览器中用于播放视频、生成交互性、扫描病毒和显示特定的文件类型。本领域技术人员将熟悉多个web浏览器插件,包括Player、和工具栏可以包括一个或多个web浏览器扩展项、加载项或附加程序。工具栏可以包括一个或多个浏览器栏、工具带或桌面带。本领域的技术人员将认识到,有多种插件框架可用于以各种编程语言开发插件,这些语言通过非限制性示例包括,C++、Delphi、JavaTM、PHP、PythonTM和VB.NET或它们的组合。In some implementations, this document discloses software programs that include web browser plugins in certain aspects. In computing, a plugin is, in some cases, one or more software components that add specific functionality to a larger software application. Software application manufacturers may support plugins to enable third-party developers to create the ability to extend applications, support the easy addition of new features, and reduce the application size. If supported, a plugin can customize the functionality of a software application. For example, plugins are commonly used in web browsers to play videos, generate interactivity, scan for viruses, and display specific file types. Those skilled in the art will be familiar with various web browser plugins, including Player, and toolbars may include one or more web browser extensions, add-ons, or appenders. A toolbar may include one or more browser bars, toolbars, or desktop ribbons. Those skilled in the art will recognize that various plugin frameworks are available for developing plugins in a variety of programming languages, including, by way of non-limiting examples, C++, Delphi, Java , PHP, Python , and VB.NET, or combinations thereof.

在一些实施方案中,Web浏览器(也称为因特网浏览器)是软件应用程序,设计用于与网络连接的数字处理设备一起使用,用于检索、呈现和遍历万维网上的信息资源。通过非限制性示例,合适的web浏览器包括InternetChrome、Opera和KDE Konqueror。在一些情况下,web浏览器是移动web浏览器。移动web浏览器(也称为微浏览器、迷你浏览器和无线浏览器)可以设计用于移动数字处理设备上,通过非限制性示例,包括手持计算机、平板计算机、上网本计算机、小型笔记本计算机、智能手机、音乐播放器、个人数字助理(PDA)和手持视频游戏系统。通过非限制性示例,适用的移动web浏览器包括:浏览器、RIM浏览器、Blazer、浏览器、移动版InternetMobile、Basic Web、浏览器、OperaMobile和PSPTM浏览器。In some implementations, a web browser (also known as an internet browser) is a software application designed for use with a network-connected digital processing device to retrieve, render, and traverse information resources on the World Wide Web. Suitable web browsers, by way of non-limiting example, include Internet Chrome, Opera, and KDE Konqueror. In some cases, the web browser is a mobile web browser. Mobile web browsers (also known as microbrowsers, mini-browsers, and wireless browsers) can be designed for use on mobile digital processing devices, including, by way of non-limiting example, handheld computers, tablet computers, netbook computers, mini-notebook computers, smartphones, music players, personal digital assistants (PDAs), and handheld video game systems. Suitable mobile web browsers, by way of non-limiting example, include: Internet Explorer, RIM Browser, Blazer Browser, Internet Mobile, Basic Web Browser, Opera Mobile, and PSP Browser.

软件模块Software Module

本文公开的介质、方法和系统包括一个或多个软件、服务器和数据库模块,或其用途。鉴于本文提供的公开,软件模块可以通过本领域技术人员已知的技术使用本领域已知的机器、软件和语言来创建。本文公开的软件模块可以以多种方式实现。在一些实施方案中,软件模块包括文件、代码段、编程特征、编程结构或其组合。软件模块可包括多个文件、多个代码段、多个编程特征、多个编程结构或其组合。通过非限制性示例,所述一个或多个软件模块包括web应用程序、移动应用程序和/或独立应用程序。软件模块可以在一个计算机程序或应用程序中。软件模块可以在多于一个的计算机程序或应用程序中。软件模块可以托管在一台机器上。软件模块可以托管在多台机器上。软件模块可以托管在云计算平台上。软件模块可以托管在位于一个位置的一台或多台机器上。软件模块可以托管在位于多于一个位置的一台或多台机器上。The media, methods, and systems disclosed herein include one or more software, server, and database modules, or their uses. Given the disclosure provided herein, software modules can be created using techniques known to those skilled in the art, utilizing machines, software, and languages known in the art. The software modules disclosed herein can be implemented in a variety of ways. In some embodiments, a software module includes files, code segments, programming features, programming structures, or combinations thereof. A software module may include multiple files, multiple code segments, multiple programming features, multiple programming structures, or combinations thereof. By way of non-limiting example, the one or more software modules include web applications, mobile applications, and/or standalone applications. A software module may be in one computer program or application. A software module may be in more than one computer program or application. A software module may be hosted on one machine. A software module may be hosted on multiple machines. A software module may be hosted on a cloud computing platform. A software module may be hosted on one or more machines located in one location. A software module may be hosted on one or more machines located in more than one location.

数据库database

本文公开的介质、方法和系统包括一个或多个数据库,例如本文描述的性状相关数据库,或其用途。本领域技术人员将认识到,许多数据库适于存储和检索地理谱、操作者活动、目标分区和/或特许权所有人的联系信息。通过非限制性示例,合适的数据库包括关系数据库、非关系数据库、面向特征的数据库、特征数据库、实体关系模型数据库、关联数据库和XML数据库。在一些实施方案中,数据库是基于因特网的。在一些实施方案中,数据库是基于web的。在一些实施方案中,数据库是基于云计算的。数据库可以基于一个或多个本地计算机存储设备。The media, methods, and systems disclosed herein include one or more databases, such as the trait-related databases described herein, or their uses. Those skilled in the art will recognize that many databases are suitable for storing and retrieving geographic spectra, operator activities, target zones, and/or contact information of concessionaires. By way of non-limiting example, suitable databases include relational databases, non-relational databases, feature-oriented databases, feature databases, entity-relationship model databases, association databases, and XML databases. In some embodiments, the database is Internet-based. In some embodiments, the database is web-based. In some embodiments, the database is cloud-based. The database may be based on one or more local computer storage devices.

数据传输Data transmission

本文描述的方法、系统和介质被配置为在处于一个或多个位置的一个或多个设施中执行。设施地点不受国家限制,包括任何国家或地区。在一些情况下,本文的方法的一个或多个步骤在不同于该方法的另一个步骤的国家中执行。在一些情况下,用于获得样品的一个或多个步骤在不同于用于分析样品基因型的一个或多个步骤的国家中执行。在一些实施方案中,涉及计算机系统的一个或多个方法步骤在与本文提供的方法的另一步骤不同的国家中执行。在一些实施方案中,数据处理和分析在不同于本文所述方法的一个或多个步骤的国家或位置执行。在一些实施方案中,将一个或多个物品、产品或数据从一个或多个设施转移到一个或多个不同的设施以进行分析或进一步分析。物品包括但不限于从受试者的样品和本文公开的作为物品或产品的任何物品或产品中获得的一种或多种组分。数据包括但不限于关于基因型的信息和通过本文公开的方法产生的任何数据。在本文描述的方法和系统的一些实施方案中,执行分析,并且随后的数据传输步骤将传达或传输分析的结果。The methods, systems, and media described herein are configured to be performed in one or more facilities at one or more locations. Facility locations are not nationally restricted and include any country or region. In some cases, one or more steps of the methods described herein are performed in a country different from another step of the methods. In some cases, one or more steps for obtaining a sample are performed in a country different from one or more steps for analyzing the genotype of the sample. In some embodiments, one or more method steps involving a computer system are performed in a country different from another step of the methods provided herein. In some embodiments, data processing and analysis are performed in a country or location different from one or more steps of the methods described herein. In some embodiments, one or more articles, products, or data are transferred from one or more facilities to one or more different facilities for analysis or further analysis. Articles include, but are not limited to, one or more components obtained from samples from subjects and any articles or products disclosed herein as articles or products. Data includes, but is not limited to, information about genotypes and any data generated by the methods disclosed herein. In some embodiments of the methods and systems described herein, an analysis is performed, and subsequent data transfer steps communicate or transmit the results of the analysis.

在一些实施方案中,本文所述的任何方法的任何步骤由计算机上的软件程序或模块执行。在另外的或进一步的实施方案中,来自本文所述的任何方法的任何步骤的数据被传送到位于相同或不同国家内的设施和从位于相同或不同国家内的设施传送,包括在特定位置的一个设施中执行的分析以及传送到另一位置或直接传送到相同或不同国家内的个体的数据。在另外的或进一步的实施方案中,来自本文所述的任何方法的任何步骤的数据被传送到位于相同或不同国家内的设施和/或从位于相同或不同国家内的所述设施接收,包括在特定位置的一个设施中执行的数据输入(例如细胞材料)的分析以及发送到另一位置或直接发送到个体的相应数据,例如与相同或不同位置或国家内的诊断、预后、对治疗的响应性等相关的数据。In some embodiments, any step of any method described herein is performed by a software program or module on a computer. In other or further embodiments, data from any step of any method described herein is transmitted to and from facilities located in the same or different countries, including analysis performed in a facility at a specific location and data transmitted to another location or directly to individuals in the same or different countries. In other or further embodiments, data from any step of any method described herein is transmitted to and/or received from facilities located in the same or different countries, including analysis of data inputs (e.g., cellular material) performed in a facility at a specific location and corresponding data transmitted to another location or directly to individuals, such as data related to diagnosis, prognosis, response to treatment, etc., in the same or different locations or countries.

非暂时性计算机可读存储介质Non-transitory computer-readable storage medium

本文公开的方面提供了一个或多个非暂时性计算机可读存储介质,其编码有软件程序,所述软件程序包括可由操作系统执行的指令。在一些实施方案中,被编码的软件包括本文描述的一个或多个软件程序。在另外的实施方案中,计算机可读存储介质是计算设备的有形组件。在又一些实施方案中,计算机可读存储介质任选地可从计算设备移除。在一些实施方案中,通过非限制性示例,计算机可读存储介质包括CD-ROM、DVD、闪存设备、固态存储器、磁盘驱动器、磁带驱动器、光盘驱动器、云计算系统和服务等。在一些情况下,程序和指令被永久地、基本上永久地、半永久地或非暂时性地编码在介质上。The aspects disclosed herein provide one or more non-transitory computer-readable storage media encoded with software programs, said software programs including instructions executable by an operating system. In some embodiments, the encoded software includes one or more software programs described herein. In other embodiments, the computer-readable storage medium is a tangible component of a computing device. In still other embodiments, the computer-readable storage medium is optionally removable from the computing device. In some embodiments, by way of non-limiting example, the computer-readable storage medium includes CD-ROMs, DVDs, flash memory devices, solid-state storage, disk drives, tape drives, optical disc drives, cloud computing systems and services, etc. In some cases, programs and instructions are permanently, substantially permanently, semi-permanently, or non-transitory encoded on the medium.

试剂盒和制品reagent kits and products

在一些实施方案中,本发明公开了用于根据本发明所述方法检测从受试者获得的样品中的基因型或生物标志物的组合物。本文公开的方面提供的组合物包括多核苷酸序列,所述多核苷酸序列包含SEQ ID NO:1-218中的一个或多个或其反向互补序列的至少10个但少于50个连续核苷酸,其中所述连续多核苷酸序列包含可检测分子。在一些实施方案中,多核苷酸序列包含SEQ ID NO:1-218中的一个或多个中的位置26或31处的核碱基。在各种实施方案中,可检测分子包括荧光团。在其他实施方案中,多核苷酸序列还包含淬灭剂。In some embodiments, the present invention discloses compositions for detecting genotypes or biomarkers in samples obtained from subjects using the methods described herein. The compositions provided by the aspects disclosed herein comprise a polynucleotide sequence comprising at least 10 but less than 50 consecutive nucleotides of one or more of SEQ ID NO:1-218 or their inverse complementary sequences, wherein the consecutive polynucleotide sequence comprises a detectable molecule. In some embodiments, the polynucleotide sequence comprises a nucleobase at position 26 or 31 of one or more of SEQ ID NO:1-218. In various embodiments, the detectable molecule comprises a fluorophore. In other embodiments, the polynucleotide sequence further comprises a quencher.

在一些实施方案中,本文还公开了用于检测本文所述基因型的试剂盒。在一些实施方案中,本文公开的试剂盒可用于预测个体是否具有或将发展特定表型性状。在一些情况下,试剂盒有助于诊断或预测个体的疾病或病症。在一些情况下,试剂盒对于选择治疗患者是有用的。在一些情况下,试剂盒附有产品建议,如补充剂或非处方药。在一些情况下,试剂盒附有咨询医生或医疗保健专业人员的建议。In some embodiments, kits for detecting the genotypes described herein are also disclosed herein. In some embodiments, the kits disclosed herein can be used to predict whether an individual has or will develop a particular phenotypic trait. In some cases, the kits aid in the diagnosis or prediction of an individual's disease or condition. In some cases, the kits are useful for selecting treatment for a patient. In some cases, the kits include product recommendations, such as supplements or over-the-counter medications. In some cases, the kits include advice to consult a physician or healthcare professional.

在一些实施方案中,试剂盒包括本文所述的组合物,其可用于执行本文所述的检测基因型的方法。试剂盒包括材料或组分的组合物,所述材料或组分包括所述组合物中的至少一种。在其他实施方案中,试剂盒包括执行用于检测基因型的测定所必需和/或足够的所有组分,包括所有对照、用于执行测定的说明以及用于分析和呈现结果的任何必要软件。在一些情况下,本文公开的试剂盒适用于诸如PCR和qPCR的测定。在一些情况下,试剂盒包括基因分型芯片,其可以在需要时使用。试剂盒中配置的组分的确切性质取决于其预期用途。In some embodiments, the kit includes the compositions described herein, which can be used to perform the methods described herein for detecting genotypes. The kit includes a composition of materials or components, said materials or components including at least one of the compositions. In other embodiments, the kit includes all components necessary and/or sufficient to perform an assay for detecting genotypes, including all controls, instructions for performing the assay, and any necessary software for analyzing and presenting results. In some cases, the kits disclosed herein are suitable for assays such as PCR and qPCR. In some cases, the kit includes a genotyping chip that can be used when needed. The exact properties of the components configured in the kit depend on their intended use.

可在试剂盒中包括使用说明。任选地,试剂盒还包括其他有用的组分,如稀释剂、缓冲剂、药学上可接受的载剂、注射器、导管、敷贴器、移液或测量工具、绷带材料或其他有用的用品。组装在试剂盒中的材料或组分可以提供给从业者,以任何方便和合适的方式储存,以保持其可操作性和实用性。例如,组分可以是溶解的、脱水的或冻干的形式;它们可以在室温、冷藏或冷冻温度下提供。这些组分通常包含在合适的包装材料中。如本文所用,短语“包装材料”是指用于容纳试剂盒内容物,例如组合物等的一个或多个物理结构。所述包装材料通过众所周知的方法构造,优选地以提供无菌、无污染物的环境。试剂盒中使用的包装材料是基因表达测定和治疗施用中常用的包装材料。如在本文中所使用的,术语“包装”指的是合适的固体基质或材料,例如玻璃、塑料、纸、箔等,其能够容纳各个试剂盒组分。因此,例如,包装可以是玻璃瓶或预填充注射器,用于容纳适当量的药物组合物。包装材料具有外部标签,其表明试剂盒及其组分的内容和/或用途。Instructions for use may be included in the kit. Optionally, the kit may also include other useful components such as diluents, buffers, pharmaceutically acceptable carriers, syringes, catheters, dressings, pipettes or measuring instruments, bandage materials, or other useful supplies. Materials or components assembled in the kit may be provided to practitioners for storage in any convenient and suitable manner to maintain their operability and usability. For example, components may be in dissolved, dehydrated, or lyophilized form; they may be provided at room temperature, refrigerated, or frozen temperatures. These components are typically contained in suitable packaging materials. As used herein, the phrase “packaging material” refers to one or more physical structures used to contain the contents of the kit, such as compositions, etc. The packaging material is constructed by well-known methods, preferably to provide a sterile, contaminant-free environment. The packaging materials used in the kit are those commonly used in gene expression assays and therapeutic administration. As used herein, the term “packaging” refers to a suitable solid matrix or material, such as glass, plastic, paper, foil, etc., capable of containing the individual kit components. Thus, for example, packaging may be a glass vial or a pre-filled syringe for containing an appropriate amount of the pharmaceutical composition. The packaging material has an external label that indicates the contents and/or uses of the kit and its components.

某些术语certain terms

在下面的描述中,为了提供对各种实施方案的透彻理解,阐述了某些特定细节。然而,本领域技术人员将理解,可以在没有这些细节的情况下实践所提供的实施方案。除非上下文另有要求,否则在随后的整个说明书和权利要求书中,“包括”一词及其变体,例如“包含”和“含有”,应以开放、包容的意义解释,即解释为“包括但不限于”。除非上下文另外清楚地规定,否则如本说明书和所附权利要求中所使用的,单数形式“一个(a)”、“一种(an)”和“所述(the)”包括复数个指示物。还应当指出,术语“或”通常在其包括“和/或”的含义上使用,除非内容另有明确规定。此外,这里提供的标题仅是为了方便,并不解释所要求保护的实施方案的范围或含义。In the following description, certain specific details are set forth in order to provide a thorough understanding of the various embodiments. However, those skilled in the art will understand that the embodiments provided can be practiced without these details. Unless the context otherwise requires, throughout the following specification and claims, the word “comprising” and its variations, such as “including” and “contains”, shall be interpreted in an open, inclusive sense, meaning “including but not limited to”. Unless the context clearly specifies otherwise, the singular forms “a,” “an,” and “the” as used in this specification and the appended claims include a plural of indicators. It should also be noted that the term “or” is generally used in its meaning including “and/or” unless the context expressly specifies otherwise. Furthermore, the headings provided herein are for convenience only and do not explain the scope or meaning of the claimed embodiments.

如本文所用,术语“约”是指接近所述量约10%、5%或1%的量。As used herein, the term “about” means an amount that is approximately 10%, 5%, or 1% of the stated amount.

当用于定义组合物和方法时,本文所用的“基本上由……组成”应意指排除对用于所陈述目的的组合具有任何重要意义的其他要素。因此,基本上由本文定义的要素组成的组合物不排除对所要求保护的公开(例如用于治疗皮肤失调如痤疮、湿疹、牛皮癣和酒渣鼻的组合物)的基本和新颖性没有实质性影响的其他材料或步骤。When used to define compositions and methods, the phrase “consistent essentially of…” as used herein shall mean excluding other elements that are of any significance to the combination for the stated purpose. Therefore, a composition consisting essentially of the elements defined herein does not exclude other materials or steps that do not materially affect the fundamental and novelty of the claimed disclosure (e.g., a composition for treating skin disorders such as acne, eczema, psoriasis, and rosacea).

这里使用术语“增加的”或“增加”通常表示统计学上显著量的增加;在一些实施方案中,术语“增加的”或“增加”表示与参考水平相比增加至少10%,例如与参考水平、标准或对照相比增加至少约10%、增加至少约20%、或增加至少约30%、或增加至少约40%、或增加至少约50%、或增加至少约60%、或增加至少约70%、或增加至少约80%、或增加至少约90%,或增加高达并包括100%或10-100%之间的任何增加。“增加”的其他示例包括与参考水平相比增加至少2倍、至少5倍、至少10倍、至少20倍、至少50倍、至少100倍、至少1000倍或更多。The term “increased” or “increased” as used herein generally refers to an increase of a statistically significant amount; in some embodiments, the term “increased” or “increased” means an increase of at least 10% compared to a reference level, such as an increase of at least about 10%, an increase of at least about 20%, or an increase of at least about 30%, or an increase of at least about 40%, or an increase of at least about 50%, or an increase of at least about 60%, or an increase of at least about 70%, or an increase of at least about 80%, or an increase of at least about 90%, or an increase of up to and including 100% or any increase between 10 and 100%. Other examples of “increased” include an increase of at least 2 times, at least 5 times, at least 10 times, at least 20 times, at least 50 times, at least 100 times, at least 1000 times or more compared to a reference level.

术语“减少的”或“减少”在此通常用于表示统计学上显著量的减少。在一些实施方案中,“减少的”或“减少”表示与参考水平相比减少至少10%,例如与参考水平相比减少至少约20%,或至少约30%,或至少约40%,或至少约50%,或至少约60%,或至少约70%,或至少约80%,或至少约90%,或减少高达并包括100%(例如,与参考水平相比无水平或不可检测水平),或10-100%之间的任何减少。在标志物或症状的上下文中,这些术语指的是这种水平的统计学上显著的减少。减少可以是,例如,至少10%、至少20%、至少30%、至少40%或更多,并且优选地降低到对于没有给定疾病的个体来说在正常范围内的可接受水平。The terms “reduced” or “reduction” are generally used herein to indicate a statistically significant reduction. In some embodiments, “reduced” or “reduction” means a reduction of at least 10% compared to a reference level, such as at least about 20%, or at least about 30%, or at least about 40%, or at least about 50%, or at least about 60%, or at least about 70%, or at least about 80%, or at least about 90%, or a reduction of up to and including 100% (e.g., no level or undetectable level compared to the reference level), or any reduction between 10% and 100%. In the context of a biomarker or symptom, these terms refer to a statistically significant reduction of this level. The reduction can be, for example, at least 10%, at least 20%, at least 30%, at least 40%, or more, and is preferably reduced to an acceptable level within the normal range for an individual without a given disease.

本文所公开的“世系”是指个体的遗传谱系。The “lineage” disclosed in this article refers to the genetic lineage of an individual.

如本文所公开的术语“基因型”是指个体基因组内多核苷酸序列的化学组成。As disclosed in this article, the term "genotype" refers to the chemical composition of the polynucleotide sequence within an individual's genome.

本文使用的“治疗”指治疗性治疗和防止或预防措施,其中目的是防止或减缓(减轻)目标病症,预防病症,追求或获得良好的总体结果,或降低个体发展病症的机会,即使治疗最终不成功。在本文提供的一些方面中,需要治疗的受试者包括已经患有疾病或病症的那些受试者,以及容易发展所述疾病或病症的那些受试者或在其中要预防所述疾病或病症的那些受试者。在一些情况下,治疗包括补充剂。补充剂的非限制性示例包括维生素、矿物质、抗氧化剂、益生菌和抗炎剂。在一些情况下,治疗包括药物治疗。在一些情况下,药物治疗包括靶向本文公开的基因或其基因表达产物的抗生素、抗体或小分子化合物。As used herein, “treatment” refers to therapeutic treatment and preventive or preventative measures where the aim is to prevent or alleviate (mitigate) a target condition, prevent the condition, pursue or achieve a good overall outcome, or reduce an individual’s chance of developing the condition, even if the treatment is ultimately unsuccessful. In some aspects provided herein, subjects requiring treatment include those who already have a disease or condition, those who are prone to developing said disease or condition, or those in whom the prevention of said disease or condition is sought. In some cases, treatment includes supplements. Non-limiting examples of supplements include vitamins, minerals, antioxidants, probiotics, and anti-inflammatory agents. In some cases, treatment includes pharmaceutical treatment. In some cases, pharmaceutical treatment includes antibiotics, antibodies, or small molecule compounds that target genes disclosed herein or their gene expression products.

本文公开的“基因型”是指个体基因组内多核苷酸序列的化学组成。在一些实施方案中,基因型包括SNV、单核苷酸多态性(SNP)、indel和/或CNV。如本文所公开的术语“单核苷酸变体”或“单核苷酸变异”或SNV是指多核苷酸序列中单核苷酸的变异。SNV的变异可以有多种不同的形式。单一形式的SNV被称为“等位基因”。例如,从5'到3'读取参考多核苷酸序列是TTACG。(5’-TTACG-3’的)等位基因位置3处的SNV包括将参考等位基因“A”取代为非参考等位基因“C”。如果SNV的“C”等位基因与发展表型性状的概率增加有关,则该等位基因被认为是“风险”等位基因。然而,同样的SNV也可能包含将“A”等位基因取代为“T”等位基因。如果SNV的T等位基因与发展表型性状的概率降低有关,该等位基因被认为是“保护性”等位基因。SNV可以包括单核苷酸多态性(SNP),在一些情况下,是在给定群体的至少1%中观察到的SNV。在一些实施方案中,SNV由“rs”数表示,其是指在dbSNP生物信息学数据库中另一个提交的SNV的参考簇的登录,其特征在于包括5'至3'的核碱基总数的序列,包括提交的变异。在一些实施方案中,SNV还可以由SNV(核碱基)在所提供序列中的位置来定义,其位置总是位于序列的5'长度加1处。在一些实施方案中,SNV被定义为参考基因组中的基因组位置和等位基因变化(例如,在参考人类基因组构建版本37中,染色体7在位置234,123,567处的从G等位基因到A等位基因)。在一些实施方案中,SNV被定义为在本文公开的序列中用非核苷酸字母或代码(例如,IUPAC核苷酸代码)标识的基因组位置。The term "genotype" as disclosed herein refers to the chemical composition of a polynucleotide sequence within an individual's genome. In some embodiments, a genotype includes SNVs, single nucleotide polymorphisms (SNPs), indels, and/or CNVs. The terms "single nucleotide variant" or "single nucleotide variation" or SNV as disclosed herein refer to a variation of a single nucleotide in a polynucleotide sequence. Variations of SNVs can take many different forms. A single form of SNV is referred to as an "allele." For example, the reference polynucleotide sequence read from 5' to 3' is TTACG. An SNV at allele position 3 (5'-TTACG-3') includes the substitution of the reference allele "A" for the non-reference allele "C." If the "C" allele of an SNV is associated with an increased probability of developing the phenotypic trait, that allele is considered a "risk" allele. However, the same SNV may also contain the substitution of the "A" allele for the "T" allele. If the T allele of an SNV is associated with a decreased probability of developing the phenotypic trait, that allele is considered a "protective" allele. SNVs can include single nucleotide polymorphisms (SNPs), and in some cases, SNVs observed in at least 1% of a given population. In some embodiments, an SNV is represented by an “rs” number, which refers to a reference cluster of another submitted SNV in the dbSNP bioinformatics database, characterized by a sequence comprising a total number of 5' to 3' nucleotides, including submitted variations. In some embodiments, an SNV can also be defined by the position of the SNV (nucleotide) within the provided sequence, which is always located at 5' length plus 1 of the sequence. In some embodiments, an SNV is defined as a genomic location and allelic variation in a reference genome (e.g., in reference human genome build version 37, a change from the G allele to the A allele at positions 234,123,567 on chromosome 7). In some embodiments, an SNV is defined as a genomic location in the sequence disclosed herein identified by a non-nucleotide letter or code (e.g., IUPAC nucleotide code).

如本文所公开的,“Indel”是指多核苷酸序列内的核碱基的插入或缺失。在一些实施方案中,indel由“rs”数表示,其是指在dbSNP生物信息学数据库中另一个提交的indel的参考簇的登录,其特征在于包括5'至3'的核碱基总数的序列,包括提交的变异。在一些实施方案中,indel还可以由插入/缺失在所提供序列中的位置来定义,其位置总是位于序列的5'长度加1处。在一些实施方案中,indel定义为参考基因组中的基因组位置和等位基因变化。在一些实施方案中,indel被定义为在本文公开的序列中用非核苷酸字母或代码(例如,IUPAC核苷酸代码)标识的基因组位置。As disclosed herein, an "Indel" refers to an insertion or deletion of a nucleobase within a polynucleotide sequence. In some embodiments, an indel is represented by an "rs" number, which refers to a reference cluster entry for another submitted indel in the dbSNP bioinformatics database, characterized by a sequence comprising a total of 5' to 3' nucleobases, including submitted variations. In some embodiments, an indel may also be defined by the location of the insertion/deletion within the provided sequence, which is always located at 5' length plus 1 of the sequence. In some embodiments, an indel is defined as a genomic location and allelic variation within a reference genome. In some embodiments, an indel is defined as a genomic location within the sequence disclosed herein identified by a non-nucleotide letter or code (e.g., IUPAC nucleotide code).

本文公开的“拷贝数变体”或“拷贝数变异”或“CNV”是指多核苷酸序列的部分重复或缺失的现象,基因组中的重复数在给定群体中的个体之间变化。在一些实施方案中,多核苷酸序列的部分是“短的”,包括大约两个核苷酸(双核苷酸CNV)或三个核苷酸(三核苷酸CNV)。在一些实施方案中,多核苷酸序列的部分是“长的”,包括四个核苷酸和基因的整个长度之间的许多核苷酸。The term "copy number variant" or "CNV" disclosed herein refers to the phenomenon of partial duplication or deletion of a polynucleotide sequence, with the number of duplications in the genome varying among individuals in a given population. In some embodiments, the portion of the polynucleotide sequence is "short," comprising approximately two nucleotides (dinucleotide CNV) or three nucleotides (trinucleotide CNV). In some embodiments, the portion of the polynucleotide sequence is "long," comprising four nucleotides and many nucleotides between the full length of the gene and the entire length of the gene.

“样品”的非限制性示例包括可从中获得核酸和/或蛋白质的任何材料。通过非限制性示例,这包括全血、外周血、血浆、血清、唾液、粘液、尿液、精液、淋巴液、粪便提取物、面颊拭子、细胞或其他体液或组织,包括但不限于通过外科活检或外科切除获得的组织。在各种实施方案中,样品包括来自大肠和/或小肠的组织。在各种实施方案中,大肠样品包括盲肠、结肠(升结肠、横结肠、降结肠和乙状结肠)、直肠和/或肛管。在一些实施方案中,小肠样品包括十二指肠、空肠和/或回肠。可替代地,样品可以通过原代患者来源的细胞系获得,或者是以保存的样品的形式存档的患者样品或者新鲜冷冻的样品。Non-limiting examples of "sample" include any material from which nucleic acids and/or proteins can be obtained. By way of non-limiting examples, this includes whole blood, peripheral blood, plasma, serum, saliva, mucus, urine, semen, lymph, fecal extracts, buccal swabs, cells, or other bodily fluids or tissues, including but not limited to tissues obtained through surgical biopsy or surgical resection. In various embodiments, the sample includes tissue from the large intestine and/or small intestine. In various embodiments, large intestine samples include the cecum, colon (ascending colon, transverse colon, descending colon, and sigmoid colon), rectum, and/or anal canal. In some embodiments, small intestine samples include the duodenum, jejunum, and/or ileum. Alternatively, the sample may be obtained from a primary patient-derived cell line, or from a patient sample archived as a preserved sample, or a freshly frozen sample.

实施例Example

实施例1.计算个体的世系特异性遗传风险得分,其代表所述个体将有更好的有氧运动能力的可能性Example 1. Calculate an individual's lineage-specific genetic risk score, which represents the likelihood that the individual will have better aerobic exercise capacity.

首先,提供个体的基因型。个体的基因型可以是Illumina基因分型阵列的格式。基因型包括对个体特异性的遗传风险变体(个体特异性遗传风险变体)。遗传风险变体可包括单核苷酸变体(SNV)、单核苷酸多态性(SNP)、indel和/或拷贝数变体(CNV)。Illumina基因分型阵列包括对各种SNV、indel、SNP和/或CNV具有特异性的核酸探针。利用主成分分析(PCA)对基因型进行分析以确定个体的世系,确定该个体为非洲裔。First, the individual's genotype is provided. The individual's genotype can be in the format of an Illumina genotyping array. The genotype includes individual-specific genetic risk variants (ANVs). ANVs can include single nucleotide variants (SNVs), single nucleotide polymorphisms (SNPs), indels, and/or copy number variants (CNVs). The Illumina genotyping array includes nucleic acid probes specific to various SNVs, indels, SNPs, and/or CNVs. Principal component analysis (PCA) is used to analyze the genotype to determine the individual's pedigree, identifying the individual as of African descent.

接下来,从与个体具有如通过PCA确定的相同世系(例如非洲)的受试者(世系特异性受试者组)的全基因组关联研究(GWAS)中选择参考遗传变体。世系特异性变体位于已报道的有氧运动能力易感遗传基因座,包括TSHR、ACSL1、PRDM1、DBX1、GRIN3A、ESRRB、ZIC4和/或CDH13,并基于世系特异性遗传变体和有氧运动能力性状之间的强相关性(P=1.0x10-4或更低)进行选择。变体提供在表39中。Next, reference genetic variants were selected from genome-wide association studies (GWAS) of subjects who shared the same lineage as the individuals identified by PCA (e.g., Africa) (the lineage-specific subject group). Lineage-specific variants were located at reported aerobic susceptibility loci, including TSHR, ACSL1, PRDM1, DBX1, GRIN3A, ESRRB, ZIC4, and/or CDH13, and selection was based on a strong correlation (P = 1.0 x 10⁻⁴ or lower) between the lineage-specific variants and the aerobic trait. Variants are provided in Table 39.

表39Table 39

如果个体特异性遗传风险变体未知,意味着对应于个体特异性遗传变体的基因分型阵列的识别号未在上述GWAS中公布,则选择代理遗传变体作为遗传风险计算的基础。如果代理遗传变体与未知个体特异性遗传风险变体处于连锁不平衡(LD)(r2值为至少0.70或D'值为至少约0.20)中,则选择该代理遗传变体,也称为“插补”。If an individual-specific genetic risk variant is unknown, meaning the identification number of the genotyping array corresponding to the individual-specific genetic variant is not published in the aforementioned GWAS, a surrogate variant is selected as the basis for genetic risk calculation. If the surrogate variant is in linkage disequilibrium (LD) with the unknown individual-specific genetic risk variant ( value of at least 0.70 or D' value of at least about 0.20), the surrogate variant is selected, also known as "insertion".

接下来,计算个体特异性原始得分。将数值分配给个体特异性遗传变体内的风险单位(例如,风险等位基因)并且将每个个体特异性遗传变体的所有数值相加在一起,并除以个体特异性遗传变体和/或代理遗传变体的总数,以产生个体特异性原始得分。Next, the individual-specific raw score is calculated. Numerical values are assigned to risk units (e.g., risk alleles) within the individual-specific genetic variant, and all numerical values for each individual-specific genetic variant are summed and divided by the total number of individual-specific genetic variants and/or surrogate variants to produce the individual-specific raw score.

接下来,执行相同的计算以生成世系特异性受试者组内的每个个体的原始得分,由此生成原始得分的观察范围(观察范围)。接下来,将个体特异性原始得分与世系特异性观察范围进行比较,以计算相对于世系特异性受试者群体的风险百分比。接下来,将遗传风险得分(GRS)分配给个体。Next, the same calculations are performed to generate raw scores for each individual within the lineage-specific subject group, thereby generating the observation range (observation range) for the raw scores. Next, the individual-specific raw scores are compared to the lineage-specific observation range to calculate the percentage risk relative to the lineage-specific subject population. Finally, the Genetic Risk Score (GRS) is assigned to the individuals.

例如,为了计算由七个遗传变体组成的针对个体有氧运动能力的GRS,在本实施例中,SNP(具有风险等位基因C的rs7144481,具有风险等位基因G的rs6552828,具有风险等位基因A的rs1049904,具有风险等位基因A的rs10500872,具有风险等位基因G的rs1535628,具有风险等位基因T的rs1289359和具有风险等位基因A的rs1171582)要求通过实际的基因分型或插补确定每个基因型,并计算所有风险等位基因之和的平均值。因此,具有基因型rs7144481(CC)、rs6552828(AA)、rs1049904(GG)、rs10500872(AG)、rs1535628(AA)、rs1289359(CT)、rs1171582(AA)的个体分别具有2、0、0、1、0、1和2个风险等位基因,总和为6,其中平均遗传风险得分为0.86(=6/7;风险等位基因除以组成模型的变体总数)。表40根据所提供的实施例提供了示例性计算。For example, in order to calculate the GRS for an individual's aerobic capacity, which consists of seven genetic variants, in this embodiment, the SNPs (rs7144481 with risk allele C, rs6552828 with risk allele G, rs1049904 with risk allele A, rs10500872 with risk allele A, rs1535628 with risk allele G, rs1289359 with risk allele T, and rs1171582 with risk allele A) require each genotype to be determined by actual genotyping or imputation, and the average of the sum of all risk alleles to be calculated. Therefore, individuals with genotypes rs7144481 (CC), rs6552828 (AA), rs1049904 (GG), rs10500872 (AG), rs1535628 (AA), rs1289359 (CT), and rs1171582 (AA) have 2, 0, 0, 1, 0, 1, and 2 risk alleles, respectively, totaling 6, with an average genetic risk score of 0.86 (=6/7; risk alleles divided by the total number of variants constituting the model). Table 40 provides exemplary calculations based on the provided embodiments.

表40Table 40

对于世系特异性群体类似地计算GRS得分。当个体的GRS得分与来自同一世系特异性群体的GRS得分分布相比较时,个体的GRS得分在第50百分位数。该个体被预测具有平均的有氧运动能力。The GRS score is calculated similarly for lineage-specific populations. When an individual's GRS score is compared to the distribution of GRS scores from the same lineage-specific population, the individual's GRS score is at the 50th percentile. This individual is predicted to have average aerobic capacity.

实施例2.计算个体的世系特异性遗传风险得分,其代表所述个体将经历胶原分解的可能性Example 2. Calculate an individual's lineage-specific genetic risk score, which represents the likelihood that the individual will experience collagen breakdown.

首先,提供个体的基因型。个体的基因型可以是Illumina基因分型阵列的格式。基因型包括对个体特异性的遗传风险变体(个体特异性遗传风险变体)。遗传风险变体可包括单核苷酸变体(SNV)、单核苷酸多态性(SNP)、indel和/或拷贝数变体(CNV)。Illumina基因分型阵列包括对各种SNV、SNP和/或CNV具有特异性的核酸探针。利用主成分分析(PCA)对基因型进行分析以确定个体的世系,确定该个体为中国人。First, the individual's genotype is provided. The individual's genotype can be in the format of an Illumina genotyping array. The genotype includes individual-specific genetic risk variants (INVs). These genetic risk variants can include single nucleotide variants (SNVs), single nucleotide polymorphisms (SNPs), indels, and/or copy number variants (CNVs). The Illumina genotyping array includes nucleic acid probes specific to various SNVs, SNPs, and/or CNVs. Principal component analysis (PCA) is used to analyze the genotype to determine the individual's pedigree, identifying the individual as Chinese.

接下来,从GWAS中选择参考遗传变体。这些变体位于已报道的胶原分解易感遗传基因座MMP1、MMP3和MMP9处,并基于遗传变异与身体素质性状之间的强相关性(P=1.0x10-4或更低)进行选择。变体提供在表41中。Next, reference genetic variants were selected from the GWAS. These variants were located at the reported collagen breakdown susceptibility loci MMP1, MMP3, and MMP9, and were selected based on a strong correlation between the genetic variants and physical traits (P = 1.0 x 10⁻⁴ or lower). The variants are provided in Table 41.

表41Table 41

如果个体特异性遗传风险变体未知,意味着对应于个体特异性遗传变体的阵列识别号未在上述GWAS中公布,则选择代理遗传变体作为遗传风险计算的基础。如果代理遗传变体与未知个体特异性遗传风险变体处于连锁不平衡(LD)(r2值为至少0.70或D'值为至少约0.20,基于与个体具有相同世系的受试者)中,则选择该代理遗传变体。If an individual-specific genetic risk variant is unknown, meaning the array identification number corresponding to the individual-specific genetic variant is not published in the aforementioned GWAS, then a surrogate genetic variant is selected as the basis for genetic risk calculation. If the surrogate genetic variant is in linkage disequilibrium (LD) with the unknown individual-specific genetic risk variant ( value of at least 0.70 or D' value of at least approximately 0.20, based on subjects of the same lineage as the individual), then the surrogate genetic variant is selected.

接下来,计算个体特异性原始得分。将数值分配给个体特异性遗传变体内的风险单位(例如,风险等位基因)并且将每个个体特异性遗传变体的所有数值相加在一起,并除以个体特异性遗传变体或代理遗传变体的总数,以产生个体特异性原始得分。Next, the individual-specific raw score is calculated. Numerical values are assigned to risk units (e.g., risk alleles) within the individual-specific genetic variant, and all numerical values for each individual-specific genetic variant are summed and divided by the total number of individual-specific genetic variants or surrogate variants to produce the individual-specific raw score.

接下来,执行相同的计算以生成世系特异性受试者组内的每个个体的原始得分,由此生成原始得分的观察范围(观察范围)。接下来,将个体特异性原始得分与世系特异性观察范围进行比较,以计算相对于世系特异性受试者群体的风险百分比。接下来,将遗传风险得分(GRS)分配给个体。Next, the same calculations are performed to generate raw scores for each individual within the lineage-specific subject group, thereby generating the observation range (observation range) for the raw scores. Next, the individual-specific raw scores are compared to the lineage-specific observation range to calculate the percentage risk relative to the lineage-specific subject population. Finally, the Genetic Risk Score (GRS) is assigned to the individuals.

例如,为了计算个体的由两个遗传变体组成的针对胶原分解性状的GRS,在本实施例中SNP(具有风险等位基因G的rs495366,具有风险等位基因G的rs11226373)要求通过实际的基因分型或插补来确定每个基因型,并计算所有风险等位基因之和的平均值。因此,具有基因型rs495366(GG)、rs11226373(GA)的个体分别具有2和1个风险等位基因,总和为3,平均遗传风险得分为1.5(=3/2;风险等位基因除以组成模型的变体总数)。表42根据本实施例提供了示例性计算。For example, to calculate the GRS for the collagen breakdown trait in an individual, which consists of two genetic variants, in this embodiment, the SNPs (rs495366 with risk allele G and rs11226373 with risk allele G) require determining each genotype through actual genotyping or imputation and calculating the average of the sums of all risk alleles. Therefore, individuals with genotypes rs495366 (GG) and rs11226373 (GA) have 2 and 1 risk alleles respectively, totaling 3, with an average genetic risk score of 1.5 (=3/2; risk alleles divided by the total number of variants constituting the model). Table 42 provides exemplary calculations according to this embodiment.

表42Table 42

对于世系特异性群体类似地计算GRS得分。当个体的GRS得分与来自同一世系特异性群体的GRS得分分布相比较时,个体的GRS得分在第90百分位数。该个体被预测具有很高的胶原分解的风险,并被建议给他们的皮肤补水和涂胶原霜。The GRS score was calculated similarly for lineage-specific populations. An individual's GRS score was ranked at the 90th percentile when compared to the distribution of GRS scores from the same lineage-specific population. This individual was predicted to have a high risk of collagen breakdown and was advised to moisturize their skin and apply collagen cream.

实施例3.计算个体的世系特异性遗传风险得分,其代表所述个体将经历维生素A缺乏的可能性Example 3. Calculate an individual's lineage-specific genetic risk score, which represents the likelihood that the individual will experience vitamin A deficiency.

首先,提供个体的基因型。个体的基因型可以是Illumina基因分型阵列的格式。基因型包括对个体特异性的遗传风险变体(个体特异性遗传风险变体)。遗传风险变体可包括单核苷酸变体(SNV)、单核苷酸多态性(SNP)、indel和/或拷贝数变体(CNV)。Illumina基因型芯片包括对各种SNV、SNP、indel和/或CNV具有特异性的核酸探针。利用主成分分析(PCA)对基因型进行分析以确定个体的世系,确定该个体为中国人。First, the individual's genotype is provided. The individual's genotype can be in the format of an Illumina genotyping array. The genotype includes individual-specific genetic risk variants (INVs). These genetic risk variants can include single nucleotide variants (SNVs), single nucleotide polymorphisms (SNPs), indels, and/or copy number variants (CNVs). The Illumina genotyping array includes nucleic acid probes specific to various SNVs, SNPs, indels, and/or CNVs. Principal component analysis (PCA) is used to analyze the genotype to determine the individual's pedigree, identifying the individual as Chinese.

接下来,从发表在高影响力杂志上的GWAS中选择参考遗传变体。这些变体位于已报道的维生素A缺乏易感遗传基因座BCMO1、FFAR4和TTR处,并基于遗传变异与营养性状之间的强相关性(P=1.0x10-4或更低)进行选择。世系特异性变体提供在表43中。Next, reference genetic variants were selected from GWAS published in high-impact journals. These variants were located at the reported vitamin A deficiency susceptibility loci BCMO1, FFAR4, and TTR, and were selected based on a strong correlation between genetic variation and nutritional traits (P = 1.0 x 10⁻⁴ or lower). Lineage-specific variants are provided in Table 43.

表43Table 43

如果个体特异性遗传风险变体未知,意味着对应于个体特异性遗传变体的阵列识别号未在上述GWAS中公布,则选择代理遗传变体作为遗传风险计算的基础。如果代理遗传变体与未知个体特异性遗传风险变体处于连锁不平衡(LD)(r2值为至少0.70或D'值为至少约0.20,基于与个体具有相同世系的受试者)中,则选择该代理遗传变体。If an individual-specific genetic risk variant is unknown, meaning the array identification number corresponding to the individual-specific genetic variant is not published in the aforementioned GWAS, then a surrogate genetic variant is selected as the basis for genetic risk calculation. If the surrogate genetic variant is in linkage disequilibrium (LD) with the unknown individual-specific genetic risk variant ( value of at least 0.70 or D' value of at least approximately 0.20, based on subjects of the same lineage as the individual), then the surrogate genetic variant is selected.

接下来,计算个体特异性原始得分。将数值分配给个体特异性遗传变体内的风险单位(例如,风险等位基因)并且将每个个体特异性遗传变体的所有数值相加在一起,并除以个体特异性遗传变体或代理遗传变体的总数,以产生个体特异性原始得分。Next, the individual-specific raw score is calculated. Numerical values are assigned to risk units (e.g., risk alleles) within the individual-specific genetic variant, and all numerical values for each individual-specific genetic variant are summed and divided by the total number of individual-specific genetic variants or surrogate variants to produce the individual-specific raw score.

接下来,执行相同的计算以生成世系特异性受试者组内的每个个体的原始得分,由此生成原始得分的观察范围(观察范围)。接下来,将个体特异性原始得分与世系特异性观察范围进行比较,以计算相对于世系特异性受试者群体的风险百分比。接下来,将遗传风险得分(GRS)分配给个体。Next, the same calculations are performed to generate raw scores for each individual within the lineage-specific subject group, thereby generating the observation range (observation range) for the raw scores. Next, the individual-specific raw scores are compared to the lineage-specific observation range to calculate the percentage risk relative to the lineage-specific subject population. Finally, the Genetic Risk Score (GRS) is assigned to the individuals.

例如,为了计算个体的由三个遗传变体组成的针对维生素A缺乏性状的GRS,在本实施例中SNP(具有风险等位基因T的rs6564851,具有风险等位基因C的rs1082272,具有风险等位基因A的rs1667255)要求通过实际的基因分型或插补来确定每个基因型,并计算所有风险等位基因之和的平均值。因此,具有基因型rs6564851(TG)、rs1082272(TT)和rs1667255(AC)的个体分别具有1、0和1个风险等位基因,总和为2,平均遗传风险得分为1.67(=2/3;风险等位基因除以组成模型的变体总数)。表44根据本实施例提供了示例性计算。For example, to calculate the GRS for a vitamin A deficiency trait consisting of three genetic variants in an individual, in this embodiment, the SNPs (rs6564851 with risk allele T, rs1082272 with risk allele C, and rs1667255 with risk allele A) require determining each genotype through actual genotyping or imputation and calculating the average of the sums of all risk alleles. Therefore, individuals with genotypes rs6564851 (TG), rs1082272 (TT), and rs1667255 (AC) have 1, 0, and 1 risk alleles, respectively, totaling 2, with an average genetic risk score of 1.67 (=2/3; risk alleles divided by the total number of variants constituting the model). Table 44 provides an exemplary calculation according to this embodiment.

表44Table 44

在世系特异性群体中类似地计算GRS得分。当个体的GRS得分与来自同一世系特异性群体的GRS得分分布相比较时,个体的GRS得分高于平均值1个标准差。个体被预测有维生素A缺乏的风险,并被建议服用维生素A补充剂。GRS scores were calculated similarly in ancestry-specific populations. When an individual's GRS score was compared to the distribution of GRS scores from the same ancestry-specific population, the individual's GRS score was one standard deviation above the mean. The individual was predicted to be at risk of vitamin A deficiency and was advised to take vitamin A supplements.

实施例4.针对酒精潮红反应的世系特异性遗传风险得分Example 4. Lineage-specific genetic risk score for alcohol flushing

酒精潮红反应是一种病症,其中个体饮用酒精饮料后在面部、颈部、肩膀并且在一些情况下在整个身体出现潮红或疹斑。大约三分之一的东亚裔人因饮酒而面部潮红。编码醇脱氢酶1B(ADH1B)的基因中的rs671等位基因A的单核苷酸多态性与潮红反应有关。该SNP存在于本文公开的性状相关变体数据库中。作为示例而不是限制,表45-46通过使用主要的欧洲参考群体计算遗传风险得分(GRS),与使用由与受试者具有相同世系的个体组成的参考群体计算GRS相比,显示个体发展或将发展酒精潮红反应的可能性。得分大于或等于1的个体被预测具有中度酒精潮红反应,因为rs671以显性阴性方式作用。Alcohol flushing is a condition in which an individual experiences flushing or rashes on the face, neck, shoulders, and in some cases, the entire body after consuming an alcoholic beverage. Approximately one-third of East Asians experience facial flushing due to alcohol consumption. A single nucleotide polymorphism (SNP) in the rs671 allele A of the gene encoding alcohol dehydrogenase 1B (ADH1B) is associated with flushing. This SNP is present in the trait-related variant database disclosed herein. As an example, and not a limitation, Tables 45-46 show the likelihood of an individual developing or being developed alcohol flushing by calculating a genetic risk score (GRS) using a major European reference population, compared to calculating a GRS using a reference population consisting of individuals of the same ancestry as the subject. Individuals with a score greater than or equal to 1 are predicted to have moderate alcohol flushing because rs671 acts in a dominant-negative manner.

群组.分析了包括来自1669个个体的基因型的数据集。在1669个个体的数据集中,193个具有欧洲世系(EUR),1476个具有东亚世系(EAS)。世系特异性变体(rs671)位于已报道的醇脱氢酶ADH1B易感遗传基因座处,并且基于与参考群体中的酒精潮红反应性状的强相关性进行选择。The cohort analysis included a dataset comprising genotypes from 1669 individuals. Of the 1669 individuals, 193 had European lineage (EUR) and 1476 had East Asian lineage (EAS). The lineage-specific variant (rs671) was located at the previously reported susceptibility locus for the alcohol dehydrogenase ADH1B and was selected based on a strong correlation with the alcohol flushing trait in the reference population.

基因分型.从每个受试者获取唾液样品。将样品在IlluminaCore-24BeadChip平台(Illumina,Inc.,San Diego,Calif.92121)上进行基因分型。根据IlluminaCore-24平台的制造说明,采用了质量控制措施。如果出现以下情况,则从分析中排除SNP:基因分型率<93%;SNP丢失率>10%,并且小等位基因频率<0.01。Genotyping. Saliva samples were obtained from each subject. Samples were genotyped on an Illumina Core-24 BeadChip platform (Illumina, Inc., San Diego, Calif. 92121). Quality control measures were performed according to the Illumina Core-24 platform's manufacturing instructions. SNPs were excluded from the analysis if: genotyping accuracy <93%; SNP loss rate >10%; and small allele frequency <0.01.

向受试者分配世系.利用基因型的主成分分析(PCA)进行群体分组分析。对于每个受试者,使用基于距离的方法(K-均值)计算离样品最近的群体并分配世系。Patriarchal assignment was performed on the subjects. Principal component analysis (PCA) of genotypes was used for population grouping analysis. For each subject, the closest group to the sample was calculated using a distance-based method (K-means), and pedigrees were assigned.

rs671的插补.为了进行酒精反应性状评分,插补rs671基因型。插补是使用来自1000个基因组的参考群体完成,这些基因组要么是世系特异性的,要么不是世系特异性的,以进行比较。Imputation of rs671. The rs671 genotype was imputed for alcohol response trait scoring. Imputation was performed using a reference population of 1000 genomes, which were either pedigree-specific or non-pedigree-specific, for comparison.

世系特异性参考群体.受试者的基因型是根据其所分配的世系来插补的。从“Aglobal reference for human genetic variations,”Nature 526,68-74(2015年10月1日)中描述的1000基因组项目中选择世系特异性参考群体。如果受试者在前一步中被指定为东亚世系,那么受试者的基因型将使用来自1000个基因组的东亚群体作为参考群体来插补。如果受试者在前一步中被指定为欧洲世系,那么受试者的基因型将使用来自1000个基因组的欧洲群体作为参考群体来插补。 Lineage-specific reference populations . The genotypes of subjects were imputed based on their assigned lineage. Lineage-specific reference populations were selected from the 1000 Genomes project described in “A global reference for human genetic variations,” Nature 526, 68-74 (October 1, 2015). If a subject was assigned an East Asian lineage in the previous step, their genotype was imputed using an East Asian population from the 1000 Genomes project as the reference population. If a subject was assigned a European lineage in the previous step, their genotype was imputed using a European population from the 1000 Genomes project as the reference population.

非世系特异性参考群体.为了比较GRS的准确性,使用非世系特异性的参考群体,也选择了参考遗传变体。受试者的基因型使用来自1000个基因组的欧洲群体作为参考群体来插补。 Non-lineage-specific reference population . To compare the accuracy of GRS, a non-lineage-specific reference population was used, and a reference genetic variant was also selected. The genotypes of the subjects were imputed using a European population from 1000 genomes as the reference population.

GRS的计算.对具有基于rs671插补的基因型的酒精潮红得分的受试者进行分析,其中使用世系特异性参考群体和非世系特异性参考群体进行比较。受试者的酒精潮红反应得分是根据非参考等位基因的数量计算的。GRS calculation. Analysis was performed on subjects with alcohol flushing scores based on rs671-interpolated genotypes, comparing them with pedigree-specific and non-pedigree-specific reference populations. Subjects' alcohol flushing response scores were calculated based on the number of non-reference alleles.

结果.表45说明使用单一参考群体的流水线(pipeline),其中不考虑世系,而表46说明世系特异性流水线,其中流水线特异于个体的世系。据了解,36%-45%的东亚人因饮酒而出现面部潮红。这准确地体现在表46中,其中世系特异性流水线预测了,具有东亚世系的个体中的42%已经或将要发展酒精潮红反应。相比之下,表45说明当不考虑亚洲世系时,没有东亚人个体被预测有酒精潮红反应,这与已知的不匹配。“EUR”指欧洲人,“EAS”指东亚人。Results. Table 45 illustrates a pipeline using a single reference population, without considering lineage, while Table 46 illustrates a lineage-specific pipeline, where the pipeline is specific to an individual's lineage. It is understood that 36%–45% of East Asians experience facial flushing due to alcohol consumption. This is accurately reflected in Table 46, where the lineage-specific pipeline predicted that 42% of individuals with East Asian lineages had already developed or would develop an alcohol flushing response. In contrast, Table 45 shows that no East Asian individuals were predicted to have an alcohol flushing response when Asian lineages were not considered, which is a known mismatch. "EUR" refers to Europeans, and "EAS" refers to East Asians.

表45.使用单一参考群体(EUR)的流水线。不考虑世系。EUR:欧洲人Table 45. Production lines using a single reference population (EUR). Lineage is not considered. EUR: European

表46.世系特异性流水线(对个体的世系特异性的流水线,其中EUR用于EUR个体,EAS用于EAS个体)。EUR:欧洲人;EAS:东亚人Table 46. Lineage-Specific Pipelines (Lineage-specific pipelines for individuals, where EUR is used for EUR individuals and EAS for EAS individuals). EUR: European; EAS: East Asian

实施例5.乳糖耐受的世系特异性遗传风险得分Example 5. Lineage-specific genetic risk score for lactose tolerance

乳糖耐受性个体是如下成年人:他们可以食用动物奶和动物奶产品,而不会出现乳糖不耐受症状,如腹胀、疼痛、抽筋、腹泻、胀气或恶心的风险。乳糖耐受的遗传性状与乳糖酶基因(LCT)上游约14kb调控区中的功能性单核苷酸变体(SNV)有关,包括-13910*T(rs4988235)、-13915*g(rs41380347),以及-14010*c(rs145946881)。这三个SNV存在于本文公开的性状相关变体数据库中。作为示例而不是限制,表47-48通过使用主要的欧洲参考群体计算遗传风险得分(GRS),与使用由与受试者具有相同世系的个体组成的参考群体计算GRS相比,显示个体是乳糖耐受还是乳糖不耐受。得分等于零的个体被预测为乳糖不耐受,得分大于零的个体被预测为乳糖耐受,因为一个等位基因足以代谢乳糖。Lactose-tolerant individuals are adults who can consume animal milk and animal milk products without experiencing lactose intolerance symptoms such as bloating, pain, cramps, diarrhea, flatulence, or nausea. The heritable trait of lactose tolerance is associated with functional single nucleotide variants (SNVs) in the regulatory region approximately 14 kb upstream of the lactase gene (LCT), including -13910*T (rs4988235), -13915*g (rs41380347), and -14010*c (rs145946881). These three SNVs are present in the trait-related variant database published herein. As an example, and not a limitation, Tables 47-48 show whether an individual is lactose-tolerant or lactose-intolerant by calculating a genetic risk score (GRS) using a major European reference population, compared to calculating a GRS using a reference population consisting of individuals of the same ancestry as the subject. Individuals with a score of zero are predicted to be lactose intolerant, while individuals with a score greater than zero are predicted to be lactose tolerant, because one allele is sufficient to metabolize lactose.

群组.分析了包括1669个体的数据集。在1669个个体的数据集中,193个具有欧洲世系(EUR),1476个具有东亚世系(EAS)。位于LCT基因上游的已报道易感遗传基因座处的世系特异性变体13910*T(rs4988235)、-13915*G(rs41380347)以及-14010*C(rs145946881)基于与参考群体中酒精潮红反应性状的强相关性进行选择。The cohort analysis included a dataset of 1669 individuals. Of these 1669 individuals, 193 had a European lineage (EUR) and 1476 had an East Asian lineage (EAS). Lineage-specific variants 13910*T (rs4988235), -13915*G (rs41380347), and -14010*C (rs145946881), located upstream of the LCT gene at previously reported susceptibility loci, were selected based on their strong correlation with the alcohol flushing trait in the reference population.

基因分型.从受试者获取唾液样品。将样品在IlluminaCore BeadChip平台(Illumina,Inc.,San Diego,Calif.92121)上进行基因分型。根据IlluminaCore-24平台的制造说明,采用了质量控制措施。如果出现以下情况,则从分析中排除SNV:基因分型率<93%;SNP丢失率>10%,并且小等位基因频率<0.01。Genotyping. Saliva samples were obtained from the subjects. Samples were genotyped on an Illumina Core BeadChip platform (Illumina, Inc., San Diego, Calif. 92121). Quality control measures were performed according to the Illumina Core-24 platform's manufacturing instructions. SNVs were excluded from the analysis if: genotyping accuracy <93%; SNP loss rate >10%; and small allele frequency <0.01.

向受试者分配世系.利用基因型的主成分分析(PCA)进行群体分组分析。对于每个受试者,使用基于距离的方法(K-均值)计算最近群体样品并分配世系。Patriarchal assignment was performed to the subjects. Principal component analysis (PCA) of genotypes was used for population grouping analysis. For each subject, the nearest population sample was calculated and pedigrees were assigned using a distance-based method (K-means).

rs4988235、rs41380347和rs145946881的插补.为了进行乳糖耐受性评分,插补rs4988235、rs41380347和rs145946881基因型。插补是使用来自1000个基因组的参考群体完成,这些基因组要么是世系特异性的,要么不是世系特异性的,以进行比较。Imputation of rs4988235, rs41380347, and rs145946881. Genotypes of rs4988235, rs41380347, and rs145946881 were imputed for lactose tolerance scoring. Imputation was performed using a reference population of 1000 genomes, which were either pedigree-specific or non-pedigree-specific, for comparison.

世系特异性参考群体.受试者的基因型是根据其所分配的世系来插补的。如果受试者在前一步中被指定为东亚世系,那么受试者的基因型将使用来自1000个基因组的东亚群体作为参考群体来插补。如果受试者在前一步中被指定为欧洲世系,那么受试者的基因型将使用来自1000个基因组的欧洲群体作为参考群体来插补。 Lineage-specific reference population . The subject's genotype is imputed based on their assigned lineage. If the subject was designated as an East Asian lineage in the previous step, their genotype will be imputed using an East Asian population from 1000 genomes as the reference population. If the subject was designated as a European lineage in the previous step, their genotype will be imputed using a European population from 1000 genomes as the reference population.

非世系特异性参考群体.为了比较GRS的准确性,使用非世系特异性的参考群体,受试者的基因型使用来自1000个基因组的欧洲群体作为参考群体来插补。 Non-lineage-specific reference population . To compare the accuracy of GRS, a non-lineage-specific reference population was used, with the genotypes of the subjects being imputed using a European population of 1000 genomes as the reference population.

GRS的计算.对rs4988235、rs41380347和rs145946881插补基因型的乳糖耐受性进行了比较分析,其中使用世系特异性参考群体和非世系特异性参考群体以进行比较。受试者的乳糖耐受得分是根据非参考等位基因的分数计算的。GRS calculation. Comparative analyses were performed on lactose tolerance of the rs4988235, rs41380347, and rs145946881 intercalated genotypes, using pedigree-specific and non-pedigree-specific reference populations for comparison. Subjects' lactose tolerance scores were calculated based on scores from non-reference alleles.

结果.表47说明使用单一参考群体的流水线,其中不考虑世系,而表48说明世系特异性流水线,其中流水线特异于个体的世系。据了解,在靠近赤道的亚洲国家,98%的人口无法消化乳糖,是乳糖不耐受的。这准确地体现在表48中,其中世系特异性流水线预测了,居住在新加坡的具有东亚世系的个体100%都是乳糖不耐受的。相比之下,表47说明当不考虑亚洲世系时,57%的东亚个体被预测为乳糖耐受,这与已知的不匹配。“EUR”指欧洲人,“EAS”指东亚人。Results. Table 47 illustrates the pipeline using a single reference population, without considering lineage, while Table 48 illustrates a lineage-specific pipeline, where the pipeline is specific to the individual's lineage. It is understood that in equatorial Asian countries, 98% of the population is lactose intolerant. This is accurately reflected in Table 48, where the lineage-specific pipeline predicts that 100% of individuals with East Asian lineages residing in Singapore are lactose intolerant. In contrast, Table 47 shows that 57% of East Asian individuals are predicted to be lactose tolerant when Asian lineages are not considered, which is a known mismatch. "EUR" refers to Europeans, and "EAS" refers to East Asians.

表47.使用不考虑世系的单一参考群体的流水线Table 47. Production lines using a single reference group without considering lineage

表48.世系特异性流水线(对个体的世系特异性的流水线,其中EUR用于EUR个体,EAS用于EAS个体)。Table 48. Lineage-specific pipelines (lineage-specific pipelines for individuals, where EUR is used for EUR individuals and EAS is used for EAS individuals).

实施例7.基因符号和基因名称Example 7. Gene Symbols and Gene Names

本文公开了多个代表目标人类基因的基因符号。表49提供了基因列表和相应的基因名称。This article discloses several gene symbols representing target human genes. Table 49 provides a list of genes and their corresponding names.

表49Table 49

虽然本文已经示出和描述了在本文公开的方法、介质和系统的优选实施方案,但这些实施方案仅作为示例提供。在不脱离本文公开的方法、介质和系统的情况下,可以进行许多变化、改变和替换。应当理解,在实践本文所公开的本发明构思时,可以使用本文所公开的方法、介质和系统的实施方案的各种替代方案。所附权利要求旨在限定所述方法、介质和系统的范围,并且由此涵盖这些权利要求范围内的方法和结构及其等同方案。While preferred embodiments of the methods, media, and systems disclosed herein have been shown and described, these embodiments are provided by way of example only. Many variations, modifications, and substitutions can be made without departing from the methods, media, and systems disclosed herein. It should be understood that various alternatives to the embodiments of the methods, media, and systems disclosed herein can be used in practicing the inventive concept disclosed herein. The appended claims are intended to define the scope of the methods, media, and systems, and thereby cover the methods and structures within the scope of these claims and their equivalents.

Claims (20)

1.一种基于个体的世系确定所述个体具有或将发展特定表型性状的可能性的计算机实现的方法,所述方法包括:1. A computer-implemented method for determining the likelihood that an individual possesses or will develop a specific phenotypic trait based on its pedigree, the method comprising: a.使用基于距离或基于模型的计算机程序分配所述个体的世系以分析所述个体的基因型,所述基因型包括一个或多个个体特异性遗传变体;a. Assigning pedigrees to the individuals using distance-based or model-based computer programs to analyze the individuals' genotypes, which include one or more individual-specific genetic variants; b.从包括源自与所述个体具有相同世系的受试者的世系特异性遗传变体的性状相关变体数据库,至少部分地基于所述个体的所述世系选择一个或多个世系特异性遗传变体,其中所述一个或多个世系特异性遗传变体中的每一个对应于:b. Selecting one or more lineage-specific genetic variants from a database of trait-related variants, including those derived from lineage-specific genetic variants of subjects with the same lineage as the individual, at least in part based on the individual's lineage, wherein each of the one or more lineage-specific genetic variants corresponds to: i.所述一个或多个个体特异性遗传变体中的个体特异性遗传变体,或i. The individual-specific genetic variant among the one or more individual-specific genetic variants, or ii.在与所述个体具有相同世系的受试者群体中与所述一个或多个个体特异性遗传变体中的个体特异性遗传变体处于连锁不平衡(LD)的预先确定的遗传变体,其中所述预先确定的遗传变体是通过以下预先确定的:ii. A predetermined genetic variant in linkage disequilibrium (LD) with an individual-specific genetic variant among the one or more individual-specific genetic variants in a subject population of the same lineage as the individual, wherein the predetermined genetic variant is determined by: 处理1:将来自所述个体的未分型基因型数据分型,以基于所述个体的所述世系产生个体特异性分型单倍型;Process 1: Type the untyped genotype data from the individual to generate individual-specific haplotypes based on the individual's lineage; 处理2:使用来自与所述个体具有相同世系的参考组的分型单倍型数据,插补分型的所述个体特异性分型单倍型中不存在的个体特异性基因型;以及Process 2: Using haplotype data from a reference group with the same pedigree as the individual, imput individual-specific genotypes not present in the individual-specific haplotypes of the individual being genotyped; and 处理3:从插补的个体特异性基因型中选择与所述个体特异性遗传变体匹配的遗传变体,所述个体特异性遗传变体与所述个体具有或将发展特定表型性状的可能性相关联、并对应于所述一个或多个世系特异性遗传变体,Process 3: Select a genetic variant from the imputed individual-specific genotype that matches the individual-specific genetic variant, which is associated with the individual's likelihood of having or developing a specific phenotypic trait and corresponds to one or more lineage-specific genetic variants. 其中所述一个或多个世系特异性遗传变体中的每一个和所述一个或多个个体特异性遗传变体中的每一个包括一个或多个风险单位;以及Each of the one or more lineage-specific genetic variants and each of the one or more individual-specific genetic variants includes one or more risk units; and c.基于所选择的一个或多个世系特异性遗传变体计算对于所述个体的遗传风险得分,c. Calculate a genetic risk score for the individual based on one or more selected lineage-specific genetic variants. 其中所述遗传风险得分指示所述个体具有或将发展所述特定表型性状的可能性。The genetic risk score indicates the likelihood that the individual has or will develop the specific phenotypic trait. 2.如权利要求1所述的方法,其中所述一个或多个世系特异性遗传变体、所述一个或多个个体特异性遗传变体和与所述一个或多个个体特异性遗传变体处于所述LD的所述预先确定的遗传变体包括单核苷酸变体(SNV)、indel和/或拷贝数变体(CNV)。2. The method of claim 1, wherein the one or more lineage-specific genetic variants, the one or more individual-specific genetic variants, and the predetermined genetic variants at the LD with the one or more individual-specific genetic variants include single nucleotide variants (SNVs), indels, and/or copy number variants (CNVs). 3.如权利要求2所述的方法,其中所述SNV的一个或多个风险单位包括风险等位基因;所述indel的一个或多个风险单位包括核苷酸的插入或缺失;并且所述CNV的一个或多个风险单位包括核酸序列的插入或缺失。3. The method of claim 2, wherein one or more risk units of the SNV include risk alleles; one or more risk units of the indel include nucleotide insertions or deletions; and one or more risk units of the CNV include nucleic acid sequence insertions or deletions. 4.如权利要求1所述的方法,还包括向所述个体提供通知,所述通知包括所述个体具有或将发展所述特定表型性状的风险。4. The method of claim 1, further comprising providing the individual with a notification that includes a risk that the individual has or will develop the particular phenotypic trait. 5.如权利要求1所述的方法,其中所述特定表型性状包括营养性状、临床性状、亚临床性状、体育锻炼性状、皮肤性状、毛发性状、过敏性状或精神性状。5. The method of claim 1, wherein the specific phenotypic trait includes nutritional traits, clinical traits, subclinical traits, physical activity traits, skin traits, hair traits, allergic traits, or psychological traits. 6.如权利要求4所述的方法,其中所述通知还包括与所述特定表型性状相关的行为改变的建议。6. The method of claim 4, wherein the notification further includes a suggestion of behavioral change related to the particular phenotypic trait. 7.如权利要求6所述的方法,其中与所述特定表型性状相关的所述行为改变包括增加、减少或避免一活动,所述活动包括:进行体育锻炼;摄入药物、维生素或补充剂;接触产品;使用产品;饮食改变;睡眠改变;酒精消耗或咖啡因消耗。7. The method of claim 6, wherein the behavioral change associated with the particular phenotypic trait includes increasing, decreasing, or avoiding an activity, said activity including: engaging in physical exercise; ingesting drugs, vitamins, or supplements; contact with products; use of products; dietary changes; sleep changes; alcohol consumption or caffeine consumption. 8.如权利要求1所述的方法,其中所述基于距离的计算机程序是主成分分析,并且其中所述基于模型的计算机程序是最大似然或贝叶斯方法。8. The method of claim 1, wherein the distance-based computer program is principal component analysis, and wherein the model-based computer program is a maximum likelihood or Bayesian method. 9.一种健康报告系统,包括:9. A health reporting system, comprising: 计算设备,所述计算设备包括至少一个处理器、存储器和软件程序,所述软件程序包括可由至少一个处理器执行以评估个体具有或将发展特定表型性状的可能性的指令,所述指令包括以下步骤:A computing device, comprising at least one processor, memory, and software program, the software program including instructions executable by at least one processor to assess the likelihood that an individual has or will develop a particular phenotypic trait, the instructions comprising the following steps: a.使用基于距离或基于模型的计算机程序分配所述个体的世系,以分析所述个体的基因型,所述基因型包括一个或多个个体特异性遗传变体;a. Assigning the lineage of the individuals using distance-based or model-based computer programs to analyze the genotypes of the individuals, the genotypes including one or more individual-specific genetic variants; b.从包括源自与所述个体具有相同世系的受试者的世系特异性遗传变体的性状相关变体数据库,至少部分地基于所述个体的所述世系选择一个或多个世系特异性遗传变体,其中所述一个或多个世系特异性遗传变体中的每一个对应于:b. Selecting one or more lineage-specific genetic variants from a database of trait-related variants, including those derived from lineage-specific genetic variants of subjects with the same lineage as the individual, at least in part based on the individual's lineage, wherein each of the one or more lineage-specific genetic variants corresponds to: i.所述一个或多个个体特异性遗传变体中的个体特异性遗传变体,或i. The individual-specific genetic variant among the one or more individual-specific genetic variants, or ii.在与所述个体具有相同世系的受试者群体中与所述一个或多个个体特异性遗传变体中的个体特异性遗传变体处于连锁不平衡(LD)的预先确定的遗传变体,其中所述预先确定的遗传变体是通过以下预先确定的:ii. A predetermined genetic variant in linkage disequilibrium (LD) with an individual-specific genetic variant among the one or more individual-specific genetic variants in a subject population of the same lineage as the individual, wherein the predetermined genetic variant is determined by: 处理1:将来自所述个体的未分型基因型数据分型,以基于所述个体的所述世系产生个体特异性分型单倍型;Process 1: Type the untyped genotype data from the individual to generate individual-specific haplotypes based on the individual's lineage; 处理2:使用来自与所述个体具有相同世系的参考组的分型单倍型数据,插补分型的所述个体特异性分型单倍型中不存在的个体特异性基因型;以及Process 2: Using haplotype data from a reference group with the same pedigree as the individual, imput individual-specific genotypes not present in the individual-specific haplotypes of the individual being genotyped; and 处理3:从所述插补的个体特异性基因型中选择与所述个体特异性遗传变体匹配的遗传变体,所述个体特异性遗传变体与所述个体具有或将发展特定表型性状的可能性相关联并对应于所述一个或多个世系特异性遗传变体,Process 3: Select a genetic variant from the interpolated individual-specific genotype that matches the individual-specific genetic variant, which is associated with the individual's likelihood of having or developing a specific phenotypic trait and corresponds to one or more lineage-specific genetic variants. 其中所述一个或多个世系特异性遗传变体中的每一个和所述一个或多个个体特异性遗传变体中的每一个包括一个或多个风险单位;以及Each of the one or more lineage-specific genetic variants and each of the one or more individual-specific genetic variants includes one or more risk units; and c.基于所选择的一个或多个世系特异性遗传变体计算对于所述个体的遗传风险得分,c. Calculate a genetic risk score for the individual based on one or more selected lineage-specific genetic variants. 其中所述遗传风险得分指示所述个体具有或将发展所述特定表型性状的可能性;The genetic risk score indicates the likelihood that the individual has or will develop the specific phenotypic trait. 报告模块,所述报告模块被配置为生成包括所述个体的针对所述特定表型性状的所述遗传风险得分的报告;以及A reporting module configured to generate a report including a genetic risk score for the individual for the specific phenotypic trait; and 输出模块,所述输出模块被配置为向所述个体展示所述报告。An output module configured to display the report to the individual. 10.如权利要求9所述的系统,其中所述一个或多个世系特异性遗传变体、所述一个或多个个体特异性遗传变体和与所述一个或多个个体特异性遗传变体处于所述LD的所述预先确定的遗传变体包括单核苷酸变体(SNV)、indel和/或拷贝数变体(CNV)。10. The system of claim 9, wherein the one or more lineage-specific genetic variants, the one or more individual-specific genetic variants, and the predetermined genetic variants at the LD with the one or more individual-specific genetic variants comprise single nucleotide variants (SNVs), indels, and/or copy number variants (CNVs). 11.如权利要求10所述的系统,其中所述SNV的一个或多个风险单位包括风险等位基因;所述indel的一个或多个风险单位包括核苷酸的插入或缺失;并且所述CNV的一个或多个风险单位包括核酸序列的插入或缺失。11. The system of claim 10, wherein one or more risk units of the SNV include risk alleles; one or more risk units of the indel include nucleotide insertions or deletions; and one or more risk units of the CNV include nucleic acid sequence insertions or deletions. 12.如权利要求9所述的系统,其中所述报告还包括与所述特定表型性状相关的行为改变的建议。12. The system of claim 9, wherein the report further includes recommendations for behavioral changes related to the particular phenotypic trait. 13.如权利要求9所述的系统,其中所述特定表型性状包括营养性状、临床性状、亚临床性状、体育锻炼性状、皮肤性状、毛发性状、过敏性状或精神性状。13. The system of claim 9, wherein the specific phenotypic trait includes nutritional traits, clinical traits, subclinical traits, physical activity traits, skin traits, hair traits, allergic traits, or psychological traits. 14.如权利要求9所述的系统,还包括具有应用程序的个人电子设备,所述应用程序被配置为经由计算机网络与所述输出模块通信以访问所述报告。14. The system of claim 9, further comprising a personal electronic device having an application configured to communicate with the output module via a computer network to access the report. 15.如权利要求9所述的系统,其中所述基于距离的计算机程序是主成分分析,并且其中所述基于模型的计算机程序是最大似然或贝叶斯方法。15. The system of claim 9, wherein the distance-based computer program is principal component analysis, and wherein the model-based computer program is a maximum likelihood or Bayesian method. 16.一种非暂时性计算机可读存储介质,包括计算机可执行代码,所述计算机可执行代码被配置为使至少一个处理器执行包括以下的步骤:16. A non-transitory computer-readable storage medium comprising computer-executable code configured to cause at least one processor to perform the following steps: a.使用基于距离或基于模型的计算机程序分配个体的世系,以分析所述个体的基因型,所述基因型包括一个或多个个体特异性遗传变体;a. Assigning lineages to individuals using distance-based or model-based computer programs to analyze the genotypes of the individuals, which include one or more individual-specific genetic variants; b.从包括源自与所述个体具有相同世系的受试者的世系特异性遗传变体的性状相关变体数据库,至少部分地基于所述个体的所述世系选择一个或多个世系特异性遗传变体,其中所述一个或多个世系特异性遗传变体中的每一个对应于:b. Selecting one or more lineage-specific genetic variants from a database of trait-related variants, including those derived from lineage-specific genetic variants of subjects with the same lineage as the individual, at least in part based on the individual's lineage, wherein each of the one or more lineage-specific genetic variants corresponds to: i.所述一个或多个个体特异性遗传变体中的个体特异性遗传变体,或i. The individual-specific genetic variant among the one or more individual-specific genetic variants, or ii.在与所述个体具有相同世系的受试者群体中与所述一个或多个个体特异性遗传变体中的个体特异性遗传变体处于连锁不平衡(LD)的预先确定的遗传变体,其中所述预先确定的遗传变体是通过以下预先确定的:ii. A predetermined genetic variant in linkage disequilibrium (LD) with an individual-specific genetic variant among the one or more individual-specific genetic variants in a subject population of the same lineage as the individual, wherein the predetermined genetic variant is determined by: 处理1:提供来自所述个体的未分型基因型数据;Process 1: Provide untyped genotype data from the individuals; 处理2:将所述未分型基因型数据分型,以基于所述个体的所述世系产生个体特异性分型单倍型;Process 2: The untyped genotype data is genotyped to generate individual-specific haplotypes based on the individual's lineage; 处理3:使用来自与所述个体具有相同世系的参考组的分型单倍型数据,插补分型的所述个体特异性分型单倍型中不存在的个体特异性基因型;以及Process 3: Using haplotype data from a reference group with the same lineage as the individual, imput individual-specific genotypes not present in the individual-specific haplotypes of the individual being genotyped; and 处理4:从所述插补的个体特异性基因型中选择与所述个体特异性遗传变体匹配的遗传变体,所述个体特异性遗传变体与所述个体具有或将发展特定表型性状的可能性相关联;以及Process 4: Selecting a genetic variant from the imputed individual-specific genotype that matches the individual-specific genetic variant, the individual-specific genetic variant being associated with the likelihood that the individual has or will develop a particular phenotypic trait; and c.基于所选择的一个或多个世系特异性遗传变体计算对于所述个体的遗传风险得分,c. Calculate a genetic risk score for the individual based on one or more selected lineage-specific genetic variants. 其中所述遗传风险得分指示所述个体具有或将发展所述特定表型性状的可能性。The genetic risk score indicates the likelihood that the individual has or will develop the specific phenotypic trait. 17.如权利要求16所述的介质,其中所述一个或多个世系特异性遗传变体、所述一个或多个个体特异性遗传变体和与所述一个或多个个体特异性遗传变体处于所述LD的所述预先确定的遗传变体包括单核苷酸变体(SNV)、indel和/或拷贝数变体(CNV)。17. The medium of claim 16, wherein the one or more lineage-specific genetic variants, the one or more individual-specific genetic variants, and the predetermined genetic variants at the LD with the one or more individual-specific genetic variants comprise single nucleotide variants (SNVs), indels, and/or copy number variants (CNVs). 18.如权利要求17所述的介质,其中所述一个或多个世系特异性遗传变体中的每一个和所述个体特异性遗传变体中的每一个包括一个或多个风险单位,并且其中所述SNV的一个或多个风险单位包括风险等位基因;所述indel的一个或多个风险单位包括核苷酸的插入或缺失;并且所述CNV的一个或多个风险单位包括核酸序列的插入或缺失。18. The medium of claim 17, wherein each of the one or more lineage-specific genetic variants and each of the individual-specific genetic variants comprises one or more risk units, and wherein the one or more risk units of the SNV comprise risk alleles; the one or more risk units of the indel comprise nucleotide insertions or deletions; and the one or more risk units of the CNV comprise nucleic acid sequence insertions or deletions. 19.如权利要求16所述的介质,其中所述步骤还包括向所述个体提供通知,所述通知包括所述个体具有或将发展所述特定表型性状的可能性。19. The medium of claim 16, wherein the step further includes providing a notification to the individual, the notification including the possibility that the individual has or will develop the particular phenotypic trait. 20.如权利要求16所述的介质,其中所述特定表型性状包括营养性状、临床性状、亚临床性状、体育锻炼性状、皮肤性状、毛发性状、过敏性状或精神性状。20. The medium of claim 16, wherein the specific phenotypic trait includes nutritional traits, clinical traits, subclinical traits, physical activity traits, skin traits, hair traits, allergic traits, or psychogenic traits.
HK62022051515.4A 2018-11-28 2019-10-31 Ancestry-specific genetic risk scores HK40062217B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US62/772,565 2018-11-28
US16/216,940 2018-12-11

Publications (2)

Publication Number Publication Date
HK40062217A HK40062217A (en) 2022-06-10
HK40062217B true HK40062217B (en) 2024-08-09

Family

ID=

Similar Documents

Publication Publication Date Title
JP7522735B2 (en) Ancestry-specific genetic risk scores
Auwerx et al. The individual and global impact of copy-number variants on complex human traits
Sleiman et al. GWAS identifies four novel eosinophilic esophagitis loci
Jordan et al. Rare and common variants in CARD14, encoding an epidermal regulator of NF-kappaB, in psoriasis
US20220305075A1 (en) Methods and compositions for detecting and treating endometriosis
Hicks et al. Validation of a salivary RNA test for childhood autism spectrum disorder
Li et al. Early life affects late-life health through determining DNA methylation across the lifespan: A twin study
Wang et al. Polymorphism in maternal LRP8 gene is associated with fetal growth
Bai et al. Association analysis of genetic variants with type 2 diabetes in a Mongolian population in China
Roman et al. Assessment of genetic polymorphisms associated with hyperuricemia or gout in the Hmong
Kothari et al. Role of local CpG DNA methylation in mediating the 17q21 asthma susceptibility gasdermin B (GSDMB)/ORMDL sphingolipid biosynthesis regulator 3 (ORMDL3) expression quantitative trait locus
Louter et al. Candidate-gene association study searching for genetic factors involved in migraine chronification
Qadeer et al. Association of serotonin system-related genes with homicidal behavior and criminal aggression in a prison population of Pakistani Origin
Oussalah et al. Population and evolutionary genetics of the PAH locus to uncover overdominance and adaptive mechanisms in phenylketonuria: Results from a multiethnic study
Skuladottir et al. GWAS meta-analysis reveals key risk loci in essential tremor pathogenesis
Kamal et al. Role of miR-146a rs2910164 and UTS2 rs228648 genetic variants in Behcet’s disease
Catapano et al. Novel free-circulating and extracellular vesicle-derived miRNAs dysregulated in Duchenne muscular dystrophy
Memon et al. Association of MSX1 gene variants with nonsyndromic cleft lip and/or palate in the Pakistani population
JP7491847B2 (en) Precision medicine for pain: diagnostic biomarkers, pharmacogenomics, and repurposed drugs
Roecklein et al. Haplotype analysis of the folate-related genes MTHFR, MTRR, and MTR and migraine with aura
Myrum et al. Implication of the APP gene in intellectual abilities
US20210287758A1 (en) Ancestry-specific genetic risk scores
Li et al. Impact of platelet glycoprotein Ia/IIa C807T gene polymorphisms on coronary artery aneurysms of KD patients
HK40062217B (en) Ancestry-specific genetic risk scores
HK40062217A (en) Ancestry-specific genetic risk scores