CN115161408A

CN115161408A - DNA methylation detection of target segments of the maize genome

Info

Publication number: CN115161408A
Application number: CN202210575458.4A
Authority: CN
Inventors: 李青; 许强; 李若楠; 李林; 李娟�; 韩瑞
Original assignee: Huazhong Agricultural University
Current assignee: Huazhong Agricultural University
Priority date: 2022-05-25
Filing date: 2022-05-25
Publication date: 2022-10-11

Abstract

The application discloses DNA methylation detection of a corn genome target segment, and particularly relates to a construction method, a detection method, a kit and a system of a corn genome DNA methylation detection library containing the target segment. According to the method, the kit and the system, the DNA methylation detection library is constructed through two times of PCR amplification, and then the detection is realized through sequencing of the DNA methylation detection library and comparison of DNA methylation information. The method, the kit and the system can detect DNA methylation information of a plurality of target sections in a plurality of corn materials at one time, and can also be applied to the detection of DNA methylation of two parent alleles of hybrid materials, each target section comprises at least one or more SNP sites capable of distinguishing different corn materials or different alleles, and the method, the kit and the system not only simplify fussy experimental operation, but also greatly reduce detection cost when accurately detecting the DNA methylation level.

Description

DNA methylation detection of target segments of the maize genome

技术领域technical field

本申请涉及玉米基因组DNA甲基化检测技术领域，尤其涉及玉米基因组目标区段的DNA甲基化检测，具体涉及包含目标区段的玉米基因组DNA甲基化检测文库的构建方法、检测方法、试剂盒和系统。The present application relates to the technical field of maize genomic DNA methylation detection, in particular to the DNA methylation detection of target segments of the maize genome, and in particular to a construction method, detection method and reagent for a maize genomic DNA methylation detection library comprising the target segment Boxes and Systems.

背景技术Background technique

DNA甲基化(DNA methylation)是DNA表观修饰的一种方式，能够在不改变DNA序列的前提下，影响表型的遗传。玉米基因组DNA甲基化水平是动态变化的，受到玉米生长发育时期，以及各种环境因素的影响，例如盐胁迫、干旱胁迫等。因此对玉米DNA甲基化的检测，是探究DNA甲基化如何影响玉米生长发育调控、和应答逆境胁迫的重要手段。DNA methylation is a way of DNA epigenetic modification, which can affect the inheritance of phenotype without changing the DNA sequence. The methylation level of maize genomic DNA changes dynamically and is affected by the growth and development period of maize and various environmental factors, such as salt stress and drought stress. Therefore, the detection of maize DNA methylation is an important means to explore how DNA methylation affects maize growth and development regulation and responds to adversity stress.

全基因组测序是检测植物基因组DNA甲基化的常用方式，可以在全基因组范围内检测植物基因组DNA甲基化。而调控基因表达的DNA甲基化通常位于某些特异区段上，对于特异区段的DNA甲基化检测，全基因组DNA甲基化测序手段不仅成本较高，即使对这些特异区域增加测序深度，往往仍然不能很好覆盖到这些特异区段。此外，对于多个基因组，多个特异区段DNA甲基化的检测，采用全基因组DNA甲基化测序方法需要进行多次检测，不仅检测成本提高，而且检测效率低。Whole-genome sequencing is a common way to detect plant genome DNA methylation, which can detect plant genome DNA methylation on a genome-wide scale. The DNA methylation that regulates gene expression is usually located in some specific regions. For the detection of DNA methylation in specific regions, the whole genome DNA methylation sequencing method is not only expensive, even if the sequencing depth is increased for these specific regions , these specific segments are often not well covered. In addition, for the detection of DNA methylation of multiple genomes and multiple specific segments, the whole genome DNA methylation sequencing method requires multiple detections, which not only increases the detection cost, but also has a low detection efficiency.

发明内容SUMMARY OF THE INVENTION

有鉴于此，本申请的目的在于至少提供用于玉米基因组目标区段DNA甲基化检测文库构建方法、检测方法、试剂盒和系统，以一定程度上解决上述技术问题之一。In view of this, the purpose of the present application is to at least provide a method, detection method, kit and system for detecting DNA methylation in the target segment of the maize genome, so as to solve one of the above technical problems to a certain extent.

第一方面，本申请实施例公开了包含目标区段的玉米基因组DNA甲基化检测文库的构建方法，所述目标区段包含至少一个或多个能够区分不同玉米基因组的SNP位点，所述DNA甲基化检测文库用于检测包含所述目标区段的至少两个玉米基因组的DNA混合样本，其中，所述构建方法包括：In the first aspect, the examples of this application disclose a method for constructing a maize genomic DNA methylation detection library comprising a target segment, the target segment comprising at least one or more SNP sites capable of distinguishing different maize genomes, the The DNA methylation detection library is used to detect DNA mixed samples of at least two maize genomes containing the target segment, wherein the construction method includes:

将所述DNA混合样本中DNA分子中未甲基化的胞嘧啶转变为尿嘧啶；converting unmethylated cytosine in DNA molecules in the DNA mixed sample into uracil;

进行第一扩增，所述第一扩增包括对所述DNA混合样本在同一PCR反应体系中进行扩增得到第一扩增产物，所述第一扩增产物包含具有所述目标区段的DNA分子以及连接在所述DNA分子的3'端的第一识别序列和桥序列，所述第一识别序列用以标识所述DNA混合样本中所述目标区段的每一条DNA分子；A first amplification is performed, and the first amplification includes amplifying the DNA mixed sample in the same PCR reaction system to obtain a first amplification product, and the first amplification product comprises a DNA having the target segment. A DNA molecule and a first recognition sequence and a bridge sequence connected to the 3' end of the DNA molecule, the first recognition sequence is used to identify each DNA molecule of the target segment in the DNA mixed sample;

进行第二扩增，所述第二扩增包括对所述第一扩增产物进行PCR扩增得到第二扩增产物的步骤，所述第二扩增产物包含第一扩增产物以及连接在所述第一扩增产物3'端的第二识别序列，所述第二识别序列用于对所述第一扩增产物进行标识；以及performing a second amplification, the second amplification comprising the step of performing PCR amplification on the first amplification product to obtain a second amplification product, the second amplification product comprising the first amplification product and the a second recognition sequence at the 3' end of the first amplification product, the second recognition sequence being used to identify the first amplification product; and

利用所述第二扩增产物构建所述包含所述目标区段的玉米基因组的DNA甲基化检测文库。The second amplification product is used to construct the DNA methylation detection library of the maize genome comprising the target segment.

在本申请实施例中，所述第一扩增使用了上游引物和第一下游引物，所述上游引物中的C碱基设计成为简并碱基Y；所述第一下游引物中的G碱基设计成为简并碱基R，所述第一下游引物包含所述第一识别序列和连接在所述第一识别序列3'端的桥序列，所述桥序列用以与所述第二桥序列同源匹配；所述第一扩增产物中的SNP位点位于具有所述目标区段的DNA分子与所述上游引物结合起始位置或与所述第一下游引物结合起始位置的150bp以内。In the examples of the present application, the first amplification uses an upstream primer and a first downstream primer, and the C base in the upstream primer is designed to be a degenerate base Y; the G base in the first downstream primer is designed The base is designed as a degenerate base R, and the first downstream primer comprises the first recognition sequence and a bridge sequence connected at the 3' end of the first recognition sequence, and the bridge sequence is used to connect with the second bridge sequence. Homologous matching; the SNP site in the first amplification product is located within 150bp of the DNA molecule with the target segment and the upstream primer binding start position or the first downstream primer binding start position .

在申请实施例中，所述第一扩增的反应步骤包括：In the application examples, the reaction step of the first amplification includes:

进行预变性，处理温度为95℃，处理时间为5min；Carry out pre-denaturation, the treatment temperature is 95 °C, and the treatment time is 5 min;

进行第一扩增循环，所述第一扩增循环包括依次进行第一解链处理、第一退火处理和第一延伸处理；performing a first amplification cycle, the first amplification cycle comprising sequentially performing a first melting process, a first annealing process, and a first extension process;

进行第二扩增循环，所述第二扩增循环包括依次进行第二解链处理、第二退火处理和第二延伸处理；的95℃处理时间为30s、60℃处理30s和70℃处理30s；以及A second amplification cycle is performed, and the second amplification cycle includes sequentially performing the second melting treatment, the second annealing treatment and the second extension treatment; the treatment time at 95°C is 30s, 60°C for 30s, and 70°C for 30s ;as well as

72℃处理5min和12℃处理1s；72℃ for 5min and 12℃ for 1s;

其中，所述第一扩增循环包括至少8～12次循环，所述第一退火处理于每一次所述第一扩增循环的退火处理温度随所述第一扩增循环逐次降低。Wherein, the first amplification cycle includes at least 8 to 12 cycles, and the annealing temperature of the first annealing treatment in each of the first amplification cycles gradually decreases with the first amplification cycle.

在本申请实施例中，所述第一退火处理的处理温度为68-55℃、65-53℃、65-55℃或63-55℃。In the embodiments of the present application, the treatment temperature of the first annealing treatment is 68-55°C, 65-53°C, 65-55°C, or 63-55°C.

在本申请实施例中，所述第二扩增使用了所述上游引物和第二下游引物，所述第二下游引物包括所述桥序列和所述第二识别序列，所述第二扩增的反应步骤包括：In the embodiments of the present application, the second amplification uses the upstream primer and the second downstream primer, the second downstream primer includes the bridge sequence and the second recognition sequence, and the second amplification The reaction steps include:

进行扩增循环，所述扩增循环包括依次进行解链处理、退火处理和延伸处理；以及performing an amplification cycle comprising sequentially performing a melting process, an annealing process, and an extension process; and

72℃处理5min和12℃处理1s。Treat at 72°C for 5 min and at 12°C for 1 s.

在本申请实施例中，所述第二扩增产物还包括连接在所述第一扩增产物3'端的所述第一识别序列与所述第二识别序列之间的桥序列，所述桥序列用以连接所述第一识别序列和所述第二识别序列。In the embodiment of the present application, the second amplification product further includes a bridge sequence connected between the first recognition sequence and the second recognition sequence at the 3' end of the first amplification product, the bridge A sequence is used to link the first recognition sequence and the second recognition sequence.

第二方面，玉米基因组目标区段DNA甲基化的检测方法，所述目标区段包含至少一个或多个能够区分不同玉米基因组的SNP位点，所述目标区段DNA甲基化检测方法用于检测包含所述目标区段的至少两个玉米基因组的DNA混合样本，其中，所述检测方法包括：In a second aspect, a method for detecting DNA methylation in a target segment of a maize genome, the target segment comprising at least one or more SNP sites capable of distinguishing different maize genomes, the method for detecting DNA methylation in a target segment using For detection of DNA mixed samples of at least two corn genomes comprising the target segment, wherein the detection method comprises:

第一方面的构建方法，构建所述包含所述目标区段的玉米基因组的DNA甲基化检测文库；The construction method of the first aspect, constructing the DNA methylation detection library of the maize genome comprising the target segment;

对所述DNA甲基化检测文库进行测序，以获得第一reads库；Sequencing the DNA methylation detection library to obtain the first reads library;

对所述第一reads库进行质控、拆分和组对得到第二reads库；Carry out quality control, splitting and pairing to the first reads library to obtain the second reads library;

对所述第二reads库进行比对、去重和计算，即得到所述DNA混合样本中每一玉米基因组的DNA甲基化信息。The second reads library is aligned, deduplicated and calculated to obtain the DNA methylation information of each maize genome in the DNA mixed sample.

在本申请实施例中，所述第一reads库包括与所述DNA甲基化检测文库对应数量的第一reads，所述第二reads库包括基于相同所述第二识别序列组成的第二reads，所述第二reads为去除了所述第一识别序列、所述第二识别序列和所述桥序列的reads。In the embodiment of the present application, the first reads library includes a number of first reads corresponding to the DNA methylation detection library, and the second reads library includes second reads based on the same second recognition sequence. , the second reads are the reads from which the first recognition sequence, the second recognition sequence and the bridge sequence are removed.

第三方面，本申请实施例公开了用于构建包含目标区段的玉米基因组DNA甲基化检测文库的试剂盒，所述目标区段包含至少一个或多个能够区分不同玉米基因组的SNP位点，所述DNA甲基化检测文库用于检测包含所述目标区段的至少两个玉米基因组的DNA混合样本，其中，所述试剂盒包括：In a third aspect, the embodiments of the present application disclose a kit for constructing a maize genomic DNA methylation detection library comprising a target segment, the target segment comprising at least one or more SNP sites capable of distinguishing different maize genomes , the DNA methylation detection library is used to detect DNA mixed samples of at least two maize genomes comprising the target segment, wherein the kit includes:

玉米基因组DNA处理试剂，用于提取所述玉米基因组DNA并将其中未发生甲基化的胞嘧啶转变成为尿嘧啶；Maize genomic DNA processing reagent for extracting the maize genomic DNA and converting unmethylated cytosine into uracil;

上游引物，所述上游引物序列中的C碱基替换成为简并碱基Y的序列；an upstream primer, the C base in the upstream primer sequence is replaced with a sequence of degenerate base Y;

第一下游引物，所述下游引物序列中的G碱基替换成为简并碱基R的序列，所述第一下游引物包含第一识别序列和桥序列；以及a first downstream primer, the G base in the downstream primer sequence is replaced with a sequence of degenerate base R, the first downstream primer comprises a first recognition sequence and a bridge sequence; and

第二下游引物，所述第二下游引物包含所述桥序列和第二识别序列序列；a second downstream primer comprising the bridge sequence and the second recognition sequence sequence;

其中，所述上游引物和所述第一下游引物用于进行一次第一扩增得到第一扩增产物，所述第一扩增产物包含具有所述目标区段的DNA分子以及连接在所述DNA分子的3'端的第一识别序列和桥序列；Wherein, the upstream primer and the first downstream primer are used to perform a first amplification to obtain a first amplification product, and the first amplification product includes a DNA molecule with the target segment and a DNA molecule connected to the A first recognition sequence and a bridge sequence at the 3' end of the DNA molecule;

所述上游引物和所述第二下游引物可于一次第二扩增中得到第二扩增产物，所述第二扩增产物包含第一扩增产物以及连接在所述第一扩增产物3'端的第二识别序列，以构建所述DNA甲基化检测文库；The upstream primer and the second downstream primer can obtain a second amplification product in a second amplification, and the second amplification product includes the first amplification product and is connected to the first amplification product 3 ' end of the second recognition sequence, to construct the DNA methylation detection library;

所述第一识别序列用以标识所述DNA混合样本中所述目标区段的每一条DNA分子，所述第二识别序列用于对所述第一扩增产物进行标识。The first identification sequence is used to identify each DNA molecule of the target segment in the mixed DNA sample, and the second identification sequence is used to identify the first amplification product.

第四方面，本申请实施例公开了包含目标区段的玉米基因组DNA甲基化检测的系统，所述目标区段包含至少一个或多个能够区分不同玉米基因组的SNP位点，所述DNA甲基化检测文库用于检测包含所述目标区段的至少两个玉米基因组的DNA混合样本，其中，所述系统包括：In a fourth aspect, the embodiments of the present application disclose a system for detecting DNA methylation in maize genomes comprising a target segment, the target segment comprising at least one or more SNP sites capable of distinguishing different maize genomes, the DNA methylation The gene-based detection library is used to detect DNA mixed samples of at least two maize genomes comprising the target segment, wherein the system comprises:

第三方面的试剂盒，用于获得所述DNA混合样本中每一玉米基因组的DNA甲基化信息；以及The kit of the third aspect for obtaining DNA methylation information for each corn genome in the DNA mixed sample; and

处理装置，所述处理装置用于运行程序，所述程序运行时执行第三方面所述检测方法中的A processing device, the processing device is used for running a program, and when the program is running, the detection method in the third aspect is executed.

“对所述DNA甲基化检测文库进行测序，以获得第一reads库；"Sequencing the DNA methylation detection library to obtain the first reads library;

对所述第二reads库进行比对，去重和计算，即得到所述DNA混合样本中每一玉米基因组的DNA甲基化信息”的步骤。The second reads library is aligned, deduplicated and calculated to obtain the DNA methylation information of each maize genome in the DNA mixed sample".

与现有技术相比，本申请至少具有以下有益效果之一：Compared with the prior art, the present application has at least one of the following beneficial effects:

本申请实施例中涉及的包含目标区段的玉米基因组DNA甲基化检测文库的构建方法、检测方法、试剂盒和系统，针对于包含有SNP位点的目标区段，不仅可以一次性检测多个包含目标区段的玉米基因组样本，覆盖率高，不必进行多次检测，提高了检测效率，还提高了DNA甲基化检测的可信度和准确性等。The construction method, detection method, kit and system of the maize genomic DNA methylation detection library comprising the target segment involved in the examples of the present application, for the target segment comprising the SNP site, not only can one-time detection of multiple A maize genome sample containing the target segment has a high coverage rate, does not need to perform multiple detections, improves the detection efficiency, and also improves the reliability and accuracy of DNA methylation detection.

附图说明Description of drawings

图1为本申请实施例提供的包含目标区段的玉米基因组DNA甲基化检测原理图。FIG. 1 is a schematic diagram of the DNA methylation detection principle of the maize genome including the target segment provided in the embodiment of the present application.

图2为本申请实施例提供的包含目标区段的玉米基因组DNA甲基化构建中的第一扩增和第二扩增中的引物和产物结构图。FIG. 2 is a structural diagram of primers and products in the first amplification and the second amplification in the construction of the maize genomic DNA methylation including the target segment provided in the embodiment of the present application.

图3为本申请实施例涉及的四个玉米自交系的目标区段，其中，标红为SNP位点。Figure 3 is the target segment of the four maize inbred lines involved in the examples of the application, wherein the SNP sites are marked in red.

图4为本申请实施例提供的四个玉米自交系基因组的混合样本分别采用本申请提供的甲基化检测方法或者系统的检测结果；CHH\CHG\CG分别代表甲基化C碱基的三种序列环境。FIG. 4 is the detection result of the methylation detection method or system provided in the mixed samples of four maize inbred line genomes provided in the embodiment of the present application respectively; CHH\CHG\CG represent the methylation C bases respectively. Three sequence environments.

图5为本申请实施例提供的两个目标区段采用全基因组DNA甲基化测序方法得到的DNA甲基化结果；CHH\CHG\CG分别代表甲基化C碱基的三种序列环境。Figure 5 shows the DNA methylation results obtained by using the whole-genome DNA methylation sequencing method for the two target segments provided in the embodiment of the present application; CHH\CHG\CG respectively represent three sequence environments of methylated C bases.

图6为本申请实施例提供的纯化的第一扩增产物凝胶电泳图，泳道1为目标区段，泳道2为DNA分子量marker。FIG. 6 is a gel electrophoresis image of the purified first amplification product provided in the embodiment of the present application, lane 1 is the target segment, and lane 2 is the DNA molecular weight marker.

图7为本申请实施例提供的纯化的第二扩增产物凝胶电泳图，泳道1为目标区段，泳道2为DNA分子量marker。FIG. 7 is a gel electrophoresis image of the purified second amplification product provided in the embodiment of the present application, lane 1 is the target segment, and lane 2 is the DNA molecular weight marker.

具体实施方式Detailed ways

为了使本申请的目的、技术方案及优点更加清楚明白，以下结合实施例对本申请进行进一步详细说明。应当理解，此处所描述的具体实施例仅用以解释本申请，并不用于限定本申请。本申请中未详细单独说明的试剂均为常规试剂，均可从商业途径获得；未详细特别说明的方法均为常规实验方法，可从现有技术中获知。In order to make the purpose, technical solutions and advantages of the present application more clear, the present application will be further described in detail below with reference to the embodiments. It should be understood that the specific embodiments described herein are only used to explain the present application, but not to limit the present application. The reagents not described in detail and individually in this application are all conventional reagents and can be obtained from commercial sources; the methods not described in detail are all conventional experimental methods, which can be known from the prior art.

需要说明的是，本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象，而不必用于描述特定的顺序或先后次序，也不对其后的技术特征起到实质的限定作用。应该理解这样使用的数据在适当情况下可以互换，以便这里描述的本发明实施例能够以除了在这里图示或描述的以外的顺序实施。此外，术语“包括”和“具有”以及他们的任何变形，意图在于覆盖不排他的包含，例如，包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元，而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。It should be noted that the terms "first", "second", etc. in the description and claims of the present invention and the above drawings are used to distinguish similar objects, and are not necessarily used to describe a specific sequence or sequence. It does not play a substantial limiting role on the subsequent technical features. It is to be understood that the data so used may be interchanged under appropriate circumstances such that the embodiments of the invention described herein can be practiced in sequences other than those illustrated or described herein. Furthermore, the terms "comprising" and "having" and any variations thereof, are intended to cover non-exclusive inclusion, for example, a process, method, system, product or device comprising a series of steps or units is not necessarily limited to those expressly listed Rather, those steps or units may include other steps or units not expressly listed or inherent to these processes, methods, products or devices.

玉米基因组DNA甲基化检测文库的构建方法Construction method of maize genomic DNA methylation detection library

将玉米基因组DNA的特异区段作为目标区段进行DNA甲基化检测时，采用本领域技术人员熟知的全基因组DNA甲基化测序手段由于覆盖度有限，成本较高，其检测效果并不理想，即使增加测序深度，也不能很好的覆盖到目标区段。此外，对于多个基因组，多个特异区段DNA甲基化的检测，由于全基因组DNA甲基化测序一次只能检测一个基因组，所以价格非常昂贵。When the specific segment of maize genomic DNA is used as the target segment for DNA methylation detection, the whole genome DNA methylation sequencing method well-known to those skilled in the art is used due to limited coverage and high cost, and its detection effect is not ideal. , even if the sequencing depth is increased, the target segment cannot be well covered. In addition, for the detection of DNA methylation in multiple genomes and multiple specific segments, the whole genome DNA methylation sequencing can only detect one genome at a time, so the price is very expensive.

为此，本申请实施例公开了包含目标区段的玉米基因组的DNA甲基化检测文库的构建方法。该目标区段包含至少一个或多个能够区分不同玉米基因组的SNP位点，该DNA甲基化检测文库用于检测包含所述目标区段的至少两个玉米基因组的DNA混合样本。本申请利用SNP位点对目标区段进行定位，依次建立的甲基化文库可以用于一次性检测多个玉米基因组中的至少一个目标区段的DNA甲基化信息，也可以用于一次检测一个玉米基因组一个目标区段的DNA甲基化信息，对目标区段覆盖率，检测效率大大提高。To this end, the examples of the present application disclose a method for constructing a DNA methylation detection library of the maize genome comprising the target segment. The target segment contains at least one or more SNP sites capable of distinguishing different maize genomes, and the DNA methylation detection library is used to detect DNA mixed samples of at least two maize genomes comprising the target segment. The application uses SNP sites to locate the target segment, and the methylation library established in turn can be used to detect the DNA methylation information of at least one target segment in multiple maize genomes at one time, and can also be used for one-time detection. The DNA methylation information of a target segment of a maize genome greatly improves the detection efficiency of the target segment coverage.

具体的，如图1～2所示，本申请公开的包含目标区段的玉米基因组的DNA甲基化检测文库的构建方法包括：Specifically, as shown in Figures 1-2, the method for constructing the DNA methylation detection library of the maize genome comprising the target segment disclosed in the present application includes:

S1、将DNA混合样本中DNA分子中未甲基化的胞嘧啶转变为尿嘧啶；S1. Convert the unmethylated cytosine in the DNA molecule in the DNA mixed sample to uracil;

S2、进行第一扩增，第一扩增包括对DNA混合样本在同一PCR反应体系中进行扩增得到第一扩增产物，第一扩增产物包含具有目标区段的DNA分子以及连接在DNA分子的3'端的第一识别序列和桥序列，第一识别序列用以标识该DNA混合样本中目标区段的每一条DNA分子；S2. Perform a first amplification. The first amplification includes amplifying the mixed DNA samples in the same PCR reaction system to obtain a first amplification product. The first amplification product includes a DNA molecule with a target segment and a DNA molecule connected to the DNA. A first recognition sequence and a bridge sequence at the 3' end of the molecule, the first recognition sequence is used to identify each DNA molecule of the target segment in the DNA mixed sample;

S3、进行第二扩增，第二扩增包括对第一扩增产物进行PCR扩增得到第二扩增产物的步骤，第二扩增产物包含第一扩增产物以及连接在第一扩增产物3'端的第二识别序列，第二识别序列用于对第一扩增产物进行标识；以及S3. Perform a second amplification. The second amplification includes the step of performing PCR amplification on the first amplification product to obtain a second amplification product. The second amplification product includes the first amplification product and is connected to the first amplification product. a second recognition sequence at the 3' end of the product, the second recognition sequence being used to identify the first amplification product; and

S4、利用第二扩增产物构建包含目标区段的玉米基因组的DNA甲基化检测文库。S4, using the second amplification product to construct a DNA methylation detection library of the maize genome comprising the target segment.

为此，下方将上述S1～S4步骤进行更加详细的说明。For this reason, the above steps S1 to S4 will be described in more detail below.

S1、玉米基因组的DNA混合样本的获得S1. Acquisition of DNA Mixed Samples of Maize Genome

在本申请S1步骤的实施例中，DNA混合样本是将提取的玉米基因组DNA样本进行混合得到的，更多的实施例中，玉米基因组DNA样本可以为2个或者更多。In the embodiment of step S1 of the present application, the DNA mixed sample is obtained by mixing the extracted corn genomic DNA samples. In more embodiments, the number of corn genomic DNA samples may be two or more.

具体的，“转变”可以采用化学诱变剂、紫外诱变等物理条件诱变，只需满足将获得将未发生甲基化的脱氨基胞嘧啶转变成为尿嘧啶即可，对于具体的“转变”手段不做限制，例如采用亚硫酸氢盐进行处理。Specifically, the "transformation" can be mutagenized using physical conditions such as chemical mutagen, ultraviolet mutagenesis, etc., as long as it is sufficient to convert the unmethylated deaminated cytosine into uracil. For the specific "transformation" "The means are not limited, for example, using bisulfite for treatment.

一个具体的S1步骤实施过程如下：A specific S1 step implementation process is as follows:

(1)玉米基因组DNA的提取(1) Extraction of maize genomic DNA

本实施例选取玉米B73，Mo17，W22，SK四个常见自交系两叶一心期第三片叶为材料，使用2％CTAB法提取基因组DNA。将4个自交系的DNA等质量混合在一起，制备成待测混合DNA样品。In this example, four common inbred lines of maize, B73, Mo17, W22, and SK, were selected as materials, and the third leaf at the two-leaf and one-heart stage was used as materials, and the genomic DNA was extracted by 2% CTAB method. The DNA samples of the four inbred lines were mixed together to prepare a mixed DNA sample to be tested.

具体步骤包括：取适量叶片装入1.5mL无菌离心管中，液氮冷冻条件下利用研钵进行研磨；迅速加入750μL的2％CTAB提取液，快速震荡混匀；65℃水浴45min，水浴过程中每隔10min轻轻摇晃几次；取出离心管，加入750μL氯仿:异戊醇(24:1)，轻轻摇匀片刻；室温下8000r/min离心10min，取出约500μL上清液转移至新的1.5mL离心管；加500μL预冷的异丙醇(-20℃)，轻摇混匀后静置片刻；室温下12000r/min离心2min，弃上清；加75％酒精500μL洗涤，室温下12000r/min离心1min，弃上清。重复一次；风干后加入100μL ddH₂O充分溶解DNA；将4个自交系的DNA等质量混合在一起，制备成待测混合DNA样品。The specific steps include: taking an appropriate amount of leaves into a 1.5mL sterile centrifuge tube, and grinding with a mortar under the freezing condition of liquid nitrogen; quickly adding 750 μL of 2% CTAB extract, quickly shaking and mixing; Gently shake several times every 10 min; take out the centrifuge tube, add 750 μL of chloroform:isoamyl alcohol (24:1), shake gently for a while; centrifuge at 8000 r/min for 10 min at room temperature, remove about 500 μL of supernatant and transfer to a new 1.5mL centrifuge tube; add 500μL of pre-cooled isopropanol (-20°C), shake and mix, and let stand for a while; centrifuge at 12,000 r/min for 2 min at room temperature, discard the supernatant; add 500 μL of 75% alcohol to wash, at room temperature Centrifuge at 12000 r/min for 1 min and discard the supernatant. Repeat once; after air-drying, add 100 μL ddH ₂ O to fully dissolve the DNA; mix the DNA of the four inbred lines with equal mass to prepare a mixed DNA sample to be tested.

(2)亚硫酸氢盐处理。(2) Bisulfite treatment.

利用EZ DNA Methylation-Lightning Kit(Zymo,D5031)试剂盒对待测的混合DNA样品进行亚硫酸氢盐处理，纯化得到处理后的混合DNA样品。处理步骤包括：The mixed DNA samples to be tested were treated with bisulfite using the EZ DNA Methylation-Lightning Kit (Zymo, D5031) kit, and the treated mixed DNA samples were purified. Processing steps include:

1)20μL(约1μg)的待测DNA样品中加入130μL Zymo Lightning ConversionReagent。利用移液器上下吹打混匀10次以上。1) Add 130 μL of Zymo Lightning Conversion Reagent to 20 μL (about 1 μg) of the DNA sample to be tested. Mix by pipetting up and down for more than 10 times.

2)将总体积150μL的上述1)中的样品分别转移75μL到2个200μL的PCR管中，在PCR仪中按照如表1所示的程序进行：2) Transfer 75 μL of the samples in the above 1) with a total volume of 150 μL to two 200 μL PCR tubes, and follow the procedure shown in Table 1 in the PCR machine:

表1Table 1

温度temperature 时间time 98℃98℃ 8min8min 54℃54℃ 60min60min 4℃4℃ holdhold

3)加入600μL M-Binding Buffer到Zymo Spin IC column中，将2)处理后的2个75μL样品转移到已含有M-Binding Buffer的Zymo Spin IC column中，上下颠倒混匀10次以上。13000rpm离心30s，弃掉穿流液。3) Add 600 μL of M-Binding Buffer to the Zymo Spin IC column, transfer the two 75 μL samples after 2) to the Zymo Spin IC column containing M-Binding Buffer, and mix by inversion for more than 10 times. Centrifuge at 13,000 rpm for 30 s and discard the permeate.

4)加入100μL M-Wash buffer，13000rpm离心30s。4) Add 100 μL of M-Wash buffer and centrifuge at 13000rpm for 30s.

5)加入200μL L-DesμLfonation Buffer，室温静置20min。13000rpm离心30s，弃掉穿流液。5) Add 200μL of L-DesμLfonation Buffer, and let stand at room temperature for 20min. Centrifuge at 13,000 rpm for 30 s and discard the permeate.

6)加入200μL M-Wash buffer。13000rpm离心30s，弃掉穿流液。6) Add 200 μL of M-Wash buffer. Centrifuge at 13,000 rpm for 30 s and discard the permeate.

7)重复6)一次。7) Repeat 6) once.

8)将Zymo Spin IC column置于新的1.5mL离心管上，加入21μLddH₂O，室温静置2min，13000rpm离心30s，收集穿流液，即已纯化的处理后的玉米基因组样本。将B73，Mo17，W22，SK四个常见自交系的玉米基因组样本混合，即可得到DNA混合样本。8) Place the Zymo Spin IC column on a new 1.5 mL centrifuge tube, add 21 μL ddH ₂ O, stand at room temperature for 2 min, and centrifuge at 13,000 rpm for 30 s to collect the flow-through fluid, that is, the purified processed maize genome sample. Mixing the corn genome samples of four common inbred lines B73, Mo17, W22 and SK can get DNA mixed samples.

S2、第一扩增S2, the first amplification

在本步骤的实施例中，第一扩增使用了上游引物和第一下游引物对目标区段进行PCR扩增以得到第一扩增产物，第一扩增产物具有第一识别序列和桥序列，以标识该DNA混合样本中的目标区段的每一条DNA分子。如图2所示，采用上游引物和第一下游引物经过第一扩增得到的第一扩增产物中SNP位点位于其3'端的150bp以内。In the example of this step, the first amplification uses an upstream primer and a first downstream primer to perform PCR amplification on the target segment to obtain a first amplification product, and the first amplification product has a first recognition sequence and a bridge sequence , to identify each DNA molecule of the target segment in the DNA mixed sample. As shown in FIG. 2 , the SNP site in the first amplification product obtained by the first amplification using the upstream primer and the first downstream primer is located within 150 bp of its 3' end.

在一个具体的S2步骤实施例如下：An example of a specific S2 step is as follows:

(1)目标区段的选取(1) Selection of the target segment

本申请中适用的目标区段应包含至少一个或多个能够区分不同玉米基因组的SNP位点。目标区段内的SNP位点数量与本申请实施例提供的方法能够一次性检测的可进行混合成为DNA混合样本中的玉米基因组数量有关，例如若目标区段内SNP位点为1个，该SNP位点的多态类型为两种，那么可以将2个玉米基因组的提取样本进行混合制成DNA混合样本，依此类推。The target segment suitable for use in this application should contain at least one or more SNP loci that can distinguish between different maize genomes. The number of SNP sites in the target segment is related to the number of maize genomes that can be detected at one time and can be mixed into a DNA mixed sample that can be detected by the method provided in the embodiment of the present application. For example, if there is one SNP site in the target segment, the There are two types of polymorphisms of SNP loci, so the extracted samples of two maize genomes can be mixed to make DNA mixed samples, and so on.

在一个具体的实施例中，以玉米B73 RefGen_v4基因组的区段1(1:83554963-83555226)和区段2(7:46311503-46311767)作为目标区段，区段1和2内都存在多个能够区分四个基因组序列信息的SNP位点。如图3所示，待DNA甲基化检测的玉米B73，Mo17，W22，SK四个常见自交系均包含此两个目标区段，图中红点代表SNP位点。In a specific embodiment, segment 1 (1:83554963-83555226) and segment 2 (7:46311503-46311767) of the maize B73 RefGen_v4 genome are used as target segments, and there are multiple segments in both segments 1 and 2 SNP loci capable of distinguishing four genome sequence information. As shown in Figure 3, the four common inbred lines of maize B73, Mo17, W22 and SK to be detected by DNA methylation all contain these two target segments, and the red dots in the figure represent SNP sites.

(2)引物设计(2) Primer design

对区段1(1:83554963-83555226)和区段2(7:46311503-46311767)进行引物设计上游引物和第一下游引物。其中，上游引物序列中的C碱基设计为简并碱基Y；第一下游引物序列中的G碱基设计为简并碱基R，第一下游引物包含具有第一识别序列和桥序列。Primer design upstream and first downstream primers were performed on segment 1 (1:83554963-83555226) and segment 2 (7:46311503-46311767). Wherein, the C base in the upstream primer sequence is designed as a degenerate base Y; the G base in the first downstream primer sequence is designed as a degenerate base R, and the first downstream primer includes a first recognition sequence and a bridge sequence.

在一些实施例中，第一识别序列为6个随机的碱基序列，例如CCCCCC、TTTTTT、GGGGGG、AAAAAA、AGAGGG、CCCGGG等，如表2中的下划线序列，所述第一识别序列用以标识所述DNA混合样本中目标区段的每一条DNA分子。In some embodiments, the first recognition sequence is 6 random base sequences, such as CCCCCC, TTTTTT, GGGGGG, AAAAAA, AGAGGG, CCCGGG, etc., such as the underlined sequence in Table 2, the first recognition sequence is used to identify Each DNA molecule of the target segment in the DNA mixed sample.

在一些实施例中，桥序列在最终的第二扩增产物中起到连接第一识别序列和第二识别序列的作用。具体的，桥序列的长度不受限制，只需起到连接和减少对目标区段序列的干扰即可，延长桥序列可以增加第二次PCR扩增实验的稳定性，缩短桥序列可以减少引物合成成本，例如可为一个长度为9～18bp的短序列(如表2中的加粗序列)，一些实施例中，桥序列的核苷酸序列如表2所示。In some embodiments, the bridge sequence acts to link the first recognition sequence and the second recognition sequence in the final second amplification product. Specifically, the length of the bridge sequence is not limited, it only needs to connect and reduce the interference to the target segment sequence. Extending the bridge sequence can increase the stability of the second PCR amplification experiment, and shortening the bridge sequence can reduce the number of primers. The synthesis cost can be, for example, a short sequence with a length of 9-18 bp (such as the bold sequence in Table 2). In some embodiments, the nucleotide sequence of the bridge sequence is shown in Table 2.

表2Table 2

(3)第一扩增反应(3) The first amplification reaction

在一些实施例中，设置了多个目标区段，可以分别对多个目标区段设计引物，可以每一目标区段采用一对引物进行一次第一扩增反应；也可以将多个目标区段的引物混合后，仅进行一次第一扩增反应。In some embodiments, multiple target segments are set, and primers can be designed for the multiple target segments respectively, and a pair of primers can be used for each target segment to perform a first amplification reaction; After the primers of the segments are mixed, only one first amplification reaction is performed.

在一个具体的实施例中，针对区段1和区段2设计的上游引物和第一下游引物混合和同时加入经S1步骤“转变”的DNA混合样本作为模板，一次第一扩增PCR扩增反应，得到第一扩增产物。In a specific embodiment, the upstream primers designed for segment 1 and segment 2 and the first downstream primer are mixed and the DNA mixed sample "converted" in step S1 is added as a template at the same time, and a first amplification PCR amplification reaction to obtain the first amplification product.

在一些实施例中，第一扩增反应的反应体系如表3所示。表3中，上游引物可以为针对多个目标区段的设计的上游引物的混合物，第一下游引物可以为针对多个目标区段的设计的第一下游引物的混合物。In some embodiments, the reaction system of the first amplification reaction is shown in Table 3. In Table 3, the upstream primer may be a mixture of designed upstream primers for multiple target segments, and the first downstream primer may be a mixture of designed first downstream primers for multiple target segments.

表3第一扩增PCR反应体系Table 3 The first amplification PCR reaction system

为增加扩增的特异性，一些实施例提供了一种第一扩增的反应步骤包括：In order to increase the specificity of amplification, some embodiments provide a first amplification reaction step comprising:

S21、进行预变性，处理温度为95℃，处理时间为5min；S21, carry out pre-denaturation, the treatment temperature is 95°C, and the treatment time is 5min;

S22、进行第一扩增循环，第一扩增循环包括依次进行第一解链处理、第一退火处理和第一延伸处理；S22, performing a first amplification cycle, where the first amplification cycle includes sequentially performing a first melting process, a first annealing process, and a first extension process;

S23、进行第二扩增循环，第二扩增循环包括依次进行第二解链处理、第二退火处理和第二延伸处理；S23, performing a second amplification cycle, where the second amplification cycle includes sequentially performing a second melting process, a second annealing process, and a second extension process;

S24、72℃处理5min和12℃处理1s。S24, 72℃ for 5min and 12℃ for 1s.

一些步骤S22的实施例中，第一扩增循环包括至少8～12次循环，以便充分增加底物链的浓度，提高第一扩增产物的合成效率。In some embodiments of step S22, the first amplification cycle includes at least 8 to 12 cycles, so as to sufficiently increase the concentration of the substrate chain and improve the synthesis efficiency of the first amplification product.

一些步骤S22的实施例中，第一解链处理的条件为95℃处理时间为30s。In some embodiments of step S22, the condition of the first melting process is that the treatment time at 95°C is 30s.

一些步骤S22的实施例中，第一退火处理中的处理温度随第一扩增循环的循环次数增加而逐次降低。如此，不仅可以充分增加底物链的浓度，提高第一扩增产物的合成效率，还能为提高扩增的特异性，减少第一扩增产物中没有携带第一识别序列的目的片段的概率。In some embodiments of step S22, the treatment temperature in the first annealing treatment decreases successively as the cycle number of the first amplification cycle increases. In this way, not only the concentration of the substrate chain can be fully increased, the synthesis efficiency of the first amplification product can be improved, but also the probability of the target fragment not carrying the first recognition sequence in the first amplification product can be reduced in order to improve the specificity of amplification. .

一些步骤S22的实施例中，第一退火处理的处理温度为68-55℃、65-53℃、65-55℃或63-55℃。例如，第一退火处理的处理温度为65-55℃，则第一扩增循环进行的次数为11次，第一退火处理中的处理温度随第一扩增循环的循环次数增加而逐次降低，例如，第1次第一扩增循环中第一退火处理的处理温度为65℃，第2次第一扩增循环中第一退火处理的处理温度为64℃。In some embodiments of step S22, the treatment temperature of the first annealing treatment is 68-55°C, 65-53°C, 65-55°C, or 63-55°C. For example, if the treatment temperature of the first annealing treatment is 65-55°C, the number of times of the first amplification cycle is 11, and the treatment temperature in the first annealing treatment decreases gradually as the number of cycles of the first amplification cycle increases, For example, the treatment temperature of the first annealing treatment in the first first amplification cycle is 65°C, and the treatment temperature of the first annealing treatment in the second first amplification cycle is 64°C.

在一些步骤S23的实施例中，第二扩增循环包括至少20～24次循环，以提高第一扩增产物的浓度。In some embodiments of step S23, the second amplification cycle includes at least 20-24 cycles to increase the concentration of the first amplification product.

在一个具体的步骤S23的实施例中，进行24个第二扩增循环，每一所述第二扩增循环包括依次进行的95℃处理时间为30s、60℃处理30s和70℃处理30s。In a specific example of step S23, 24 second amplification cycles are performed, and each of the second amplification cycles includes successively performing 95°C treatment for 30s, 60°C treatment for 30s, and 70°C treatment for 30s.

表4一个具体的S21～S24的第一扩增反应步骤Table 4 A specific first amplification reaction step of S21～S24

(4)第一扩增产物的纯化(4) Purification of the first amplification product

本步骤中，可采用熟知的技术方法对第一扩增产物进行的纯化。例如，如图6所示，将第一扩增产物进行磁珠纯化，取2.2倍PCR产物体积的Beckman磁珠进行纯化，最后用30μLddH₂O洗脱，即可得到第一扩增产物。In this step, well-known technical methods can be used to purify the first amplification product. For example, as shown in FIG. 6 , the first amplification product is purified by magnetic beads, purified by using Beckman magnetic beads 2.2 times the volume of the PCR product, and finally eluted with 30 μL ddH ₂ O to obtain the first amplification product.

S3、第二扩增S3, second amplification

在本步骤的实施例中，使用了所述上游引物和第二下游引物，以第一扩增产物为模板进行扩增，以得到第二扩增产物。第二扩增产物包含第一扩增产物以及连接在第一扩增产物3'端的第二识别序列，第二识别序列用于对第一扩增产物进行标识。In the example of this step, the upstream primer and the second downstream primer are used, and the first amplification product is used as a template for amplification to obtain the second amplification product. The second amplification product includes the first amplification product and a second recognition sequence connected to the 3' end of the first amplification product, and the second recognition sequence is used to identify the first amplification product.

在一些实施例中，由于第一下游引物包括桥序列用以在第二扩增产物3’端形成第一识别序列、桥序列和第二识别序列的结构，第二识别序列用于在测序过程中标识第一扩增产物。例如，第二下游引物的核苷酸序列如表5所示。表5中，加粗序列为第二识别序列，其余部分为桥序列。In some embodiments, since the first downstream primer includes a bridge sequence to form the structure of the first recognition sequence, the bridge sequence and the second recognition sequence at the 3' end of the second amplification product, the second recognition sequence is used in the sequencing process The first amplification product is identified in . For example, the nucleotide sequence of the second downstream primer is shown in Table 5. In Table 5, the bold sequence is the second recognition sequence, and the rest are bridge sequences.

表5人工序列Table 5 Artificial sequences

在一些实施例中，若针对多个玉米基因组进行了多个第一扩增反应得到了多个第一扩增产物，则每一扩增产物对应设计了一个第二识别序列，如此即可得到对应数量的第二扩增产物。In some embodiments, if multiple first amplification reactions are performed on multiple maize genomes to obtain multiple first amplification products, then each amplification product is correspondingly designed with a second recognition sequence, and thus can be obtained Corresponding amount of the second amplification product.

一个具体的第二扩增反应的反应体系如表6，7所示。A specific reaction system of the second amplification reaction is shown in Tables 6 and 7.

一些实施例提供了一种第二扩增反应的步骤，包括：Some embodiments provide a step of a second amplification reaction, comprising:

S31、进行预变性，处理温度为95℃，处理时间为5min；S31, carry out pre-denaturation, the treatment temperature is 95°C, and the treatment time is 5min;

S32、进行扩增循环，所述扩增循环包括依次进行解链处理、退火处理和延伸处理；以及S32, performing an amplification cycle, wherein the amplification cycle includes sequentially performing a melting process, an annealing process, and an extension process; and

S33、72℃处理5min和12℃处理1s。S33, treated at 72 °C for 5 min and at 12 °C for 1 s.

在步骤S32的实施例中，解链处理的条件为95℃处理时间为30s，退火处理条件为60℃处理30s，延伸处理条件为72℃处理30s，如此，能够保证第一下游引物对第二扩增产物的特异性识别，提供扩增的特异性。In the embodiment of step S32, the conditions of the melting treatment are 95°C for 30s, the annealing condition is 60°C for 30s, and the extension treatment condition is 72°C for 30s. In this way, the first downstream primer pair can be guaranteed to be the second Specific recognition of amplification products provides specificity of amplification.

表6第二扩增PCR反应体系Table 6 The second amplification PCR reaction system

表7一个具体的S31～S33第二扩增PCR反应程序Table 7 A specific S31～S33 second amplification PCR reaction program

(4)第二扩增产物的纯化(4) Purification of the second amplification product

本步骤中，可采用熟知的技术方法对第二扩增产物进行的纯化。例如，如图7所示，将第二扩增的产物进行磁珠纯化，取2.2倍PCR产物体积的Beckman磁珠进行纯化，最后用30μL ddH₂O洗脱，即可得到第二扩增产物。In this step, well-known technical methods can be used to purify the second amplification product. For example, as shown in Figure 7, the second amplification product is purified by magnetic beads, using Beckman magnetic beads 2.2 times the volume of the PCR product for purification, and finally eluted with 30 μL ddH ₂ O to obtain the second amplification product .

由此，本申请实施例实质上还公开了用于构建包含目标区段的玉米基因组DNA甲基化检测文库的试剂盒，目标区段包含至少一个或多个能够区分不同玉米基因组的SNP位点，DNA甲基化检测文库用于检测包含所述目标区段的至少两个玉米基因组的DNA混合样本，其中，所述试剂盒包括：Thus, the embodiments of the present application essentially also disclose a kit for constructing a maize genomic DNA methylation detection library comprising a target segment, the target segment comprising at least one or more SNP sites capable of distinguishing different maize genomes , the DNA methylation detection library is used to detect DNA mixed samples of at least two maize genomes comprising the target segment, wherein the kit includes:

包含目标区段的玉米基因组DNA甲基化的检测方法A method for detecting methylation of maize genomic DNA containing target segments

另外，本申请实施例还公开了包含目标区段的玉米基因组DNA甲基化的检测方法，包括：In addition, the embodiment of the present application also discloses a method for detecting the methylation of the maize genomic DNA comprising the target segment, including:

S1～S3步骤，以构建所述包含所述目标区段的玉米基因组DNA甲基化检测文库；Steps S1-S3, to construct the maize genomic DNA methylation detection library comprising the target segment;

S5、对所述DNA甲基化检测文库进行测序，以获得第一reads库；S5, sequencing the DNA methylation detection library to obtain the first reads library;

S6、对所述第一reads库进行质控、拆分和组对得到第二reads库；S6, performing quality control, splitting and pairing on the first reads library to obtain a second reads library;

S7、对所述第二reads库进行比对、去重和计算，即得到所述DNA混合样本中每一玉米基因组的DNA甲基化信息。S7. Perform alignment, deduplication and calculation on the second reads library to obtain DNA methylation information of each maize genome in the DNA mixed sample.

在步骤S5的实施例中，采用第二代高通量测序技术对构建的DNA甲基化检测文库进行测序。其中，第一reads库包括与所述DNA甲基化检测文库对应数量的第一reads，其中术语“reads”为是一小段短的测序片段，是第二代高通量测序仪产生的原始测序数据。第二reads库包括基于相同所述第二识别序列组成的第二reads，所述第二reads为去除了所述第一识别序列、所述第二识别序列和所述桥序列的reads。In the example of step S5, the second-generation high-throughput sequencing technology is used to sequence the constructed DNA methylation detection library. Wherein, the first reads library includes a corresponding number of first reads in the DNA methylation detection library, wherein the term "reads" is a short segment of sequencing fragments, which are the original sequencing generated by the second-generation high-throughput sequencer. data. The second reads library includes second reads composed based on the same second recognition sequence, and the second reads are the reads from which the first recognition sequence, the second recognition sequence and the bridge sequence are removed.

在步骤S6的实施例中，其具体包括S61～S63。In the embodiment of step S6, it specifically includes S61-S63.

S61、获得目标区段的参考序列。例如，基于上述的区段1和区段2，在四个玉米自交系B73，Mo17，W22，SK基因组中的序列信息。S61. Obtain the reference sequence of the target segment. For example, based on segment 1 and segment 2 described above, sequence information in the genomes of four maize inbred lines B73, Mo17, W22, SK.

S62、对第一reads库进行质控，去除低质量的第一reads。具体，可以基于常用的测序质量阈值Q20为基准进行去除。S62. Perform quality control on the first reads library to remove low-quality first reads. Specifically, it can be removed based on the commonly used sequencing quality threshold Q20.

S63、对质控后的第一reads库中的第一reads进行拆分和组对，以得到第二reads库。术语“拆分”是将具有不同的第二识别序列的第一reads进行分开，术语“组对”是将具有相同的第二识别序列的第一reads进行汇集组成；例如存在多个第二识别序列，则可得到对应数量的第一reads组别。S63, splitting and pairing the first reads in the first reads library after quality control to obtain a second reads library. The term "split" is to separate the first reads with different second recognition sequences, and the term "pair" is to pool the first reads with the same second recognition sequence; for example, there are multiple second recognition sequences. sequence, the corresponding number of first reads groups can be obtained.

在步骤S63的一些实施例中，如表5所示，共设计了17种第二识别序列，由此产生了17个第一reads组别，具体的每一条第一reads组别汇集形成一个组fq文件，并去除每一条第一reads的第一识别序列、桥序列和第二识别序列，以此得到了第二reads库；同时输出每一条第一reads对应的第一识别序列的数据。In some embodiments of step S63, as shown in Table 5, a total of 17 second recognition sequences are designed, thereby generating 17 first read groups, and each specific first read group is aggregated to form a group fq file, and remove the first recognition sequence, bridge sequence and second recognition sequence of each first read, thereby obtaining a second reads library; at the same time, output the data of the first recognition sequence corresponding to each first read.

在一个具体的步骤S7的实施例中，对第二reads库进行比对包括将组成的第二reads分别和参考序列进行除C位点允许C和T的错配比对外其余位点无错配的比对，所述参考序列为包含所述目标区段的玉米基因组序列。例如，以1:83554963-83555226和7:46311503-46311767在B73，Mo17，W22，SK四个玉米自交系中的序列为参考序列，整理成fa格式，利用Bsmap比对软件把拆分后的reads与整理后的参考序列进行严格的，不允许错配的比对(-v 0)，得到比对后的bam文件，并用samtools软件建索引；注意：此处不允许错配的比对方式是必须的，对于第二识别序列处理后的新的fq格式文件，包含来源于四种基因组样品混合的reads，基于不同基因组之间的SNP，可以将reads在比对过程中分配给相对应的基因组。In a specific example of step S7, aligning the second reads library includes performing the composition of the second reads with the reference sequence respectively, except that the C site allows the mismatch of C and T to compare that the remaining sites are free of mismatches. The reference sequence is the maize genome sequence comprising the target segment. For example, take the sequences of 1:83554963-83555226 and 7:46311503-46311767 in B73, Mo17, W22, SK four maize inbred lines as reference sequences, arrange them into fa format, and use Bsmap alignment software to compare the split The reads are strictly aligned with the sorted reference sequence, and no mismatch is allowed (-v 0), and the aligned bam file is obtained, and the samtools software is used to build an index; note: mismatched alignment is not allowed here. It is necessary. For the new fq format file processed by the second recognition sequence, it contains reads from a mixture of four genome samples. Based on the SNPs between different genomes, the reads can be assigned to the corresponding reads during the alignment process. Genome.

在一个具体的步骤S7的实施例中，对第二reads库进行去重包括将比对上的reads去除由PCR扩增导致的重复reads。例如，基于第一识别序列，对初步DNA甲基化比对信息中由PCR扩增导致的重复数据进行去除，以得到去重的DNA甲基化比对信息。In a specific embodiment of step S7, the deduplication of the second reads library includes removing duplicate reads caused by PCR amplification from the aligned reads. For example, based on the first identification sequence, duplicate data caused by PCR amplification in the preliminary DNA methylation alignment information is removed to obtain deduplicated DNA methylation alignment information.

在一个具体的步骤S7的实施例中，对第二reads库进行计算包括对比对上的去重reads统计每个样品目标区段每个C位点配对的C碱基和T碱基的数目，计算每个C位点DNA甲基化水平：每个C位点C碱基数目/(每个C位点C碱基数目+每个C位点T碱基数目)。In a specific embodiment of step S7, the calculation of the second reads library includes comparing the deduplicated reads on the pair and counting the number of C bases and T bases paired at each C site in the target segment of each sample, The DNA methylation level of each C site was calculated: the number of C bases per C site/(the number of C bases per C site + the number of T bases per C site).

经过步骤S1～7的检测方法得到的一个具体结果如图4所示。另外，从NCBI公共数据库获取已知的玉米自交系B73和Mo17叶片组织全基因组DNA甲基化测序数据(https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi？acc＝GSE128859)，统计全基因组DNA甲基化测序方法(常规方法)获得的目标区段1和2的DNA甲基化水平，如图5。对比图4、5可知，采用本申请方法测定的B73和Mo17目标区段1和2的DNA甲基化水平与常规方法检测的目标区段的甲基化水平一致。A specific result obtained by the detection method in steps S1 to 7 is shown in FIG. 4 . In addition, the known maize inbred lines B73 and Mo17 leaf tissue whole-genome DNA methylation sequencing data were obtained from the NCBI public database (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?id=20000000 acc=GSE128859), the DNA methylation levels of target segments 1 and 2 obtained by the whole genome DNA methylation sequencing method (conventional method) were calculated, as shown in Figure 5. Comparing Figures 4 and 5, it can be seen that the DNA methylation levels of B73 and Mo17 target segments 1 and 2 determined by the method of the present application are consistent with the methylation levels of the target segments detected by conventional methods.

由上所述，本申请实施例还实质公开了包含目标区段的玉米基因组DNA甲基化检测的系统，目标区段包含至少一个或多个能够区分不同玉米基因组的SNP位点，DNA甲基化检测文库用于检测包含所述目标区段的至少两个玉米基因组的DNA混合样本，其中，所述系统包括：From the above, the embodiments of the present application also substantially disclose a system for DNA methylation detection of maize genomes comprising a target segment, the target segment comprising at least one or more SNP sites capable of distinguishing different maize genomes, DNA methylation The detection library is used to detect DNA mixed samples of at least two maize genomes comprising the target segment, wherein the system comprises:

前述实施例提供的试剂盒，用于获得DNA混合样本中每一玉米基因组的DNA甲基化信息；以及The kits provided in the preceding embodiments are used to obtain DNA methylation information of each maize genome in a mixed DNA sample; and

处理装置，所述处理装置用于运行程序，所述程序运行时执行上述实施例提供的检测方法中的“对所述DNA甲基化检测文库进行测序，以获得第一reads库；A processing device, the processing device is used to run a program, and when the program runs, executes "sequencing the DNA methylation detection library to obtain the first reads library" in the detection method provided in the above embodiment;

以上所述，仅为本申请较佳的具体实施方式，但本申请的保护范围并不局限于此，任何熟悉本技术领域的技术人员在本申请揭露的技术范围内，可轻易想到的变化或替换，都应涵盖在本申请的保护范围之内。The above are only the preferred specific embodiments of the present application, but the protection scope of the present application is not limited to this. Substitutions should be covered within the protection scope of this application.

序列表sequence listing

<110> 华中农业大学<110> Huazhong Agricultural University

<120> 玉米基因组目标区段的DNA甲基化检测<120> DNA methylation detection of target segments of the maize genome

<160> 21<160> 21

<170> SIPOSequenceListing 1.0<170> SIPOSequenceListing 1.0

<210> 1<210> 1

<211> 21<211> 21

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 1<400> 1

gtatyggtgg ygtgtggaat g 21gtatyggtgg ygtgtggaat g 21

<210> 2<210> 2

<211> 21<211> 21

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 2<400> 2

gaagtggtga yyagyagtgt g 21gaagtggtga yyagyagtgt g 21

<210> 3<210> 3

<211> 39<211> 39

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<220><220>

<221> misc_feature<221> misc_feature

<222> (10)..(15)<222> (10)..(15)

<223> n is a, c, g, or t<223> n is a, c, g, or t

<400> 3<400> 3

atagcgacgn nnnnncccca atraaaraaa carcaactc 39atagcgacgn nnnnncccca atraaaraaa carcaactc 39

<210> 4<210> 4

<211> 38<211> 38

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<220><220>

<221> misc_feature<221> misc_feature

<222> (10)..(15)<222> (10)..(15)

<223> n is a, c, g, or t<223> n is a, c, g, or t

<400> 4<400> 4

atagcgacgn nnnnncttca ctraccttcc aartcctc 38atagcgacgn nnnnncttca ctraccttcc aartcctc 38

<210> 5<210> 5

<211> 17<211> 17

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 5<400> 5

atcacgttat agcgacg 17atcacgttat agcgacg 17

<210> 6<210> 6

<211> 17<211> 17

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 6<400> 6

cgatgtttat agcgacg 17cgatgtttat agcgacg 17

<210> 7<210> 7

<211> 17<211> 17

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 7<400> 7

ttaggcatat agcgacg 17ttaggcatat agcgacg 17

<210> 8<210> 8

<211> 17<211> 17

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 8<400> 8

tgaccactat agcgacg 17tgaccactat agcgacg 17

<210> 9<210> 9

<211> 17<211> 17

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 9<400> 9

acagtggtat agcgacg 17acagtggtat agcgacg 17

<210> 10<210> 10

<211> 17<211> 17

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 10<400> 10

gccaatgtat agcgacg 17gccaatgtat agcgacg 17

<210> 11<210> 11

<211> 17<211> 17

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 11<400> 11

cagatctgat agcgacg 17cagatctgat agcgacg 17

<210> 12<210> 12

<211> 17<211> 17

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 12<400> 12

acttgatgat agcgacg 17acttgatgat agcgacg 17

<210> 13<210> 13

<211> 17<211> 17

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 13<400> 13

gatcagcgat agcgacg 17gatcagcgat agcgacg 17

<210> 14<210> 14

<211> 17<211> 17

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 14<400> 14

tagcttgtat agcgacg 17tagcttgtat agcgacg 17

<210> 15<210> 15

<211> 17<211> 17

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 15<400> 15

ggctacagat agcgacg 17ggctacagat agcgacg 17

<210> 16<210> 16

<211> 17<211> 17

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 16<400> 16

cttgtactat agcgacg 17cttgtactat agcgacg 17

<210> 17<210> 17

<211> 17<211> 17

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 17<400> 17

tggttgttat agcgacg 17tggttgttat agcgacg 17

<210> 18<210> 18

<211> 17<211> 17

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 18<400> 18

tctcggttat agcgacg 17tctcggttat agcgacg 17

<210> 19<210> 19

<211> 17<211> 17

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 19<400> 19

taagcgttat agcgacg 17taagcgttat agcgacg 17

<210> 20<210> 20

<211> 17<211> 17

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 20<400> 20

tccgtcttat agcgacg 17tccgtcttat agcgacg 17

<210> 21<210> 21

<211> 17<211> 17

<212> DNA<212> DNA

<213> 人工序列(Artificial Sequence)<213> Artificial Sequence

<400> 21<400> 21

ttctgtgtat agcgacg 17ttctgtgtat agcgacg 17

Claims

1. the construction method of the DNA methylation detection library of the maize genome comprising the target segment, the target segment comprises at least one or more SNP sites capable of distinguishing different maize genomes, and the DNA methylation detection library uses For detection of DNA mixed samples of at least two maize genomes comprising the target segment, wherein the construction method comprises:

converting unmethylated cytosine in DNA molecules in the DNA mixed sample into uracil;

A first amplification is performed, and the first amplification includes amplifying the DNA mixed sample in the same PCR reaction system to obtain a first amplification product, and the first amplification product comprises a DNA having the target segment. A DNA molecule and a first recognition sequence and a bridge sequence connected to the 3' end of the DNA molecule, the first recognition sequence is used to identify each DNA molecule of the target segment in the DNA mixed sample;

performing a second amplification, the second amplification comprising the step of performing PCR amplification on the first amplification product to obtain a second amplification product, the second amplification product comprising the first amplification product and the a second recognition sequence at the 3' end of the first amplification product, the second recognition sequence being used to identify the first amplification product; and

The second amplification product is used to construct the DNA methylation detection library of the maize genome comprising the target segment.

2. construction method according to claim 1, wherein, described first amplification uses upstream primer and first downstream primer, and the C base in described upstream primer is designed to be degenerate base Y; The G base in the first downstream primer is designed to be a degenerate base R, and the first downstream primer comprises the first recognition sequence and the bridge sequence;

The SNP site in the first amplification product is located within 150 bp of the starting position of the DNA molecule of the target segment and the upstream primer binding or with the first downstream primer binding start position, so as to ensure the SNP position. Spots can be detected in subsequent sequencing.

3. construction method according to claim 2, wherein, the reaction step of described first amplification comprises:

Carry out pre-denaturation, the treatment temperature is 95 °C, and the treatment time is 5 min;

performing a first amplification cycle, the first amplification cycle comprising sequentially performing a first melting process, a first annealing process, and a first extension process;

performing a second amplification cycle comprising sequentially performing a second melting process, a second annealing process, and a second extension process; and

72℃ for 5min and 12℃ for 1s;

Wherein, the first amplification cycle includes at least 8 to 12 cycles, and the treatment temperature of the first annealing treatment decreases successively as the cycle number of the first amplification cycle increases.

The construction method according to claim 3, wherein the treatment temperature of the first annealing treatment is 68-55°C, 65-53°C, 65-55°C or 63-55°C.

5. The construction method according to claim 1, wherein the second amplification uses the upstream primer and the second downstream primer, the second downstream primer comprises the second recognition sequence, the second The amplification reaction steps include:

performing an amplification cycle comprising sequentially performing a melting process, an annealing process, and an extension process; and

Treat at 72°C for 5 min and at 12°C for 1 s.

6. The construction method according to claim 5, wherein the second amplification product further comprises between the first recognition sequence and the second recognition sequence connected at the 3' end of the first amplification product The bridge sequence is used to connect the first recognition sequence and the second recognition sequence.

7. A method for detecting DNA methylation in a target segment of a maize genome, the target segment comprising at least one or more SNP sites capable of distinguishing different maize genomes, and the method for detecting DNA methylation in a target segment is used to detect A DNA mixed sample of at least two corn genomes comprising the target segment, wherein the detection method comprises:

The construction method according to any one of claims 1 to 6, wherein a DNA methylation detection library of the maize genome comprising the target segment is constructed;

Sequencing the DNA methylation detection library to obtain the first reads library;

Carry out quality control, splitting and pairing to the first reads library to obtain the second reads library;

The second reads library is aligned, deduplicated and calculated to obtain the DNA methylation information of each maize genome in the DNA mixed sample.

8. The detection method according to claim 7, wherein the first reads library comprises the first reads corresponding to the number of the DNA methylation detection library, and the second reads library comprises the second reads based on the same The second reads composed of the recognition sequence, the second reads are the reads from which the first recognition sequence, the second recognition sequence and the bridge sequence are removed.

9. A kit for constructing a DNA methylation detection library of a maize genome comprising a target segment comprising at least one or more SNP sites capable of distinguishing different maize genomes, the DNA methylation The detection library is used to detect DNA mixed samples of at least two maize genomes comprising the target segment, wherein the kit includes:

Maize genomic DNA processing reagent for extracting the maize genomic DNA and converting unmethylated cytosine into uracil;

an upstream primer, the C base in the upstream primer sequence is replaced with a sequence of degenerate base Y;

a first downstream primer, the G base in the sequence of the first downstream primer is replaced with a sequence of degenerate base R, the first downstream primer comprises a first recognition sequence and a bridge sequence; and

a second downstream primer comprising the bridge sequence and the second recognition sequence sequence;

Wherein, the upstream primer and the first downstream primer are used to perform a first amplification to obtain a first amplification product, and the first amplification product includes a DNA molecule with the target segment and a DNA molecule connected to the A first recognition sequence and a bridge sequence at the 3' end of the DNA molecule;

The upstream primer and the second downstream primer can obtain a second amplification product in a second amplification, and the second amplification product includes the first amplification product and is connected to the first amplification product 3 ' end of the second recognition sequence, to construct the DNA methylation detection library;

The first identification sequence is used to identify each DNA molecule of the target segment in the mixed DNA sample, and the second identification sequence is used to identify the first amplification product.

10. A system for detection of maize genome DNA methylation comprising a target segment comprising at least one or more SNP sites capable of distinguishing between different maize genomes, the DNA methylation detection library being used for detection of comprising A mixed sample of DNA from at least two maize genomes of the target segment, wherein the system comprises:

The kit of claim 9, for obtaining DNA methylation information of each corn genome in the DNA mixed sample; and

A processing device, the processing device is used for running a program, and when the program is running, the detection method of claim 7 or 8 is executed.

"Sequencing the DNA methylation detection library to obtain the first reads library;

The second reads library is aligned, deduplicated and calculated to obtain the DNA methylation information of each maize genome in the DNA mixed sample".