HK1228461B - A method for dna amplification - Google Patents
A method for dna amplification Download PDFInfo
- Publication number
- HK1228461B HK1228461B HK17101943.7A HK17101943A HK1228461B HK 1228461 B HK1228461 B HK 1228461B HK 17101943 A HK17101943 A HK 17101943A HK 1228461 B HK1228461 B HK 1228461B
- Authority
- HK
- Hong Kong
- Prior art keywords
- sequence
- primer
- temperature
- universal
- dna
- Prior art date
Links
Description
技术领域Technical Field
本发明涉及扩增DNA的方法,特别涉及扩增单细胞全基因组DNA并对其进行测序的方法。The present invention relates to a method for amplifying DNA, and in particular to a method for amplifying the whole genome DNA of a single cell and sequencing the same.
背景技术Background Art
单细胞全基因组测序技术是在单细胞水平对全基因组进行扩增与测序的一项新技术。其原理是将分离的单个细胞的微量全基因组DNA进行扩增,获得高覆盖率的完整基因组后进行高通量测序。Single-cell whole-genome sequencing is a new technology that amplifies and sequences the entire genome at the single-cell level. Its principle is to amplify the trace amount of whole-genome DNA from isolated single cells, obtain a complete genome with high coverage, and then perform high-throughput sequencing.
目前主要的全基因组扩增技术主要有四类:扩增前引物延伸聚合酶链式反(Primer Extension Preamplification-Polymerase Chain Reaction,简称为PEP-PCR,具体方法参见Zhang L,Cui X,Schmitt K,Hubert R,Navidi W,Arnheim N.1992.Wholegenome amplification from a single cell:implications for geneticanalysis.Proc Natl Acad Sci U S A.89(13):5847-51.)、退变寡核苷酸引物聚合酶链式反应(Degenerate Oligonucleotide–Primed Polymerase Chain Reaction,简称为DOP-PCR,具体方法参见Telenius H,Carter NP,Bebb CE,Nordenskjo M,Ponder BA,Tunnacliffe A.1992.Degenerate oligonucleotide-primed PCR:generalamplification of target DNA by a single degenerate primer.Genomics13:718–25)、多重置换扩增(Multiple Displacement Amplification,简称为MDA,具体方法参见DeanFB,Nelson JR,Giesler TL,LaskenRS.2001.Rapid amplification of plasmid andphageDNA using phi29DNA polymerase and multiply-primed rolling circleamplification.Genome Res.11:1095–99)和多次退火环状循环扩(Multiple Annealingand Looping Based Amplification Cycles,简称为MALBAC,具体方法参见PCT专利申请WO2012166425)。Currently, there are four main types of whole genome amplification technologies: Primer Extension Preamplification-Polymerase Chain Reaction (PEP-PCR, for specific methods, see Zhang L, Cui X, Schmitt K, Hubert R, Navidi W, Arnheim N. 1992. Whole genome amplification from a single cell: implications for genetic analysis. Proc Natl Acad Sci U S A. 89(13): 5847-51.), Degenerate Oligonucleotide–Primed Polymerase Chain Reaction (DOP-PCR, for specific methods, see Telenius H, Carter NP, Bebb CE, Nordenskjo M, Ponder BA, Tunnacliffe A. 1992. Degenerate oligonucleotide-primed PCR: general amplification of target DNA by a single degenerate primer.Genomics13:718–25), Multiple Displacement Amplification (MDA, for specific methods, see Dean FB, Nelson JR, Giesler TL, Lasken RS.2001. Rapid amplification of plasmid and phage DNA using phi29 DNA polymerase and multiply-primed rolling circle amplification. Genome Res.11:1095–99) and Multiple Annealing and Looping Based Amplification Cycles (MALBAC, for specific methods, see PCT patent application WO2012166425).
基因测序技术经过了三个发展阶段:第一代DNA测序技术包括化学降解法、双脱氧链终止法以及在它们基础上发展起来的各种测序技术,其中最具代表性的是1975年由桑格(Sanger)和考尔森(Coulson)提出的链终止法。第一代技术准确率高,读取长,是至今唯一可以进行“从头至尾”测序的方法,但存在成本高、速度慢等方面的不足,并不是最理想的测序方法。随后的二、三代测序技术以高通量为共同特征,也被称为“新一代测序技术(NGS)”。其中第二代测序技术以焦磷酸测序技术、边合成边测序(SBS)技术、以及连接测序技术为代表,经过几年的发展,焦磷酸测序技术以及连接测序技术已很少使用,现在主流的二代测序技术为边合成边测序技术、半导体测序技术以及CG测序技术。第三代测序技术大体分为两类,一类为单分子荧光测序,具有代表性的技术为TSMS技术和SMRT技术,另一类为纳米孔单分子技术。与前两代技术相比,第三代测序技术最大的特点是单分子测序。虽然第三代测序技术取得了一定的进展,但现阶段主流的测序技术仍是第二代测序技术。Gene sequencing technology has evolved through three stages. First-generation DNA sequencing technologies include chemical degradation, dideoxy chain termination, and various sequencing technologies developed from these. The most representative of these is the chain termination method proposed by Sanger and Coulson in 1975. First-generation technologies offer high accuracy and long read times, making them the only method capable of "end-to-end" sequencing to date. However, they suffer from high costs and slow speeds, making them less than ideal. Subsequent second- and third-generation sequencing technologies, characterized by high throughput, are also known as "next-generation sequencing (NGS)." Second-generation sequencing technologies are typified by pyrosequencing, sequencing by synthesis (SBS), and sequencing by ligation. After several years of development, pyrosequencing and sequencing by ligation have fallen into obscurity. Sequencing by synthesis, semiconductor sequencing, and CG sequencing are now the mainstream second-generation sequencing technologies. Third-generation sequencing technologies are broadly divided into two categories: single-molecule fluorescence sequencing, represented by TSMS and SMRT, and nanopore single-molecule sequencing. Compared with the previous two generations, the most significant feature of third-generation sequencing is single-molecule sequencing. Although third-generation sequencing has made some progress, the mainstream sequencing technology at this stage is still second-generation sequencing technology.
目前的全基因组扩增技术扩增出的全基因序列无法直接用于二代测序技术。因此,无论是将上述全基因组序列应用于二代测序技术中的边合成边测序技术、半导体测序技术或者CG测序技术,在上机测序之前都需要进行文库制备过程。每种测序技术都具有各自对应的文库制备方法,其中边合成边测序平台的文库制备主要分为两类,一类为片段化DNA经过末端修复后添加Y形接头技术或颈环接头技术,另一类为transpson技术。半导体测序平台的文库制备同样分为两类,一类为片段化DNA经过末端修复后添加接头技术,另一类为transpson技术。CG平台文库制备过程比较复杂,片段化DNA经过末端修复后需要酶切、以及两次环化过程,操作繁琐,耗时较长。The whole-genome amplification technology currently available cannot directly be used for second-generation sequencing technology. Therefore, whether the whole-genome sequence is applied to the sequencing-by-synthesis technology, semiconductor sequencing technology, or CG sequencing technology in the second-generation sequencing technology, a library preparation process is required before sequencing on the machine. Each sequencing technology has its own corresponding library preparation method. The library preparation of the sequencing-by-synthesis platform is mainly divided into two categories, one is the technology of adding Y-shaped adapters or neck-loop adapters after end-repair of fragmented DNA, and the other is transpson technology. The library preparation of the semiconductor sequencing platform is also divided into two categories, one is the technology of adding adapters after end-repair of fragmented DNA, and the other is transpson technology. The library preparation process of the CG platform is relatively complicated. After end-repair of the fragmented DNA, enzyme digestion and two circularization processes are required. The operation is cumbersome and time-consuming.
当将目前的主流扩增方法扩增出的产物用于上述测序技术时,要么需要另行进行建库,要么测序的效果不佳。因此,目前急需一种能够克服主流扩增方法的一个、多个或全部缺陷的改进的扩增方法。When products amplified by current mainstream amplification methods are used in the aforementioned sequencing technologies, either additional library construction is required or the sequencing results are poor. Therefore, there is an urgent need for an improved amplification method that can overcome one, multiple, or all of the drawbacks of mainstream amplification methods.
发明内容Summary of the Invention
本发明提供了一种扩增细胞基因组DNA的方法和一种用于扩增基因组DNA的试剂盒。The invention provides a method for amplifying cell genomic DNA and a kit for amplifying genomic DNA.
在本申请的一个方面中,提供了一种扩增基因组DNA的方法,所述方法包括:(a)提供第一反应混合物,其中所述第一反应混合物包括包含所述基因组DNA的样本、第一引物、核苷酸单体混合物和核酸聚合酶,其中所述第一引物从5’端到3’端包含通用序列和第一可变序列,所述第一可变序列包括第一随机序列,其中所述第一随机序列从5’端到3’端依次为Xa1Xa2……Xan,所述第一随机序列的Xai(i=1-n)均属于同一个集合,所述集合选自B、或D、或H、或V,其中B={T、G、C},D={A、T、G},H={T、A、C},V={A、C、G},其中Xai表示第一随机序列5’端的第i个核苷酸,n是选自3-20的正整数,可选地,所述第一反应混合物进一步包括第三引物,其中所述第三引物从5’端到3’端包含所述通用序列和第三可变序列,所述第三可变序列包括第三随机序列,其中所述第三随机序列从5’端到3’端依次为Xb1Xb2……Xbn,所述第三随机序列的Xbi(i=1-n)均属于同一个集合,所述集合选自B、或D、或H、或V,其中B={T、G、C},D={A、T、G},H={T、A、C},V={A、C、G},并且Xbi(i=1-n)和Xai(i=1-n)属于不同的集合,其中Xbi表示第三随机序列5’端的第i个核苷酸,n是选自3-20的正整数;(b)将所述第一反应混合物置于第一温度循环程序进行预扩增,获得预扩增产物;(c)提供第二反应混合物,所述第二反应混合物包括步骤(b)中得到的预扩增产物、第二引物、核苷酸单体混合物和核酸聚合酶,其中所述第二引物从5’端到3’端包含或由特定序列及所述通用序列组成;(d)将所述第二反应混合物置于第二温度循环程序进行扩增,获得扩增产物。In one aspect of the present application, a method for amplifying genomic DNA is provided, the method comprising: (a) providing a first reaction mixture, wherein the first reaction mixture comprises a sample comprising the genomic DNA, a first primer, a nucleotide monomer mixture, and a nucleic acid polymerase, wherein the first primer comprises a universal sequence and a first variable sequence from the 5' end to the 3' end, the first variable sequence comprises a first random sequence, wherein the first random sequence is X a1 X a2 ... X an from the 5' end to the 3' end, and X ai (i=1-n) of the first random sequence all belong to the same set, the set being selected from B, or D, or H, or V, wherein B={T, G, C}, D={A, T, G}, H={T, A, C}, V={A, C, G}, wherein X ai represents the i-th nucleotide at the 5' end of the first random sequence, n is a positive integer selected from 3-20, optionally, the first reaction mixture further comprises a third primer, wherein the third primer comprises the universal sequence and a third variable sequence from the 5' end to the 3' end, the third variable sequence comprises a third random sequence, wherein the third random sequence is X b1 X b2 ... X bn from the 5' end to the 3' end, X bi (i=1-n) of the third random sequence all belong to the same set, the set is selected from B, or D, or H, or V, wherein B={T, G, C}, D={A, T, G}, H={T, A, C}, V={A, C, G}, and X bi (i=1-n) and X ai (i=1-n) belong to different sets, wherein X bi represents the i-th nucleotide at the 5' end of the third random sequence, and n is a positive integer selected from 3-20; (b) placing the first reaction mixture in a first temperature cycling program for pre-amplification to obtain a pre-amplification product; (c) providing a second reaction mixture, the second reaction mixture comprising the pre-amplification product obtained in step (b), a second primer, a nucleotide monomer mixture, and a nucleic acid polymerase, wherein the second primer comprises or consists of a specific sequence and the universal sequence from the 5' end to the 3'end; (d) placing the second reaction mixture in a second temperature cycling program for amplification to obtain an amplified product.
在一些实施方式中,第一随机序列的Xai(i=1-n)均属于集合B,第三随机序列的Xbi(i=1-n)均属于集合D。In some implementations, X ai (i=1-n) of the first random sequence all belong to set B, and X bi (i=1-n) of the third random sequence all belong to set D.
在一些实施方式中,所述第一可变序列和所述第三可变序列进一步在其3’端包括固定序列,所述固定序列能够提高基因组覆盖度的碱基组合。在一些实施方式中,所述固定序列选自CCC、AAA、TGGG、GTTT、GGG、TTT、TNTNG或GTGG。In some embodiments, the first variable sequence and the third variable sequence further include a fixed sequence at their 3' end, wherein the fixed sequence can improve the base combination of genome coverage. In some embodiments, the fixed sequence is selected from CCC, AAA, TGGG, GTTT, GGG, TTT, TNTNG or GTGG.
在一些实施方式中,所述第一可变序列选自Xa1Xa2……XanTGGG或Xa1Xa2……XanGTTT,所述第三可变序列选自Xb1Xb2……XbnTGGG或Xb1Xb2……XbnGTTT。In some embodiments, the first variable sequence is selected from Xal Xa2 ... Xan TGGG or Xal Xa2 ... Xan GTTT, and the third variable sequence is selected from Xb1 Xb2 ... Xbn TGGG or Xb1 Xb2 ... Xbn GTTT .
在一些实施方式中,选择所述通用序列以使得其基本上不会与基因组DNA结合产生扩增,所述通用序列长度为6-60bp。在一些实施方式中,选择所述通用序列使得扩增产物能够直接进行测序。在一些实施方式中,所述通用序列选自SEQ ID NO:1[TTGGTAGTGAGTG]、SEQ ID NO:2[GAGGTGTGATGGA]、SEQ ID NO:3[GTGATGGTTGAGGTA]、SEQ ID NO:4[AGATGTGTATAAGAGACAG]、SEQ ID NO:5[GTGAGTGATGGTTGAGGTAGTGTGGAG]或SEQ ID NO:6[GCTCTTCCGATCT]。In some embodiments, the universal sequence is selected so that it will not be combined with genomic DNA to produce amplification, and the universal sequence length is 6-60bp. In some embodiments, the universal sequence is selected so that the amplified product can be directly sequenced. In some embodiments, the universal sequence is selected from SEQ ID NO:1[TTGGTAGTGAGTG], SEQ ID NO:2[GAGGTGTGATGGA], SEQ ID NO:3[GTGATGGTTGAGGTA], SEQ ID NO:4[AGATGTGTATAAGAGACAG], SEQ ID NO:5[GTGAGTGATGGTTGAGGTAGTGTGGAG] or SEQ ID NO:6[GCTCTTCCGATCT].
在一些实施方式中,所述通用序列和所述第一可变序列直接相连,或者所述通用序列和所述第一可变序列通过第一间隔序列相连,所述第一间隔序列为Ya1……Yam,其中Yaj(j=1-m)∈{A、T、G、C},其中Yaj表示间隔序列5’端的第j个核苷酸,m是选自1-3的正整数。In some embodiments, the universal sequence and the first variable sequence are directly connected, or the universal sequence and the first variable sequence are connected through a first spacer sequence, and the first spacer sequence is Ya1 ... Yam , where Yaj (j=1-m)∈{A, T, G, C}, where Yaj represents the jth nucleotide at the 5' end of the spacer sequence, and m is a positive integer selected from 1-3.
在一些实施方式中,所述通用序列和所述第三可变序列直接相连,或者所述通用序列和所述第三可变序列通过第三间隔序列相连,所述第三间隔序列为Yb1……Ybm,其中Ybj(j=1-m)∈{A、T、G、C},其中Ybj表示间隔序列5’端的第j个核苷酸,m是选自1-3的正整数。In some embodiments, the universal sequence and the third variable sequence are directly connected, or the universal sequence and the third variable sequence are connected through a third spacer sequence, and the third spacer sequence is Y b1 ...Y bm , where Y bj (j=1-m)∈{A, T, G, C}, where Y bj represents the j-th nucleotide at the 5' end of the spacer sequence, and m is a positive integer selected from 1-3.
在一些实施方式中,所述m=1。In some embodiments, m=1.
在一些实施方式中,所述第一引物包括GCTCTTCCGATCTYa1Xa1Xa2X a3Xa4Xa5TGGG、GCTCTTCCGATCTYa1Xa1Xa2Xa3Xa4Xa5GTTT或其混合物,所述第三引物包括GCTCTTCCGATCTYb1Xb1Xb 2Xb3Xb4Xb5TGGG、GCTCTTCCGATCTYb1Xb1Xb2Xb3Xb4Xb5GTTT或其混合物,其中Ya1∈{A、T、G、C},Yb1∈{A、T、G、C},所述Xai(i=1-5)∈{T、G、C},所述Xbi(i=1-5)∈{A、T、G}。In some embodiments, the first primer includes GCTCTTCCGATCTY a1 X a1 X a2 X a3 X a4 X a5 TGGG, GCTCTTCCGATCTY a1 X a1 X a2 X a3 X a4 X a5 GTTT , or a mixture thereof, and the third primer includes GCTCTTCCGATCTY b1 X b1 X b2 X b3 X b4 X b5 TGGG, GCTCTTCCGATCTY b1 X b1 X b2 X b3 X b4 X b5 GTTT, or a mixture thereof, wherein Y a1 ∈ {A, T, G, C}, Y b1 ∈ {A, T, G, C}, X ai (i=1-5) ∈ {T, G, C}, and X bi (i=1-5) ∈ {A, T, G}.
在一些实施方式中,所述方法进一步包括对步骤(d)中获得的扩增产物进行测序的步骤,其中所述第二引物包括与测序用引物的部分或全部互补或者相同的序列。In some embodiments, the method further comprises the step of sequencing the amplified product obtained in step (d), wherein the second primer comprises a sequence that is complementary to or identical to part or all of the sequencing primer.
在一些实施方式中,所述通用序列包括与测序用引物的部分或全部互补或者相同的序列。In some embodiments, the universal sequence includes a sequence that is complementary to or identical to part or all of a sequencing primer.
在一些实施方式中,所述第二引物的特定序列包括与测序用引物的部分或全部互补或者相同的序列。In some embodiments, the specific sequence of the second primer includes a sequence that is complementary to or identical to part or all of the sequencing primer.
在一些实施方式中,所述第二引物的特定序列进一步包括与测序平台的捕捉序列部分或全部互补或者相同的序列。In some embodiments, the specific sequence of the second primer further includes a sequence that is partially or completely complementary or identical to a capture sequence of a sequencing platform.
在一些实施方式中,所述第二引物的特定序列中包含的与测序用引物的部分或全部互补或相同的序列包含或由SEQ ID NO:31[ACACTCTTTCCCTACACGAC]、或SEQ ID NO:32[GTGACTGGAGTTCAGACGTGT]组成。In some embodiments, the specific sequence of the second primer comprises a sequence that is complementary or identical to part or all of the sequencing primer and comprises or consists of SEQ ID NO: 31 [ACACTCTTTCCCTACACGAC] or SEQ ID NO: 32 [GTGACTGGAGTTCAGACGTGT].
在一些实施方式中,所述第二引物的特定序列中包含的与测序平台的捕捉序列部分或全部互补或相同的序列包含或由SEQ ID NO:33[AATGATACGGCGACCACCGAGATCT]、或SEQID NO:34[CAAGCAGAAGACGGCATACGAGAT]组成。In some embodiments, the specific sequence of the second primer comprises a sequence that is partially or completely complementary or identical to a capture sequence of a sequencing platform and comprises or consists of SEQ ID NO: 33 [AATGATACGGCGACCACCGAGATCT] or SEQ ID NO: 34 [CAAGCAGAAGACGGCATACGAGAT].
在一些实施方式中,所述第二引物的特定序列进一步包括标识序列,所述标识序列位于所述与测序平台的捕捉序列部分或全部互补或相同的序列和所述与测序用引物的部分或全部互补或相同的序列之间。In some embodiments, the specific sequence of the second primer further includes an identification sequence, and the identification sequence is located between the sequence that is partially or fully complementary or identical to the capture sequence of the sequencing platform and the sequence that is partially or fully complementary or identical to the sequencing primer.
在一些实施方式中,所述第二引物包括具有相同通用序列和不同特定序列的引物混合物,所述不同特定序列分别与同一测序中用到的测序引物对中不同引物的部分或全部互补或相同。In some embodiments, the second primer comprises a primer mixture having the same universal sequence and different specific sequences, wherein the different specific sequences are respectively complementary to or identical to part or all of the different primers in the sequencing primer pair used in the same sequencing.
在一些实施方式中,所述第二引物包括SEQ ID NO:35[AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT]和SEQ ID NO:36[CAAGCAGAAGACGGCATACGAGATCGTGATGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT]所示的序列的混合物。In some embodiments, the second primer comprises a mixture of the sequences shown in SEQ ID NO: 35 [AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT] and SEQ ID NO: 36 [CAAGCAGAAGACGGCATACGAGATCGTGATGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT].
在一些实施方式中,所述核酸聚合酶具有热稳定和/或链置换活性。在一些实施方式中,所述核酸聚合酶选自:Phi29DNA聚合酶、Bst DNA聚合酶、Pyrophage 3137、Vent聚合酶、TOPOTaq DNA聚合酶、9。Nm聚合酶、Klenow Fragment DNA聚合酶I、MMLV反转录酶、AMV反转录酶、HIV反转录酶、T7phase DNA聚合酶变种、超保真DNA聚合酶、Taq聚合酶、Bst DNA聚合酶、E.coli DNA聚合酶、LongAmp Taq DNA聚合酶、OneTaq DNA聚合酶、DeepVent DNA聚合酶、Vent(exo-)DNA聚合酶、Deep Vent(exo-)DNA聚合酶,及其任意组合。In some embodiments, the nucleic acid polymerase has thermostability and/or strand displacement activity. In some embodiments, the nucleic acid polymerase is selected from the group consisting of: Phi29 DNA polymerase, Bst DNA polymerase, Pyrophage 3137, Vent polymerase, TOPOTaq DNA polymerase, 9.Nm polymerase, Klenow Fragment DNA polymerase I, MMLV reverse transcriptase, AMV reverse transcriptase, HIV reverse transcriptase, T7 phase DNA polymerase variant, super-fidelity DNA polymerase, Taq polymerase, Bst DNA polymerase, E. coli DNA polymerase, LongAmp Taq DNA polymerase, OneTaq DNA polymerase, DeepVent DNA polymerase, Vent (exo-) DNA polymerase, Deep Vent (exo-) DNA polymerase, and any combination thereof.
在一些实施方式中,步骤(b)使得所述第一类引物的可变序列能够与所述基因组DNA配对并扩增所述基因组DNA以得到基因组预扩增产物,其中所述基因组预扩增产物的5’端包含所述通用序列,3’端包含所述通用序列的互补序列。In some embodiments, step (b) enables the variable sequence of the first type of primer to pair with the genomic DNA and amplify the genomic DNA to obtain a genomic pre-amplification product, wherein the 5' end of the genomic pre-amplification product contains the universal sequence and the 3' end contains the complementary sequence of the universal sequence.
在一些实施方式中,所述第一温度循环程序包括:(b1)能够打开所述DNA双链以获得DNA单链模板的温度程序;(b2)能够使所述第一引物以及可选的第三引物与所述DNA单链模板结合的温度程序;(b3)在所述核酸聚合酶的作用下能够使与所述DNA单链模板结合的第一类引物延伸长度以产生预扩增产物的温度程序;(b4)重复步骤(b1)到(b3)至指定的第一循环次数,其中所述指定的第一循环次数大于1。In some embodiments, the first temperature cycling program includes: (b1) a temperature program capable of opening the double-stranded DNA to obtain a single-stranded DNA template; (b2) a temperature program capable of causing the first primer and an optional third primer to bind to the single-stranded DNA template; (b3) a temperature program capable of extending the length of the first type of primer bound to the single-stranded DNA template under the action of the nucleic acid polymerase to produce a pre-amplification product; and (b4) repeating steps (b1) to (b3) to a specified number of first cycles, wherein the specified number of first cycles is greater than 1.
在一些实施方式中,在进行第一次循环时,步骤(b1)中所述DNA双链为基因组DNA双链,所述温度程序包括在90-95℃的温度之间变性反应1-20分钟。在一些实施方式中,在进行第一次循环后,步骤(b1)中所述的温度程序包括在90-95℃的温度之间解链反应3-50秒。In some embodiments, during the first cycle, the double-stranded DNA in step (b1) is a double-stranded genomic DNA, and the temperature program includes a denaturation reaction at a temperature between 90° C. and 95° C. for 1-20 minutes. In some embodiments, after the first cycle, the temperature program in step (b1) includes a melting reaction at a temperature between 90° C. and 95° C. for 3-50 seconds.
在一些实施方式中,当进行到第二次循环后,所述预扩增产物包含在5’端包含所述通用序列,3’端包含所述通用序列的互补序列的基因组预扩增产物。In some embodiments, after the second cycle, the pre-amplification product comprises a genomic pre-amplification product comprising the universal sequence at the 5' end and a complementary sequence of the universal sequence at the 3' end.
在一些实施方式中,在步骤(b1)后并且在步骤(b2)之前不包括额外的将所述第一反应混合物置于适当的温度程序,使得所述基因组预扩增产物的3’端与5’端杂交结合以形成发卡结构的步骤(b2’)。在一些实施方式中,所述步骤(b2)包括将所述反应混合物置于多于一种的温度程序,以促使所述第一类引物充分与所述DNA模板有效结合。在一些实施方式中,所述多于一种的温度程序包括:介于10-20℃之间的第一温度,介于20-30℃之间的第二温度,和介于30-50℃之间的第三温度。在一些实施方式中,所述步骤(b2)中所述步骤包括在第一温度退火反应3-60秒、在第二温度退火反应3-50秒和在第三温度退火反应3-50秒。在一些实施方式中,所述步骤(b3)中所述的温度程序包括在60-80℃的温度之间延伸反应10秒-15分钟。在一些实施方式中,所述步骤(b4)的所述第一循环次数为2-40。In some embodiments, after step (b1) and before step (b2), an additional step (b2') of placing the first reaction mixture in an appropriate temperature program is not included so that the 3' end and the 5' end of the genomic pre-amplification product hybridize and bind to form a hairpin structure. In some embodiments, the step (b2) includes placing the reaction mixture in more than one temperature program to encourage the first type of primer to fully and effectively bind to the DNA template. In some embodiments, the more than one temperature program includes: a first temperature between 10-20°C, a second temperature between 20-30°C, and a third temperature between 30-50°C. In some embodiments, the step (b2) includes annealing for 3-60 seconds at the first temperature, annealing for 3-50 seconds at the second temperature, and annealing for 3-50 seconds at the third temperature. In some embodiments, the temperature program in step (b3) includes an extension reaction between 60-80°C for 10 seconds to 15 minutes. In some embodiments, the number of the first cycles in step (b4) is 2-40.
在一些实施方式中,所述步骤(d)使得所述第二引物的所述通用序列能够与所述基因组预扩增产物的3’端配对并扩增所述基因组预扩增产物以得到扩大的基因组扩增产物。In some embodiments, step (d) enables the universal sequence of the second primer to pair with the 3' end of the genomic pre-amplification product and amplify the genomic pre-amplification product to obtain an amplified genomic amplification product.
在一些实施方式中,所述步骤(d)包括:(d1)能够打开DNA双链的温度程序;(d2)进一步能打开DNA双链的温度程序;(d3)能够使所述第二引物与所述经步骤(b)获得的基因组预扩增产物的单链结合的温度程序;(d4)能够使与所述基因组预扩增产物单链结合的第二引物在所述核酸聚合酶的作用下延伸长度的温度程序;(d5)重复步骤(d2)到(d4)至指定的第二循环次数,其中所述指定的第二循环次数大于1。In some embodiments, the step (d) includes: (d1) a temperature program capable of opening the double-stranded DNA; (d2) a temperature program capable of further opening the double-stranded DNA; (d3) a temperature program capable of causing the second primer to bind to the single-stranded genomic pre-amplification product obtained in step (b); (d4) a temperature program capable of causing the second primer bound to the single-stranded genomic pre-amplification product to extend its length under the action of the nucleic acid polymerase; (d5) repeating steps (d2) to (d4) to a specified number of second cycles, wherein the specified number of second cycles is greater than 1.
在一些实施方式中,步骤(d1)中所述DNA双链为所述基因组预扩增产物,并且所述DNA双链包括DNA发卡结构中包含的双链,所述温度程序包括90-95℃的温度之间变性反应5秒-20分钟。In some embodiments, the double-stranded DNA in step (d1) is the genomic pre-amplification product, and the double-stranded DNA includes a double-stranded DNA contained in a DNA hairpin structure, and the temperature program includes a denaturation reaction at a temperature between 90-95°C for 5 seconds to 20 minutes.
在一些实施方式中,步骤(d2)中所述的温度程序包括在90-95℃的温度之间解链反应3-50秒。在一些实施方式中,所述步骤(d3)中所述的温度程序包括在45-65℃的温度之间退火反应3-50秒。在一些实施方式中,所述步骤(d4)中所述的温度程序包括在60-80℃的温度之间延伸反应10秒-15分钟。In some embodiments, the temperature program in step (d2) comprises a melting reaction at a temperature between 90°C and 95°C for 3-50 seconds. In some embodiments, the temperature program in step (d3) comprises an annealing reaction at a temperature between 45°C and 65°C for 3-50 seconds. In some embodiments, the temperature program in step (d4) comprises an extension reaction at a temperature between 60°C and 80°C for 10 seconds to 15 minutes.
在一些实施方式中,所述方法进一步包括分析所述扩增产物以识别与疾病或表型相关的序列特征。在一些实施方式中,所述与疾病或表型相关的序列特征包括染色体水平异常、染色体的异位、非整倍体、部分或全部染色体的缺失或重复、胎儿HLA单倍型和父源突变,或者所述疾病或表型选自下组:β-地中海贫血、唐氏综合征、囊性纤维化、镰状细胞病、泰-萨克斯病、脆性X综合征、脊髓性肌萎缩症、血红蛋白病、α-地中海贫血、X连锁疾病(由在X染色体上基因主导的疾病)、脊柱裂、无脑畸形、先天性心脏病、肥胖、糖尿病、癌症、胎儿性别、胎儿RHD。在一些实施方式中,所述基因组DNA来源于卵裂球、囊胚滋养层、培养的细胞、提取后的gDNA或囊胚培养液。In some embodiments, the method further includes analyzing the amplified product to identify sequence features associated with a disease or phenotype. In some embodiments, the sequence features associated with the disease or phenotype include chromosome level abnormalities, chromosome ectopy, aneuploidy, deletion or duplication of part or all of the chromosomes, fetal HLA haplotypes and paternal mutations, or the disease or phenotype is selected from the group consisting of β-thalassemia, Down syndrome, cystic fibrosis, sickle cell disease, Tay-Sachs disease, fragile X syndrome, spinal muscular atrophy, hemoglobinopathy, α-thalassemia, X-linked diseases (diseases dominated by genes on the X chromosome), spina bifida, anencephaly, congenital heart disease, obesity, diabetes, cancer, fetal sex, fetal RHD. In some embodiments, the genomic DNA is derived from blastomeres, blastocyst trophoblasts, cultured cells, extracted gDNA, or blastocyst culture fluid.
本申请的一方面提供了一种扩增基因组DNA的方法,所述方法包括:(a)提供第一反应混合物,其中所述第一反应混合物包括包含所述基因组DNA的样本、第一引物、核苷酸单体混合物、和核酸聚合酶,其中所述第一引物从5’端到3’端包含通用序列和可变序列,其中所述第一引物从5’端到3’端包含通用序列和第一可变序列,所述第一可变序列包括第一随机序列,其中所述第一随机序列从5’端到3’端依次为Xa1Xa2……Xan,所述第一随机序列的Xai(i=1-n)均属于同一个集合,所述集合选自B、或D、或H、或V,其中B={T、G、C},D={A、T、G},H={T、A、C},V={A、C、G},其中Xai表示第一随机序列5’端的第i个核苷酸,n是选自3-20的正整数,其中所述通用序列和所述第一可变序列直接相连、或所述通用序列和所述第一可变序列通过第一间隔序列相连,所述第一间隔序列为Ya1……Yam,其中Yaj(j=1-m)∈{A、T、G、C},其中Yaj表示间隔序列5’端的第j个核苷酸,可选地,其中所述第一反应混合物进一步包括第三引物,其中所述第三引物从5’端到3’端包含所述通用序列和第三可变序列,所述第三可变序列包括第三随机序列,其中所述第三随机序列从5’端到3’端依次为Xb1Xb2……Xbn,所述第三随机序列的Xbi(i=1-n)均属于同一个集合,所述集合选自B、或D、或H、或V,其中B={T、G、C},D={A、T、G},H={T、A、C},V={A、C、G},并且Xbi(i=1-n)和Xai(i=1-n)属于不同的集合,其中Xbi表示第一随机序列5’端的第i个核苷酸,n是选自3-20的正整数,其中所述通用序列和所述第三可变序列直接相连,或者所述通用序列和所述第三可变序列通过第三间隔序列相连,所述第三间隔序列为Yb1……Ybm,其中Ybj(j=1-m)∈{A、T、G、C},其中Ybj表示间隔序列5’端的第j个核苷酸,m是选自1-3的正整数;(b)将所述第一反应混合物置于第一温度循环程序,使得所述第一引物的第一可变序列以及可选的第三引物的第三可变序列能够与所述基因组DNA配对并扩增所述基因组DNA以得到基因组预扩增产物,其中所述基因组预扩增产物的5’端包含所述通用序列,3’端包含所述通用序列的互补序列;其中所述第一温度循环程序包括:(b1)第一个循环为在介于90-95℃的温度之间的第一变性温度反应1-20分钟,第一个循环之后为在介于90-95℃的温度之间的第二解链温度反应3-50秒;(b2)在介于10-20℃之间的第一退火温度反应3-60秒,介于20-30℃之间的第二退火温度反应3-50秒,和介于30-50℃之间的第三退火温度反应3-50秒;(b3)在介于60-80℃之间的第一延伸温度反应10秒-15分钟;(b4)重复步骤(b1)到(b3)至2-40个循环;(c)提供第二反应混合物,所述第二反应混合物包括步骤(b)中得到的所述基因组预扩增产物、第二引物、核苷酸单体混合物、和核酸聚合酶,其中所述第二引物的从5’端到3’端包含或由特定序列及所述通用序列组成;(d)将所述第二反应混合物置于第二温度循环程序,使得所述第二引物的所述通用序列能够与所述基因组预扩增产物的3’端配对并扩增所述基因组预扩增产物以得到扩大的基因组扩增产物,其中所述第二温度循环程序包括:(d1)在介于90-95℃之间的第二变性温度反应5秒-20分钟;(d2)在介于90-95℃之间的第二解链温度反应3-50秒;(d3)在介于45-65℃之间的第四退火温度反应3-50秒;(d4)在介于60-80℃之间的第二延伸温度反应10秒-15分钟;(d5)重复步骤(d2)到(d4)2-40个循环。One aspect of the present application provides a method for amplifying genomic DNA, the method comprising: (a) providing a first reaction mixture, wherein the first reaction mixture comprises a sample comprising the genomic DNA, a first primer, a nucleotide monomer mixture, and a nucleic acid polymerase, wherein the first primer comprises a universal sequence and a variable sequence from the 5' end to the 3' end, wherein the first primer comprises a universal sequence and a first variable sequence from the 5' end to the 3' end, and the first variable sequence comprises a first random sequence, wherein the first random sequence is X a1 X a2 ... X an from the 5' end to the 3' end, and X ai (i=1-n) of the first random sequence all belong to the same set, wherein the set is selected from B, or D, or H, or V, wherein B={T, G, C}, D={A, T, G}, H={T, A, C}, V={A, C, G}, wherein X ai represents the i-th nucleotide at the 5' end of the first random sequence, n is a positive integer selected from 3-20, wherein the universal sequence and the first variable sequence are directly connected, or the universal sequence and the first variable sequence are connected via a first spacer sequence, the first spacer sequence is Ya1 ... Yam , wherein Yaj (j=1-m)∈{A, T, G, C}, wherein Yaj represents the j-th nucleotide at the 5' end of the spacer sequence, optionally, wherein the first reaction mixture further comprises a third primer, wherein the third primer comprises the universal sequence and a third variable sequence from the 5' end to the 3' end, the third variable sequence comprises a third random sequence, wherein the third random sequence is Xb1 Xb2 ... Xbn from the 5' end to the 3' end, Xbi (i=1-n) of the third random sequence all belong to the same set, the set being selected from B, or D, or H, or V, wherein B={T, G, C}, D={A, T, G}, H={T, A, C}, V={A, C, G}, and Xbi (i=1-n) and X ai (i=1-n) belong to different sets, wherein X bi represents the i-th nucleotide at the 5' end of the first random sequence, and n is a positive integer selected from 3-20, wherein the universal sequence and the third variable sequence are directly connected, or the universal sequence and the third variable sequence are connected through a third spacer sequence, and the third spacer sequence is Y b1 ...Y bm , wherein Y bj (j=1-m)∈{A, T, G, C}, wherein Y bj represents the jth nucleotide at the 5' end of the spacer sequence, and m is a positive integer selected from 1-3; (b) placing the first reaction mixture in a first temperature cycling program so that the first variable sequence of the first primer and the optional third variable sequence of the third primer can pair with the genomic DNA and amplify the genomic DNA to obtain a genomic pre-amplification product, wherein the 5' end of the genomic pre-amplification product comprises the universal sequence and the 3' end comprises the complementary sequence of the universal sequence; wherein the first temperature cycling program comprises: (b1) a first cycle of a first denaturation temperature reaction at a temperature between 90-95°C for 1-20 minutes, followed by a second melting temperature reaction at a temperature between 90-95°C for 3-50 seconds; (b2) a first annealing temperature reaction at a temperature between 10-20°C for 3-60 seconds, a second annealing temperature reaction at a temperature between 20-30°C for 3-50 seconds, and a third annealing temperature reaction at a temperature between 30-50°C for 3-50 seconds; (b3) a first extension temperature reaction at a temperature between 60-80°C for 10 seconds to 15 minutes; (b4 ) repeating steps (b1) to (b3) for 2-40 cycles; (c) providing a second reaction mixture, wherein the second reaction mixture comprises the genomic pre-amplification product obtained in step (b), a second primer, a nucleotide monomer mixture, and a nucleic acid polymerase, wherein the second primer comprises or consists of a specific sequence and the universal sequence from the 5' end to the 3'end; (d) placing the second reaction mixture in a second temperature cycle program so that the universal sequence of the second primer can pair with the 3' end of the genomic pre-amplification product and amplify the The genomic pre-amplification product is used to obtain an amplified genomic amplification product, wherein the second temperature cycling program includes: (d1) reacting at a second denaturation temperature between 90-95°C for 5 seconds to 20 minutes; (d2) reacting at a second melting temperature between 90-95°C for 3-50 seconds; (d3) reacting at a fourth annealing temperature between 45-65°C for 3-50 seconds; (d4) reacting at a second extension temperature between 60-80°C for 10 seconds to 15 minutes; and (d5) repeating steps (d2) to (d4) for 2-40 cycles.
在一些实施方式中,所述通用序列包含或由SEQ ID NO:6组成;所述第一随机序列的Xai(i=1-n)均属于D,所述第三随机序列的Xbi(i=1-n)均属于B。In some embodiments, the universal sequence comprises or consists of SEQ ID NO: 6; X ai (i=1-n) of the first random sequence all belong to D, and X bi (i=1-n) of the third random sequence all belong to B.
在一些实施方式中,步骤(d)得到的扩增产物已完成了文库构建。In some embodiments, the amplified product obtained in step (d) has completed library construction.
在本申请的再一个方面中,提供了一种用于扩增基因组DNA的试剂盒,所述试剂盒包括第一引物,其中所述第一引物从5’端到3’端包含通用序列和第一可变序列,所述第一可变序列包括第一随机序列,其中所述第一随机序列从5’端到3’端依次为Xa1Xa2……Xan,所述第一随机序列的Xai(i=1-n)均属于同一个集合,所述集合选自B、或D、或H、或V,其中B={T、G、C},D={A、T、G},H={T、A、C},V={A、C、G},其中Xai表示第一随机序列5’端的第i个核苷酸,n是选自3-20的正整数,其中所述通用序列和所述第一可变序列直接相连、或所述通用序列和所述第一可变序列通过第一间隔序列相连,所述第一间隔序列为Ya1……Yam,其中Yaj(j=1-m)∈{A、T、G、C},其中Yaj表示间隔序列5’端的第j个核苷酸,m是选自1-3的正整数,可选地,其中所述第一反应混合物进一步包括第三引物,其中所述第三引物从5’端到3’端包含所述通用序列和第三可变序列,所述第三可变序列包括第三随机序列,其中所述第三随机序列从5’端到3’端依次为Xb1Xb2……Xbn,所述第三随机序列的Xbi(i=1-n)均属于同一个集合,所述集合选自B、或D、或H、或V,其中B={T、G、C},D={A、T、G},H={T、A、C},V={A、C、G},并且Xbi(i=1-n)和Xai(i=1-n)属于不同的集合,其中Xbi表示第一随机序列5’端的第i个核苷酸,n是选自3-20的正整数,其中所述通用序列和所述第三可变序列直接相连,或者所述通用序列和所述第三可变序列通过第三间隔序列相连,所述第三间隔序列为Yb1……Ybm,其中Ybj(j=1-m)∈{A、T、G、C},其中Ybj表示间隔序列5’端的第j个核苷酸,m是选自1-3的正整数。In another aspect of the present application, a kit for amplifying genomic DNA is provided, the kit comprising a first primer, wherein the first primer comprises a universal sequence and a first variable sequence from the 5' end to the 3' end, the first variable sequence comprising a first random sequence, wherein the first random sequence is X a1 X a2 ... X an from the 5' end to the 3' end, and X ai (i=1-n) of the first random sequence all belong to the same set, the set being selected from B, or D, or H, or V, wherein B={T, G, C}, D={A, T, G}, H={T, A, C}, V={A, C, G}, wherein X ai represents the i-th nucleotide at the 5' end of the first random sequence, and n is a positive integer selected from 3-20, wherein the universal sequence and the first variable sequence are directly connected, or the universal sequence and the first variable sequence are connected via a first spacer sequence, the first spacer sequence being Y a1 ... Y am , wherein Y aj (j=1-m)∈{A, T, G, C}, wherein Y aj represents the jth nucleotide at the 5' end of the spacer sequence, m is a positive integer selected from 1-3, optionally, wherein the first reaction mixture further comprises a third primer, wherein the third primer comprises the universal sequence and a third variable sequence from the 5' end to the 3' end, the third variable sequence comprises a third random sequence, wherein the third random sequence is X b1 X b2 ... X bn from the 5' end to the 3' end, X bi (i=1-n) of the third random sequence all belong to the same set, the set is selected from B, or D, or H, or V, wherein B={T, G, C}, D={A, T, G}, H={T, A, C}, V={A, C, G}, and X bi (i=1-n) and X ai (i=1-n) belong to different sets, wherein X bi represents the i-th nucleotide at the 5' end of the first random sequence, n is a positive integer selected from 3-20, wherein the universal sequence and the third variable sequence are directly connected, or the universal sequence and the third variable sequence are connected through a third spacer sequence, and the third spacer sequence is Y b1 ...Y bm , wherein Y bj (j=1-m)∈{A, T, G, C}, wherein Y bj represents the j-th nucleotide at the 5' end of the spacer sequence, and m is a positive integer selected from 1-3.
在一些实施方式中,所述通用序列包含或由SEQ ID NO:6组成;所述第一随机序列的Xai(i=1-n)均属于D,所述第三随机序列的Xbi(i=1-n)均属于B。在一些实施方式中,所述通用序列包含或由SEQ ID NO:1组成;所述第一随机序列的Xai(i=1-n)均属于D,所述第三随机序列的Xbi(i=1-n)均属于B。在一些实施方式中,所述通用序列包含或由SEQ ID NO:2组成;所述第一随机序列的Xai(i=1-n)均属于D,所述第三随机序列的Xbi(i=1-n)均属于B。In some embodiments, the universal sequence comprises or consists of SEQ ID NO: 6; the X ai (i=1-n) of the first random sequence all belong to D, and the X bi (i=1-n) of the third random sequence all belong to B. In some embodiments, the universal sequence comprises or consists of SEQ ID NO: 1; the X ai (i=1-n) of the first random sequence all belong to D, and the X bi (i=1-n) of the third random sequence all belong to B. In some embodiments, the universal sequence comprises or consists of SEQ ID NO: 2; the X ai (i=1-n) of the first random sequence all belong to D, and the X bi (i=1-n) of the third random sequence all belong to B.
在一些实施方式中,所述试剂盒用于构建全基因组DNA文库。In some embodiments, the kit is used to construct a whole-genomic DNA library.
在一些实施方式中,所述试剂盒进一步包括核酸聚合酶,其中所述核酸聚合酶选自:Phi29DNA聚合酶、Bst DNA聚合酶、Pyrophage 3137、Vent聚合酶、TOPOTaq DNA聚合酶、9。Nm聚合酶、Klenow Fragment DNA聚合酶I、MMLV反转录酶、AMV反转录酶、HIV反转录酶、T7phase DNA聚合酶变种、超保真DNA聚合酶、Taq聚合酶、Bst DNA聚合酶、E.coliDNA聚合酶、LongAmp Taq DNA聚合酶、OneTaq DNA聚合酶、Deep Vent DNA聚合酶、Vent(exo-)DNA聚合酶、Deep Vent(exo-)DNA聚合酶、及其任意组合。In some embodiments, the kit further comprises a nucleic acid polymerase, wherein the nucleic acid polymerase is selected from the group consisting of: Phi29 DNA polymerase, Bst DNA polymerase, Pyrophage 3137, Vent polymerase, TOPO Taq DNA polymerase, 9. Nm polymerase, Klenow Fragment DNA polymerase I, MMLV reverse transcriptase, AMV reverse transcriptase, HIV reverse transcriptase, T7 phase DNA polymerase variant, super-fidelity DNA polymerase, Taq polymerase, Bst DNA polymerase, E. coli DNA polymerase, LongAmp Taq DNA polymerase, OneTaq DNA polymerase, Deep Vent DNA polymerase, Vent (exo-) DNA polymerase, Deep Vent (exo-) DNA polymerase, and any combination thereof.
在一些实施方式中,所述试剂盒进一步包括包含一种或多种选自下组的成分:核苷酸单体混合物、Mg2+、dTT、牛血清白蛋白、pH调节剂、DNase抑制剂、RNase、SO4 2-、Cl-、K+、Ca2 +、Na+、(NH4)+的一种或多种试剂。In some embodiments, the kit further comprises one or more reagents comprising one or more components selected from the group consisting of a nucleotide monomer mixture, Mg 2+ , dTT, bovine serum albumin, pH adjuster, DNase inhibitor, RNase, SO 4 2− , Cl − , K + , Ca 2 + , Na + , and (NH 4 ) + .
在一些实施方式中,所述混合物进一步包括细胞裂解剂,所述细胞裂解剂选自:蛋白酶K、胃蛋白酶、木瓜蛋白酶、NP-40、吐温、SDS、TritonX-100、EDTA和异硫氰酸胍中的一种或多种。In some embodiments, the mixture further comprises a cell lysis agent, and the cell lysis agent is selected from one or more of proteinase K, pepsin, papain, NP-40, Tween, SDS, TritonX-100, EDTA and guanidine thiocyanate.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
通过下面说明书和所附的权利要求书并与附图结合,将会更加充分地描述本申请内容的上述和其他特征。可以理解,这些附图仅描绘了本申请内容的若干实施方式,因此不应认为是对本申请内容范围的限定。通过采用附图,本申请内容将会得到更加明确和详细地说明。The above and other features of the present application will be more fully described by the following description and the appended claims, taken in conjunction with the accompanying drawings. It should be understood that these drawings only depict several embodiments of the present application and, therefore, should not be construed as limiting the scope of the present application. The present application will be more clearly and fully illustrated by the accompanying drawings.
图1示出了本申请扩增方法的基本原理。FIG1 shows the basic principle of the amplification method of the present application.
图2示出了本申请扩增方法中使用的第一类引物(线性扩增引物)的结构示意图。FIG2 shows a schematic structural diagram of the first type of primers (linear amplification primers) used in the amplification method of the present application.
图3示出了使用不同的第一类引物混合物对50pg人基因组DNA进行扩增,并将得到的扩增产物分别进行凝胶电泳的结果,其中自左向右第1泳道为分子量标记(M),第2-13泳道为使用实验组1-12的引物混合物(具体见表1)对gDNA进行扩增获得的扩增样品,第14泳道为分子量标记。Figure 3 shows the results of amplifying 50 pg of human genomic DNA using different first-class primer mixtures and subjecting the resulting amplified products to gel electrophoresis, wherein lane 1 from left to right is a molecular weight marker (M), lanes 2-13 are amplified samples obtained by amplifying gDNA using primer mixtures of experimental groups 1-12 (see Table 1 for details), and lane 14 is a molecular weight marker.
图4示出了实验组1-12中获得的扩增产物在SBS测序中每个读数位置的A、T、C、G分布。FIG4 shows the distribution of A, T, C, and G at each read position in the amplification products obtained in experimental groups 1-12 during SBS sequencing.
图5示出了使用表1中所示的1-12个实验组的引物混合物以正常人表皮成纤维细胞(AFP细胞)为起始样本进行扩增的扩增结果,自左向右第1泳道为分子量标记,第2-11泳道为单细胞扩增样品,第12泳道为分子量标记。Figure 5 shows the amplification results of normal human epidermal fibroblasts (AFP cells) as the starting sample using the primer mixtures of experimental groups 1-12 shown in Table 1. From left to right, lane 1 is a molecular weight marker, lanes 2-11 are single cell amplification samples, and lane 12 is a molecular weight marker.
图6示出了使用表1中所示的实验组9/10和实验组11/12的引物混合物以正常人表皮成纤维细胞(AFP细胞)为起始样本进行扩增,并将得到的扩增产物分别进行凝胶电泳的结果。自左向右第1泳道为分子量标记,第2-11泳道为使用实验组11/12的引物混合物对单细胞进行扩增获得的扩增样品,第12泳道为分子量标记,第13-22泳道为使用实验组9/10的引物混合物对单细胞进行扩增获得的扩增样品,第23泳道为分子量标记。Figure 6 shows the results of gel electrophoresis of amplified products obtained by amplifying normal human dermal fibroblasts (AFP cells) using the primer mixtures of experimental groups 9/10 and 11/12 shown in Table 1. From left to right, lane 1 is a molecular weight marker, lanes 2-11 are amplified samples obtained by amplifying single cells using the primer mixture of experimental group 11/12, lane 12 is a molecular weight marker, lanes 13-22 are amplified samples obtained by amplifying single cells using the primer mixture of experimental group 9/10, and lane 23 is a molecular weight marker.
图7示出了图6中的每个样本1_1、1_2…1_10及2_1、2_2…2_10在SBS测序中的数据量(以等体积扩增产物进行测序)。FIG. 7 shows the data volume of each sample 1_1, 1_2 ... 1_10 and 2_1, 2_2 ... 2_10 in FIG. 6 in SBS sequencing (sequencing was performed using equal volumes of amplification products).
图8示出了图6中的每个样本1_1、1_2…1_10及2_1、2_2…2_10在SBS测序中的拷贝数变异系数。FIG8 shows the copy number variation coefficient of each sample 1_1, 1_2 ... 1_10 and 2_1, 2_2 ... 2_10 in FIG6 during SBS sequencing.
图9A-9D示出了图6中的每个样本1_1、1_2…1_10及2_1、2_2…2_10在SBS测序中每个染色体的拷贝数。9A-9D show the copy number of each chromosome in SBS sequencing for each sample 1_1, 1_2 ... 1_10 and 2_1, 2_2 ... 2_10 in FIG6 .
图10示出了对图6中的扩增样本1_1、1_2及2_1、2_2分别进一步针对表8中所列的35个致病位点基因进行PCR扩增,将扩增产物进行凝胶电泳的结果。从左向右每个泳道依次表示分子量标记物、针对表8所示的致病位点1-23进行的扩增结果、分子量标记物、针对表8所示的致病位点24-35进行的扩增结果、分子量标记物。Figure 10 shows the results of PCR amplification of samples 1_1, 1_2, and 2_1, 2_2 in Figure 6 targeting the 35 pathogenicity loci listed in Table 8, followed by gel electrophoresis of the amplified products. From left to right, the lanes represent a molecular weight marker, amplification results targeting pathogenicity loci 1-23 listed in Table 8, a molecular weight marker, amplification results targeting pathogenicity loci 24-35 listed in Table 8, and a molecular weight marker.
图11示出了使用表1中所示的实验组9/10的引物混合物以正常人表皮成纤维细胞(AFP细胞)为起始样本进行扩增,并将得到的扩增产物分别进行凝胶电泳的结果。从左向右每个泳道依次表示分子量标记物、使用实验组9/10的引物混合物对单细胞进行扩增获得的扩增样品(4个平行实验孔)、分子量标记物。Figure 11 shows the results of gel electrophoresis of amplified products obtained from normal human dermal fibroblasts (AFP cells) using the primer mixture of experimental group 9/10 shown in Table 1 as the starting sample. From left to right, each lane represents a molecular weight marker, an amplified sample (four parallel experimental wells) obtained by amplifying a single cell using the primer mixture of experimental group 9/10, and a molecular weight marker.
图12示出了对图11中的扩增样本1和2别进一步针对表8中所列的35个致病位点基因进行PCR扩增,将扩增产物进行凝胶电泳的结果。从左向右每个泳道依次表示分子量标记物、针对表8所示的致病位点1-23进行的扩增结果、分子量标记物、针对表6所示的致病位点24-35进行的扩增结果、分子量标记物。Figure 12 shows the results of PCR amplification of samples 1 and 2 in Figure 11 targeting the 35 pathogenicity loci listed in Table 8, followed by gel electrophoresis of the amplified products. From left to right, the lanes represent a molecular weight marker, amplification results targeting pathogenicity loci 1-23 listed in Table 8, a molecular weight marker, amplification results targeting pathogenicity loci 24-35 listed in Table 6, and a molecular weight marker.
图13示出了对图11中的扩增样本在半导体测序中每个染色体的拷贝数。FIG. 13 shows the copy number of each chromosome in semiconductor sequencing of the amplified sample in FIG. 11 .
图14示出了使用表1中所示的实验组9/10的引物混合物对以囊胚培养液中DNA为起始样本进行扩增,并将扩增样本进行SBS测序得到的染色体拷贝数。FIG14 shows the chromosome copy numbers obtained by amplifying DNA from blastocyst culture medium using the primer mixture of experimental group 9/10 shown in Table 1 and performing SBS sequencing on the amplified sample.
具体实施方式DETAILED DESCRIPTION
本发明提供了扩增基因组DNA的方法,特别是扩增单细胞全基因组DNA的方法。The present invention provides a method for amplifying genomic DNA, in particular a method for amplifying the whole genome DNA of a single cell.
在本发明之前,通常是在基因扩增完成之后进行建库,在建库完成后再对其进行测序,这种方法流程复杂,耗时时间长。而本申请发明人通过设计特殊结构的引物并且优化扩增的过程,使得在单细胞扩增之后能够直接成库从而大幅减少单细胞全基因组DNA文库构建所需的时间。虽然在某些文献中报道了对引物的某些设计,但是这些设计均存在这样或那样的缺陷。例如在WO2012/166425中进行单细胞全基因组预扩增步骤时,引物的随机序列选自四种碱基(即,A、T、C和G),但是使用这种方法进行直接扩增建库时会不可避免的自体或相互之间成环或形成二聚体,从而显著降低了扩增的效率。再例如,在US8,206,913中报道了引物中随机序列选自两种碱基(即,G和T、G和A、A和C、C和T)以避免自体或相互之间成环,但是由于使用这类引物扩增出的序列中目标序列前的碱基随机性很差,所以在整板上机进行SBS测序时必须添加阳性对照品来校正碱基随机性,否则无法进行检测,因此这种方法势必会浪费一定的数据量。而与上述现有技术不同,本发明中涉及的引物虽然包含较高的碱基随机性,但引物自身或引物之间基本不形成或与四碱基随机引物相比形成非常少的成环或二聚体,并且本发明构建出的文库中目标序列前具有较高的碱基随机性,因此根据本发明的方法进行扩增获得的扩增产物二聚体少、可以直接成库、可用于整版上机并且测序结果良好。Before the present invention, it is usually to build a library after gene amplification is completed, and then sequence it after building the library is completed. This method has a complicated process and consumes a long time. The inventors of the present application design a primer of a special structure and optimize the process of amplification so that after single cell amplification, the library can be directly built into a library, thereby significantly reducing the time required for the construction of a single cell whole genome DNA library. Although some designs of primers have been reported in some documents, these designs all have defects of one kind or another. For example, when carrying out the single cell whole genome pre-amplification step in WO2012/166425, the random sequence of primers is selected from four bases (i.e., A, T, C and G), but when using this method to directly amplify and build a library, it will inevitably form a loop or dimer from or each other, thereby significantly reducing the efficiency of amplification. For another example, US Pat. No. 8,206,913 reports that the random sequence in the primers is selected from two bases (i.e., G and T, G and A, A and C, C and T) to avoid self- or mutual looping. However, because the base randomness before the target sequence in the sequences amplified using such primers is very poor, a positive control must be added to correct for base randomness when performing SBS sequencing on a whole plate; otherwise, detection cannot be performed. Therefore, this method inevitably wastes a certain amount of data. Unlike the above-mentioned prior art, the primers involved in the present invention, while having high base randomness, basically do not form loops or dimers within the primers themselves or between primers, or form very few loops or dimers compared to four-base random primers. In addition, the libraries constructed in the present invention have high base randomness before the target sequence. Therefore, the amplified products obtained by amplification according to the method of the present invention have few dimers, can be directly libraryed, can be used for whole plate sequencing, and produce good sequencing results.
在一方面,本申请提供了扩增基因组DNA的方法,所述方法包括:(a)提供第一反应混合物,其中所述第一反应混合物包括包含所述基因组DNA的样本、第一引物、核苷酸单体混合物和核酸聚合酶,其中所述第一引物从5’端到3’端包含通用序列和第一可变序列,所述第一可变序列包括第一随机序列,其中所述第一随机序列从5’端到3’端依次为Xa1Xa2……Xan,所述第一随机序列的Xai(i=1-n)均属于同一个集合,所述集合选自B、或D、或H、或V,其中B={T、G、C},D={A、T、G},H={T、A、C},V={A、C、G},其中Xai表示第一随机序列5’端的第i个核苷酸,n是选自3-20的正整数,可选地,所述第一反应混合物进一步包括第三引物,其中所述第三引物从5’端到3’端包含所述通用序列和第三可变序列,所述第三可变序列包括第三随机序列,其中所述第三随机序列从5’端到3’端依次为Xb1Xb2……Xbn,所述第三随机序列的Xbi(i=1-n)均属于同一个集合,所述集合选自B、或D、或H、或V,其中B={T、G、C},D={A、T、G},H={T、A、C},V={A、C、G},并且Xbi(i=1-n)和Xai(i=1-n)属于不同的集合,其中Xbi表示第三随机序列5’端的第i个核苷酸,n是选自3-20的正整数;(b)将所述第一反应混合物置于第一温度循环程序进行预扩增,获得预扩增产物;(c)提供第二反应混合物,所述第二反应混合物包括步骤(b)中得到的预扩增产物、第二引物、核苷酸单体混合物和核酸聚合酶,其中所述第二引物从5’端到3’端包含或由特定序列及所述通用序列组成;(d)将所述第二反应混合物置于第二温度循环程序进行扩增,获得扩增产物。本申请提供的方法的一种实施方式的图示请见图1。In one aspect, the present application provides a method for amplifying genomic DNA, the method comprising: (a) providing a first reaction mixture, wherein the first reaction mixture comprises a sample comprising the genomic DNA, a first primer, a nucleotide monomer mixture, and a nucleic acid polymerase, wherein the first primer comprises a universal sequence and a first variable sequence from the 5' end to the 3' end, the first variable sequence comprises a first random sequence, wherein the first random sequence is X a1 X a2 ... X an from the 5' end to the 3' end, and X ai (i=1-n) of the first random sequence all belong to the same set, the set being selected from B, or D, or H, or V, wherein B={T, G, C}, D={A, T, G}, H={T, A, C}, V={A, C, G}, wherein X ai represents the i-th nucleotide at the 5' end of the first random sequence, n is a positive integer selected from 3-20, optionally, the first reaction mixture further comprises a third primer, wherein the third primer comprises the universal sequence and a third variable sequence from the 5' end to the 3' end, the third variable sequence comprises a third random sequence, wherein the third random sequence is X b1 X b2 ... X bn from the 5' end to the 3' end, X bi (i=1-n) of the third random sequence all belong to the same set, the set is selected from B, or D, or H, or V, wherein B={T, G, C}, D={A, T, G}, H={T, A, C}, V={A, C, G}, and X bi (i=1-n) and X ai (i=1-n) belong to different sets, wherein X bi represents the i-th nucleotide at the 5' end of the third random sequence, and n is a positive integer selected from 3-20; (b) subjecting the first reaction mixture to a first temperature cycle program for pre-amplification to obtain a pre-amplification product; (c) providing a second reaction mixture, the second reaction mixture comprising the pre-amplification product obtained in step (b), a second primer, a nucleotide monomer mixture, and a nucleic acid polymerase, wherein the second primer comprises or consists of a specific sequence and the universal sequence from the 5' end to the 3'end; (d) subjecting the second reaction mixture to a second temperature cycle program for amplification to obtain an amplified product. A diagram of one embodiment of the method provided herein is shown in Figure 1.
步骤(a):提供第一反应混合物Step (a): Providing a first reaction mixture
本申请的方法广泛适用于基因组DNA的扩增,特别是痕量的基因组DNA的扩增。The method of the present application is widely applicable to the amplification of genomic DNA, especially the amplification of trace amounts of genomic DNA.
i.基因组DNAi. Genomic DNA
本申请的方法优选适用于基因组DNA。在某些实施方式中,反应混合物中包含的基因组DNA的起始量不超过10ng、不超过5ng、不超过1ng、不超过500pg、不超过200pg、不超过100pg、不超过50pg、不超过20pg、或者不超过10pg。The method of the present application is preferably applicable to genomic DNA.In certain embodiments, the starting amount of the genomic DNA included in the reaction mixture is no more than 10 ng, no more than 5 ng, no more than 1 ng, no more than 500 pg, no more than 200 pg, no more than 100 pg, no more than 50 pg, no more than 20 pg, or no more than 10 pg.
基因组DNA可以来自生物样品,例如生物组织或含有细胞或游离DNA的体液。含有基因组DNA的样品可以通过已知的方法获取,例如通过口腔粘膜样本、鼻腔样本、头发、漱口水、脐带血、血浆、羊水、胚胎组织、内皮细胞、指甲样本、蹄样本等获取。生物样品可以是任何适当的形式提供,例如可以是石蜡包埋的形式,新鲜分离的形式等。基因组DNA可以来自任何物种或生物种类,例如但不限于,人类、哺乳动物、牛、猪、羊、马、啮齿动物、禽类、鱼类、斑马鱼、虾、植物、酵母、病毒或细菌。Genomic DNA can be obtained from a biological sample, such as a biological tissue or a body fluid containing cells or free DNA. Samples containing genomic DNA can be obtained by known methods, such as oral mucosal samples, nasal samples, hair, mouthwash, umbilical cord blood, plasma, amniotic fluid, fetal tissue, endothelial cells, nail samples, hoof samples, etc. The biological sample can be provided in any suitable form, such as paraffin-embedded form, freshly isolated form, etc. The genomic DNA can be from any species or biological type, such as, but not limited to, humans, mammals, cattle, pigs, sheep, horses, rodents, birds, fish, zebrafish, shrimp, plants, yeast, viruses, or bacteria.
在某些实施方式中,基因组DNA是来自于单个细胞的基因组DNA,或者来自两个或多个同类细胞的基因组DNA。单个细胞或同类细胞可以来自,例如,植入前的胚胎、孕妇外周血中的胚胎细胞、单精子、卵细胞、受精卵、癌细胞、细菌细胞、肿瘤循环细胞、肿瘤组织细胞、或者从任意组织获得的单个或多个同类细胞。本申请的方法可以用于扩增一些宝贵的样本或起始量低样本中的DNA,如人类的卵细胞、生殖细胞、肿瘤循环细胞、肿瘤组织细胞等。In certain embodiments, the genomic DNA is the genomic DNA from a single cell, or the genomic DNA from two or more cells of the same type. The single cell or cells of the same type can be derived from, for example, a pre-implantation embryo, an embryonic cell in the peripheral blood of a pregnant woman, a single sperm, an egg cell, a fertilized egg, a cancer cell, a bacterial cell, a circulating tumor cell, a tumor tissue cell, or a single or multiple cells of the same type obtained from any tissue. The method of the present application can be used to amplify the DNA in some valuable samples or low-starting samples, such as human egg cells, germ cells, circulating tumor cells, tumor tissue cells, etc.
在一些实施方式中,基因组DNA来源于卵裂球、囊胚滋养层、培养的细胞、提取后的gDNA或囊胚培养液。In some embodiments, the genomic DNA is derived from blastomeres, blastocyst trophoblasts, cultured cells, extracted gDNA, or blastocyst culture fluid.
获得单细胞的方法在本领域也是公知的,例如,可以通过流式细胞分选的方法(Herzenberg等人Proc Natl Acad Sci USA 76:1453-55,1979;lverson等人PrenatalDiagnosis 1:61-73,1981;Bianchi等人Prenatal Diagnosis 11:523-28,1991)、荧光激活细胞分选法、通过磁珠分离的方法(MACS,Ganshirt-Ahlert等人Am J Obstet Gynecol166:1350,1992)、使用半自动细胞挑取仪(例如Stoelting公司生产的细胞转移系统QuixellTM)或者上述多种方法的结合。在一些实施方式中,可以使用梯度离心和流式细胞技术来提高分离和分选的效率。在一些实施方式中,可以根据单个细胞不同的性质来挑选特定类型的细胞,例如表达某种特定的生物标记的细胞。Methods for obtaining single cells are also well known in the art, and can be performed, for example, by flow cytometry (Herzenberg et al., Proc Natl Acad Sci USA 76:1453-55, 1979; Iverson et al., Prenatal Diagnosis 1:61-73, 1981; Bianchi et al., Prenatal Diagnosis 11:523-28, 1991), fluorescence-activated cell sorting, magnetic bead separation (MACS, Ganshirt-Ahlert et al., Am J Obstet Gynecol 166:1350, 1992), using a semi-automated cell picker (e.g., Quixell ™ Cell Transfer System manufactured by Stoelting), or a combination of these methods. In some embodiments, gradient centrifugation and flow cytometry can be used to improve separation and sorting efficiency. In some embodiments, specific cell types can be selected based on the different properties of individual cells, such as cells expressing a specific biomarker.
获得基因组DNA的方法也是本领域公知的。在某些实施方式中,可以从生物样品中或单个细胞中裂解细胞并释放获得基因组DNA。可以使用本领域公知的任何适当的方法进行裂解,例如可以通过热裂解、碱裂解、酶裂解、机械裂解,或其任意组合的方式进行裂解(具体可参见,例如,U.S.7,521,246、Thermo Scientific Pierce Cell Lysis TechnicalHandbook v2和Current Protocols in Molecular Biology(1995).John Wiley和Sons,Inc.(supplement 29)pp.9.7.1-9.7.2.)。The method for obtaining genomic DNA is also well known in the art. In certain embodiments, the genomic DNA can be obtained by lysing cells from a biological sample or in a single cell and releasing the obtained genomic DNA. Any suitable method well known in the art can be used to crack, for example, by thermal cracking, alkaline cracking, enzyme cracking, mechanical cracking, or any combination thereof (specifically referring to, for example, U.S. 7,521,246, Thermo Scientific Pierce Cell Lysis Technical Handbook v2 and Current Protocols in Molecular Biology (1995). John Wiley and Sons, Inc. (supplement 29) pp.9.7.1-9.7.2.).
机械裂解包括使用超声、高速搅拌、均质、加压(例如法式滤压壶)、减压和研磨等使用机械力破坏细胞的方法。最常用的机械裂解法是液体均质法,其迫使细胞悬浮液通过一个很狭窄的空间,从而对细胞膜施加剪切力(例如,如WO2013153176A1中所描述的)。Mechanical lysis includes methods that use mechanical force to disrupt cells, including ultrasound, high-speed stirring, homogenization, pressurization (e.g., French press), decompression, and grinding. The most commonly used mechanical lysis method is liquid homogenization, which forces a cell suspension through a very narrow space, thereby applying shear forces to the cell membrane (e.g., as described in WO2013153176A1).
在某些实施方式中,可以使用温和的裂解方法。例如,可以将细胞在含有Tween-20的溶液中72℃加热2分钟、在水中65℃加热10分钟(Esumi等人,Neurosci Res 60(4):439-51(2008)、在含有0.5%NP-40的PCR缓冲液II(Applied Biosystems)中70℃加热90秒(Kurimoto等人,Nucleic Acids Res 34(5):e42(2006)、或者使用蛋白酶(例如蛋白酶K)或者离盐液(例如异硫氰酸胍)进行裂解(例如,如美国专利申请US 20070281313中所描述的)。In certain embodiments, a mild lysis method can be used. For example, cells can be heated at 72°C for 2 minutes in a solution containing Tween-20, at 65°C for 10 minutes in water (Esumi et al., Neurosci Res 60(4):439-51 (2008), at 70°C for 90 seconds in PCR buffer II (Applied Biosystems) containing 0.5% NP-40 (Kurimoto et al., Nucleic Acids Res 34(5):e42 (2006), or lysed using a protease (e.g., proteinase K) or a free salt solution (e.g., guanidine isothiocyanate) (e.g., as described in U.S. Patent Application No. US 20070281313).
热裂解包括加热法和反复冻融法。在一些实施方式中,所述热裂解包括温度在20-100℃之间,裂解10-100分钟。在一些实施方式中,热裂解的温度可以是介于在20-90、30-90、40-90、50-90、60-90、70-90、80-90、30-80、40-80、50-80、60-80或70-80℃之间的任意温度。在一些实施方式中,热裂解的温度不低于20、30、40或50℃。在一些实施方式中,热裂解的温度不高于100、90或80℃。在一些实施方式中,热裂解时间可以是介于20-100、20-90、20-80、20-70、20-60、20-50、20-40、20-30、30-100、30-90、30-80、30-70、30-60、30-50或30-40分钟之间的任意时间。在一些实施方式中,热裂解的时间不少于20、30、40、50、60、70、80或90分钟。在一些实施方式中,热裂解的时间不多于90、80、70、60、50、40、30或20分钟。在一些实施方式中,热裂解温度是随时间进行变化的。在一些实施方式中,热裂解是温度在30-60℃保持10-30分钟,之后在70-90℃保持5-20分钟。Thermal cracking includes heating and repeated freezing and thawing. In some embodiments, the thermal cracking includes a temperature between 20-100 ° C and a cracking time of 10-100 minutes. In some embodiments, the temperature of the thermal cracking can be any temperature between 20-90, 30-90, 40-90, 50-90, 60-90, 70-90, 80-90, 30-80, 40-80, 50-80, 60-80 or 70-80 ° C. In some embodiments, the temperature of the thermal cracking is not less than 20, 30, 40 or 50 ° C. In some embodiments, the temperature of the thermal cracking is not higher than 100, 90 or 80 ° C. In some embodiments, the thermal cracking time can be any time between 20-100, 20-90, 20-80, 20-70, 20-60, 20-50, 20-40, 20-30, 30-100, 30-90, 30-80, 30-70, 30-60, 30-50 or 30-40 minutes. In some embodiments, the thermal cracking time is not less than 20, 30, 40, 50, 60, 70, 80 or 90 minutes. In some embodiments, the thermal cracking time is not more than 90, 80, 70, 60, 50, 40, 30 or 20 minutes. In some embodiments, the thermal cracking temperature varies over time. In some embodiments, the thermal cracking temperature is maintained at 30-60°C for 10-30 minutes, and then maintained at 70-90°C for 5-20 minutes.
在一些实施方式中,所述热裂解是在裂解试剂存在的条件下进行的。当裂解试剂存在时,可以降低裂解所需的时间或降低裂解所需的温度。裂解试剂可以破坏蛋白-蛋白、脂质-脂质和/或蛋白-脂质相互作用,从而促进细胞释放基因组DNA。In some embodiments, the thermal lysis is carried out in the presence of a lytic agent. When a lytic agent is present, the time required for lysis can be reduced or the temperature required for lysis can be reduced. The lytic agent can disrupt protein-protein, lipid-lipid and/or protein-lipid interactions, thereby promoting the release of genomic DNA from cells.
在一些实施方式中,所述裂解试剂包括表面活性剂和/或裂解酶。表面活性剂可以分为离子型、两性和非离子型表面活性剂。一般情况下,两性和非离子型表面活性剂的裂解效能弱于离子型表面活性剂。示例性的表面活性剂包括,但不限于,NP-40、吐温、SDS、GHAPS、TritonX-100、TritonX-114、EDTA、脱氧胆酸钠、胆酸钠、异硫氰酸胍中的一种或多种。本领域技术人员可以根据实际的需要选择表面活性剂的种类和浓度。在一些实施方式中,表面活性剂的工作浓度为0.01%-5%、0.1%-3%、0.3%-2%或0.5-1%。In some embodiments, the lysis reagent includes a surfactant and/or a lyase. Surfactants can be divided into ionic, amphoteric and non-ionic surfactants. Generally, the lysis efficiency of amphoteric and non-ionic surfactants is weaker than that of ionic surfactants. Exemplary surfactants include, but are not limited to, one or more of NP-40, Tween, SDS, GHAPS, TritonX-100, TritonX-114, EDTA, sodium deoxycholate, sodium cholate, and guanidine isothiocyanate. Those skilled in the art can select the type and concentration of the surfactant according to actual needs. In some embodiments, the working concentration of the surfactant is 0.01%-5%, 0.1%-3%, 0.3%-2% or 0.5-1%.
示例性的裂解酶可以是蛋白酶K、胃蛋白酶、木瓜蛋白酶等,或其任意组合。在一些实施方式中,裂解酶的工作浓度为0.01%-1%、0.02%-0.5%、0.03%-0.2%或0.4-0.1%。Exemplary lytic enzymes can be proteinase K, pepsin, papain, etc., or any combination thereof. In some embodiments, the working concentration of the lytic enzyme is 0.01%-1%, 0.02%-0.5%, 0.03%-0.2% or 0.4-0.1%.
在本申请提供的方法中,可以在第一反应混合物中直接使用含有基因组DNA的裂解产物,例如,可以将生物样品预先进行裂解处理,得到裂解产物,然后将裂解产物与第一反应混合物的其他成分混合。如有需要,可以对裂解产物经过进一步的处理,以分离得到其中的基因组DNA,再将该分离的基因组DNA与第一反应混合物的其他成分混合得到第一反应混合物。In the methods provided herein, the lysate containing genomic DNA can be used directly in the first reaction mixture. For example, the biological sample can be pre-lysed to obtain a lysate, which is then mixed with the other components of the first reaction mixture. If necessary, the lysate can be further treated to isolate the genomic DNA therein, which is then mixed with the other components of the first reaction mixture to obtain the first reaction mixture.
在一些实施方式中,裂解后的核酸样品无需进行纯化即可进行扩增。在一些实施方式中,裂解后的核酸样品在进行纯化后再进行扩增。在一些实施方式中,裂解过程中DNA已经发生了不同程度的断裂,而无需特殊打断步骤即可用于扩增。在一些实施方式中,裂解后的核酸样品在经过打断处理后再进行扩增。In some embodiments, the cleaved nucleic acid sample can be amplified without purification. In some embodiments, the cleaved nucleic acid sample is purified before amplification. In some embodiments, the DNA has already undergone varying degrees of fragmentation during the cleavage process and can be amplified without a special shearing step. In some embodiments, the cleaved nucleic acid sample is sheared before amplification.
本申请还提供了一种更为简便的方法,即,直接将包含基因组DNA的细胞与扩增所需的其他成分混合得到第一反应混合物,也就是说,在第一反应混合物中的基因组DNA存在于细胞内部。在这样的情况下,第一反应混合物中还可以进一步含有能够裂解所述细胞的表面活性剂(例如但不限于,NP-40、吐温、SDS、TritonX-100、EDTA、异硫氰酸胍中的一种或多种)和/或裂解酶(例如蛋白酶K、胃蛋白酶、木瓜蛋白酶中的一种或多种)。这样,细胞的裂解和基因组DNA的预扩增都在同一个反应混合物中进行,能够提高了反应效率和缩短反应时间。The application also provides a more simple method, that is, directly the cell comprising genomic DNA is mixed with other components required for amplification to obtain the first reaction mixture, that is to say, the genomic DNA in the first reaction mixture is present in the cell interior. In such a case, the first reaction mixture can also further contain a surfactant (such as but not limited to one or more of NP-40, Tween, SDS, TritonX-100, EDTA, guanidine isothiocyanate) and/or a lyase (such as one or more of proteinase K, pepsin, papain) that can crack the cell. Like this, the cracking of cell and the pre-amplification of genomic DNA are all carried out in the same reaction mixture, which can improve reaction efficiency and shorten reaction time.
在某些实施方式中,本申请提供的方法在步骤(a)完成以后并且在进行步骤(b)之前还可以进一步包括将所述反应混合物置于裂解温度循环程序,使得所述细胞裂解并释放出所述基因组DNA。本领域技术人员根据反应混合物中含有的裂解成分、细胞的种类等可以选择适当的裂解温度循环程序。示例的裂解温度循环程序包括,将反应混合物置于50℃3分钟到8小时(例如,在3分钟到7小时、3分钟到6小时、3分钟到5小时、3分钟到4小时、3分钟到3小时、3分钟到2小时、3分钟到1小时、3分钟到40分钟、3分钟到20分钟之间的任意时间,例如10分钟、20分钟、30分钟等),然后置于80℃2分钟到8小时(例如,在2分钟到7小时、2分钟到6小时、2分钟到5小时、2分钟到4小时、2分钟到3小时、2分钟到2小时、2分钟到1小时、2分钟到40分钟、2分钟到20分钟之间的任意时间,例如10分钟、20分钟、30分钟等)。裂解温度程序可以进行1个循环,如有需要,也可以进行两个或更多个循环,取决于具体的裂解条件。In certain embodiments, the methods provided herein may further comprise subjecting the reaction mixture to a lysis temperature cycle program after step (a) and before step (b) to lyse the cells and release the genomic DNA. A person skilled in the art may select an appropriate lysis temperature cycle program based on the lysis components contained in the reaction mixture, the type of cells, and the like. An exemplary cleavage temperature cycle program includes placing the reaction mixture at 50° C. for 3 minutes to 8 hours (e.g., any time between 3 minutes to 7 hours, 3 minutes to 6 hours, 3 minutes to 5 hours, 3 minutes to 4 hours, 3 minutes to 3 hours, 3 minutes to 2 hours, 3 minutes to 1 hour, 3 minutes to 40 minutes, 3 minutes to 20 minutes, such as 10 minutes, 20 minutes, 30 minutes, etc.), and then placing it at 80° C. for 2 minutes to 8 hours (e.g., any time between 2 minutes to 7 hours, 2 minutes to 6 hours, 2 minutes to 5 hours, 2 minutes to 4 hours, 2 minutes to 3 hours, 2 minutes to 2 hours, 2 minutes to 1 hour, 2 minutes to 40 minutes, 2 minutes to 20 minutes, such as 10 minutes, 20 minutes, 30 minutes, etc.). The cleavage temperature program can be performed in one cycle or, if desired, in two or more cycles, depending on the specific cleavage conditions.
ii.第一类引物ii. First class primers
本申请所述的方法中涉及两大类不同的引物,其中第一类引物从5’端到3’端包含通用序列和可变序列,所述第二类引物包含特定序列和通用序列,但是不包含任何可变序列。本文中所述的“第一引物”和“第三引物”均属于上述第一类引物。在第一反应混合物中包括的第一引物从5’端到3’端包含通用序列和第一可变序列;而在第一反应混合物中可选地包括的第三引物从5’端到3’端包括通用序列和第三可变序列。在一些实施方式中,第一类引物由通用序列和可变序列组成。在另一些实施方式中,第一类引物由通用序列、可变序列和间隔序列组成。The method described in this application involves two major categories of primers, wherein the first category of primers comprises a universal sequence and a variable sequence from the 5' end to the 3' end, and the second category of primers comprises a specific sequence and a universal sequence, but does not comprise any variable sequence. The "first primer" and "third primer" described herein both belong to the above-mentioned first category of primers. The first primer included in the first reaction mixture comprises a universal sequence and a first variable sequence from the 5' end to the 3' end; and the third primer optionally included in the first reaction mixture comprises a universal sequence and a third variable sequence from the 5' end to the 3' end. In some embodiments, the first category of primers consists of a universal sequence and a variable sequence. In other embodiments, the first category of primers consists of a universal sequence, a variable sequence, and a spacer sequence.
通用序列universal sequence
通用序列在本申请中是指第一类引物和第二类引物在其5’端均具有的核苷酸序列。通用序列的长度可以是例如,6-60、8-50、9-40、10-30、10-15或25-30个碱基。在本申请中,选择适当的通用序列,使得基本上不会与基因组DNA结合而产生扩增,并且避免第一类引物与第一类引物之间的聚合(例如,第一引物与第一引物之间、第三引物与第三引物之间或第一引物和第三引物之间)以及第一类引物自身的成环(例如,第一引物5’端的部分序列与3’端的部分序列互补而第一引物自身形成发卡结构、或第三引物5’端的部分序列与3’端的部分序列互补而第三引物自身形成发卡结构),以及第一类引物与第二类引物之间的聚合或成环的情况。Universal sequence refers to the nucleotide sequence that first class primer and second class primer all have at its 5 ' end in the present application.The length of universal sequence can be for example, 6-60,8-50,9-40,10-30,10-15 or 25-30 base.In the present application, select suitable universal sequence, make to be combined with genomic dna and produce amplification basically, and avoid polymerization between first class primer and the first class primer (for example, between first primer and the first primer, between the 3rd primer and the 3rd primer or between first primer and the 3rd primer) and the looping of first class primer self (for example, the partial sequence at 5 ' end of first primer and the partial sequence at 3 ' end are complementary and the first primer self forms hairpin structure or the partial sequence at 5 ' end of the 3rd primer and the partial sequence at 3 ' end are complementary and the 3rd primer self forms hairpin structure), and polymerization between first class primer and second class primer or the situation of looping.
在某些实施方式中,通用序列中包含全部4类碱基A、T、C、G。在某些实施方式中,通用序列中仅包含三类或两类自身互补配对能力较弱的碱基,而不含有另一种或两种碱基。在某些实施方式中,通用序列由G、A和T三种碱基组成,即通用序列中不含有C碱基。在某些实施方式中,通用序列由C、A和T三种碱基组成,即通用序列中不含有G碱基。在某些实施方式中,通用序列由A和T、A和C、A和G、T和C或T和G两种碱基组成,即通用序列中不同时含有G和C碱基。不希望受理论限制,但认为通用序列中如果含有C或G碱基可能会导致引物与引物之间的相互聚合,产生多聚体,从而削弱对基因组DNA的扩增能力。优选地,通用序列中不具有能够自身配对的序列、会导致引物与引物之间配对的序列,或者连续多个同种的碱基。In some embodiments, the universal sequence comprises all 4 types of bases A, T, C, and G. In some embodiments, the universal sequence only comprises three or two types of bases with weaker self-complementary pairing abilities, and does not contain another type or two types of bases. In some embodiments, the universal sequence is composed of three types of bases: G, A, and T, i.e., the universal sequence does not contain a C base. In some embodiments, the universal sequence is composed of three types of bases: C, A, and T, i.e., the universal sequence does not contain a G base. In some embodiments, the universal sequence is composed of two types of bases: A and T, A and C, A and G, T and C, or T and G, i.e., the universal sequence does not contain G and C bases at the same time. Without wishing to be bound by theory, it is believed that if the universal sequence contains C or G bases, it may cause mutual polymerization between primers, produce polymers, thereby weakening the amplification ability of genomic DNA. Preferably, the universal sequence does not have a sequence that can self-pair, a sequence that can cause pairing between primers, or a plurality of continuous homologous bases.
在某些实施方式中,可以选择适当的通用序列的碱基序列以及其中各碱基的比例,以确保通用序列本身不与基因组DNA模板序列发生碱基配对或产生扩增。In certain embodiments, the base sequence of an appropriate universal sequence and the ratio of each base therein can be selected to ensure that the universal sequence itself does not base pair with the genomic DNA template sequence or cause amplification.
在某些实施方式中,可以选择所述通用序列使得扩增产物能够直接进行测序。不希望受到理论的约束,可以将通用序列设计成包括与测序用引物的部分或全部互补或者相同的序列(例如,与测序用引物的部分相同、全部相同、部分互补、或全部互补的序列)。在某些实施方式中,根据不同的测序平台针对性地选择通用序列。在某些实施方式中,根据第二代或第三代测序平台针对性地选择通用序列。在某些实施方式中,根据Illumina的NGS测序平台针对性地选择通用序列。在某些实施方式中,根据Ion torrent测序平台针对性地选择通用序列。In some embodiments, the universal sequence can be selected so that amplified production can directly be sequenced.Do not wish to be bound by theory, the universal sequence can be designed to comprise partial or complete complementarity or identical sequence (for example, identical with the part of sequencing primer, all identical, partially complementary or entirely complementary sequence) of sequencing primer. In some embodiments, universal sequence is selected specifically according to different sequencing platforms. In some embodiments, universal sequence is selected specifically according to second generation or third generation sequencing platform. In some embodiments, universal sequence is selected specifically according to the NGS sequencing platform of Illumina. In some embodiments, universal sequence is selected specifically according to the Ion torrent sequencing platform.
在某些实施方式中,所述通用序列选自下组:SEQ ID NO:1[TTGGTAGTGAGTG]、SEQID NO:2[GAGGTGTGATGGA]、SEQ ID NO:3[GTGATGGTTGAGGTA]、SEQ ID NO:4[AGATGTGTATAAGAGACAG]、SEQ ID NO:5[GTGAGTGATGGTTGAGGTAGTGTGGAG]和SEQ ID NO:6[GCTCTTCCGATCT]。In certain embodiments, the common sequence is selected from the group consisting of SEQ ID NO: 1 [TTGGTAGTGAGTG], SEQ ID NO: 2 [GAGGTGTGATGGA], SEQ ID NO: 3 [GTGATGGTTGAGGTA], SEQ ID NO: 4 [AGATGTGTATAAGAGACAG], SEQ ID NO: 5 [GTGAGTGATGGTTGAGGTAGTGTGGAG], and SEQ ID NO: 6 [GCTCTTCCGATCT].
可变序列Mutable Sequence
第一类引物从5’端到3’端包含通用序列和可变序列(例如第一引物/第三引物分别包含第一/第三可变序列),其中第一类引物中的通用序列都相同,但是可变序列可能各不相同。例如,在一些实施方式中,第一/第三引物分别为包括相同的通用序列和不同的可变序列的引物混合物。可变序列在本申请中是指序列不固定的一段碱基序列,其可以包含随机序列(例如第一/第三可变序列分别包含第一/第三随机序列)。在一些实施方式中,可变序列由随机序列组成。在另一些实施方式中,可变序列由随机序列和固定序列组成。The first class primer comprises universal sequence and variable sequence (for example the first primer/the 3rd primer comprises the first/the 3rd variable sequence respectively) from 5 ' end to 3 ' end, and wherein the universal sequence in the first class primer is all identical, but variable sequence may be different.For example, in some embodiments, the first/the 3rd primer is respectively a primer mixture comprising identical universal sequence and different variable sequences.Variable sequence refers to a section of base sequence that sequence is not fixed in the application, and it can comprise random sequence (for example the first/the 3rd variable sequence comprises the first/the 3rd random sequence respectively).In some embodiments, variable sequence is made up of random sequence.In other embodiments, variable sequence is made up of random sequence and fixed sequence.
a)随机序列a) Random sequence
随机序列是指该序列每个碱基位置上的碱基均从某个特定集合中各自独立地随机选出,因此上述随机序列代表了由不同碱基组合构成的碱基序列的集合。A random sequence means that the bases at each base position in the sequence are independently and randomly selected from a specific set, so the above random sequence represents a set of base sequences composed of different base combinations.
具体而言,例如,第一可变序列可包括第一随机序列,其中所述第一随机序列的碱基数为n,n是选自3-20的正整数,第一随机序列从5’端到3’端的序列可以表示为Xa1Xa2……Xan,而其中任意碱基位置i上的碱基(即,第一随机序列5’端的第i个核苷酸,i=1-n)可用Xai来代表,其中每个Xai均是从一个特定的集合中随机选择,例如,由A、T、G、C中的特定两种或三种核苷酸组成的集合。通常可以通过简并标识的方法表示上述任意碱基位置上可选择的集合,例如,可将仅包含A、G两种核苷酸的集合表示为R(即R={A、G}),其他的可以简并标识方式表示的集合还包括:Y={C、T}、M={A、C}、K={G、T}、S={C、G}、W={A、T}、H={A、C、T}、B={C、G、T}、V={A、C、G}、D={A、G、T}、N={A、C、G、T}。Specifically, for example, the first variable sequence may include a first random sequence, wherein the number of bases in the first random sequence is n, where n is a positive integer selected from 3-20, and the sequence of the first random sequence from the 5' end to the 3' end can be expressed as X a1 X a2 ... X an , and the base at any base position i (i.e., the i-th nucleotide at the 5' end of the first random sequence, i=1-n) can be represented by X ai , wherein each X ai is randomly selected from a specific set, for example, a set consisting of specific two or three nucleotides among A, T, G, and C. The selectable sets at any of the above-mentioned base positions can usually be represented by a degenerate identification method. For example, the set containing only two nucleotides, A and G, can be represented as R (i.e., R = {A, G}). Other sets that can be represented by degenerate identification include: Y = {C, T}, M = {A, C}, K = {G, T}, S = {C, G}, W = {A, T}, H = {A, C, T}, B = {C, G, T}, V = {A, C, G}, D = {A, G, T}, N = {A, C, G, T}.
可以通过完全随机的方式选择随机序列(即随机序列中的任意碱基位置),也可以在随机的基础上进一步增加某些限定条件,从而排除一些不希望的情况或者增加与目标基因组DNA的匹配程度。在某些实施方式中,为避免可变序列与通用序列产生互补配对,当通用序列含有大量G时,随机序列中的任意碱基位置均选自集合D(即,不为C);或者当通用序列含有大量C时,随机序列中的任意碱基位置均选自集合H(即,不为G);当通用序列含有大量T时,随机序列中的任意碱基位置均选自集合B(即,不为A);或者当通用序列含有大量A时,随机序列中的任意碱基位置均选自集合V(即,不为T)。The random sequence (i.e., any base position in the random sequence) can be selected in a completely random manner, or certain restrictions can be added to the randomness to exclude some undesirable situations or increase the degree of matching with the target genomic DNA. In some embodiments, to avoid complementary pairing between the variable sequence and the universal sequence, when the universal sequence contains a large number of Gs, any base position in the random sequence is selected from set D (i.e., not C); or when the universal sequence contains a large number of Cs, any base position in the random sequence is selected from set H (i.e., not G); when the universal sequence contains a large number of Ts, any base position in the random sequence is selected from set B (i.e., not A); or when the universal sequence contains a large number of As, any base position in the random sequence is selected from set V (i.e., not T).
随机序列可以具有适当的长度,例如2-20个碱基、2-19个碱基、2-18个碱基、2-17个碱基、2-16个碱基、2-15个碱基、2-14个碱基、2-13个碱基、2-12个碱基、2-11个碱基、2-12个碱基、2-11个碱基、2-10个碱基、2-9个碱基、2-8个碱基、3-18个碱基、3-16个碱基、3-14个碱基、3-12个碱基、3-10个碱基,4-16个碱基、4-12个碱基、4-9个碱基、或5-8个碱基。在某些实施方式中,随机序列的长度为5个碱基。在某些实施方式中,随机序列的长度为8个碱基。理论上来说,如果随机序列的每个碱基位置都从A、T、G三种碱基中随机选择的话,那么长度为4个碱基的可变序列可以组合出34=81种可能的随机序列,长度为5个碱基的随机序列可以组合出35=243种可能的随机序列,以此类推。这些随机序列可以与基因组DNA上的不同位置的对应序列互补配对,从而在基因组DNA的不同位置开始复制。The random sequence can have an appropriate length, for example 2-20 bases, 2-19 bases, 2-18 bases, 2-17 bases, 2-16 bases, 2-15 bases, 2-14 bases, 2-13 bases, 2-12 bases, 2-11 bases, 2-12 bases, 2-11 bases, 2-10 bases, 2-9 bases, 2-8 bases, 3-18 bases, 3-16 bases, 3-14 bases, 3-12 bases, 3-10 bases, 4-16 bases, 4-12 bases, 4-9 bases, or 5-8 bases. In certain embodiments, the length of the random sequence is 5 bases. In certain embodiments, the length of the random sequence is 8 bases. Theoretically, if each base position in the random sequence is randomly selected from the three bases A, T, and G, then a variable sequence of 4 bases can combine to form 3 4 = 81 possible random sequences, a random sequence of 5 bases can combine to form 3 5 = 243 possible random sequences, and so on. These random sequences can complement each other with corresponding sequences at different locations on the genomic DNA, thereby initiating replication at different locations on the genomic DNA.
在一个实施方式中,第一随机序列中每个任意碱基位置i上的碱基Xai(i=1-n)均属于同一个集合,并且其中所述集合选自B、D、H或V中的一个。作为一个非限制性的例子,第一引物以具有通用序列和第一随机序列,其中n=5,随机序列的每个任意Xai(i=1-5)均属于同一集合B,即,该随机序列可表示为BBBBB或者(B)5,随机序列可以选自{TTTTT,TGTTT,TCTTT,TTGTT,TTCTT……},共35=243种序列组合。在包括这种第一引物的特定第一反应混合物中,这些第一引物均具有相同的通用序列及上述的第一随机序列,即,在这个特定第一反应物中的第一引物是一组引物,这些引物均具有相同的通用序列,并且具有由选自集合B的碱基组成的相同或不同的随机序列。In one embodiment, the base X ai (i=1-n) at each arbitrary base position i in the first random sequence belongs to the same set, and the set is selected from one of B, D, H, or V. As a non-limiting example, a first primer has a universal sequence and a first random sequence, where n=5, and each arbitrary X ai (i=1-5) of the random sequence belongs to the same set B, that is, the random sequence can be represented as BBBBB or (B) 5 , and the random sequence can be selected from {TTTTT, TGTTT, TCTTT, TTGTT, TTCTT...}, for a total of 3 5 = 243 sequence combinations. In a specific first reaction mixture including such first primers, these first primers all have the same universal sequence and the above-mentioned first random sequence, that is, the first primers in this specific first reaction are a group of primers, these primers all have the same universal sequence and have the same or different random sequences composed of bases selected from set B.
除非另有明确的说明,本文中所有对于第一引物及其各个部分的描述均适用与第三引物及其相应部分。相似地,在第一反应混合物中进一步包含第三引物的情况下,第三引物中的第三可变序列可包括第三随机序列,其中所述第三随机序列从5’端到3’端依次为Xb1Xb2……Xbn,优选地所述第三随机序列的Xbi(i=1-n)均属于同一个集合,所述集合选自B、或D、或H、或V,其中B={T、G、C},D={A、T、G},H={T、A、C},V={A、C、G},并且Xbi(i=1-n)和Xai(i=1-n)属于不同的集合,其中Xbi表示第三随机序列5’端的第i个核苷酸,n是选自3-20的正整数。在一个特定的第一反应混合物中,包括一定量的第一引物,这些第一引物均具有相同的通用序列及长度为n的第一随机序列,其中第一随机序列的每个碱基Xai均属于同一个集合,并且其中所述集合选自B、D、H或V;同时上述第一反应混合物中进一步包括一定量的第三引物,这些第三引物均具有相同的通用序列及长度为n的第一随机序列,其中第一随机序列的每个碱基Xbi均属于同一个集合,并且其中所述集合选自B、D、H或V,并且Xbi和Xai属于不同的集合。在一些实施方式中,第一随机序列和第三随机序列的长度相同。在另一些实施方式中,第一随机序列和第三随机序列的长度不同。Unless otherwise explicitly stated, all descriptions herein of the first primer and its various parts are applicable to the third primer and its corresponding parts. Similarly, when the first reaction mixture further comprises a third primer, the third variable sequence in the third primer may include a third random sequence, wherein the third random sequence is X b1 X b2 ... X bn from the 5' end to the 3' end, preferably, X bi (i=1-n) of the third random sequence all belong to the same set, the set being selected from B, or D, or H, or V, wherein B={T, G, C}, D={A, T, G}, H={T, A, C}, V={A, C, G}, and X bi (i=1-n) and X ai (i=1-n) belong to different sets, wherein X bi represents the i-th nucleotide at the 5' end of the third random sequence, and n is a positive integer selected from 3-20. In a specific first reaction mixture, a certain amount of first primers is included, each of which has the same universal sequence and a first random sequence of length n, wherein each base X ai of the first random sequence belongs to the same set, and wherein the set is selected from B, D, H, or V; and the first reaction mixture further includes a certain amount of third primers, each of which has the same universal sequence and a first random sequence of length n, wherein each base X bi of the first random sequence belongs to the same set, and wherein the set is selected from B, D, H, or V, and X bi and X ai belong to different sets. In some embodiments, the first random sequence and the third random sequence are the same length. In other embodiments, the first random sequence and the third random sequence are different lengths.
b)固定序列b) Fixed sequence
可变序列在其3’端还可以进一步包括固定序列,所述固定序列可以选自任何能够提高基因组覆盖度的碱基组合。本申请所述的固定序列包括但不限于选自CCC、AAA、TGGG、GTTT、GGG、TTT、TNTNG或GTGG的序列。在本申请中描述固定序列时使用的N表示选自A、T、C、G中的任一种单核苷酸,而并非表示选自N的随机序列。同一引物组中,例如第一引物中,从5’端到3’端可以依次包括相同的通用序列、含有不同序列组合的随机序列和相同的固定序列(例如所有第一引物在其3’端均包括TGGG或GTTT中的任一种)。或者,同一引物组中,例如第一引物中,从5’端到3’端可以依次包括相同的通用序列、含有不同序列组合的随机序列和不同的固定序列(例如第一引物中包含在其3’端均包括TGGG的引物混合物以及或在其3’端均包括GTTT的引物混合物)。在一些实施方式中,第一反应混合物包括第一引物和第三引物,其中第一引物中的第一可变序列选自Xa1Xa2……XanGGG、Xa1Xa2……XanTTT、Xa1Xa2……XanTGGG或Xa1Xa2……XanGTTT,第三引物中的第三可变序列选自Xb1Xb2……XbnGGG、Xb1Xb2……XbnTTT、Xb1Xb2……XbnTGGG或Xb1Xb2……XbnGTTT。Variable sequence can further include fixed sequence at its 3 ' end, and described fixed sequence can be selected from any base combination that can improve genome coverage.Fixed sequence described in the present application includes but is not limited to the sequence selected from CCC, AAA, TGGG, GTTT, GGG, TTT, TNTNG or GTGG.The N used when describing fixed sequence in the present application represents any mononucleotide selected from A, T, C, G, rather than the random sequence selected from N. In the same primer set, for example, in the first primer, from 5 ' end to 3 ' end, can include successively identical universal sequence, random sequence containing different sequence combinations and identical fixed sequence (for example, all first primers include any one of TGGG or GTTT at its 3 ' end). Or, in the same primer set, for example, in the first primer, from 5 ' end to 3 ' end, can include successively identical universal sequence, random sequence containing different sequence combinations and different fixed sequences (for example, in the first primer, include the primer mixture that all includes TGGG at its 3 ' end and or include the primer mixture that all includes GTTT at its 3 ' end). In some embodiments, the first reaction mixture includes a first primer and a third primer, wherein the first variable sequence in the first primer is selected from Xa1Xa2 ... XanGGG , Xa1Xa2 ... XanTTT , Xa1Xa2 ... XanTGGG , or Xa1Xa2 ... XanGTTT , and the third variable sequence in the third primer is selected from Xb1Xb2 ... XbnGGG , Xb1Xb2 ... XbnTTT , Xb1Xb2 ... XbnTGGG , or Xb1Xb2 ... XbnGTTT .
在某些实施方式中,还可以通过统计计算,选择在基因组上分布更加均匀,覆盖度更高的可变序列,从而增加可变序列与基因组DNA的识别机会。In certain embodiments, statistical calculations can be used to select variable sequences that are more evenly distributed on the genome and have higher coverage, thereby increasing the chances of recognition between the variable sequences and genomic DNA.
在某些实施方式中,可变序列选自下组:(B)nCCC、(B)n AAA、(B)n TGGG、(B)nGTTT、(B)n GGG、(B)n TTT、(B)n TNTNG、(B)n GTGGGGG、(D)nCCC、(D)n AAA、(D)n TGGG、(D)nGTTT、(D)n GGG、(D)n TTT、(D)n TNTNG、(D)n GTGGGGG、(H)nCCC、(H)n AAA、(H)n TGGG、(H)nGTTT、(H)n GGG、(H)n TTT、(H)n TNTNG、(H)n GTGGGGG、(V)nCCC、(V)n AAA、(V)n TGGG、(V)nGTTT、(V)n GGG、(V)n TTT、(V)n TNTNG、(V)n GTGGGGG,其中n是选自3-17的正整数。在某些实施方式中,所述第一引物中的第一可变序列可以具有(B)nCCC、(B)n AAA、(B)n TGGG、(B)nGTTT、(B)n GGG、(B)n TTT、(B)n TNTNG、(B)n GTGGGGG中的一种或多种序列。在某些实施方式中,所述第三引物中的第三可变序列可以具有(D)nCCC、(D)n AAA、(D)n TGGG、(D)n GTTT、(D)n GGG、(D)n TTT、(D)n TNTNG、(D)n GTGGGGG中的一种或多种序列。In certain embodiments, the variable sequence is selected from the group consisting of (B) n CCC, (B) n AAA, (B) n TGGG, (B) n GTTT, (B) n GGG, (B) n TTT, (B) n TNTNG, (B) n GTGGGGG, (D) n CCC, (D) n AAA, (D) n TGGG, (D) n GTTT, (D) n GGG, (D) n TTT, (D) n TNTNG, (D) n GTGGGGG, (H) n CCC, (H) n AAA, (H) n TGGG, (H) n GTTT, (H) n GGG, (H) n TTT, (H) n TNTNG, (H) n GTGGGGG, (V) n CCC, (V) n AAA, (V) n TGGG, (V) n GTTT, (V) n In some embodiments, the first variable sequence in the first primer may have one or more of the following sequences: (B) n CCC, (B) n AAA, (B) n TGGG, (B) n GTTT, (B) n GGG , (B) n TTT, ( B ) n TNTNG, (B) n GTGGGGG. In some embodiments, the third variable sequence in the third primer may have one or more of the following sequences: (D) n CCC, (D) n AAA, (D) n TGGG, (D) n GTTT, (D) n GGG , (D) n TTT, (D) n TNTNG, (D) n GTGGGGG.
间隔序列spacer sequence
第一类引物的通用序列和可变序列可以是直接相邻的,或者也可以具有一个或多个碱基的间隔序列。在某些实施方式中,通用序列和可变序列通过长度为m的间隔序列相连,其中m是选自1-3的正整数。在为了排除一些不希望的情况(例如引物二聚体等)或者为了增加与目标基因组DNA的匹配程度而对可变序列中的随机序列进行一定程度的限制时,可以在通用序列和可变序列中引入m个完全随机地选自A、T、G、C的碱基(长度为m的间隔序列),以在不增加引物二聚体产生程度的情况下进一步增加第一类引物在目标基因组DNA上的覆盖率。The universal sequence and variable sequence of the first class primer can be directly adjacent, or also can have the spacer sequence of one or more bases.In some embodiments, universal sequence and variable sequence link to each other through the spacer sequence of m, and wherein m is the positive integer that is selected from 1-3.When the random sequence in the variable sequence is restricted to a certain degree in order to get rid of some undesirable situations (for example primer dimer etc.) or in order to increase with the matching degree of target gene group dna, can in universal sequence and variable sequence, introduce m base (length is the spacer sequence of m) that is selected from A, T, G, C completely at random, to further increase the coverage ratio of the first class primer on target gene group dna when not increasing primer dimer and producing degree.
在一些实施方式中,第一引物中的通用序列和第一可变序列之间通过第一间隔序列相连,所述第一间隔序列为Ya1……Yam,其中Yaj(j=1-m)∈{A、T、G、C},其中Yaj表示第一间隔序列5’端的第j个核苷酸,m是选自1-3的正整数。在一些实施方式中,第三引物中的通用序列和第三可变序列之间通过第三间隔序列相连,所述第一间隔序列为Yb1……Ybm,其中Ybj(j=1-m)∈{A、T、G、C},其中Ybj表示第三间隔序列5’端的第j个核苷酸,m是选自1-3的正整数。在一些实施方式中,m为1,即第一引物中通用序列和第一可变序列之间通过一个选自集合N的碱基相连,第三引物中通用序列和第三可变序列之间通过一个选自集合N的碱基相连。In some embodiments, the universal sequence and the first variable sequence in the first primer are connected by a first spacer sequence, wherein the first spacer sequence is Y a1 …Y am , where Yaj (j=1-m)∈{A, T, G, C}, where Yaj represents the jth nucleotide at the 5' end of the first spacer sequence, and m is a positive integer selected from 1-3. In some embodiments, the universal sequence and the third variable sequence in the third primer are connected by a third spacer sequence, wherein the first spacer sequence is Y b1 …Y bm , where Y bj (j=1-m)∈{A, T, G, C}, where Y bj represents the jth nucleotide at the 5' end of the third spacer sequence, and m is a positive integer selected from 1-3. In some embodiments, m is 1, i.e., the universal sequence and the first variable sequence in the first primer are connected by a base selected from set N, and the universal sequence and the third variable sequence in the third primer are connected by a base selected from set N.
在某些实施方式中,设计第一引物(以及可选的第三引物)以使得其扩增产物可直接用于Illumina的NGS测序平台,其中第一引物包括GCTCTTCCGATCTYa1Xa1Xa2Xa3Xa4Xa5TGGG、GCTCTTCCGATCTYa1Xa1Xa2Xa3Xa4Xa5GTTT所示的序列或其混合物,第三引物包括GCTCTTCCGATCTYb1Xb1Xb2Xb3Xb4Xb5TGGG、GCTCTTCCGATCTYb1Xb1Xb2Xb3Xb4Xb5GTTT或其混合物,其中每个任意碱基位置i上的碱基Xai(i=1-n)均属于同一个集合,其中所述集合选自B、D、H或V中的一个,以及每个任意碱基位置i上的碱基Xbi(i=1-n)均属于同一个集合,其中所述集合选自B、D、H或V中的一个,并且Xbi(i=1-n)和Xai(i=1-n)属于不同的集合;其中Ya1∈{A、T、G、C},Yb1∈{A、T、G、C}。在某些特定实施方式中,上述Xai(i=1-5)∈{T、G、C},Xbi(i=1-5)∈{A、T、G},即第一引物包括SEQ ID NO:7、SEQ ID NO:11所示的序列或其混合物,第三引物包括SEQ ID NO:8、SEQ ID NO:12所示的序列或其混合物。In certain embodiments, the first primer (and optional third primer) is designed so that its amplification product can be directly used in the Illumina NGS sequencing platform, wherein the first primer comprises a sequence represented by GCTCTTCCGATCTY a1 X a1 X a2 X a3 X a4 X a5 TGGG, GCTCTTCCGATCTY a1 X a1 X a2 X a3 X a4 X a5 GTTT, or a mixture thereof, and the third primer comprises GCTCTTCCGATCTY b1 X b1 X b2 X b3 X b4 X b5 TGGG, GCTCTTCCGATCTY b1 X b1 X b2 X b3 X b4 X b5 GTTT, or a mixture thereof, wherein each base X ai (i=1-n) at any base position i belongs to the same set, wherein the set is selected from one of B, D, H, or V, and each base X bi (i=1-n) at any base position i belongs to the same set, wherein the set is selected from one of B, D, H, or V, and X bi (i=1-n) and X ai (i=1-n) belong to different sets; wherein Ya1 ∈ {A, T, G, C}, Y b1 ∈ {A, T, G, C}. In certain specific embodiments, the above X ai (i=1-5) ∈ {T, G, C}, X bi (i=1-5) ∈ {A, T, G}, that is, the first primer includes the sequence shown in SEQ ID NO:7, SEQ ID NO:11, or a mixture thereof, and the third primer includes the sequence shown in SEQ ID NO:8, SEQ ID NO:12, or a mixture thereof.
在某些实施方式中,第一类引物包含或者由选自SEQ ID NO:7、SEQ ID NO:8、SEQID NO:9、SEQ ID NO:10、SEQ ID NO:11、SEQ ID NO:12、SEQ ID NO:13或SEQ ID NO:14所示的序列组成,其中各第一类引物的通用序列包含或由SEQ ID NO:6组成。在某些实施方式中,第一类引物包括由SEQ ID NO:7所示的序列组成的引物和/或由SEQ ID NO:11所示的序列组成的引物。在某些实施方式中,第一类引物包括由SEQ ID NO:8所示的序列组成的引物和由SEQ ID NO:12所示的序列组成的引物。在某些实施方式中,第一类引物包括由SEQ IDNO:7所示的序列组成的引物或由SEQ ID NO:11所示的序列组成的引物;以及由SEQ ID NO:8所示的序列组成的引物或由SEQ ID NO:12所示的序列组成的引物。在某些实施方式中,第一类引物包含由SEQ ID NO:7所示的序列组成的引物、由SEQ ID NO:11所示的序列组成的引物、由SEQ ID NO:8所示的序列组成的引物和由SEQ ID NO:12所示的序列组成的引物。In certain embodiments, the first class of primers comprises or consists of a sequence selected from SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:13, or SEQ ID NO:14, wherein the common sequence of each first class of primers comprises or consists of SEQ ID NO:6. In certain embodiments, the first class of primers comprises a primer consisting of the sequence set forth in SEQ ID NO:7 and/or a primer consisting of the sequence set forth in SEQ ID NO:11. In certain embodiments, the first class of primers comprises a primer consisting of the sequence set forth in SEQ ID NO:8 and a primer consisting of the sequence set forth in SEQ ID NO:12. In certain embodiments, the first class of primers comprises a primer consisting of the sequence set forth in SEQ ID NO:7 or a primer consisting of the sequence set forth in SEQ ID NO:11; and a primer consisting of the sequence set forth in SEQ ID NO:8 or a primer consisting of the sequence set forth in SEQ ID NO:12. In certain embodiments, the first type of primer comprises a primer consisting of the sequence shown in SEQ ID NO:7, a primer consisting of the sequence shown in SEQ ID NO:11, a primer consisting of the sequence shown in SEQ ID NO:8, and a primer consisting of the sequence shown in SEQ ID NO:12.
在某些实施方式中,第一类引物包含或者由选自SEQ ID NO:15-22所示的序列组成,其中各第一类引物的通用序列包含或由SEQ ID NO:1组成。在某些实施方式中,第一类引物包括由SEQ ID NO:15所示的序列组成的引物和/或由SEQ ID NO:19所示的序列组成的引物。在某些实施方式中,第一类引物包括由SEQ ID NO:16所示的序列组成的引物和/或由SEQ ID NO:20所示的序列组成的引物。在某些实施方式中,第一类引物括由SEQ ID NO:15所示的序列组成的引物或由SEQ ID NO:19所示的序列组成的引物;以及由SEQ ID NO:16所示的序列组成的引物或由SEQ ID NO:20所示的序列组成的引物。在某些实施方式中,第一类引物包含由SEQ ID NO:15所示的序列组成的引物、由SEQ ID NO:19所示的序列组成的引物、由SEQ ID NO:16所示的序列组成的引物和由SEQ ID NO:20所示的序列组成的引物。In certain embodiments, the first class of primers comprises or consists of a sequence selected from the group consisting of SEQ ID NOs: 15-22, wherein the common sequence of each first class of primers comprises or consists of SEQ ID NO: 1. In certain embodiments, the first class of primers comprises a primer consisting of a sequence set forth in SEQ ID NO: 15 and/or a primer consisting of a sequence set forth in SEQ ID NO: 19. In certain embodiments, the first class of primers comprises a primer consisting of a sequence set forth in SEQ ID NO: 16 and/or a primer consisting of a sequence set forth in SEQ ID NO: 20. In certain embodiments, the first class of primers comprises a primer consisting of a sequence set forth in SEQ ID NO: 15 or a primer consisting of a sequence set forth in SEQ ID NO: 19; and a primer consisting of a sequence set forth in SEQ ID NO: 16 or a primer consisting of a sequence set forth in SEQ ID NO: 20. In certain embodiments, the first class of primers comprises a primer consisting of a sequence set forth in SEQ ID NO: 15, a primer consisting of a sequence set forth in SEQ ID NO: 19, a primer consisting of a sequence set forth in SEQ ID NO: 16, and a primer consisting of a sequence set forth in SEQ ID NO: 20.
在某些实施方式中,第一类引物包含或者由选自SEQ ID NO:23-30所示的序列组成,其中各第一类引物的的通用序列包含或由SEQ ID NO:2组成。在某些实施方式中,第一类引物包括由SEQ ID NO:23所示的序列组成的引物和/或由SEQ ID NO:27所示的序列组成的引物中的一种或两种。在某些实施方式中,第一类引物包括由SEQ ID NO:24所示的序列组成的引物和/或由SEQ ID NO:28所示的序列组成的引物中的一种或两种。在某些实施方式中,第一类引物括由SEQ ID NO:23所示的序列组成的引物或由SEQ ID NO:27所示的序列组成的引物;以及由SEQ ID NO:24所示的序列组成的引物或由SEQ ID NO:28所示的序列组成的引物。在某些实施方式中,第一类引物包含由SEQ ID NO:23所示的序列组成的引物、由SEQ ID NO:27所示的序列组成的引物、由SEQ ID NO:24所示的序列组成的引物和由SEQ IDNO:28所示的序列组成的引物。In certain embodiments, the first class of primers comprises or consists of a sequence selected from the group consisting of SEQ ID NOs: 23-30, wherein the common sequence of each first class of primers comprises or consists of SEQ ID NO: 2. In certain embodiments, the first class of primers comprises one or both of a primer consisting of a sequence set forth in SEQ ID NO: 23 and/or a primer consisting of a sequence set forth in SEQ ID NO: 27. In certain embodiments, the first class of primers comprises one or both of a primer consisting of a sequence set forth in SEQ ID NO: 24 and/or a primer consisting of a sequence set forth in SEQ ID NO: 28. In certain embodiments, the first class of primers comprises a primer consisting of a sequence set forth in SEQ ID NO: 23 or a primer consisting of a sequence set forth in SEQ ID NO: 27; and a primer consisting of a sequence set forth in SEQ ID NO: 24 or a primer consisting of a sequence set forth in SEQ ID NO: 28. In certain embodiments, the first type of primer comprises a primer consisting of the sequence shown in SEQ ID NO:23, a primer consisting of the sequence shown in SEQ ID NO:27, a primer consisting of the sequence shown in SEQ ID NO:24, and a primer consisting of the sequence shown in SEQ ID NO:28.
在一些实施方式中,第一和第三引物在第一反应混合物中的总浓度为10-150ng/μL。在一些实施方式中,第一和第三引物在第一反应混合物中的总浓度为10-120ng/μL、10-100ng/μL、10-90ng/μL、10-80ng/μL、10-70ng/μL、10-60ng/μL、10-50ng/μL、10-40ng/μL、20-120ng/μL、20-100ng/μL、20-80ng/μL、20-70ng/μL、20-60ng/μL、20-50ng/μL、30-140ng/μL、30-120ng/μL、30-100ng/μL、30-80ng/μL、30-60ng/μL或30-40ng/μL。在一些实施方式中,第一和第三引物在第一反应混合物中的浓度分别为10-140ng/μL、10-120ng/μL、10-100ng/μL、10-80ng/μL、10-60ng/μL、10-30ng/μL、10-20ng/μL、20-120ng/μL、20-100ng/μL、20-80ng/μL、20-60ng/μL、20-40ng/μL或20-30ng/μL。在一些实施方式中,第一和第三引物在第一反应混合物中的浓度分别为15ng/μL、30ng/μL或60ng/μL。在一些实施方式中,第一引物和第三引物在第一反应混合物中的浓度相同。在一些实施方式中,在第一反应混合物中的第一和第三引物分别为100-800pmol。在一些实施方式中,在第一反应混合物中的第一和第三引物一共为400-600pmol。In some embodiments, the total concentration of the first and third primers in the first reaction mixture is 10-150 ng/μL. In some embodiments, the total concentration of the first and third primers in the first reaction mixture is 10-120 ng/μL, 10-100 ng/μL, 10-90 ng/μL, 10-80 ng/μL, 10-70 ng/μL, 10-60 ng/μL, 10-50 ng/μL, 10-40 ng/μL, 20-120 ng/μL, 20-100 ng/μL, 20-80 ng/μL, 20-70 ng/μL, 20-60 ng/μL, 20-50 ng/μL, 30-140 ng/μL, 30-120 ng/μL, 30-100 ng/μL, 30-80 ng/μL, 30-60 ng/μL, or 30-40 ng/μL. In some embodiments, the concentration of the first and third primers in the first reaction mixture is respectively 10-140ng/μL, 10-120ng/μL, 10-100ng/μL, 10-80ng/μL, 10-60ng/μL, 10-30ng/μL, 10-20ng/μL, 20-120ng/μL, 20-100ng/μL, 20-80ng/μL, 20-60ng/μL, 20-40ng/μL or 20-30ng/μL. In some embodiments, the concentration of the first and third primers in the first reaction mixture is respectively 15ng/μL, 30ng/μL or 60ng/μL. In some embodiments, the concentration of the first primer and the third primer in the first reaction mixture is the same. In some embodiments, the first and third primers in the first reaction mixture are respectively 100-800pmol. In some embodiments, the first and third primers are present in a total amount of 400-600 pmol in the first reaction mixture.
iii.其他成分iii. Other ingredients
第一反应混合物还包括DNA扩增所需的其他组分,例如核酸聚合酶、核苷酸单体混合物、以及酶活性所需的适当的金属离子和缓冲液成分等。至少一种或多种这些成分可以使用本领域已知的试剂。The first reaction mixture may further comprise other components required for DNA amplification, such as nucleic acid polymerase, nucleotide monomer mixture, and appropriate metal ions and buffer components required for enzyme activity, etc. At least one or more of these components may use reagents known in the art.
核酸聚合酶在本申请中是指能够合成新的核酸链的酶。任何适用于本申请方法的核酸聚合酶都可以使用。优选使用DNA聚合酶。在某些实施方式中,本申请的方法使用热稳定的核酸聚合酶,例如那些在PCR扩增的温度下(例如95摄氏度)聚合酶活性不会下降或者下降小于1%、3%、5%、7%、10%、20%、30%、40%或者50%的那些核酸聚合酶。在某些实施方式中,本申请的方法使用的核酸聚合酶具有链置换活性。本申请所述的“链置换活性”是指核酸聚合酶的一种活性,其能够使得核酸模板和与其配对结合的互补链分离,并且这种分离以从5’到3’的方向进行并伴随着新的与模板互补的核酸链的生成。具有链置换能力的核酸聚合酶及其应用是本领域已知的,例如可以参见美国专利U.S.5824517,该专利的全部内容通过引用并入本申请。适合的核酸聚合酶包括,但不限于:Phi29DNA聚合酶、Bst DNA聚合酶、Bst 2.0DNA聚合酶、Pyrophage 3137、Vent聚合酶(例如Thermococcus litoralis的Vent聚合酶、Deep Vent聚合酶、Vent(-exo)聚合酶、Deep Vent(-exo)聚合酶)、TOPOTaqDNA聚合酶、9。Nm聚合酶、Klenow Fragment DNA聚合酶I、MMLV反转录酶、AMV反转录酶、HIV反转录酶、T7phase DNA聚合酶变种(缺少3’-5’外切酶活性)、超保真DNA聚合酶、Taq聚合酶、Psp GBD(exo-)DNA聚合酶、Bst DNA聚合酶(全长)、E.coli DNA聚合酶、LongAmpTaq DNA聚合酶、OneTaq DNA聚合酶中的一种或多种。As used herein, a nucleic acid polymerase refers to an enzyme capable of synthesizing new nucleic acid chains. Any nucleic acid polymerase suitable for the methods of the present application may be used. DNA polymerase is preferably used. In certain embodiments, the methods of the present application utilize thermostable nucleic acid polymerases, such as those whose polymerase activity does not decrease, or decreases by less than 1%, 3%, 5%, 7%, 10%, 20%, 30%, 40%, or 50%, at the temperature of PCR amplification (e.g., 95 degrees Celsius). In certain embodiments, the nucleic acid polymerase used in the methods of the present application possesses strand displacement activity. "Strand displacement activity," as used herein, refers to an activity of a nucleic acid polymerase that can separate a nucleic acid template from its paired complementary strand, with this separation occurring in a 5' to 3' direction and accompanied by the generation of a new nucleic acid strand complementary to the template. Nucleic acid polymerases with strand displacement ability and their use are known in the art, for example, see U.S. Patent No. 5,824,517, the entire contents of which are incorporated herein by reference. Suitable nucleic acid polymerases include, but are not limited to, one or more of Phi29 DNA polymerase, Bst DNA polymerase, Bst 2.0 DNA polymerase, Pyrophage 3137, Vent polymerase (e.g., Vent polymerase, Deep Vent polymerase, Vent (-exo) polymerase, Deep Vent (-exo) polymerase from Thermococcus litoralis), TOPO Taq DNA polymerase, 9.Nm polymerase, Klenow Fragment DNA polymerase I, MMLV reverse transcriptase, AMV reverse transcriptase, HIV reverse transcriptase, T7 phase DNA polymerase variant (lacking 3'-5' exonuclease activity), super-fidelity DNA polymerase, Taq polymerase, Psp GBD (exo-) DNA polymerase, Bst DNA polymerase (full length), E. coli DNA polymerase, Long Amp Taq DNA polymerase, and One Taq DNA polymerase.
核苷酸单体混合物在本申请中是指dATP、dTTP、dGTP、dCTP的混合物。The nucleotide monomer mixture in this application refers to a mixture of dATP, dTTP, dGTP, and dCTP.
在某些实施方式中,第一反应混合物中含有Thermococcus litoralis的Vent聚合酶、Deep Vent聚合酶、Vent(-exo)聚合酶、或Deep Vent(-exo)聚合酶中的一种或多种。在某些实施方式中,反应混合物中含有Thermococcus litoralis的Vent聚合酶。Thermococcus litoralis的Vent聚合酶是指分离自Thermococcus litoralis的天然的聚合酶。在某些实施方式中,反应混合物中含有Deep Vent聚合酶。Deep Vent聚合酶是指分离自Pyrococcus species GB-D的天然的聚合酶。在某些实施方式中,反应混合物中含有Vent(-exo)聚合酶。Vent(-exo)聚合酶是指将Thermococcus litoralis的Vent聚合酶进行过D141A/E143A基因改造的酶。在某些实施方式中,反应混合物中含有Deep Vent(-exo)聚合酶。Deep Vent(-exo)聚合酶是指对Deep Vent聚合酶进行过D141A/E143A基因改造的酶。本申请中所述的各种Vent聚合酶可以从商业途径获得,例如从New England Biolabs公司获得。In certain embodiments, the first reaction mixture contains one or more of Vent polymerase, Deep Vent polymerase, Vent(-exo) polymerase, or Deep Vent(-exo) polymerase from Thermococcus litoralis. In certain embodiments, the reaction mixture contains Vent polymerase from Thermococcus litoralis. Vent polymerase from Thermococcus litoralis refers to a natural polymerase isolated from Thermococcus litoralis. In certain embodiments, the reaction mixture contains Deep Vent polymerase. Deep Vent polymerase refers to a natural polymerase isolated from Pyrococcus species GB-D. In certain embodiments, the reaction mixture contains Vent(-exo) polymerase. Vent(-exo) polymerase refers to an enzyme in which Vent polymerase from Thermococcus litoralis has been genetically modified with D141A/E143A. In certain embodiments, the reaction mixture contains Deep Vent(-exo) polymerase. Deep Vent (-exo) polymerase refers to an enzyme that has been genetically modified with D141A/E143A to Deep Vent polymerase. The various Vent polymerases described in this application can be obtained from commercial sources, such as New England Biolabs.
第一反应混合物中还可以包括核酸聚合酶发挥酶活性所需的适当的金属离子(例如,适当浓度的Mg2+离子(例如终浓度可以为约1.5mM到约8mM),核苷酸单体混合物(例如dATP、dGTP、dTTP和dCTP)、牛血清白蛋白(BSA)、dTT(例如终浓度可以为约2mM到约7mM)、纯水等。The first reaction mixture may further include appropriate metal ions required for the nucleic acid polymerase to exert enzymatic activity (e.g., an appropriate concentration of Mg 2+ ions (e.g., a final concentration of about 1.5 mM to about 8 mM), a nucleotide monomer mixture (e.g., dATP, dGTP, dTTP, and dCTP), bovine serum albumin (BSA), dTT (e.g., a final concentration of about 2 mM to about 7 mM), pure water, etc.
在某些实施方式中,第一反应混合物中还可以进一步包括pH调节剂,使得混合物的pH值维持在7.0-9.0之间。适当的pH调节剂可以包括,例如Tris HCl和Tris SO4。在某些实施方式中,第一反应混合物中还可以进一步包括一种或多种其他成分,例如DNase抑制剂、RNase、SO4 2-、Cl-、K+、Ca2+、Na+、和/或(NH4)+等。In certain embodiments, the first reaction mixture may further include a pH regulator to maintain the pH of the mixture between 7.0 and 9.0. Suitable pH regulators may include, for example, Tris HCl and Tris SO 4 . In certain embodiments, the first reaction mixture may further include one or more other components, such as a DNase inhibitor, RNase, SO 4 2- , Cl - , K + , Ca 2+ , Na + , and/or (NH 4 ) + .
步骤(b):置于第一温度循环程序Step (b): Place in the first temperature cycle program
本申请提供的方法包括步骤(b):将所述第一反应混合物置于第一温度循环程序,使得所述第一类引物(第一引物或第一引物和第三引物)的可变序列能够与所述基因组DNA通过碱基配对结合,在核酸聚合酶的作用下复制基因组DNA。The method provided in the present application includes step (b): placing the first reaction mixture in a first temperature cycle program so that the variable sequence of the first type of primer (the first primer or the first primer and the third primer) can bind to the genomic DNA through base pairing, and replicate the genomic DNA under the action of nucleic acid polymerase.
“扩增”在本申请中是指,在核酸聚合酶的作用下,在引物的3’端添加与核酸模板互补的核苷酸,从而合成得到与核酸模板碱基互补的新的核酸链。可以使用适合的扩增核酸的方法,例如聚合酶链式反应(PCR)、连接酶链式反应(LCR),或其他适合的扩增方法。这些方法都是本领域已知的,可以参见例如美国专利U.S.4,683,195和U.S.4,683,202,以及Innis等人"PCR protocols:a guide to method and applications"Academic Press,Incorporated(1990)和Wu等人(1989)Genomics 4:560-569,这些文献和专利的全部内容通过引用并入本申请。" Amplification " refers to in the present application, under the action of nucleic acid polymerase, adding nucleotides complementary to the nucleic acid template at the 3 ' end of the primer, so as to synthesize a new nucleic acid chain complementary to the nucleic acid template base. Suitable methods for amplifying nucleic acids can be used, such as polymerase chain reaction (PCR), ligase chain reaction (LCR), or other suitable amplification methods. These methods are all known in the art, and can be found in, for example, U.S. Patent Nos. 4,683,195 and 4,683,202, and Innis et al. " PCR protocols: a guide to method and applications " Academic Press, Incorporated (1990) and Wu et al. (1989) Genomics 4: 560-569, the full contents of these documents and patents are incorporated herein by reference.
在扩增过程中,将反应混合物置于适当的温度循环程序,使得DNA模板双链解开成单链,第一/第三引物与模板单链杂交,然后在DNA聚合酶的作用下在引物的3’端进行延伸。因此,温度循环程序通常包括:变性或解链温度,在该温度下DNA模板双链解开成单链;退火温度,在该温度下引物与DNA模板单链特异性杂交;以及延伸温度,在该温度下DNA聚合酶在引物的3’端添加与DNA模板碱基互补的核苷酸,使得引物得以延长,得到与DNA模板互补的新的DNA链。During amplification, the reaction mixture is placed in an appropriate temperature cycling program to allow the double-stranded DNA template to unwind into single strands, allowing the first/third primers to hybridize to the single-stranded template, and then be extended at the 3' end of the primers by the action of a DNA polymerase. Therefore, the temperature cycling program typically includes: a denaturation or melting temperature, at which the double-stranded DNA template unwinds into single strands; an annealing temperature, at which the primers specifically hybridize to the single-stranded DNA template; and an extension temperature, at which the DNA polymerase adds nucleotides complementary to the bases of the DNA template to the 3' end of the primer, allowing the primers to be extended, resulting in a new DNA strand complementary to the DNA template.
在第一温度循环程序中的第一次循环,首先将第一反应混合物置于能够打开所述基因组DNA的双链的温度程序(步骤(b1))。在第一轮循环中,为确保基因组DNA双链完全解开成单链(即,变性/解链),可以使用较高的反应温度(例如90℃-95℃),并且可以保持较长的反应时间(例如在介于90-95℃之间的温度反应1-20分钟)。而在后续循环中,需要解开的双链为扩增过程中生成的双链,在此情况下,只要待扩增的半扩增子或全扩增子双链能够变性成为单链即可,因此需要的解链时间无需很长(例如在介于90-95℃的温度之间解链反应3-50秒)。In the first cycle of the first temperature cycling program, the first reaction mixture is first placed in a temperature program capable of opening the double-stranded genomic DNA (step (b1)). In the first round of cycles, in order to ensure that the double-stranded genomic DNA is completely unwound into a single strand (i.e., denaturation/melting), a higher reaction temperature (e.g., 90°C-95°C) can be used, and a longer reaction time (e.g., 1-20 minutes at a temperature between 90-95°C) can be maintained. In subsequent cycles, the double strands that need to be unwound are the double strands generated during the amplification process. In this case, as long as the double strands of the half-amplicons or full-amplicons to be amplified can be denatured into single strands, the required melting time does not need to be very long (e.g., 3-50 seconds for melting reaction at a temperature between 90-95°C).
然后,将第一反应混合物置于能够使所述第一类引物(第一引物或第一引物和第三引物)与DNA单链模板结合的温度程序(步骤(b2))。在这个温度程序中,第一类引物中的可变序列与基因组DNA中的不同位置的互补序列通过碱基互补结合(即,退火),并由此在基因组DNA的不同位置开启复制。由于第一类引物中的可变序列各不相同,其中的碱基比例、序列都存在差异,因此每个可变序列与基因组DNA结合的最佳温度也存在较大的差别。这样,在某个特定的退火温度下,可能只有一部分的引物能够很好地与基因组DNA结合,而另一部分引物与基因组DNA的结合可能并不理想。在某些实施方式中,所述步骤(b2)包括将所述反应混合物置于多于一种温度的程序,以促使所述第一类引物充分与所述DNA模板有效结合。例如,可以将DNA变性的反应混合物快速降温至低温,例如约10℃-20℃,再通过梯度升温的方式,使得反应混合物分别在不同的退火温度下反应适当的时间,从而确保尽可能多的引物与基因组DNA配对结合。在某些实施方式中,步骤(b2)包括在介于10-20℃之间的第一退火温度(例如15℃)反应适当的时间(例如3-60秒),在介于20-30℃之间的第二退火温度(例如25℃)反应适当的时间(例如3-50秒),以及在介于30-50℃之间的第三退火温度(例如35℃)反应适当的时间(例如3-50秒)。Then, the first reaction mixture is placed in a temperature program that enables the first type of primer (the first primer or the first primer and the third primer) to bind to the DNA single-stranded template (step (b2)). In this temperature program, the variable sequence in the first type of primer binds to the complementary sequence at different positions in the genomic DNA through base complementarity (i.e., annealing), and thus initiates replication at different positions of the genomic DNA. Since the variable sequences in the first type of primers are different, there are differences in the base ratio and sequence therein, and therefore the optimal temperature for each variable sequence to bind to the genomic DNA is also quite different. In this way, at a certain specific annealing temperature, only a part of the primers may be able to bind well to the genomic DNA, while the binding of another part of the primers to the genomic DNA may not be ideal. In some embodiments, the step (b2) includes placing the reaction mixture in a program of more than one temperature to encourage the first type of primer to fully and effectively bind to the DNA template. For example, the reaction mixture for DNA denaturation can be rapidly cooled to a low temperature, such as about 10°C-20°C, and then the temperature can be increased in a gradient manner so that the reaction mixture reacts at different annealing temperatures for an appropriate time, thereby ensuring that as many primers as possible are paired and bound to the genomic DNA. In certain embodiments, step (b2) includes reacting at a first annealing temperature between 10-20°C (e.g., 15°C) for an appropriate time (e.g., 3-60 seconds), reacting at a second annealing temperature between 20-30°C (e.g., 25°C) for an appropriate time (e.g., 3-50 seconds), and reacting at a third annealing temperature between 30-50°C (e.g., 35°C) for an appropriate time (e.g., 3-50 seconds).
本领域公知,引物的退火温度通常不会比引物Tm值低5℃以上,而过低的退火温度会导致引物与引物之间发生非特异性结合,从而导致出现引物聚合体以及非特异性扩增产物。因此,通常在引物退火温度中不会使用如10℃-20℃这样的低温。但是,本申请的发明人意想不到地发现,即使从低温(例如10℃-20℃)开始梯度升温,引物与基因组DNA之间的配对仍然能够保持很好的特异性,扩增结果仍然保持非常低的变异性,表明扩增的结果准确可靠。同时,由于引物退火温度覆盖了低温的情况,因此可以确保更广范围的引物序列与基因组DNA的结合,从而能够提供更好的基因组覆盖率和扩增深度。在退火温度程序后,将所述反应混合物置于能够使与DNA单链模板结合的第一类引物在所述核酸聚合酶的作用下延伸长度的温度程序,以产生扩增产物(步骤(b3))。It is well known in the art that the annealing temperature of a primer is usually not more than 5°C lower than the primer Tm value, and an annealing temperature that is too low will cause nonspecific binding between primers, resulting in the formation of primer polymers and nonspecific amplification products. Therefore, low temperatures such as 10°C-20°C are usually not used in the primer annealing temperature. However, the inventors of the present application unexpectedly found that even if the temperature is gradually increased from a low temperature (e.g., 10°C-20°C), the pairing between the primer and the genomic DNA can still maintain good specificity, and the amplification results still maintain very low variability, indicating that the amplification results are accurate and reliable. At the same time, since the primer annealing temperature covers the low temperature situation, it is possible to ensure that a wider range of primer sequences are combined with the genomic DNA, thereby providing better genome coverage and amplification depth. After the annealing temperature program, the reaction mixture is placed in a temperature program that allows the first type of primer bound to the DNA single-stranded template to extend its length under the action of the nucleic acid polymerase to produce an amplified product (step (b3)).
延伸温度通常与DNA聚合酶的最适温度相关,本领域技术人员可以根据具体的反应混合物进行具体的选择。在某些实施方式中,在反应混合物中的DNA聚合酶可以具有链置换活性,这样,如果引物在延伸的过程中遇到与下游模板结合的引物或扩增子,DNA聚合酶的链置换活性可以使这些下游结合的引物与模板链分开,从而确保延伸中的引物可以继续延伸,以得到较长的扩增序列。具有链置换活性的DNA聚合酶包括但不限于,例如,phi29DNA聚合酶、T5DNA聚合酶、SEQUENASE 1.0和SEQUENASE 2.0。在某些实施方式中,在反应混合物中的DNA聚合酶是热稳定的DNA聚合酶。热稳定的DNA聚合酶包括但不限于,例如,Taq DNA聚合酶、OmniBaseTM序列酶、Pfu DNA聚合酶、TaqBeadTM热启动聚合酶、Vent DNA聚合酶(例如Thermococcus litoralis的Vent聚合酶、Deep Vent聚合酶、Vent(-exo)聚合酶、Deep Vent(-exo)聚合酶)、Tub DNA聚合酶、TaqPlus DNA聚合酶、Tfl DNA聚合酶、Tli DNA聚合酶和Tth DNA聚合酶。在某些实施方式中,反应混合物中的DNA聚合酶可以是热稳定并且具有链置换活性的DNA聚合酶。在某些实施方式中,在反应混合物中的DNA聚合酶选自:Phi29DNA聚合酶、Bst DNA聚合酶、Pyrophage 3137、Vent聚合酶(例如Thermococcus litoralis的Vent聚合酶、Deep Vent聚合酶、Vent(-exo)聚合酶、Deep Vent(-exo)聚合酶)、TOPOTaq DNA聚合酶、9。Nm聚合酶、Klenow Fragment DNA聚合酶I、MMLV反转录酶、AMV反转录酶、HIV反转录酶、T7phase DNA聚合酶变种(缺少3’-5’外切酶活性)、超保真DNA聚合酶、Taq聚合酶、Bst DNA聚合酶(全长)、E.coli DNA聚合酶、LongAmp Taq DNA聚合酶、OneTaq DNA聚合酶中的一种或多种。The extension temperature is usually related to the optimum temperature of the DNA polymerase, and those skilled in the art can make a specific selection based on the specific reaction mixture. In some embodiments, the DNA polymerase in the reaction mixture can have a strand displacement activity, so that if the primer encounters a primer or amplicon bound to a downstream template during the extension process, the strand displacement activity of the DNA polymerase can separate these downstream bound primers from the template chain, thereby ensuring that the primer in the extension can continue to extend to obtain a longer amplified sequence. DNA polymerases with strand displacement activity include, but are not limited to, for example, phi29 DNA polymerase, T5 DNA polymerase, SEQUENASE 1.0 and SEQUENASE 2.0. In some embodiments, the DNA polymerase in the reaction mixture is a thermostable DNA polymerase. Thermostable DNA polymerases include, but are not limited to, for example, Taq DNA polymerase, OmniBase ™ sequenase, Pfu DNA polymerase, TaqBead ™ hot start polymerase, Vent DNA polymerase (for example Vent DNA polymerase, Deep Vent polymerase, Vent (-exo) polymerase, Deep Vent (-exo) polymerase of Thermococcus litoralis), Tub DNA polymerase, TaqPlus DNA polymerase, Tfl DNA polymerase, Tli DNA polymerase and Tth DNA polymerase. In some embodiments, the DNA polymerase in the reaction mixture can be a DNA polymerase that is thermostable and has strand displacement activity. In some embodiments, the DNA polymerase in the reaction mixture is selected from: Phi29 DNA polymerase, Bst DNA polymerase, Pyrophage 3137, Vent DNA polymerase (for example Vent DNA polymerase, Deep Vent polymerase, Vent (-exo) polymerase, Deep Vent (-exo) polymerase of Thermococcus litoralis), TOPO Taq DNA polymerase, 9. One or more of Nm polymerase, Klenow Fragment DNA polymerase I, MMLV reverse transcriptase, AMV reverse transcriptase, HIV reverse transcriptase, T7phase DNA polymerase variant (lacking 3'-5' exonuclease activity), high-fidelity DNA polymerase, Taq polymerase, Bst DNA polymerase (full length), E. coli DNA polymerase, LongAmp Taq DNA polymerase, and OneTaq DNA polymerase.
在某些实施方式中,步骤(b3)包括在介于60-90℃之间的延伸温度(例如,在65-90℃、70-90℃、75-90℃、80-90℃、60-85℃、60-80℃、60-75℃、70-80℃之间,或在60、61、62、63、64、65、66、67、68、69、70、71、72、73、74、75℃下)反应10秒-15分钟(例如,1-14、1-13、1-12、1-11、1-10、1-9、1-8、1-7、1-6、1-5、1-4、1-3、1-2、2-14、3-14、5-14、6-14、7-14、8-14、9-14、10-14、11-14、12-14、13-14分钟,或者10-60、10-50、10-40、10-30、10-20、20-60、20-50、20-40、20-30、30-60、30-50、30-40秒)。在某些实施方式中,步骤(b3)包括在60-80℃之间的一个或多个温度下反应30秒-2分钟。在某些实施方式中,步骤(b3)包括在65℃反应40秒。在某些实施方式中,步骤(b3)包括在75℃反应40秒。在某些实施方式中,步骤(b3)包括在65℃反应40秒之后再在75℃反应40秒。In certain embodiments, step (b3) comprises reacting at an extension temperature between 60-90°C (e.g., at 65-90°C, 70-90°C, 75-90°C, 80-90°C, 60-85°C, 60-80°C, 60-75°C, 70-80°C, or at 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75°C) for 10 seconds to 15 minutes (e.g., 1-14, 1-13, 1-12, 1-14). In some embodiments, step (b3) comprises reacting at one or more temperatures between 60-80° C. for 30 seconds to 2 minutes. In some embodiments, step (b3) comprises reacting at 65° C. for 40 seconds. In some embodiments, step (b3) comprises reacting at 75° C. for 40 seconds. In certain embodiments, step (b3) comprises reacting at 65°C for 40 seconds and then reacting at 75°C for 40 seconds.
引物延伸程序后,重复步骤(b1)到(b3)至指定的第一循环次数,如上所述在后续循环中,步骤(b1)中的解链温度与第一轮循环中的解链温度相近,但反应时间可以略短。在某些实施方式中,在第一轮循环后的循环中步骤(b1)包括在90-95℃的温度之间反应10-50秒。After the primer extension procedure, steps (b1) to (b3) are repeated for the designated first number of cycles. As described above, in subsequent cycles, the melting temperature in step (b1) is similar to that in the first cycle, but the reaction time may be slightly shorter. In certain embodiments, in cycles subsequent to the first cycle, step (b1) comprises a reaction at a temperature between 90°C and 95°C for 10-50 seconds.
本申请所述的第一循环次数至少为2。在第一次循环时,第一类引物的可变序列的3’端的序列得以延长,得到的扩增产物在5’端为通用序列,3’端为与基因组模板单链序列互补的序列,这样的扩增产物也称为半扩增子。在第二次循环时,之前的半扩增子本身也可以作为DNA模板与第一类引物中的可变序列结合,引物在核酸聚合酶的作用下向扩增产物的5’端延伸,直到复制完扩增产物5’末端的通用序列,由此得到在5’端为通用序列,3’端为通用序列的互补序列的基因组扩增产物,这样的扩增产物也称为全扩增子。本申请所述的预扩增产物主要是指5’端为通用序列而3’端为通用序列的互补序列的全扩增子。The first cycle number of times described in the present application is at least 2. During the first circulation, the sequence at the 3 ' end of the variable sequence of the first class primer is extended, and the amplified product obtained is a universal sequence at the 5 ' end, and the 3 ' end is a sequence complementary to the genomic template single-stranded sequence, and such an amplified product is also referred to as a half-amplicon. During the second circulation, the previous half-amplicon itself can also be combined with the variable sequence in the first class primer as a DNA template, and the primer extends to the 5 ' end of the amplified product under the action of nucleic acid polymerase, until the universal sequence at the 5 ' end of the amplified product is copied, thus obtaining a universal sequence at the 5 ' end, and a genomic amplified product with a complementary sequence to the universal sequence at the 3 ' end, and such an amplified product is also referred to as a full amplicon. The pre-amplified product described in the present application mainly refers to a full amplicon with a universal sequence at the 5 ' end and a complementary sequence to the universal sequence at the 3 ' end.
在第一次循环后的后续扩增中,反应混合物中的DNA单链不仅包含原始的基因组DNA单链,也包含扩增得到的新合成的DNA单链,其中原始基因组DNA模板以及初始扩增中产生的半扩增子均可再次作为新的DNA模板,与引物结合并开启新一轮的DNA合成;但由于全扩增子两端包含互补的序列(5’端包含的通用序列和3’端包含的通用序列的互补序列),因此会自身形成发卡结构,从而不能在下一个反应循环中再次作为新的DNA模板,进行新一轮的DNA合成。In the subsequent amplification after the first cycle, the DNA single strand in the reaction mixture contains not only the original genomic DNA single strand, but also the newly synthesized DNA single strand obtained by amplification. The original genomic DNA template and the half-amplicon produced in the initial amplification can be used as new DNA templates again to bind to the primers and start a new round of DNA synthesis. However, since the full amplicon contains complementary sequences at both ends (the universal sequence contained at the 5' end and the complementary sequence of the universal sequence contained at the 3' end), it will form a hairpin structure by itself, and thus cannot be used as a new DNA template again in the next reaction cycle for a new round of DNA synthesis.
在某些实施方式中,将第一循环的次数控制在适当的范围内,以确保既有足够的预扩增产物用于后续的反应,又不会因为循环次数过多影响整个流程的反应时间。在某些实施方式中,第一循环的次数为2-40个循环(例如,2-40个、4-40个、6-40个、8-40个、10-40个、12-40个、14-40个、16-40个、18-40个、20-40个、15-40个、20-40个、25-40个、30-40个、5-35个、10-35个、15-35个、20-35个、25-35个、30-35个、10-30个、15-30个、20-30个、25-30个、2-20个、2-18个、2-16个、2-14个、2-12个、2-10个、2-8个、2-6个、2-4个、4-20个、4-18个、4-16个、4-14个、4-12个、4-10个、4-8个、4-6个、6-20个、6-18个、6-16个、6-14个、6-12个、6-10个、6-8个、8-20个、8-18个、8-16个、8-14个、8-12个、8-10个、10-20个、10-18个、10-16个、10-14个、10-12个、12-20个、12-18个、12-16个、12-14个、14-20个、1-18个、14-16个、16-20个、16-18个和18-20个循环)。例如,第一循环次数次数至少为3、至少为4、至少为5、或至少为6、至少为7、至少为8、至少为9、或至少为10、至少为11、至少为12、至少为13、至少为14、至少为15、或至少为16、至少为17、至少为18、至少为19或至少为20,或者最好不超过8、不超过9、不超过10、不超过11、不超过12、不超过13、不超过14、不超过15、不超过16、不超过17、或不超过18、不超过19、不超过20、不超过21、不超过22、不超过23、不超过24、或不超过25、不超过26、不超过27、不超过28、不超过29、不超过30、不超过31、不超过32、不超过33、不超过34、不超过35、不超过36、不超过37、不超过38、不超过39或不超过40。如果第一循环次数过低,则得到的预扩增产物少,为获得足够的扩增产物,就需要在扩增步骤(d)中增加循环次数,这样会降低扩增结果的准确性。而如果第一循环次数过高,则会由于耗时较长而导致整个流程反应时间过长。In certain embodiments, the number of first cycles is controlled within an appropriate range to ensure that there are sufficient pre-amplification products for subsequent reactions, while not affecting the reaction time of the entire process due to excessive number of cycles. In certain embodiments, the number of first cycles is 2-40 cycles (e.g., 2-40, 4-40, 6-40, 8-40, 10-40, 12-40, 14-40, 16-40, 18-40, 20-40, 15-40, 20-40, 25-40, 30-40, 5-35, 10-35, 15-35, 20-35, 25-35, 30-35, 10-30, 15-30, 20-30, 25-30, 2-20, 2-18, 2-16, 2-14, 2-12, 2-10, 2-8, 2-6 , 2-4, 4-20, 4-18, 4-16, 4-14, 4-12, 4-10, 4-8, 4-6, 6-20, 6-18, 6-16, 6-14, 6-12, 6-10, 6-8, 8-20, 8-18, 8-16, 8-14, 8-12, 8-10, 10-20, 10-18, 10-16, 10-14, 10-12, 12-20, 12-18, 12-16, 12-14, 14-20, 1-18, 14-16, 16-20, 16-18, and 18-20 cycles). For example, the number of first cycles is at least 3, at least 4, at least 5, or at least 6, at least 7, at least 8, at least 9, or at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, or at least 16, at least 17, at least 18, at least 19 or at least 20, or preferably no more than 8, no more than 9, no more than 10, no more than 11, no more than 12, no more than 13, no more than 14, no more than 15, no more than 16, no more than 17, or no more than 18, no more than 19, no more than 20, no more than 21, no more than 22, no more than 23, no more than 24, or no more than 25, no more than 26, no more than 27, no more than 28, no more than 29, no more than 30, no more than 31, no more than 32, no more than 33, no more than 34, no more than 35, no more than 36, no more than 37, no more than 38, no more than 39 or no more than 40. If the number of first cycles is too low, the amount of pre-amplification product obtained will be small. To obtain sufficient amplification product, the number of cycles in amplification step (d) will need to be increased, which will reduce the accuracy of the amplification results. If the number of first cycles is too high, the entire process will take too long due to the long reaction time.
在某些实施方式中,在步骤(b3)后进一步包括步骤(b3’),其中将所述反应混合物置于适当的温度程序,使得所述基因组预扩增产物中全扩增子的3’端与5’端杂交结合以形成环状结构。此前认为,步骤(b3’)能够将全扩增子的末端保护起来,从而避免两条或多条全扩增子之间发生首尾聚合,从而避免将两个原本在基因组上不相邻的序列结合在一起。这将有助于提高扩增结果的准确性。In certain embodiments, step (b3) is followed by a further step (b3'), wherein the reaction mixture is subjected to a suitable temperature program to allow the 3' and 5' ends of the full amplicon in the genomic pre-amplification product to hybridize and bind to form a circular structure. It was previously believed that step (b3') protects the ends of the full amplicon, thereby preventing end-to-end polymerization between two or more full amplicons and thus preventing the joining of two sequences that were originally non-adjacent on the genome. This helps improve the accuracy of the amplification results.
在某些实施方式中,所述方法在步骤(b3)后不经其他步骤(例如步骤(b3’))而直接到后续步骤(b1)或(c)。这样,全扩增子并未经过特定的步骤以避免首尾聚合的情况,因此,理论上,这样的扩增结果应该在准确性上存在一定的缺陷。但是,意想不到的是,在本申请的方法中,即使在步骤(b3)后不经特定的步骤使全扩增子成环,最终的扩增结果仍然具有相当高的准确度,与使用步骤(b3’)的方法相比效果差不多。这精简了反应步骤,同时仍然保持了反应的特异性。In certain embodiments, the method proceeds directly to the subsequent step (b1) or (c) without further steps (e.g., step (b3')) after step (b3). In this way, the full amplicon does not undergo a specific step to avoid end-to-end aggregation. Therefore, in theory, such an amplification result should have certain defects in accuracy. However, unexpectedly, in the method of the present application, even if the full amplicon is not circularized after step (b3) without a specific step, the final amplification result still has a fairly high accuracy, which is similar to the method using step (b3'). This simplifies the reaction steps while still maintaining the specificity of the reaction.
步骤(c):提供第二反应混合物Step (c): Providing a second reaction mixture
在步骤(c)中,第二反应混合物中包含步骤(b)中得到的预扩增产物、第二引物、核苷酸单体混合物和核酸聚合酶,第二引物从5’端到3’端包含特定序列及所述通用序列。由于通用序列基本上不与基因组序列互补,因此如果第二类引物的其他部分被设计为基本不与基因组序列互补,那么第二类引物不会直接与基因组DNA发生配对并开启基因组DNA的复制,因而在某些特定的实施方式中可以通过直接在步骤(b)结束后获得的反应混合物中加入第二引物而获得第二反应混合物。在另一些实施方式中,在步骤(c)之前对步骤(b)结束后获得的反应混合物进行纯化,得到纯化的预扩增产物,然后与第二引物、核苷酸单体混合物和核酸聚合酶以及可选地与任何其他本领域公知的可以用于扩增反应的试剂混合得到第二反应混合物。In step (c), the second reaction mixture comprises the pre-amplification product obtained in step (b), the second primer, a mixture of nucleotide monomers and a nucleic acid polymerase, and the second primer comprises a specific sequence and the universal sequence from the 5' end to the 3' end. Since the universal sequence is substantially not complementary to the genomic sequence, if the other parts of the second class primer are designed to be substantially not complementary to the genomic sequence, the second class primer will not directly pair with the genomic DNA and start the replication of the genomic DNA. Therefore, in certain specific embodiments, the second reaction mixture can be obtained by directly adding the second primer to the reaction mixture obtained after the end of step (b). In other embodiments, the reaction mixture obtained after the end of step (b) is purified before step (c) to obtain a purified pre-amplification product, which is then mixed with the second primer, the mixture of nucleotide monomers and a nucleic acid polymerase, and optionally with any other reagents known in the art that can be used for amplification reactions to obtain a second reaction mixture.
i.第二类引物i. Second type of primers
本文中所述的“第二引物”属于上文所述的第二类引物。第二类引物包含第一类引物中的通用序列,从而第二类引物可以结合全扩增子中的3’端的通用序列的互补序列,从而进一步复制该全扩增子,使其数量大大增加。The "second primer" described herein belongs to the second class of primers described above. The second class of primers contains the universal sequence of the first class of primers, so that the second class of primers can bind to the complementary sequence of the universal sequence at the 3' end of the full amplicon, thereby further replicating the full amplicon and greatly increasing its number.
在某些实施方式中,第二类引物从5’到3’包含或由特定序列和通用序列组成。可以根据不同的测序平台针对性地选择第二类引物。在某些实施方式中,根据第二代测序平台针对性地选择第二类引物。在某些实施方式中,根据Illumina的NGS测序平台(例如但不限于Hiseq、Miseq等)或Life technologies的Ion torrent的NGS测序平台针对性地选择第二类引物。在某些实施方式中,第二类引物包括与测序用引物的部分或全部互补或者相同的序列。在某些实施方式中,上述第二类引物中的与测序用引物的部分或全部互补或者相同的序列包含或由所述的通用序列组成。In certain embodiments, the second class primers include or consist of a specific sequence and a universal sequence from 5' to 3'. The second class primers can be selected specifically according to different sequencing platforms. In certain embodiments, the second class primers are selected specifically according to a second generation sequencing platform. In certain embodiments, the second class primers are selected specifically according to an NGS sequencing platform of Illumina (such as but not limited to Hiseq, Miseq, etc.) or an Ion torrent of Life technologies. In certain embodiments, the second class primers include sequences that are complementary or identical to part or all of the sequencing primers. In certain embodiments, the sequences that are complementary or identical to part or all of the sequencing primers in the above-mentioned second class primers include or consist of the universal sequence.
本申请中所述的第二引物可以是具有第二类引物结构特征的一对引物对或者是具有相同结构和序列的单一引物。在一些实施方式中,第二引物的特定序列在其3’端包括与测序用引物的部分或全部互补或者相同的序列。在一些实施方式中所述第二引物的特定序列中包含的与测序用引物的部分或全部互补或相同的序列包含或由SEQ ID NO:31[ACACTCTTTCCCTACACGAC]、或SEQ ID NO:32[GTGACTGGAGTTCAGACGTGT]组成。在一些实施方式中,第二引物中的特定序列在其5’端进一步包括与测序平台的捕捉序列部分或全部互补或者相同的序列。捕捉序列是指在测序平台中测序板上包含的用于捕捉待测序片段的序列。在一些实施方式中,第二引物的特定序列中包含的与测序平台的捕捉序列部分或全部互补或相同的序列包含或由SEQ ID NO:33[AATGATACGGCGACCACCGAGATCT]、或SEQ ID NO:34[CAAGCAGAAGACGGCATACGAGAT]组成。在一些实施方式中,第二引物的特定序列进一步在所述与测序平台的捕捉序列部分或全部互补或相同的序列和所述与测序用引物的部分或全部互补或相同的序列之间包括一段标识序列(barcode序列),所述标识序列是指用于标识特定的待测序片段集合的序列,当测序平台同时对多个测序片段集合进行测序时,可以通过在测序结果中筛选每个集合带有的标识序列来区分测序数据。The second primer described in this application can be a pair of primers having the structural characteristics of the second class primer or a single primer having the same structure and sequence. In some embodiments, the specific sequence of the second primer includes a sequence at its 3' end that is complementary or identical to part or all of the sequencing primer. In some embodiments, the sequence included in the specific sequence of the second primer that is complementary or identical to part or all of the sequencing primer includes or consists of SEQ ID NO:31 [ACACTCTTTCCCTACACGAC] or SEQ ID NO:32 [GTGACTGGAGTTCAGACGTGT]. In some embodiments, the specific sequence in the second primer further includes a sequence at its 5' end that is complementary or identical to part or all of the capture sequence of the sequencing platform. The capture sequence refers to the sequence included on the sequencing plate in the sequencing platform for capturing the fragment to be sequenced. In some embodiments, the specific sequence of the second primer comprises a sequence that is partially or completely complementary or identical to the capture sequence of the sequencing platform and comprises or consists of SEQ ID NO: 33 [AATGATACGGCGACCACCGAGATCT] or SEQ ID NO: 34 [CAAGCAGAAGACGGCATACGAGAT]. In some embodiments, the specific sequence of the second primer further comprises an identifier sequence (barcode sequence) between the sequence that is partially or completely complementary or identical to the capture sequence of the sequencing platform and the sequence that is partially or completely complementary or identical to the sequencing primer. The identifier sequence is a sequence used to identify a specific set of fragments to be sequenced. When the sequencing platform simultaneously sequences multiple sets of sequencing fragments, the sequencing data can be distinguished by screening the sequencing results for the identifier sequence carried by each set.
在一些实施方式中,第二引物是包括具有相同通用序列和不同特定序列的引物对,其中所述不同特定序列分别包含与同一测序平台中用到的一对捕捉序列的部分或全部互补或者相同的序列,和/或所述不同特定序列分别包含与同一测序中用到的测序引物对中不同引物的部分或全部互补或相同的特定序列。在一些实施方式中,第二引物包括SEQID NO:35[AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT]和[CAAGCAGAAGACGGCATACGAGATX…XGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT]所示的序列的混合物,其中X…X为标识序列,本领域技术人员可以根据实际需要选择标识序列的长度和其具体序列。在一些实施方式中,第二引物包括SEQ ID NO:35[AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT]、SEQ ID NO:36[CAAGCAGAAGACGGCATACGAGATCGTGATGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT]所示的序列的混合物。在一些实施方式中,第二引物包括SEQ ID NO:37[CCACTACGCCTCCGCTTTCCTCTCTATGGGCAGTCGGTGATGCTCTTCCGATCT]和SEQ ID NO:38[CCATCTCATCCCTGCGTGTCTCCGACTCAGCTAAGGTAACGATGCTCTTCCGATCT]所示的序列的混合物。在一些实施方式中,第二引物包括SEQ ID NO:39[CCACTACGCCTCCGCTTTCCTCTCTATGGGCAGTCGGTGATTTGGTAGTGAGTG]和SEQ ID NO:40[CCATCTCATCCCTGCGTGTCTCCGACTCAGCTAAGGTAACGATTTGGTAGTGAGTG]所示的序列的混合物。In some embodiments, the second primer comprises a primer pair having the same universal sequence and different specific sequences, wherein the different specific sequences each comprise sequences complementary to or identical to part or all of a capture sequence pair used in the same sequencing platform, and/or the different specific sequences each comprise specific sequences complementary to or identical to part or all of different primers in a sequencing primer pair used in the same sequencing run. In some embodiments, the second primer comprises a mixture of the sequences set forth in SEQ ID NO: 35 [AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT] and [CAAGCAGAAGACGGCATACGAGAT X…X GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT], wherein X…X represents an identifier sequence. Those skilled in the art can select the length and specific sequence of the identifier sequence based on actual needs. In some embodiments, the second primer comprises a mixture of the sequences set forth in SEQ ID NO: 35 [AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT] and SEQ ID NO: 36 [CAAGCAGAAGACGGCATACGAGATCGTGATGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT]. In some embodiments, the second primer comprises a mixture of the sequences set forth in SEQ ID NO: 37 [CCACTACGCCTCCGCTTTCCTCTCTATGGGCAGTCGGTGATGCTCTTCCGATCT] and SEQ ID NO: 38 [CCATCTCATCCCTGCGTGTCTCCGACTCAGCTAAGGTAACGATGCTCTTCCGATCT]. In some embodiments, the second primer includes a mixture of the sequences set forth in SEQ ID NO: 39 [CCACTACGCCTCCGCTTTCCTCTCTATGGGCAGTCGGTGATTTGGTAGTGAGTG] and SEQ ID NO: 40 [CCATCTCATCCCTGCGTGTCTCCGACTCAGCTAAGGTAACGATTTGGTAGTGAGTG].
在一些实施方式中,第二引物在第二反应混合物中的浓度为1-15ng/μL。在一些实施方式中,第二引物在第二反应混合物中的浓度为1-12ng/μL、1-10ng/μL、1-8ng/μL、1-7ng/μL、1-6ng/μL、1-5ng/μL、1-4ng/μL、2-3ng/μL、2-12ng/μL、2-10ng/μL、2-8ng/μL、2-6ng/μL、2-5ng/μL、2-4ng/μL、2-3ng/μL、3-12ng/μL、3-10ng/μL、3-8ng/μL、3-6ng/μL或3-4ng/μL。在一些实施方式中,在第二反应混合物中的第二引物浓度为2-3ng/μL。在一些实施方式中,在第二反应混合物中的第二引物为5-50pmol。在一些实施方式中,在第二反应混合物中的第二引物为10pmol、15pmol或20pmol。In some embodiments, the concentration of the second primer in the second reaction mixture is 1-15 ng/μL. In some embodiments, the concentration of the second primer in the second reaction mixture is 1-12 ng/μL, 1-10 ng/μL, 1-8 ng/μL, 1-7 ng/μL, 1-6 ng/μL, 1-5 ng/μL, 1-4 ng/μL, 2-3 ng/μL, 2-12 ng/μL, 2-10 ng/μL, 2-8 ng/μL, 2-6 ng/μL, 2-5 ng/μL, 2-4 ng/μL, 2-3 ng/μL, 3-12 ng/μL, 3-10 ng/μL, 3-8 ng/μL, 3-6 ng/μL, or 3-4 ng/μL. In some embodiments, the concentration of the second primer in the second reaction mixture is 2-3 ng/μL. In some embodiments, the second primer in the second reaction mixture is 5-50 pmol. In some embodiments, the second primer is present in the second reaction mixture at 10 pmol, 15 pmol, or 20 pmol.
ii.其它成分ii. Other ingredients
在某些实施方式中,第二反应混合物中含有的核酸聚合酶为选自Thermococcuslitoralis的Vent聚合酶、Deep Vent聚合酶、Vent(-exo)聚合酶、或Deep Vent(-exo)聚合酶中的一种或多种。在某些实施方式中,第二反应混合物中含有Thermococcus litoralis的Vent聚合酶。在某些实施方式中,第二反应混合物中含有Deep Vent聚合酶。在某些实施方式中,第二反应混合物中含有Vent(-exo)聚合酶。在某些实施方式中,第二反应混合物中含有Deep Vent(-exo)聚合酶。本申请中所述的各种聚合酶均可以从商业途径获得,例如从New England Biolabs公司获得。In certain embodiments, the nucleic acid polymerase contained in the second reaction mixture is one or more selected from Vent polymerase, Deep Vent polymerase, Vent (-exo) polymerase or Deep Vent (-exo) polymerase of Thermococcus litoralis. In certain embodiments, the second reaction mixture contains Vent polymerase of Thermococcus litoralis. In certain embodiments, the second reaction mixture contains Deep Vent polymerase. In certain embodiments, the second reaction mixture contains Vent (-exo) polymerase. In certain embodiments, the second reaction mixture contains Deep Vent (-exo) polymerase. Various polymerases described in this application can all be obtained from commercial sources, for example, from New England Biolabs.
在某些实施方式中,第二反应混合物中还可以包括核酸聚合酶发挥酶活性所需的适当的金属离子(例如,适当浓度的Mg2+离子(例如终浓度可以为约1.5mM到约8mM);核苷酸单体混合物(例如dATP、dGTP、dTTP、和dCTP)、牛血清白蛋白(BSA)、dTT(例如终浓度可以为约2mM到约7mM)、纯水、适当的缓冲液成分(例如pH调节剂,如Tris HCl和Tris SO4)或其他本领域通用的一种或多种其他成分(例如DNase抑制剂、RNase、SO4 2-、Cl-、K+、Ca2+、Na+、和/或(NH4)+等)等。In certain embodiments, the second reaction mixture may further include appropriate metal ions required for the enzymatic activity of the nucleic acid polymerase (e.g., an appropriate concentration of Mg2 + ions (e.g., a final concentration of about 1.5 mM to about 8 mM); a nucleotide monomer mixture (e.g., dATP, dGTP, dTTP, and dCTP), bovine serum albumin (BSA), dTT (e.g., a final concentration of about 2 mM to about 7 mM), pure water, appropriate buffer components (e.g., pH adjusters, such as Tris HCl and Tris SO4 ), or one or more other components commonly used in the art (e.g., DNase inhibitors, RNase, SO42- , Cl- , K + , Ca2 + , Na + , and/or (NH4) + , etc.), etc.).
步骤(d):置于第二温度循环程序Step (d): Place in the second temperature cycle program
本申请提供的方法还包括步骤(d):将步骤(c)得到的第二反应混合物置于第二温度循环程序,使得所述第二类引物的通用序列能够与所述基因组预扩增产物的3’端配对并扩增所述基因组预扩增产物以得到扩大的基因组扩增产物。The method provided in the present application also includes step (d): placing the second reaction mixture obtained in step (c) under a second temperature cycle program, so that the universal sequence of the second type of primer can pair with the 3' end of the genomic pre-amplification product and amplify the genomic pre-amplification product to obtain an amplified genomic amplification product.
由于步骤(b)得到的基因组预扩增产物,即全扩增子,在3’端具有通用序列的互补序列,因此可以与第二类引物的通用序列互补,在核酸聚合酶的作用下,第二类引物延伸,复制全扩增子的全长。Since the genomic pre-amplification product obtained in step (b), i.e., the full amplicon, has a complementary sequence to the universal sequence at the 3' end, it can be complementary to the universal sequence of the second type of primer. Under the action of nucleic acid polymerase, the second type of primer is extended to replicate the full length of the full amplicon.
在第二温度循环程序中,首先将反应混合物置于能够打开DNA双链的温度程序(步骤(d1))。这里的DNA双链主要是指在步骤(b)中得到的基因组预扩增产物(即全扩增子)的双链(包括全扩增子的单链发卡结构分子)。虽然此时的第二反应混合物中仍然可能存在原始的基因组DNA,但由于第二类引物基本上不与基因组DNA配对结合,因此原始的基因组DNA并不是步骤(d)中的待扩增的DNA模板。可以使用较高的反应温度(例如90℃-95℃)反应适当的时间使得待扩增的全扩增子双链/发卡结构能够变性成为线性单链。在某些实施方式中,步骤(d1)中的温度程序中将反应混合物置于能够打开DNA双链的温度反应足够的时间,以确保模板DNA双链或发卡结构全部变性成单链,该温度程序包括在介于90-95℃之间(例如95℃)的变性温度反应5秒-20分钟(例如30秒或3分钟)。在步骤(d1)以后,将反应混合物置于能够使其中包含的第x轮(x为≥1的整数)扩增中生成的扩增产物双链解链为单链模板的温度程序(步骤(d2)),即在介于90-95℃之间(例如95℃)的解链温度反应3-50秒(例如20秒)。应当理解的是,在第一轮循环中步骤(d2)并非必须,但由于变性和解链程序中使用的温度相近,且相对于变性时间来说解链时间很短,所以可以认为其在第一轮中为步骤(d1)的延时。In the second temperature cycle program, the reaction mixture is first placed in a temperature program capable of opening the DNA double strand (step (d1)). The DNA double strand here mainly refers to the double strand (including the single-stranded hairpin structure molecule of the full amplicon) of the genomic pre-amplification product (i.e., the full amplicon) obtained in step (b). Although the original genomic DNA may still be present in the second reaction mixture at this time, since the second type of primer does not basically bind to the genomic DNA pairing, the original genomic DNA is not the DNA template to be amplified in step (d). A higher reaction temperature (e.g., 90°C-95°C) can be used to react for an appropriate time so that the full amplicon double strand/hairpin structure to be amplified can be denatured into a linear single strand. In certain embodiments, in the temperature program in step (d1), the reaction mixture is placed in a temperature reaction capable of opening the DNA double strand for a sufficient time to ensure that the template DNA double strand or hairpin structure is completely denatured into a single strand, and the temperature program includes a denaturation temperature reaction between 90-95°C (e.g., 95°C) for 5 seconds to 20 minutes (e.g., 30 seconds or 3 minutes). After step (d1), the reaction mixture is placed in a temperature program (step (d2)) capable of melting the double-stranded amplification product generated in the x-th round (x is an integer ≥ 1) of amplification contained therein into a single-stranded template, i.e., reacting at a melting temperature between 90° C. and 95° C. (e.g., 95° C.) for 3-50 seconds (e.g., 20 seconds). It should be understood that step (d2) is not required in the first cycle, but because the temperatures used in the denaturation and melting procedures are similar and the melting time is very short relative to the denaturation time, it can be considered as a delay of step (d1) in the first cycle.
在步骤(d2)以后,将反应混合物置于能够使所述第二类引物与步骤(d1)或(d2)中获得的DNA单链结合的温度程序(步骤(d3))。根据第二类引物中的碱基组成,可以计算出第二类引物的Tm值,并基于该Tm值找出对于第二类引物的适合的退火温度。在某些实施方式中,步骤(d3)中的温度程序包括在介于45-65℃之间的退火温度(例如63℃)反应3-50秒(例如40秒)。在某些实施方式中,第二类引物为SEQ ID NO:35、SEQ ID NO:36的混合物,且步骤(d3)中的温度程序包括在63℃反应3-50秒。在某些实施方式中,步骤(d3)中的退火温度高于在步骤(b2)中的退火温度。在步骤(d3)时,反应混合物可能仍然含有在步骤(b)中未反应的第一类引物,这些第一类引物中的可变序列可能与步骤(d3)中得到的DNA单链模板配对结合,从而产生不完整的扩增序列。当步骤(d3)中的退火温度高于第一类引物适合的退火温度时,可以减少或避免第一类引物与DNA单链模板结合,从而选择性地允许第二类引物进行扩增。After step (d2), the reaction mixture is placed in a temperature program that allows the second class primer to bind to the DNA single strand obtained in step (d1) or (d2) (step (d3)). Based on the base composition in the second class primer, the Tm value of the second class primer can be calculated, and based on the Tm value, a suitable annealing temperature for the second class primer is found. In certain embodiments, the temperature program in step (d3) includes reacting at an annealing temperature between 45-65°C (e.g., 63°C) for 3-50 seconds (e.g., 40 seconds). In certain embodiments, the second class primer is a mixture of SEQ ID NO:35 and SEQ ID NO:36, and the temperature program in step (d3) includes reacting at 63°C for 3-50 seconds. In certain embodiments, the annealing temperature in step (d3) is higher than the annealing temperature in step (b2). During step (d3), the reaction mixture may still contain unreacted first-class primers from step (b). The variable sequences in these first-class primers may bind to the single-stranded DNA template obtained in step (d3), thereby producing incomplete amplified sequences. When the annealing temperature in step (d3) is higher than the suitable annealing temperature for the first-class primers, binding of the first-class primers to the single-stranded DNA template can be reduced or avoided, thereby selectively allowing the second-class primers to amplify.
在引物退火完成以后,将所述反应混合物置于能够使与所述扩增产物单链结合的第二类引物在所述核酸聚合酶的作用下延伸长度的温度程序。在某些实施方式中,步骤(d4)中所述的温度程序包括在介于60-80℃之间的延伸温度(例如72℃)反应10秒-15分钟(例如40秒或3分钟)。After primer annealing is complete, the reaction mixture is subjected to a temperature program that allows the second type of primer bound to the single-stranded amplification product to be extended in length by the action of the nucleic acid polymerase. In certain embodiments, the temperature program in step (d4) includes a reaction at an extension temperature between 60°C and 80°C (e.g., 72°C) for 10 seconds to 15 minutes (e.g., 40 seconds or 3 minutes).
可以重复步骤(d2)到(d4)至第二循环次数,以获得所需的扩大的基因组扩增产物。在这个过程中,步骤(b)中得到的基因组扩增产物被进一步复制扩增,数量大大增加,以提供足够的基因组DNA序列用于后续的研究或操作。在某些实施方式中,步骤(d5)中的所述第二循环次数大于所述步骤(b4)中的第一循环次数。在某些实施方式中,将第二循环的次数控制在适当的范围内,使得其既能够提供足够量的DNA,又不会因为过多的循环数而影响扩增的准确度。在某些实施方式中,第二循环次数为2-40个循环(例如,2-40个、4-40个、6-40个、8-40个、10-40个、12-40个、14-40个、16-40个、18-40个、20-40个、15-40个、20-40个、25-40个、30-40个、5-35个、10-35个、15-35个、20-35个、25-35个、30-35个、10-30个、15-30个、20-30个、25-30个、15-28个、15-26个、15-24个、15-22个、15-20个、15-18个、15-17个、16-30个、17-30个、18-30个、20-30个、22-30个、24-30个、26-30个、28-30个、32-40个、32-38个、32-36个或32-34个循环)。Steps (d2) to (d4) can be repeated for a second number of cycles to obtain the desired amplified genomic amplification product. In this process, the genomic amplification product obtained in step (b) is further replicated and amplified, and the quantity is greatly increased to provide sufficient genomic DNA sequences for subsequent research or operation. In some embodiments, the second number of cycles in step (d5) is greater than the first number of cycles in step (b4). In some embodiments, the number of the second cycle is controlled within an appropriate range so that it can provide a sufficient amount of DNA without affecting the accuracy of amplification due to an excessive number of cycles. In certain embodiments, the second cycle number is 2-40 cycles (e.g., 2-40, 4-40, 6-40, 8-40, 10-40, 12-40, 14-40, 16-40, 18-40, 20-40, 15-40, 20-40, 25-40, 30-40, 5-35, 10-35, 15-35, 20-35, 25-35, 30-35, 1 30, 20-30, 24-30, 26-30, 28-30, 32-40, 32-38, 32-36, or 32-34 cycles).
在某些实施方式中,步骤(d)进一步包括在第二温度循环程序以后,将反应混合物置于与步骤(d4)相同的温度程序(例如72℃)反应适当的时间(例如40秒)。然后将反应混合物置于4℃的温度下以结束反应。在某些实施方式中,步骤(d)反应结束后直接将反应混合物置于4℃的温度下以结束反应。In certain embodiments, step (d) further comprises, after the second temperature cycling program, subjecting the reaction mixture to the same temperature program as step (d4) (e.g., 72° C.) for an appropriate reaction time (e.g., 40 seconds). The reaction mixture is then placed at 4° C. to terminate the reaction. In certain embodiments, after the reaction in step (d) is completed, the reaction mixture is directly placed at 4° C. to terminate the reaction.
在某些特定的实施方式中,本申请还提供了一种扩增细胞基因组的方法,所述方法包括:In certain specific embodiments, the present application also provides a method for amplifying a cell genome, the method comprising:
(a)提供第一反应混合物,其中所述第一反应混合物包括所述基因组DNA、第一引物、核苷酸单体混合物、和核酸聚合酶,其中所述第一引物从5’端到3’端包含通用序列和可变序列,其中所述第一引物从5’端到3’端包含通用序列和第一可变序列,所述第一可变序列包括第一随机序列,其中所述第一随机序列从5’端到3’端依次为Xa1Xa2……Xan,所述第一随机序列的Xai(i=1-n)均属于同一个集合,所述集合选自B、或D、或H、或V,其中B={T、G、C},D={A、T、G},H={T、A、C},V={A、C、G},其中Xai表示第一随机序列5’端的第i个核苷酸,n是选自3-20的正整数,其中所述通用序列和所述第一可变序列直接相连、或所述通用序列和所述第一可变序列通过第一间隔序列相连,所述第一间隔序列为Ya1……Yam,其中Yaj(j=1-m)∈{A、T、G、C},其中Yaj表示间隔序列5’端的第j个核苷酸,(a) providing a first reaction mixture, wherein the first reaction mixture comprises the genomic DNA, a first primer, a nucleotide monomer mixture, and a nucleic acid polymerase, wherein the first primer comprises a universal sequence and a variable sequence from the 5' end to the 3' end, wherein the first primer comprises a universal sequence and a first variable sequence from the 5' end to the 3' end, and the first variable sequence comprises a first random sequence, wherein the first random sequence is X a1 X a2 ... X an in order from the 5' end to the 3' end, and X ai (i=1-n) of the first random sequence all belong to the same set, which is selected from B, or D, or H, or V, wherein B={T, G, C}, D={A, T, G}, H={T, A, C}, and V={A, C, G}, wherein X ai represents the i-th nucleotide at the 5' end of the first random sequence, and n is a positive integer selected from 3-20, wherein the universal sequence and the first variable sequence are directly connected, or the universal sequence and the first variable sequence are connected via a first spacer sequence, and the first spacer sequence is Ya1 ... Yam , wherein Yaj (j=1-m)∈{A, T, G, C}, where Yaj represents the jth nucleotide at the 5' end of the spacer sequence,
可选地,其中所述第一反应混合物进一步包括第三引物,其中所述第三引物从5’端到3’端包含所述通用序列和第三可变序列,所述第三可变序列包括第三随机序列,其中所述第三随机序列从5’端到3’端依次为Xb1Xb2……Xbn,所述第三随机序列的Xbi(i=1-n)均属于同一个集合,所述集合选自B、或D、或H、或V,其中B={T、G、C},D={A、T、G},H={T、A、C},V={A、C、G},并且Xbi(i=1-n)和Xai(i=1-n)属于不同的集合,其中Xbi表示第一随机序列5’端的第i个核苷酸,n是选自3-20的正整数,其中所述通用序列和所述第三可变序列直接相连,或者所述通用序列和所述第三可变序列通过第三间隔序列相连,所述第三间隔序列为Yb1……Ybm,其中Ybj(j=1-m)∈{A、T、G、C},其中Ybj表示间隔序列5’端的第j个核苷酸,m是选自1-3的正整数;Optionally, the first reaction mixture further comprises a third primer, wherein the third primer comprises the universal sequence and a third variable sequence from the 5' end to the 3' end, the third variable sequence comprises a third random sequence, wherein the third random sequence is X b1 X b2 ... X bn from the 5' end to the 3' end, X bi (i=1-n) of the third random sequence all belong to the same set, the set is selected from B, or D, or H, or V, wherein B={T, G, C}, D={A, T, G}, H={T, A, C}, V={A, C, G}, and X bi (i=1-n) and X ai (i=1-n) belong to different sets, wherein X bi represents the i-th nucleotide at the 5' end of the first random sequence, and n is a positive integer selected from 3-20, wherein the universal sequence and the third variable sequence are directly connected, or the universal sequence and the third variable sequence are connected via a third spacer sequence, the third spacer sequence is Y b1 ... Y bm , wherein Y bj (j=1-m)∈{A, T, G, C}, where Y bj represents the jth nucleotide at the 5' end of the spacer sequence, and m is a positive integer selected from 1-3;
(b)将所述第一反应混合物置于第一温度循环程序,使得所述第一引物和第三引物的可变序列能够与所述基因组DNA配对并扩增所述基因组DNA以得到基因组扩增产物,其中所述基因组扩增产物的5’端包含所述通用序列,3’端包含所述通用序列的互补序列;其中所述第一温度循环程序包括:(b) subjecting the first reaction mixture to a first temperature cycling program, so that the variable sequences of the first primer and the third primer can pair with the genomic DNA and amplify the genomic DNA to obtain a genomic amplification product, wherein the 5' end of the genomic amplification product comprises the universal sequence and the 3' end comprises a complementary sequence to the universal sequence; wherein the first temperature cycling program comprises:
(b1)在介于90-95℃的温度之间的第一变性温度反应1-10分钟(第一轮循环中)或者10-50秒(后续循环中);(b1) reacting at a first denaturation temperature between 90° C. and 95° C. for 1 to 10 minutes (in the first cycle) or 10 to 50 seconds (in subsequent cycles);
(b2)介于5-15℃之间的第一退火温度反应3-50秒,介于15-25℃之间的第二退火温度反应3-50秒,和介于30-50℃之间的第三退火温度反应3-50秒;(b2) a first annealing temperature between 5-15° C. for 3-50 seconds, a second annealing temperature between 15-25° C. for 3-50 seconds, and a third annealing temperature between 30-50° C. for 3-50 seconds;
(b3)在介于60-80℃之间的(一个或多个)第一延伸温度反应10秒-15分钟;(b3) reacting at a first extension temperature(s) between 60° C. and 80° C. for 10 seconds to 15 minutes;
(b4)重复步骤(b1)到(b3)至2-40个循环;(b4) repeating steps (b1) to (b3) for 2-40 cycles;
(c)提供第二反应混合物,所述第二反应混合物包括步骤(b)中得到的所述基因组预扩增产物、第二引物、核苷酸单体混合物、和核酸聚合酶,其中所述第二引物的从5’端到3’端包含特定序列及所述通用序列;(c) providing a second reaction mixture, the second reaction mixture comprising the genomic pre-amplification product obtained in step (b), a second primer, a nucleotide monomer mixture, and a nucleic acid polymerase, wherein the second primer comprises a specific sequence and the universal sequence from the 5' end to the 3' end;
(d)将所述第二反应混合物置于第二温度循环程序,使得所述第二引物的所述通用序列能够与所述基因组预扩增产物的3’端配对并扩增所述基因组预扩增产物以得到扩大的基因组扩增产物,其中所述第二温度循环程序包括:(d) subjecting the second reaction mixture to a second temperature cycling program, so that the universal sequence of the second primer can pair with the 3' end of the genomic pre-amplification product and amplify the genomic pre-amplification product to obtain an amplified genomic amplification product, wherein the second temperature cycling program comprises:
(d1)在介于90-95℃之间的第二变性温度反应5秒-20分钟;(d1) reacting at a second denaturation temperature between 90° C. and 95° C. for 5 seconds to 20 minutes;
(d2)在介于90-95℃之间的第二解链温度反应3-50秒;(d2) reacting at a second melting temperature between 90°C and 95°C for 3-50 seconds;
(d3)在介于45-65℃之间的第四退火温度反应3-50秒;(d3) reacting at a fourth annealing temperature between 45° C. and 65° C. for 3-50 seconds;
(d4)在介于60-80℃之间的第二延伸温度反应10秒-15分钟;(d4) reacting at a second extension temperature between 60° C. and 80° C. for 10 seconds to 15 minutes;
(d5)重复步骤(d2)到(d4)2-40个循环,获得基因组扩增产物。(d5) Repeat steps (d2) to (d4) for 2-40 cycles to obtain a genome amplification product.
在某些实施方式中,在步骤(a)的反应混合物中的基因组DNA存在于细胞内部,即:反应混合物含有细胞,而在细胞中包含了待扩增的基因组DNA。在某些实施方式中,在步骤(a)的反应混合物含有细胞,而且还进一步包含能够裂解细胞的成分,例如表面活性剂和/或裂解酶等。可以使用适当的表面活性剂,例如NP-40、吐温、SDS、TritonX-100、EDTA、异硫氰酸胍中的一种或多种。也可以选择适当的裂解酶,例如蛋白酶K、胃蛋白酶、木瓜蛋白酶中的一种或多种。在这样的实施方式中,上述扩增细胞基因组的方法在步骤(a)之后以及步骤(b)之前进一步包括将所述反应混合物置于裂解温度循环程序(例如将反应混合物置于50℃20分钟,然后置于80℃10分钟),使得所述细胞裂解并释放出所述基因组DNA。In certain embodiments, the genomic DNA in the reaction mixture of step (a) exists inside the cells, that is, the reaction mixture contains cells, and the genomic DNA to be amplified is contained in the cells. In certain embodiments, the reaction mixture of step (a) contains cells and further contains components capable of lysing cells, such as surfactants and/or lysing enzymes. Suitable surfactants can be used, such as one or more of NP-40, Tween, SDS, TritonX-100, EDTA, and guanidine isothiocyanate. Suitable lysing enzymes can also be selected, such as one or more of proteinase K, pepsin, and papain. In such an embodiment, the above-mentioned method for amplifying the cell genome further includes placing the reaction mixture in a lysis temperature cycle program (for example, placing the reaction mixture at 50°C for 20 minutes and then at 80°C for 10 minutes) after step (a) and before step (b), so that the cells are lysed and the genomic DNA is released.
应用application
在某些实施方式中,本申请方法扩增得到的产物可以进一步用于进行测序,如进行全基因测序。由于各种测序分析平台如新一代测序(NGS),基因芯片(Microarray),荧光定量PCR等均对待分析样本的起始量有较高的要求(100ng以上),因此如需要从单个人类细胞(6pg左右)或者少量起始量的样本中得到足量用于分析的核酸物质,则需要进行全基因组扩增。可以通过本申请的方法对生物样品(例如单细胞)中的基因组DNA进行扩增,再通过本领域适当的测序方法对扩增得到的产物进行测序。示例的测序方法包括,杂交测序法(SBH)、连接酶测序法(SBL)、定量增量荧光核酸增加测序法(QIFNAS)、逐步连接和切割法、分子信标法、焦磷酸测序法、原位荧光测序法(FISSEQ)、荧光共振能量转移法(FRET)、多重测序法(美国专利申请12/027039;porreca等人(2007)NAT.Methods 4:931)、聚合群体(POLONY)测序法(U.S.6,432,360、U.S.6,485,944和PCT/US05/06425)、摆动测序法(PCTUS05/27695)、TaqMan报告分子探针消化法、微粒滚动循环测序法(ROLONY)(美国专利申请12/120541)、FISSEQ小珠法(U.S.7,425,431)和等位基因特异的寡核苷酸连接分析法等。In certain embodiments, the product obtained by amplification of the present application method can be further used for sequencing, such as whole-genome sequencing. Since various sequencing analysis platforms such as next-generation sequencing (NGS), gene chips (Microarray), fluorescent quantitative PCR, etc. have high requirements for the starting amount of samples to be analyzed (more than 100ng), if it is necessary to obtain sufficient nucleic acid material for analysis from a single human cell (about 6pg) or a small amount of starting sample, whole-genome amplification is required. The genomic DNA in a biological sample (such as a single cell) can be amplified by the method of the present application, and the amplified product can be sequenced by an appropriate sequencing method in the art. Exemplary sequencing methods include sequencing by hybridization (SBH), sequencing by ligase (SBL), quantitative incremental fluorescent nucleic acid addition sequencing (QIFNAS), stepwise ligation and cleavage, molecular beacon, pyrosequencing, fluorescence in situ sequencing (FISSEQ), fluorescence resonance energy transfer (FRET), multiplex sequencing (U.S. patent application 12/027039; Porreca et al. (2007) NAT. Methods 4:931), polymeric population (POLONY) sequencing (U.S. 6,432,360, U.S. 6,485,944 and PCT/US05/06425), wobble sequencing (PCT US05/27695), TaqMan reporter probe digestion, microparticle rolling cycle sequencing (ROLONY) (U.S. patent application 12/120541), FISSEQ beads (U.S. 7,425,431) and allele-specific oligonucleotide ligation analysis, etc.
在某些实施方式中,可以以高通量的方法实现对本申请方法的扩增产物的测序。高通量的方法通常将待测序的核酸分子片段化(例如通过酶解或机械剪切等方式),以形成大量的长度为几十bp到几百bp的短片段。通过在一次测序反应中平行地对几万个、几十万个、几百万个、几千万个、甚至上亿个这样的短片段测序,可以大大提高测序的通量、缩短测序所需的时间。将测得的短片段的序列通过软件进行数据处理,可以拼接成完整的序列。本领域已知多种高通量测序平台,例如Roche 454、Illumina Solexa、AB-SOLiD、Helicos、Polonator平台技术等。本领域还已知多种基于光的测序技术,例如可以参见Landegren等人(1998)Genome Res.8:769-76、Kwok(2000)Pharmacogenomics 1:95-100和Shi(2001)Clin.Chem.47:164-172中描述的那些。In certain embodiments, the sequencing of the amplified product of the present application method can be achieved with a high-throughput method. The high-throughput method is usually fragmented (e.g., by means of enzymatic hydrolysis or mechanical shearing) by the nucleic acid molecules to be sequenced to form a large amount of short fragments of tens of bp to hundreds of bp in length. By sequencing tens of thousands, hundreds of thousands, millions, tens of millions, or even hundreds of millions of such short fragments in parallel in a sequencing reaction, the throughput of sequencing can be greatly improved and the time required for sequencing can be shortened. The sequence of the short fragments measured is processed by software and can be spliced into a complete sequence. Various high-throughput sequencing platforms are known in the art, such as Roche 454, Illumina Solexa, AB-SOLiD, Helicos, Polonator platform technology, etc. A variety of light-based sequencing technologies are also known in the art, such as those described in Landegren et al. (1998) Genome Res. 8:769-76, Kwok (2000) Pharmacogenomics 1:95-100, and Shi (2001) Clin. Chem. 47:164-172.
在某些实施方式中,本申请方法扩增得到的产物还可以用于对基因组DNA中的基因型或遗传多态性进行分析,例如单核苷酸多态性(SNP)分析、短串联重复序列(STR)分析、限制性片段长度多态性(RFLP)分析、可变数目串联重复序列(VNTRs)分析、复杂重复序列(CTR)分析或微卫星分析等,例如可以参考Krebs,J.E.,Goldstein,E.S.和Kilpatrick,S.T.(2009).Lewin’s Genes X(Jones&Bartlett Publishers),其公开内容通过引用整体并入本申请。In certain embodiments, the products amplified by the methods of the present application can also be used to analyze the genotype or genetic polymorphism in genomic DNA, such as single nucleotide polymorphism (SNP) analysis, short tandem repeat (STR) analysis, restriction fragment length polymorphism (RFLP) analysis, variable number tandem repeat (VNTRs) analysis, complex repeat (CTR) analysis or microsatellite analysis, etc. For example, reference can be made to Krebs, J.E., Goldstein, E.S. and Kilpatrick, S.T. (2009). Lewin’s Genes X (Jones & Bartlett Publishers), the disclosure of which is incorporated herein by reference in its entirety.
在某些实施方式中,本申请的方法得到的扩增产物还可以用于医学分析和/或诊断分析。例如,可以对个体的生物样品用本申请的方法进行扩增,分析扩增产物中在感兴趣的基因或DNA序列中是否存在突变、缺失、插入或染色体之间的融合等异常情况,从而评估该个体患上某种疾病的风险、疾病的进展阶段、疾病的基因分型、疾病的严重程度、或者该个体对某种疗法反应的可能性。可以使用本领域已知的适当的方法对感兴趣的基因或DNA序列进行分析,例如但不限于,通过核酸探针杂交、引物特异性扩增、对感兴趣的序列测序、单链构象多态性(SSCP)等。In certain embodiments, the amplified product obtained by the method of the present application can also be used for medical analysis and/or diagnostic analysis. For example, the biological sample of an individual can be amplified using the method of the present application, and the abnormalities such as mutation, deletion, insertion or fusion between chromosomes in the gene or DNA sequence of interest in the amplified product are analyzed to assess the risk of the individual suffering from a certain disease, the progression stage of the disease, the genotyping of the disease, the severity of the disease or the possibility of the individual's reaction to a certain therapy. Appropriate methods known in the art can be used to analyze the gene or DNA sequence of interest, such as, but not limited to, by nucleic acid probe hybridization, primer-specific amplification, sequencing of the sequence of interest, single-stranded conformation polymorphism (SSCP) etc.
在某些实施方式中,本申请的方法可以用于比较来源于不同单细胞的基因组,特别是来自于同一个体的不同单细胞。例如,当同一个体的不同单细胞的基因组之间存在差异时,例如肿瘤细胞和正常细胞之间,可以使用本申请的方法分别扩增不同单细胞的基因组DNA,并对扩增产物进行进一步的分析,例如,通过测序分析和比较,或者进行比较基因组杂交(CGH)分析。可以参考Fan,H.C.,Wang,J.,Potanina,A.和Quake,S.R.(2011).Whole-genome molecular haplotyping of single cells.Nature Biotechnology 29,51–57.以及Navin,N.,Kendall,J.,Troge,J.,Andrews,P.,Rodgers,L.,McIndoo,J.,Cook,K.,Stepansky,A.,Levy,D.,Esposito,D.等人(2011).Tumour evolution inferred bysingle-cell sequencing.Nature 472,90–94,其公开内容通过引用整体并入本申请。In certain embodiments, the methods of the present application can be used to compare genomes derived from different single cells, particularly different single cells from the same individual. For example, when there are differences between the genomes of different single cells from the same individual, such as between tumor cells and normal cells, the methods of the present application can be used to amplify the genomic DNA of each of the different single cells, and the amplified products can be further analyzed, for example, by sequencing analysis and comparison, or by comparative genomic hybridization (CGH) analysis. Reference may be made to Fan, H.C., Wang, J., Potanina, A. and Quake, S.R. (2011). Whole-genome molecular haplotyping of single cells. Nature Biotechnology 29, 51–57. and Navin, N., Kendall, J., Troge, J., Andrews, P., Rodgers, L., McIndoo, J., Cook, K., Stepansky, A., Levy, D., Esposito, D. et al. (2011). Tumour evolution inferred by single-cell sequencing. Nature 472, 90–94, the disclosures of which are incorporated herein by reference in their entirety.
在某些实施方式中,本申请的方法可以用于识别在同源染色体中的单倍体结构或单倍体基因型。单倍体基因型是指同一单倍体的染色体上共同遗传的多个基因座上等位基因的组合。可以将生物样品(例如来自个体的二倍体的单细胞)分成足够多的部分,以使得同源的两个单倍体上的DNA序列在统计学意义上被分隔到不同的部分中。每一个部分配置成一个反应混合物,对每一个反应混合物通过本申请的方法进行DNA扩增,然后将扩增产物进行序列分析,并与参照的基因组序列(例如公开的人的标准基因组序列,请参见:International Human Genome Sequencing Consortium,Nature 431,931-945(2004))进行比对,以识别其中的单核苷酸突变情况。如果没有现成的参照基因组序列,也可以通过从头基因组组装(de-novo genome assembly)的方法从基因组的多个片段序列组装得到适当长度的一段区域以供比较。In certain embodiments, the method of the present application can be used to identify the haploid structure or haploid genotype in homologous chromosomes.Haploid genotype refers to the combination of alleles on multiple loci inherited together on the chromosome of the same haploid.Biological samples (such as single cells from individual diploids) can be divided into enough parts so that the DNA sequences on the two haploids of homology are statistically separated into different parts.Each part is configured into a reaction mixture, and each reaction mixture is amplified by the method of the present application, and then the amplified product is subjected to sequence analysis and compared with the genome sequence of reference (such as the standard genome sequence of disclosed people, see: International Human Genome Sequencing Consortium, Nature 431,931-945 (2004)), to identify single nucleotide mutations therein.If there is no ready-made reference genome sequence, a region of appropriate length can also be assembled from multiple fragment sequences of genome by the method of de-novo genome assembly for comparison.
在某些实施方式中,本申请的方法扩增得到的产物可以进一步用于基因克隆、荧光定量PCR等分析。In certain embodiments, the product amplified by the method of the present application can be further used for analysis such as gene cloning and fluorescent quantitative PCR.
在某些实施方式中,本申请的方法还可以进一步包括分析所述扩增产物以识别与疾病或表型相关的序列特征。在一些实施方式中,分析所述扩增产物包括对DNA扩增物的基因型分析。在另一些实施方式中,分析所述扩增产物包括识别DNA扩增物的多态性,如单核苷酸多态性分析(SNP)。SNP可以通过一些众所周知的方法进行检测,例如寡核苷酸连接测定法(OLA)、单碱基延生法、等位基因特异性引物延伸法、错配杂交法等。可以通过比对SNP与已知疾病表型的关系来诊断疾病。In certain embodiments, the method of the present application can further include analyzing the amplified product to identify the sequence signature relevant to the disease or phenotype. In some embodiments, analyzing the amplified product includes genotyping the DNA amplified product. In other embodiments, analyzing the amplified product includes identifying the polymorphism of the DNA amplified product, such as single nucleotide polymorphism analysis (SNP). SNP can be detected by some well-known methods, such as oligonucleotide ligation assay (OLA), single base extension method, allele-specific primer extension method, mismatch hybridization method, etc. Disease can be diagnosed by comparing the relationship between SNP and known disease phenotype.
在一些实施方式中,所述与疾病或表型相关的序列特征包括染色体水平异常、染色体的异位、非整倍体、部分或全部染色体的缺失或重复、胎儿HLA单倍型和父源突变。In some embodiments, the sequence features associated with the disease or phenotype include chromosome-level abnormalities, chromosome translocation, aneuploidy, deletion or duplication of part or all of a chromosome, fetal HLA haplotype, and paternal mutation.
在一些实施方式中,所述疾病或表型可以是β-地中海贫血、唐氏综合征、囊性纤维化、镰状细胞病、泰-萨克斯病、脆性X综合征、脊髓性肌萎缩症、血红蛋白病、α-地中海贫血、X连锁疾病(由在X染色体上基因主导的疾病)、脊柱裂、无脑畸形、先天性心脏病、肥胖、糖尿病、癌症、胎儿性别、或胎儿RHD。In some embodiments, the disease or phenotype can be beta-thalassemia, Down syndrome, cystic fibrosis, sickle cell disease, Tay-Sachs disease, fragile X syndrome, spinal muscular atrophy, hemoglobinopathy, alpha-thalassemia, X-linked diseases (diseases dominated by genes on the X chromosome), spina bifida, anencephaly, congenital heart disease, obesity, diabetes, cancer, fetal sex, or fetal RHD.
试剂盒Reagent test kit
在本申请的另一方面还提供了可用于基因组DNA扩增的试剂盒,其中包括第一引物。在某些实施方式中,所述试剂盒同时包括第一引物和第三引物。在某些实施方式中,试剂盒进一步包括核酸聚合酶,其中所述核酸聚合酶选自:Phi29DNA聚合酶、Bst DNA聚合酶、Pyrophage 3137、Vent聚合酶、TOPOTaq DNA聚合酶、9。Nm聚合酶、Klenow Fragment DNA聚合酶I、MMLV反转录酶、AMV反转录酶、HIV反转录酶、T7phase DNA聚合酶变种、超保真DNA聚合酶、Taq聚合酶、Bst DNA聚合酶、E.coli DNA聚合酶、LongAmp Taq DNA聚合酶、OneTaq DNA聚合酶、Deep Vent DNA聚合酶、Vent(exo-)DNA聚合酶、Deep Vent(exo-)DNA聚合酶,及其任意组合。在某些实施方式中,试剂盒进一步包括一种或多种选自下组的成分:核苷酸单体混合物(例如dATP、dGTP、dTTP和dCTP,例如,总浓度介于1mmol-8mmol/μL)、dTT(例如,浓度介于1mmol-7mmol/μL)、Mg2+溶液(例如,浓度介于2mmol-8mmol/μL)、牛血清白蛋白(BSA)、pH调节剂(例如Tris HCl)、DNase抑制剂、RNase、SO4 2-、Cl-、K+、Ca2+、Na+、和/或(NH4)+。在某些实施方式中,试剂盒进一步包括能够裂解细胞的成分,例如一种或多种表面活性剂(例如,NP-40、吐温、SDS、TritonX-100、EDTA、异硫氰酸胍),和/或一种或多种裂解酶(例如,蛋白酶K、胃蛋白酶、木瓜蛋白酶)。在一些实施方式中,所述试剂盒进一步包括第二类引物(即,第二引物)。应当理解,试剂盒中的第一引物、第二引物及第三引物均具有如上文中具体描述的结构和序列特征。In another aspect of the present application, a test kit for genomic DNA amplification is provided, comprising the first primer. In some embodiments, the test kit comprises the first primer and the third primer simultaneously. In some embodiments, the test kit further comprises a nucleic acid polymerase, wherein the nucleic acid polymerase is selected from: Phi29 DNA polymerase, Bst DNA polymerase, Pyrophage 3137, Vent polymerase, TOPO Taq DNA polymerase, 9.Nm polymerase, Klenow Fragment DNA polymerase I, MMLV reverse transcriptase, AMV reverse transcriptase, HIV reverse transcriptase, T7 phase DNA polymerase variant, super-fidelity DNA polymerase, Taq polymerase, Bst DNA polymerase, E.coli DNA polymerase, LongAmp Taq DNA polymerase, OneTaq DNA polymerase, Deep Vent DNA polymerase, Vent (exo-) DNA polymerase, Deep Vent (exo-) DNA polymerase, and any combination thereof. In certain embodiments, the kit further comprises one or more components selected from the group consisting of a nucleotide monomer mixture (e.g., dATP, dGTP, dTTP, and dCTP, e.g., at a total concentration of 1 mmol-8 mmol/μL), dTT (e.g., at a concentration of 1 mmol-7 mmol/μL), a Mg 2+ solution (e.g., at a concentration of 2 mmol-8 mmol/μL), bovine serum albumin (BSA), a pH adjuster (e.g., Tris HCl), a DNase inhibitor, an RNase, SO 4 2- , Cl - , K + , Ca 2+ , Na + , and/or (NH 4 ) + . In certain embodiments, the kit further comprises a component capable of lysing cells, e.g., one or more surfactants (e.g., NP-40, Tween, SDS, Triton X-100, EDTA, guanidine isothiocyanate), and/or one or more lytic enzymes (e.g., proteinase K, pepsin, papain). In some embodiments, the kit further comprises a second type of primer (i.e., a second primer). It should be understood that the first primer, the second primer and the third primer in the kit all have the structural and sequence characteristics as specifically described above.
在一些实施方式中,试剂盒中的所有组分均分别存放于单独的容器中。在一些实施方式中,试剂盒中的所有组分均共同存放在同一容器中。在一些实施方式中,试剂盒中的每种引物均分别各自存放在单独的容器中,而除引物以外的所有其他组分均存放在同一容器中。当试剂盒中包括核酸聚合酶时,核酸聚合酶可以以基本上纯的形式存放于单独的容器中,或者可选地可以与其他成分组成混合物。In some embodiments, all components of the test kit are stored in separate containers. In some embodiments, all components of the test kit are stored together in the same container. In some embodiments, each primer in the test kit is stored in a separate container, while all other components except the primers are stored in the same container. When the test kit includes a nucleic acid polymerase, the nucleic acid polymerase can be stored in a separate container in a substantially pure form, or alternatively can be mixed with other components.
在一些实施方式中,所述试剂盒可以包含含有线性扩增反应所需的除基因组DNA以外的全部反应物的混合物,当这样的试剂盒用于本申请所述的线性扩增反应时,可以将含有基因组DNA的样本与试剂盒中的混合物直接混合,可选地可以加入适量的纯水以获得需要的反应体积,即可获得本申请方法的步骤(a)中的第一反应混合物。在一些实施方式中,所述试剂盒可以包含含有指数扩增反应所需的除扩增模板以外的全部反应物的混合物,当这样的试剂盒用于本申请所述的指数扩增反应时,可以将含有步骤(b)中的扩增产物的DNA模板样本与试剂盒中的该混合物直接混合,可选地可以加入适量的纯水以获得需要的反应体积,即可获得本申请方法的步骤(c)中的第二反应混合物。在另一些实施方式中,所述试剂盒可以既包含含有指数扩增反应所需的除扩增模板以外的全部反应物的混合物又包含含有指数扩增反应所需的除扩增模板以外的全部反应物的混合物,上述混合物可以是分开的两种,也可以是混合的一种。In some embodiments, the kit may include a mixture containing all reactants other than genomic DNA required for a linear amplification reaction. When such a kit is used for the linear amplification reaction described herein, the sample containing genomic DNA can be directly mixed with the mixture in the kit, and optionally, an appropriate amount of pure water can be added to obtain the required reaction volume to obtain the first reaction mixture in step (a) of the method of the present application. In some embodiments, the kit may include a mixture containing all reactants other than the amplification template required for an exponential amplification reaction. When such a kit is used for the exponential amplification reaction described herein, the DNA template sample containing the amplified product in step (b) can be directly mixed with the mixture in the kit, and optionally, an appropriate amount of pure water can be added to obtain the required reaction volume to obtain the second reaction mixture in step (c) of the method of the present application. In other embodiments, the kit may include both a mixture containing all reactants other than the amplification template required for an exponential amplification reaction and a mixture containing all reactants other than the amplification template required for an exponential amplification reaction. The above mixtures may be two separate ones or a mixed one.
在本申请的另一方面还提供了可用于基因组DNA扩增的试剂盒,所述试剂盒包括第一类引物(例如,第一引物和/或第三引物)和第二类引物(例如,第二引物),并且还包括使用说明书,所述使用说明书记载了在开始进行所述扩增之前混合引物和其他组分得到第一/第三反应混合物的步骤。在另一些实施方式中,所述说明书还记载了如何进行本申请所述的扩增。试剂盒中的第一类引物和第二类引物可以分别置于不同的容器中,但说明书中可以包括在开始扩增前将两者混合在同一容器中的步骤。In another aspect of the present application, a test kit that can be used for genomic DNA amplification is also provided, wherein the test kit comprises a first class primer (for example, the first primer and/or the third primer) and a second class primer (for example, the second primer), and further comprises instructions for use, wherein the instructions for use have been recorded in the steps of mixing primers and other components to obtain the first/third reaction mixture before starting to carry out the amplification. In other embodiments, the instructions have also been recorded in the steps of how to carry out amplification described in the present application. The first class primer and the second class primer in the test kit can be placed in different containers respectively, but the instructions can include in the step of mixing the two in the same container before starting to amplify.
具体实施例Specific embodiments
实施例1:初步验证使用不同线性扩增引物混合物的扩增效果Example 1: Preliminary verification of amplification effects using different linear amplification primer mixtures
a)使用标准基因组DNA样本进行验证a) Verification using standard genomic DNA samples
标准基因组DNA为事先提取好的人类细胞的基因组DNA。用无核酸酶水将标准基因组DNA稀释为50皮克/微升的DNA溶液,取1微升上述溶液(作为基因组DNA源)加入到PCR管中,在各实验组中加入如表1所示的引物混合物及其他相关试剂,得到第一反应混合物(其中含有Na+、Mg2+、Cl-、Tris-Cl、TritonX-100、dNTP、Vent聚合酶和引物混合物)。The standard genomic DNA was previously extracted from human cells. Dilute the standard genomic DNA with nuclease-free water to a 50 pg/μL DNA solution. Add 1 μL of this solution (as the genomic DNA source) to a PCR tube. For each experimental group, add the primer mix and other relevant reagents listed in Table 1 to obtain the first reaction mixture (containing Na + , Mg 2+ , Cl − , Tris-Cl, Triton X-100, dNTPs, Vent polymerase, and primer mix).
线性扩增Linear amplification
每组引物组合/混合物均使用两个实验组进行平行实验以保证其准确性,并将各实验组的反应混合液置于如下第一温控程序进行反应:Each primer combination/mixture was tested in parallel using two experimental groups to ensure its accuracy, and the reaction mixture of each experimental group was placed in the first temperature control program as follows for reaction:
各实验组中使用的引物混合物如下表1中所示。The primer mix used in each experimental group is shown in Table 1 below.
表1:各实验组中使用的引物Table 1: Primers used in each experimental group
其中每个实验组中使用的第一类引物的总量为600皮摩尔(如果其中包括多种引物,则总量为600皮摩尔,各种引物含量均相同)。可根据测序平台的不同设计不同的通用序列。在本实验的1-12中根据Illumina平台选用SEQ ID NO:6作为通用序列。The total amount of the first type primer used in each experimental group was 600 pmoles (if multiple primers were included, the total amount was 600 pmoles, with the same amount of each primer). Different universal sequences can be designed based on different sequencing platforms. In experiments 1-12 of this invention, SEQ ID NO: 6 was selected as the universal sequence based on the Illumina platform.
线性扩增程序后,在各反应体系中加入1微升10皮摩/微升的第二引物,获得12组第二反应混合物,并将各实验组的反应混合液置于如下第二温控程序进行反应:After the linear amplification process, 1 μl of 10 pmol/μl of the second primer was added to each reaction system to obtain 12 sets of second reaction mixtures. The reaction mixtures of each experimental group were placed in the following second temperature control program for reaction:
可根据测序平台的不同设计不同的第二引物序列。在本实验的1-12中根据Illumina平台均使用如下第二引物的混合物:Different second primer sequences can be designed based on different sequencing platforms. In experiments 1-12, the following second primer mixture was used based on the Illumina platform:
第二引物-1(SEQ ID NO:35):GCT CTT CCG ATCT Second primer-1 (SEQ ID NO: 35): GCT CTT CCG ATCT
第二引物-2(SEQ ID NO:36):GCT CTT CCG ATCT Second primer-2 (SEQ ID NO: 36): GCT CTT CCG ATCT
其中,第二引物中以双下划线标识的碱基包括与测序平台的捕捉序列对应的部分,以斜体标识的碱基为与测序平台的测序序列对应的部分,以点标识的部分为标识序列部分,可根据需要替换为其他标识序列,以单下划线标识的部分为通用序列部分。Among them, the bases marked with double underlines in the second primer include the part corresponding to the capture sequence of the sequencing platform, the bases marked in italics are the part corresponding to the sequencing sequence of the sequencing platform, the part marked with a dot is the identification sequence part, which can be replaced with other identification sequences as needed, and the part marked with a single underline is the universal sequence part.
各实验组结束上述扩增温控程序后得到扩增产物。After completing the above amplification temperature control program, each experimental group obtained the amplification product.
凝胶电泳(定性)Gel electrophoresis (qualitative)
分别取5微升未纯化的实施例1中各实验组的扩增产物,并分别添加1微升6xDNA加样缓冲液(购自北京康为世纪生物科技有限公司,货号CW0610A)准备上样。凝胶使用1%琼脂糖凝胶,标记物使用DM2000(购自北京康为世纪生物科技有限公司,货号CW0632C)。电泳图请参见附图3,其中自左向右第1泳道为分子量标记,2-13泳道为基因组DNA扩增样品,14泳道为分子量标记。5 μl of unpurified amplified product from each experimental group in Example 1 was taken and added with 1 μl of 6xDNA loading buffer (purchased from Beijing Kangwei Century Biotechnology Co., Ltd., Catalog No. CW0610A) for sample loading. A 1% agarose gel was used, and a DM2000 marker (purchased from Beijing Kangwei Century Biotechnology Co., Ltd., Catalog No. CW0632C) was used. For the electropherogram, see Figure 3 . From left to right, lane 1 is a molecular weight marker, lanes 2-13 are genomic DNA amplified samples, and lane 14 is a molecular weight marker.
如图3电泳图所示,实验组1-4的产物在100bp附近有明显条带,但在100-500bp之间产物含量相对较低;其他实验组5-12的产物集中在500bp左右,实验组5-10在100bp附近有不清晰条带且与实验组1-4相比浓度非常低。由图3可知实验组1-4中引物聚合物较多,相对反应效率较低,而第一类引物可变序列内包含的固定序列为TGGG或者GTTT(实验组3-4)时反应效率高于固定序列为GGG或者TTT的引物(实验组1-2);实验组5-12组间的扩增产物区别不大,但与实验组1-4相比显著更高,说明引物聚合物产生的程度比第1-4组低。As shown in the electrophoresis diagram in Figure 3, the products of experimental groups 1-4 have a clear band near 100 bp, but the product content is relatively low between 100-500 bp. The products of other experimental groups 5-12 are concentrated around 500 bp, and experimental group 5-10 has an unclear band near 100 bp and its concentration is very low compared to experimental groups 1-4. Figure 3 shows that experimental groups 1-4 have more primer aggregates and relatively low reaction efficiency. However, the reaction efficiency of the first type of primers containing the fixed sequence TGGG or GTTT in the variable sequence (experimental groups 3-4) is higher than that of the primers with the fixed sequence GGG or TTT (experimental groups 1-2). The amplification products of experimental groups 5-12 are not much different from each other, but are significantly higher than those of experimental groups 1-4, indicating that the degree of primer aggregate generation is lower than that of groups 1-4.
纯化产物(定量)Purified product (quantitative)
取50微升未纯化的扩增产物,使用磁珠法DNA纯化回收试剂盒(购自北京康为世纪生物科技有限公司,货号CW2508)对扩增产物进行纯化处理,纯化步骤按照试剂盒说明书操作。使用20微升EB洗脱。纯化完成后取2微升纯化产物使用Nanodrop(AOSHENG,NANO-100)检测浓度。浓度检测结果如表2所示。50 μl of unpurified amplified product was purified using a magnetic bead DNA purification and recovery kit (purchased from Beijing Kangwei Century Biotechnology Co., Ltd., Cat. No. CW2508). Purification steps were performed according to the kit instructions. Elution was performed using 20 μl of EB. After purification, 2 μl of the purified product was collected and the concentration was determined using a Nanodrop (AOSHENG, NANO-100). The concentration results are shown in Table 2.
表2:图3中样本回收后浓度Table 2: Concentration of recovered samples in Figure 3
根据获得的扩增产物在纯化后的浓度推测各实验组的扩增效率:由表2所示的浓度检测结果可以看出,实验组1-2扩增效率最低,实验组3-4与实验组5-12相比扩增效率相对较低但比实验组1-2扩增效率高。除实验组1-4以外,其他实验组扩增产物总量相当,没有显著差别。The amplification efficiency of each experimental group was estimated based on the concentration of the purified amplified product. The concentration test results shown in Table 2 indicate that experimental group 1-2 had the lowest amplification efficiency. Experimental groups 3-4 had relatively lower amplification efficiencies than experimental groups 5-12, but higher than experimental group 1-2. With the exception of experimental groups 1-4, the total amount of amplified product in the other experimental groups was comparable, with no significant differences.
测序(定性)Sequencing (qualitative)
取上述纯化后的12实验组的扩增产物,采取浅测序的方式使用Illumina的NGS测序平台hiseq2500测序仪进行测序,并将测序得到的序列比对到人类参考基因组上。The purified amplified products of the 12 experimental groups were sequenced using Illumina's NGS sequencing platform hiSeq2500 sequencer in a shallow sequencing manner, and the sequenced sequences were aligned to the human reference genome.
在表3中提供了高通量测序结果的各个指标参数。Table 3 provides various indicator parameters of high-throughput sequencing results.
表3的各指标参数中,原始数据中唯一比对比例(unique_mapped_of_raw,即能够比对到人类基因组的唯一位置的数据比例)是最重要的衡量指标。如表3中数据所示,实验组5-12中的unique_mapped_of_raw在83%-86%之间,各组间差别不大,但实验组1-4中的unique_mapped_of_raw相对较低在67-79%之间。Among the various metrics in Table 3, the unique_mapped_of_raw ratio (i.e., the proportion of data that can be mapped to a unique location in the human genome) is the most important metric. As shown in Table 3, the unique_mapped_of_raw ratio for experimental groups 5-12 ranged from 83% to 86%, with little difference between the groups. However, the unique_mapped_of_raw ratio for experimental groups 1-4 was relatively low, ranging from 67% to 79%.
另一个重要指标参数为原始数据中比对比例(mapped_of_raw,即原始数据中能够比对到人类基因组的某个位置的数据比例)。类似地,如表3中数据所示,实验组5-12中的mapped_of_raw在89%-93%之间,各组间差别不大,但实验组1-4中的mapped_of_raw相对较低在73-86%之间。Another important indicator parameter is the mapping ratio in the raw data (mapped_of_raw, that is, the proportion of data in the raw data that can be mapped to a certain position in the human genome). Similarly, as shown in the data in Table 3, the mapped_of_raw ratio in experimental groups 5-12 ranged from 89% to 93%, with little difference between the groups. However, the mapped_of_raw ratio in experimental groups 1-4 was relatively low, ranging from 73% to 86%.
另外,表3中的数据还表明实验组1-4的读数数据质量也低于其他实验组,例如实验组3、4的原始数据中高质量数据比例仅为77.08%、76.99%,而实验组5-12中高质量数据比例在94%-96%之间。In addition, the data in Table 3 also show that the reading data quality of experimental groups 1-4 is also lower than that of other experimental groups. For example, the proportion of high-quality data in the original data of experimental groups 3 and 4 is only 77.08% and 76.99%, while the proportion of high-quality data in experimental groups 5-12 is between 94% and 96%.
图4中显示了测序文库中序列读数起始位置的各核苷酸读数情况。由图4可见,实验组1-4和9-10中起始读数区域包括A、T、C、G四种碱基。实验组5-7中的起始读数区域缺乏A或者C,实验组11-12中的起始读数区域缺乏A和C。本领域技术人员应知,在测序时,特别是使用SBS测序时要求测序的前几个碱基有较高的随机性,当测序样本整体的前几个碱基随机性较低时,则需要在整板上样时在每个上样孔中添加一定量的阳性质控品以增加碱基随机性,但这势必会浪费一定的数据量。例如,当使用实验组11-12中制备的文库进行整板上样测序时,由于起始读数区域缺乏A、C则需要加入一定数量的阳性质控品,根据实验经验一般需要加入至少20%上样量的阳性质控才能保证SBS测序的顺利进行。Figure 4 shows the nucleotide readings at the start position of the sequence reads in the sequencing library. As can be seen from Figure 4, the start read regions in experimental groups 1-4 and 9-10 include four bases: A, T, C, and G. The start read regions in experimental groups 5-7 lack A or C, and the start read regions in experimental groups 11-12 lack A and C. Those skilled in the art will know that when sequencing, especially when using SBS sequencing, the first few bases of the sequencing are required to have a high degree of randomness. When the randomness of the first few bases of the sequencing sample as a whole is low, it is necessary to add a certain amount of positive quality control to each loading well when loading the whole plate to increase the base randomness, but this is bound to waste a certain amount of data. For example, when using the library prepared in experimental group 11-12 for whole plate sequencing, due to the lack of A and C in the start read region, a certain amount of positive quality control needs to be added. According to experimental experience, it is generally necessary to add at least 20% of the sample amount of positive quality control to ensure the smooth progress of SBS sequencing.
表3:实验组1-12的高通量测序结果主要质量指标Table 3: Main quality indicators of high-throughput sequencing results of experimental groups 1-12
b)使用AFP单细胞样本进行验证b) Validation using AFP single cell samples
待测样本为AFP单细胞。使用胰蛋白酶消化培养状态良好的人表皮成纤维细胞(AFP),将消化后的细胞收集进入1.5ml EP管内。将收集的细胞离心并用1x的PBS溶液冲洗。冲洗完成后加入1x的PBS使细胞悬浮。用移液器抽取一部分包含细胞的悬浮液,在10x显微镜下使用口吸管挑取单细胞,吸取的PBS溶液体积不超过1微升,并将挑取的单细胞转移进入包含5微升裂解缓冲液(含有Tris-Cl、KCl、EDTA、Triton X-100和Qiagen Protease)的PCR管内。短暂离心后将PCR管置于PCR仪上执行裂解程序,具体程序如表4所示。The sample to be tested is AFP single cell. Human epidermal fibroblasts (AFP) in good culture state are digested with trypsin, and the digested cells are collected into a 1.5ml EP tube. The collected cells are centrifuged and rinsed with a 1x PBS solution. After rinsing, 1x PBS is added to suspend the cells. Use a pipette to extract a portion of the suspension containing cells, and use a mouth pipette to pick up single cells under a 10x microscope. The volume of the PBS solution absorbed does not exceed 1 microliter, and the picked single cells are transferred into a PCR tube containing 5 microliters of lysis buffer (containing Tris-Cl, KCl, EDTA, Triton X-100 and Qiagen Protease). After a brief centrifugation, the PCR tube is placed on a PCR instrument to perform the lysis program. The specific program is shown in Table 4.
表4:裂解程序Table 4: Lysis procedure
将裂解后的反应液作为基因组DNA源替换实施例1a)中的标准基因组DNA,其它成分和线性扩增、指数扩增温控程序与实施例1a)中相同。对获得的扩增产物采用上文所述条件进行凝胶电泳检测。凝胶电泳电泳图请参见附图5,其中自左向右第1泳道为分子量标记,2-13泳道为基因组DNA扩增样品,14泳道为分子量标记。The lysed reaction solution was used as the genomic DNA source, replacing the standard genomic DNA in Example 1a). Other components and the temperature control procedures for linear and exponential amplification were the same as in Example 1a). The resulting amplified products were detected by gel electrophoresis using the conditions described above. See Figure 5 for the gel electrophoresis diagram, where, from left to right, lane 1 is a molecular weight marker, lanes 2-13 are genomic DNA amplified samples, and lane 14 is a molecular weight marker.
如图5电泳图所示,实验结果与图3中所示的相似。实验组1-4的产物在100bp附近有明显条带,但在100-500bp之间产物含量相对较低;其他实验组5-12的产物集中在500bp左右,实验组5-12在100bp附近无明显条带。由图5可知实验组1-4中引物聚合物较多,相对反应效率较低,而第一类引物可变序列内包含的固定序列为TGGG或者GTTT(实验组3-4)时反应效率高于固定序列为GGG或者TTT的引物(实验组1-2);实验组5-10组间的扩增产物得率区别不大,但与实验组1-4相比更高,说明引物聚合物产生的程度比第1-4组低,但与实验组11-12相比略低。As shown in the electrophoresis diagram in Figure 5, the experimental results are similar to those shown in Figure 3. The products of experimental groups 1-4 showed a distinct band around 100 bp, but the product content was relatively low between 100 and 500 bp. The products of the other experimental groups 5-12 were concentrated around 500 bp, and experimental group 5-12 showed no distinct band around 100 bp. Figure 5 shows that experimental groups 1-4 had a high concentration of primer aggregates, indicating a relatively low reaction efficiency. However, the reaction efficiency of the first-class primers containing the fixed sequence TGGG or GTTT within the variable sequence (experimental groups 3-4) was higher than that of the primers containing the fixed sequence GGG or TTT (experimental groups 1-2). The amplification product yields of experimental groups 5-10 were not significantly different among the groups, but were higher than those of experimental groups 1-4, indicating that the degree of primer aggregates was lower than that of groups 1-4, but slightly lower than that of experimental groups 11-12.
根据上文所述的操作对扩增产物进行纯化并且测序,测序结果的各项指标如下表5中所示。其中实验组5-12中的unique_mapped_of_raw在78%-85%之间,各组间差别不大,但实验组1-4中的unique_mapped_of_raw相对较低在63-70%之间。如表5中数据所示,实验组5-12中的mapped_of_raw在84%-92%之间,各组间差别不大,但实验组1-4中的mapped_of_raw相对较低在73-86%之间。另外,表3中的数据还表明实验组1-4的读数数据质量也低于其他实验组,例如实验组3、4的原始数据中高质量数据比例仅为68.97%、72.29%,而实验组5-12中高质量数据比例在94%-96%之间。The amplified products were purified and sequenced according to the procedures described above. The sequencing results are shown in Table 5 below. The unique_mapped_of_raw percentage for experimental groups 5-12 ranged from 78% to 85%, with little difference between the groups. However, the unique_mapped_of_raw percentage for experimental groups 1-4 was relatively low, ranging from 63% to 70%. As shown in Table 5, the mapped_of_raw percentage for experimental groups 5-12 ranged from 84% to 92%, with little difference between the groups. However, the mapped_of_raw percentage for experimental groups 1-4 was relatively low, ranging from 73% to 86%. Furthermore, the data in Table 3 also indicate that the read quality of experimental groups 1-4 was lower than that of the other experimental groups. For example, the proportion of high-quality raw data in experimental groups 3 and 4 was only 68.97% and 72.29%, respectively, while the proportion of high-quality data in experimental groups 5-12 ranged from 94% to 96%.
表5:实验组1-12的高通量测序结果主要质量指标Table 5: Main quality indicators of high-throughput sequencing results of experimental groups 1-12
由上述结果可见,实验组1-4所使用的线性扩增引物混合物会产生较多的引物聚合物,从而使得扩增效率大大降低,相应地在同等扩增条件下数据量较低。虽然实验组1-4在起始测序区域的A、T、C、G分布均匀,但是数据中的比对比例尤其是唯一比对比例较低,从而导致使用其测序数据在后续处理上更为困难。The above results indicate that the linear amplification primer mixture used in Experimental Groups 1-4 produced a high amount of primer aggregates, significantly reducing amplification efficiency and correspondingly producing lower data volumes under equivalent amplification conditions. Although the A, T, C, and G sequences in the initial sequencing regions of Experimental Groups 1-4 were evenly distributed, the alignment ratios, particularly the unique alignment ratio, were low, making subsequent processing of the sequencing data more difficult.
实施例2:进一步验证使用不同线性扩增引物混合物的扩增效果Example 2: Further verification of the amplification effect using different linear amplification primer mixtures
按照实施1b)所述的方法对人表皮成纤维细胞进行分离和裂解,以获得单细胞基因组DNA,并且分别使用表1实验组11/12中使用的引物混合物以及实验组9/10中使用的引物混合物进行扩增,对每种引物混合物均进行10个平行试验(分别以1_1、1_2…1_10和2_1、2_2…2_10表示)。根据实施例1a)中所述的程序进行扩增并获得扩增产物,并对扩增产物进行凝胶电泳检测,电泳检测结果如图6所示。其中实验组2_1、2_2…2_10中扩增产物的浓度略低于实验组1_1、1_2…1_10中扩增产物的浓度Human epidermal fibroblasts were isolated and lysed according to the method described in Example 1b) to obtain single-cell genomic DNA, and amplification was performed using the primer mixture used in experimental groups 11/12 and the primer mixture used in experimental groups 9/10 in Table 1, respectively. Ten parallel experiments were performed for each primer mixture (represented by 1_1, 1_2...1_10 and 2_1, 2_2...2_10, respectively). Amplification was performed according to the procedure described in Example 1a) to obtain amplified products, and the amplified products were detected by gel electrophoresis. The electrophoresis detection results are shown in Figure 6. The concentrations of the amplified products in experimental groups 2_1, 2_2...2_10 were slightly lower than those in experimental groups 1_1, 1_2...1_10.
根据实施例1a)中所述的程序对上述扩增产物进行纯化并取相同体积的纯化扩增产物进行测序,测序结果的相关指标参数如下表6-7所示。The amplified products were purified according to the procedure described in Example 1a) and the same volume of the purified amplified products was sequenced. The relevant index parameters of the sequencing results are shown in Tables 6-7 below.
表6:实验组1_1、1_2…1_10的高通量测序结果主要质量指标Table 6: Main quality indicators of high-throughput sequencing results of experimental groups 1_1, 1_2…1_10
表7:实验组2_1、2_2…2-10高通量测序结果主要质量指标Table 7: Main quality indicators of high-throughput sequencing results of experimental groups 2_1, 2_2…2-10
表6和7中的数据显示,实验组2_1、2_2…2_10中的unique_mapped_of_raw在83%-84%左右,mapped_of_raw在90%-91%左右,实验组1_1、1_2…1_10中的unique_mapped_of_raw在84%-85%左右,mapped_of_raw在91%-92%左右,两组间差别不大。每组中各样品的上机数据量统计结果显示如图7:实验组1_1、1_2…1_10中,除异常的实验组1_1以外(可能由于回收过程中的失误导致数据量极低)其他实验组中数据量均在1.5-2M之间,实验组1_2、1_3…1_10的上机数据量平均值为约1.7M左右。实验组2_1、2_2…2_10中,数据量均在1.5-2.5M之间,实验组2_1、2_2…2_10的上机数据量平均值为约1.8M左右,实验组2_1、2_2…2_10中数据量略高。此外,将实验组上机数据中的拷贝数变异系数CV总结如图8:在排除明显异常的实验组1_1后,实验组1_2、1_3…1_10的平均拷贝数变异系数CV约为0.046,实验组2_1、2_2…2_10的平均拷贝数变异系数CV约为0.049,两个实验组中拷贝数变异系数相比无明显差别。图9A-9D中单独列出了每个实验组的拷贝数变异图,其中纵坐标代表染色体的拷贝数,正常人为2;横坐标代表染色体的1-22号染色体及性染色体。如图所示,每个实验组中染色体1-22除个别数据点以外均大致为两个拷贝,而性染色体X和Y均分别为大致1个拷贝。The data in Tables 6 and 7 show that for experimental groups 2_1, 2_2, …, and 2_10, unique_mapped_of_raw values ranged from approximately 83% to 84%, and mapped_of_raw values ranged from approximately 90% to 91%. For experimental groups 1_1, 1_2, …, and 1_10, unique_mapped_of_raw values ranged from approximately 84% to 85%, and mapped_of_raw values ranged from approximately 91% to 92%, with little difference between the two groups. The statistical results of the amount of data loaded on the machine for each sample in each group are shown in Figure 7. With the exception of experimental group 1_1 (possibly due to errors in the data collection process resulting in extremely low data volumes), the data volumes for experimental groups 1_1, 1_2, …, and 1_10 ranged from 1.5 to 2 MB. The average amount of data loaded on the machine for experimental groups 1_2, 1_3, …, and 1_10 was approximately 1.7 MB. The data volumes for experimental groups 2_1, 2_2, …, and 2_10 ranged from 1.5 to 2.5 megabytes. The average data volume for these groups was approximately 1.8 megabytes, with slightly higher volumes for these groups. Furthermore, the copy number variation coefficients (CVs) for the experimental group data are summarized in Figure 8 : Excluding the significantly abnormal experimental group 1_1, the average CVs for experimental groups 1_2, 1_3, …, and 1_10 were approximately 0.046, while the average CVs for experimental groups 2_1, 2_2, …, and 2_10 were approximately 0.049. There was no significant difference in the CVs between the two experimental groups. Figures 9A-9D show the copy number variation plots for each experimental group. The ordinate represents the chromosome copy number (normal is 2), while the abscissa represents chromosomes 1-22 and the sex chromosomes. As shown in the figure, chromosomes 1-22 in each experimental group were present in approximately two copies except for a few individual data points, while sex chromosomes X and Y were present in approximately one copy each.
实施例3:致病位点和质检引物检测Example 3: Detection of pathogenic sites and quality control primers
致病位点检测Pathogenic site detection
随机选取35个致病位点(选择的位点参见下表8),并设计引物。选取的致病位点及其相应的引物分别如表8和表9所示。35 pathogenicity sites were randomly selected (see Table 8 below for the selected sites) and primers were designed. The selected pathogenicity sites and their corresponding primers are shown in Tables 8 and 9, respectively.
表8:随机选取的35个致病位点Table 8: 35 randomly selected pathogenic sites
表9:与表8中的致病位点相应的引物Table 9: Primers corresponding to the pathogenicity sites in Table 8
随机选择根据实施例2中的1_1、1_2、2_1、2_2实验组中的扩增产物分别作为模板DNA。使用2xGoldstarMasterMix(购自北京康为世纪生物科技有限公司,货号CW0960)对模板DNA进行PCR检测。扩增体系组成如表10所示,扩增程序如表11所示。Amplification products from experimental groups 1_1, 1_2, 2_1, and 2_2 in Example 2 were randomly selected as template DNA. PCR was performed on the template DNA using 2x Goldstar Master Mix (purchased from Beijing Kangwei Century Biotechnology Co., Ltd., catalog number CW0960). The amplification system composition is shown in Table 10, and the amplification procedure is shown in Table 11.
表10:用于致病位点检测的PCR反应体系Table 10: PCR reaction system for pathogenicity site detection
表11:用于致病位点检测的PCR扩增程序Table 11: PCR amplification procedures for pathogenicity site detection
扩增的结果如图10中的凝胶电泳图所示。扩增结果显示:在样品1_1、1_2中致病位点4、13在两个样本中均未被扩增出,而致病位点21在样品1_1中未被扩增出,致病位点20、29、31在样品1_2中未被扩增出。在样品2_1、2_2中致病位点31在两个样本中均未被扩增出,而致病位点18、21、32、35在样品2_1中未被扩增出,致病位点8、22在样品1_2中未被扩增出。结果显示两组引物组样品(1_1、1_2和2_1、2_2)在扩增的准确性和扩增产物的量上没有显著差别。The amplification results are shown in the gel electrophoresis diagram in Figure 10. The amplification results show that in samples 1_1 and 1_2, pathogenicity sites 4 and 13 were not amplified in either sample, while pathogenicity site 21 was not amplified in sample 1_1, and pathogenicity sites 20, 29, and 31 were not amplified in sample 1_2. In samples 2_1 and 2_2, pathogenicity site 31 was not amplified in either sample, while pathogenicity sites 18, 21, 32, and 35 were not amplified in sample 2_1, and pathogenicity sites 8 and 22 were not amplified in sample 1_2. The results show that there was no significant difference in amplification accuracy or the amount of amplified product between the two primer sets (1_1, 1_2 and 2_1, 2_2).
质检引物q-PCR检测Quality control primer q-PCR detection
使用上述实施例2中的1_1、1_2、2_1、2_2实验组中的扩增产物、阳性对照(相同浓度的gDNA)、阴性对照(无模板)分别作为模板DNA。使用如表12所示的6组质检引物,分别针对不同染色体上的DNA序列,对模板DNA进行q-PCR检测。在荧光定量PCR中使用2xFastSYBRMixture(购自北京康为世纪生物科技有限公司,货号CW0955)。扩增体系组成如表表13所示,扩增程序如表14所示。The amplified products, positive controls (same concentration of gDNA), and negative controls (no template) in the experimental groups 1_1, 1_2, 2_1, and 2_2 in Example 2 above were used as template DNA, respectively. 6 sets of quality control primers as shown in Table 12 were used to perform q-PCR detection on the template DNA for DNA sequences on different chromosomes, respectively. 2xFast SYBR Mixture (purchased from Beijing Kangwei Century Biotechnology Co., Ltd., article number CW0955) was used in fluorescent quantitative PCR. The amplification system composition is shown in Table 13, and the amplification program is shown in Table 14.
表12:质检使用的6对随机引物信息Table 12: Information of 6 pairs of random primers used in quality control
表13:用于qPCR检测的反应体系Table 13: Reaction system for qPCR detection
表14:用于qPCR检测的扩增程序Table 14: Amplification procedures for qPCR detection
扩增的结果如表15中所示,其中分别列出了引物对CH1、CH2、CH4、CH5、CH6和CH7对各组模板DNA的q-PCR检测数据。其中当Ct值越大表明,该引物对应的模板数越低,对应在gDNA扩增中扩增效率越差。扩增结果显示:样品2_1中CH1、CH2、CH4、CH5、CH6和CH7扩增效率均较高。样品2_2中CH1、CH2、CH4、CH5、CH6和CH7的扩增效率均很高。与样本1-1与1-2的扩增没有本质区别。The amplification results are shown in Table 15, which lists the q-PCR data for primer pairs CH1, CH2, CH4, CH5, CH6, and CH7 for each set of template DNA. A higher Ct value indicates a lower number of templates corresponding to that primer pair, corresponding to poorer amplification efficiency in gDNA amplification. The amplification results show that CH1, CH2, CH4, CH5, CH6, and CH7 in sample 2_1 all had high amplification efficiencies. Sample 2_2 also had high amplification efficiencies for CH1, CH2, CH4, CH5, CH6, and CH7. There was no substantial difference in amplification between the results for samples 1-1 and 1-2.
表15:使用表12中的6对引物使用qPCR检测对图6中4个样本的扩增效率Table 15: Amplification efficiency of the four samples in Figure 6 using qPCR detection using the six primer pairs in Table 12
实施例4:本申请的扩增方法用于Ion torrent测序平台Example 4: The amplification method of the present application is used for the Ion torrent sequencing platform
抽取新鲜的血液并使用淋巴细胞分离液分离淋巴细胞,其中用移液器抽取一部分包含细胞的悬浮液,在10x显微镜下使用口吸管挑取约3个白细胞,吸取的PBS溶液体积不超过1微升,并将挑取的3个左右白细胞转移进入包含4微升裂解缓冲液含有Tris-Cl、KCl、EDTA、Triton X-100和Qiagen Protease的PCR管内,并根据实施例1b)中所描述的步骤进行裂解,并使用表1中的实验组9/10中所使用的引物混合物对基因组DNA进行线性扩增,并且使用如下第二引物混合物:第二引物-1(SEQ ID NO:37):CCA CTA CGC CTC CGC TTT CCTCTC TAT GGG CAG TCG GTG ATG CTC TTC CGA TCT; Fresh blood was drawn and lymphocytes were isolated using lymphocyte separation medium, wherein a portion of the cell suspension was withdrawn with a pipette, and approximately 3 white blood cells were picked using a mouth pipette under a 10x microscope, with the volume of PBS solution not exceeding 1 μl. The picked 3 or so white blood cells were transferred into a PCR tube containing 4 μl of lysis buffer containing Tris-Cl, KCl, EDTA, Triton X-100, and Qiagen Protease, and lysed according to the steps described in Example 1b). The genomic DNA was linearly amplified using the primer mixture used in experimental group 9/10 in Table 1, and the following second primer mixture was used: Second Primer-1 (SEQ ID NO: 37): CCA CTA CGC CTC CGC TTT CCTCTC TAT GGG CAG TCG GTG AT G CTC TTC CGA TCT;
第二引物-2(SEQ ID NO:38):T CAGGA TGC TCT TCC GAT CT;(其中,第二引物中以双下划线标识的碱基包括与测序平台的捕捉序列对应的部分,以点标识的部分为标识序列部分,可根据需要替换为其他标识序列,单下划线部分为通用序列部分)进行指数扩增得到扩增产物,平行进行4组实验。所有其他反应条件均与实施例1b)中所描述的一致。扩增效果如图11中的凝胶电泳所示。The second primer-2 (SEQ ID NO: 38): T CAGGA T GC TCT TCC GAT CT ; (wherein, the double-underlined bases in the second primer correspond to the capture sequence of the sequencing platform; the portion marked with a dot is the identifier sequence and can be replaced with another identifier sequence as needed; the single-underlined portion is the universal sequence.) Exponential amplification was performed to obtain amplified products. Four sets of experiments were performed in parallel. All other reaction conditions were consistent with those described in Example 1b). The amplification results are shown in the gel electrophoresis in Figure 11.
均一性检测Homogeneity testing
之后随机选择根据实施例4扩增出的2个样品(在图11中显示为样本1和样本2)分别作为模板DNA。使用2xGoldstarMasterMix(购自北京康为世纪生物科技有限公司,货号CW0960)对模板DNA进行PCR检测,其中使用如表9中所示的引物扩增如表8中所示的35个致病位点。扩增体系组成如表10所示,扩增程序如表11所示。Two samples amplified according to Example 4 (shown as Sample 1 and Sample 2 in Figure 11) were randomly selected as template DNA. PCR was performed on the template DNA using 2x Goldstar Master Mix (purchased from Beijing Kangwei Century Biotechnology Co., Ltd., Catalog No. CW0960). The primers listed in Table 9 were used to amplify the 35 pathogenic loci listed in Table 8. The amplification system composition is shown in Table 10, and the amplification procedure is shown in Table 11.
扩增的结果如图12所示。扩增结果显示:35个致病位点在上述两个扩增产物样本中均能得到了很好地扩增,两个样本在扩增的准确性和扩增产物的量上没有显著差别。The amplification results are shown in Figure 12. The amplification results showed that the 35 pathogenic sites were well amplified in both amplification product samples, and there was no significant difference in amplification accuracy and amplification product quantity between the two samples.
基因测序Gene sequencing
图11中显示的4个样本在经纯化后分别取等体积,使用Life Technologies的Iontorrent测序平台PGMTM测序仪进行测序,并将测序得到的序列比对到人类参考基因组上。测序结果如下表16及附图13所示。After purification, equal volumes of the four samples shown in Figure 11 were taken and sequenced using the Life Technologies Iontorrent sequencing platform PGM ™ sequencer. The resulting sequences were aligned to the human reference genome. The sequencing results are shown in Table 16 and Figure 13.
表16:图11中样本半导体测序(PGM平台)测序结果Table 16: Sequencing results of semiconductor sequencing (PGM platform) of samples in Figure 11
表16中的数据显示,样本1、2、3、4中的unique_mapped_of_raw在68%左右,mapped_of_raw在72%-73%左右,数据量在0.38-0.53M之间,上机数据中的拷贝数变异系数CV约为0.06。此外测序读数的拷贝数变异图如图13所示,每个实验组中染色体1-22除个别数据点以外均大致为两个拷贝,而性染色体X和Y均分别为大致1个拷贝。The data in Table 16 show that for samples 1, 2, 3, and 4, unique_mapped_of_raw values were approximately 68%, mapped_of_raw values were approximately 72%-73%, the data size ranged from 0.38 to 0.53M, and the copy number variation coefficient (CV) for the on-machine data was approximately 0.06. Furthermore, the copy number variation plot for sequencing reads is shown in Figure 13. Except for a few individual data points, chromosomes 1-22 in each experimental group had approximately two copies, while sex chromosomes X and Y each had approximately one copy.
值得注意的是,在Ion Torrent测序平台中并不需要针对测序平台设计特定的通用序列,原则上任何基本上不会与基因组DNA结合产生扩增的6-60bp长度范围内的序列均可选作通用序列。我们在Ion Torrent测序平台中还使用了另外两种引物组合进行检测:1)使用与表1中的实验组9/10中所使用的第一类引物混合物类似的第一类引物混合物:SEQID NO:15、16、19和20(但其中通用序列为SEQ ID NO:1),对应的指数扩增引物混合物为SEQID NO:39CCA CTA CGC CTC CGC TTT CCT CTC TAT GGG CAG TCG GTG ATT TGG TAG TGA GTG和SEQ ID NO:40T CAGGA TTT GGT AGT GAG TG;2)使用与表1中的实验组9/10中所使用的第一类引物混合物类似的第一类引物混合物:SEQ ID NO:23、24、27和28(但其中通用序列为SEQ ID NO:2),对应的指数扩增引物混合物为SEQ ID NO:41CCA CTA CGC CTC CGC TTT CCT CTC TAT GGG CAGTCG GTG ATG AGG TGT GAT GGA和SEQ ID NO:42T CAGGA TGA GGT GTG ATG GA。扩增与测序结果与图11及表16中所示的结果无明显差别(此处未提供具体数据)。It is worth noting that in the Ion Torrent sequencing platform, there is no need to design a specific universal sequence for the sequencing platform. In principle, any sequence within the length range of 6-60bp that basically does not bind to genomic DNA to produce amplification can be selected as a universal sequence. We also used two other primer combinations for detection in the Ion Torrent sequencing platform: 1) using a first-class primer mixture similar to the first-class primer mixture used in experimental group 9/10 in Table 1: SEQ ID NO: 15, 16, 19 and 20 (but the universal sequence is SEQ ID NO: 1), the corresponding exponential amplification primer mixture is SEQ ID NO: 39 CCA CTA CGC CTC CGC TTT CCT CTC TAT GGG CAG TCG GTG AT T TGG TAG TGA GTG and SEQ ID NO: 40 T CAGGA T TT GGT AGT GAG TG ; 2) using a first-class primer mixture similar to the first-class primer mixture used in experimental group 9/10 in Table 1: SEQ ID NO: 23, 24, 27 and 28 (but the universal sequence is SEQ ID NO: 2), the corresponding exponential amplification primer mixture is SEQ ID NO: 41 CCA CTA CGC CTC CGC TTT CCT CTC TAT GGG CAGTCG GTG AT G AGG TGT GAT GGA and SEQ ID NO: 42 T CAGGA T GA GGT GTG ATG GA . The amplification and sequencing results were not significantly different from those shown in Figure 11 and Table 16 (specific data not provided herein).
实施例5:本申请的扩增方法用于囊胚培养液的胚胎植入前染色体检测Example 5: The amplification method of the present application is used for preimplantation chromosome detection in blastocyst culture medium
受精卵在体外培养,在囊胚期(体外培养第3天)取外胚滋养层的多个细胞(约3个细胞)进行染色体拷贝数异常的检测。采集囊胚外胚滋养层细胞的方法可以为任何本领域技术人员公知的方法,例如但不限于Wang L,Cram DS等人Validation of copy numbervariation sequencing for detecting chromosome imbalances in humanpreimplantation embryos.Biol Reprod,2014,91(2):37中所述的方法。使用PBS将经分离的滋养层细胞洗涤3次,并用25μl的PBS溶液重悬,用移液器抽取一部分包含细胞的悬浮液,在10x显微镜下使用口吸管挑取约3个白细胞,吸取的PBS溶液体积不超过1微升,并将挑取的3个左右白细胞转移进入包含4微升裂解缓冲液含有Tris-Cl、KCl、EDTA、Triton X-100和Qiagen Protease的PCR管内,并根据实施例1b)中所描述的步骤进行裂解,并使用表1中的实验组9/10中所使用的引物混合物对基因组DNA进行扩增(平行进行4组实验)。如实施例1中所述的步骤对扩增产物进行纯化及测序。测序结果如下表17中所示。The fertilized egg is cultured in vitro, and multiple cells (about 3 cells) of the ectotrophoblast are collected at the blastocyst stage (day 3 of in vitro culture) to detect chromosome copy number abnormalities. The method for collecting ectotrophoblast cells from the blastocyst can be any method known to those skilled in the art, such as but not limited to the method described in Wang L, Cram DS et al. Validation of copy number variation sequencing for detecting chromosome imbalances in human preimplantation embryos. Biol Reprod, 2014, 91(2):37. The isolated trophoblasts were washed three times with PBS and resuspended in 25 μl of PBS solution. A portion of the cell suspension was pipetted and approximately three leukocytes were picked under a 10x microscope using a mouth pipette. The volume of PBS solution used was no more than 1 μl. The three or so leukocytes were transferred into a PCR tube containing 4 μl of lysis buffer containing Tris-Cl, KCl, EDTA, Triton X-100, and Qiagen Protease. The cells were lysed according to the steps described in Example 1b) and genomic DNA was amplified using the primer mixture used in experimental groups 9/10 in Table 1 (four experiments were performed in parallel). The amplified products were purified and sequenced as described in Example 1. The sequencing results are shown in Table 17 below.
表17:囊胚培养样本测序结果Table 17: Sequencing results of blastocyst culture samples
表17中的数据显示,样本1、2、3中的unique_mapped_of_raw在66-73%左右,mapped_of_raw在71%-78%左右,数据量在1.4-2.6M之间。此外测序读数的拷贝数变异图如图14所示,每个实验组中染色体1-22除个别数据点以外均大致为两个拷贝,而性染色体X和Y均分别为大致1个拷贝。The data in Table 17 show that unique_mapped_of_raw in samples 1, 2, and 3 ranges from 66% to 73%, mapped_of_raw ranges from 71% to 78%, and the data size ranges from 1.4 to 2.6M. Furthermore, the copy number variation plot of the sequencing reads is shown in Figure 14. Except for a few individual data points, chromosomes 1-22 in each experimental group have approximately two copies, while sex chromosomes X and Y each have approximately one copy.
尽管本发明已公开了多个方面和实施方式,但是其它方面和实施方式对本领域技术人员而言将是显而易见的。本发明公开的多个方面和实施方式仅用于举例说明,其并非旨在限制本发明,本发明的实际保护范围以权利要求为准。Although the present invention has disclosed various aspects and embodiments, other aspects and embodiments will be apparent to those skilled in the art. The various aspects and embodiments disclosed herein are for illustration only and are not intended to limit the present invention. The actual scope of protection of the present invention shall be determined by the claims.
Claims (41)
Publications (3)
| Publication Number | Publication Date |
|---|---|
| HK1228461A HK1228461A (en) | 2017-11-03 |
| HK1228461A1 HK1228461A1 (en) | 2017-11-03 |
| HK1228461B true HK1228461B (en) | 2021-01-15 |
Family
ID=
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN105925675B (en) | Methods of Amplifying DNA | |
| AU2022202625B2 (en) | Methods and compositions for rapid nucleic acid library preparation | |
| TWI742000B (en) | DNA amplification method | |
| US12442038B2 (en) | Methods and materials for assessing nucleic acids | |
| KR20190034164A (en) | Single cell whole genomic libraries and combinatorial indexing methods for their production | |
| KR20170020704A (en) | Methods of analyzing nucleic acids from individual cells or cell populations | |
| EP3098324A1 (en) | Compositions and methods for preparing sequencing libraries | |
| CN110603326A (en) | Method for amplifying target nucleic acid | |
| CN107083427B (en) | DNA ligase-mediated DNA amplification technology | |
| US20250230485A1 (en) | Barcoding of nucleic acids | |
| HK1228461B (en) | A method for dna amplification | |
| CN114015751A (en) | Method and kit for amplifying genome DNA and method for obtaining amplification primer | |
| WO2010064040A1 (en) | Method for use in polynucleotide sequencing | |
| HK1228461A1 (en) | A method for dna amplification | |
| HK1228461A (en) | A method for dna amplification | |
| HK40064558A (en) | Compositions for rapid nucleic acid library preparation | |
| HK1237376A1 (en) | Universal blocking oligo system and improved hybridization capture methods for multiplexed capture reactions |