[go: up one dir, main page]

HK1230651B - Probe set for analyzing a dna sample and method for using the same - Google Patents

Probe set for analyzing a dna sample and method for using the same Download PDF

Info

Publication number
HK1230651B
HK1230651B HK17104284.8A HK17104284A HK1230651B HK 1230651 B HK1230651 B HK 1230651B HK 17104284 A HK17104284 A HK 17104284A HK 1230651 B HK1230651 B HK 1230651B
Authority
HK
Hong Kong
Prior art keywords
sequence
sequences
oligonucleotide
probe
labeled
Prior art date
Application number
HK17104284.8A
Other languages
Chinese (zh)
Other versions
HK1230651A1 (en
Inventor
C.O.F.达尔
O.J.埃里克松
F.卡尔松
F.罗斯
Original Assignee
苏州新波生物技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 苏州新波生物技术有限公司 filed Critical 苏州新波生物技术有限公司
Publication of HK1230651A1 publication Critical patent/HK1230651A1/en
Publication of HK1230651B publication Critical patent/HK1230651B/en

Links

Description

分析DNA样品的探针集合和使用所述探针集合的方法Probe sets for analyzing DNA samples and methods of using the same

交叉引用Cross-references

本申请要求2015年9月18日提交的临时申请系列号62/220,746的权益,所述申请通过引用的方式完整并入本文。This application claims the benefit of provisional application serial number 62/220,746, filed September 18, 2015, which is incorporated herein by reference in its entirety.

背景background

可以分析无细胞DNA(“cfDNA”)以提供多种疾病和病状的预后、诊断或预测对多种疾病和病状的治疗的反应,所述疾病和病状包括多种癌症、移植失败或成功、炎性疾病、感染性疾病和胎儿非整倍性。Cell-free DNA ("cfDNA") can be analyzed to provide prognosis, diagnosis, or predict response to treatment for a variety of diseases and conditions, including various cancers, transplant failure or success, inflammatory diseases, infectious diseases, and fetal aneuploidy.

无细胞胎儿DNA(cffDNA)存在于孕妇的血液中。这个发现导致有可能使用来自孕妇的血液样品开展胎儿的非侵入性产前检验(NIPT)。非侵入性产前检验(例如,羊膜穿刺术或绒毛膜绒毛采样(CVS))可能对母亲造成压力并且一些人认为这类手术可能增加流产风险。NIPT可以提供与多种遗传缺陷(包括唐氏综合征(染色体21三体)、Patau综合征(13三体)和爱德华综合征(18三体))相关的信息。这类方法应当高度稳健,因为假阳性可以导致不必要的医疗过程,并且假阴性可能剥夺满怀期望的母亲对可用医疗选项的理解。Cell-free fetal DNA (cffDNA) is present in the blood of pregnant women. This discovery results in the possibility of using a blood sample from a pregnant woman to carry out non-invasive prenatal testing (NIPT) of the fetus. Non-invasive prenatal testing (e.g., amniocentesis or chorionic villus sampling (CVS)) may cause pressure on the mother and some people think that this type of surgery may increase the risk of miscarriage. NIPT can provide information related to multiple genetic defects (including Down syndrome (trisomy 21), Patau syndrome (trisomy 13) and Edward syndrome (trisomy 18)). This type of method should be highly robust because false positives can lead to unnecessary medical procedures, and false negatives may deprive expectant mothers of their understanding of available medical options.

存在与按照临床规模执行非侵入性产前检验相关的许多技术障碍。例如,许多NIPT工作已经致力于分析cffDNA以鉴定特定序列(例如,来自染色体21的序列)的拷贝数变化。然而,这类方法难以按照稳健方式执行,部分原因在于血液样品中的绝大部分cfDNA是母体来源的并且在许多情况下仅非常小的数量(例如,平均约10%并低至约3%)来自胎儿。例如,可以通过将对应于染色体21的序列的拷贝数与对应于常染色体的序列的拷贝数比较,确定在胎儿中染色体(如染色体21)的额外拷贝存在或不存在。尽管这类方法听起来有吸引力,但是它们实际上富有挑战性,因为母体血液中胎儿DNA相对于母源DNA的分数浓度可能低至3%。因而,对于母体血流存在的每1000个对应于染色体21的序列,仅小百分比的这些序列(例如,如果胎儿分数是3%,则30个序列)来自胎儿。因此,胎儿中某染色体的额外副本仅将导致母体血流中对应于该染色体的序列的数目增加相对少。例如,如果胎儿分数是4,则胎儿21三体将仅导致母体血流中对应于染色体21的片段的数目增加1.5%。由于这个问题,统计严格性仅可以通过计数与疑似具有拷贝数差异的染色体区域相对应的大量序列(例如,至少1,000个和有时至少5,000个或更多个序列)并且将该数值与不疑似具有拷贝数差异的另一个染色体区域的相似数值比较来实现。能够一致并准确地计数片段对许多NIPT方法取得成功最重要。There are many technical obstacles associated with performing non-invasive prenatal testing on a clinical scale. For example, many NIPT efforts have been devoted to analyzing cffDNA to identify copy number changes of specific sequences (e.g., sequences from chromosome 21). However, this type of method is difficult to perform in a robust manner, in part because the vast majority of cfDNA in a blood sample is maternally derived and in many cases only a very small amount (e.g., an average of about 10% and as low as about 3%) is from the fetus. For example, the presence or absence of an extra copy of a chromosome (such as chromosome 21) in the fetus can be determined by comparing the copy number of the sequence corresponding to chromosome 21 with the copy number of the sequence corresponding to an autosome. Although this type of method sounds attractive, it is actually challenging because the fractional concentration of fetal DNA relative to maternal DNA in maternal blood may be as low as 3%. Thus, for every 1000 sequences corresponding to chromosome 21 present in the maternal bloodstream, only a small percentage of these sequences (e.g., if the fetal fraction is 3%, then 30 sequences) are from the fetus. Therefore, an extra copy of a chromosome in the fetus will only result in a relatively small increase in the number of sequences corresponding to that chromosome in the maternal bloodstream. For example, if the fetal fraction is 4, fetal trisomy 21 will only result in a 1.5% increase in the number of fragments corresponding to chromosome 21 in the maternal bloodstream. Due to this problem, statistical rigor can only be achieved by counting a large number of sequences corresponding to the chromosomal region suspected of having a copy number difference (e.g., at least 1,000 and sometimes at least 5,000 or more sequences) and comparing this value with a similar value for another chromosomal region that is not suspected of having a copy number difference. Being able to count fragments consistently and accurately is paramount to the success of many NIPT methods.

一些NIPT方法使用聚合酶链反应(PCR)扩增DNA。广泛地使用PCR,但是它遭受可能不利影响结果准确度的多种限制。PCR可以在样品中引入序列人为假象并产生扩增偏倚。PCR序列人为假象是PCR反应向PCR扩增产物的DNA序列中引入的错误。PCR序列人为假象可能通过多种事件引起,如通过形成嵌合分子(例如,两个不同的DNA小片尾对尾连接)、形成异源双链体DNA(例如,两个不同的DNA分子彼此杂交)和通过扩增酶产生的错误(例如,由Taq DNA聚合酶在DNA模板设置错配的核苷酸)引起。来自PCR的序列偏倚是与原始样品相比,PCR产物的分布偏斜。PCR序列偏倚可以通过各种事件引起,如模板扩增效率的固有差异或因DNA模板自我退火而抑制扩增。PCR错误导致不同DNA分子的不等扩增,从而扩增的样品不再代表原始样品。PCR还众所周知对来自环境的外源DNA污染敏感。归因于PCR期间DNA的指数型扩增,PCR反应中甚至非常少量的外源DNA污染就可以导致高度不准确的结果。外源DNA污染可以从飘在空气中的雾化液滴引入或可以从污染的设备转入反应中。Some NIPT methods use polymerase chain reaction (PCR) to amplify DNA. PCR is widely used, but it suffers from a variety of limitations that may adversely affect the accuracy of the results. PCR can introduce sequence artifacts into the sample and produce amplification bias. PCR sequence artifacts are errors introduced by the PCR reaction into the DNA sequence of the PCR amplified product. PCR sequence artifacts may be caused by a variety of events, such as by forming chimeric molecules (e.g., two different DNA fragments are joined end-to-end), forming heteroduplex DNA (e.g., two different DNA molecules hybridize with each other), and by errors generated by amplification enzymes (e.g., by Taq DNA polymerase setting mismatched nucleotides in the DNA template). Sequence bias from PCR is the skewed distribution of PCR products compared to the original sample. PCR sequence bias can be caused by various events, such as inherent differences in template amplification efficiency or inhibition of amplification due to self-annealing of the DNA template. PCR errors result in unequal amplification of different DNA molecules, so that the amplified sample no longer represents the original sample. PCR is also known to be sensitive to exogenous DNA contamination from the environment. Due to the exponential amplification of DNA during PCR, even very small amounts of exogenous DNA contamination in PCR reactions can lead to highly inaccurate results. Exogenous DNA contamination can be introduced from aerosolized droplets in the air or can be transferred into the reaction from contaminated equipment.

使用滚环扩增(RCA)分析母体血液中的cfDNA避免了许多与PCR相关的问题。但是,RCA产物不是很容易按提供统计稳健性的方式定量。在实践层面,虽然RCA反应中产物的绝对数目可能足够高到提供统计稳健性,但是不同RCA产物可以到按不同效率扩增和检测,因此,一致地均匀检测数万或数十万RCA产物已经成为难题。Using rolling circle amplification (RCA) to analyze cfDNA from maternal blood avoids many of the problems associated with PCR. However, RCA products are not easily quantified in a statistically robust manner. On a practical level, while the absolute number of products in an RCA reaction may be high enough to provide statistical robustness, different RCA products can amplify and detect at varying efficiencies. Therefore, consistently detecting tens or hundreds of thousands of RCA products has been challenging.

概述Overview

连同其他,本文中描述了分析核酸样品的探针系统。探针可以按如此方式设计,从而它们可以连接至来自不同基因座(例如,不同染色体)的基因组DNA的靶片段(本文也称作“靶序列”或仅“片段”)以产生环状DNA分子。环状DNA分子,即便它们含有来自不同染色体的片段,均含有相同的“主链”序列。另外,在一些实施方案中,含有来自相同基因座的片段的全部环状DNA分子均含有相同基因座特异性标示序列(identifier sequence),即,基因座特异性条形码。在这些实施方案中,可以使用与主链中序列杂交的引物扩增环状DNA分子,并且可以通过RCA产物与标记的寡核苷酸杂交,检测衍生已克隆片段的基因座,其中标记的寡核苷酸与基因座特异性标示序列杂交。如将显而易见,可以使用多个基因座特异性标示序列和与这些序列杂交的可区分地标记的寡核苷酸,多路复用该方法的这个实施方案。因为全部环状产物具有相同的主链并且彼此仅因已克隆片段的序列和基因座特异性条形码而不同,所以一致地扩增从这些产物扩增的RCA产物,并且可以准确检测到对应于这些RCA产物的基因座。还提供了利用该探针系统的方法,以及开展该方法的试剂盒。In conjunction with other, the probe system of analyzing nucleic acid sample is described herein.Probe can be designed in such a way that they can be connected to the target fragment (herein also referred to as " target sequence " or only " fragment ") of the genomic DNA from different loci (for example, different chromosomes) to produce circular DNA molecules.Circular DNA molecules, even if they contain the fragment from different chromosomes, all contain identical " backbone " sequence.In addition, in some embodiments, all circular DNA molecules containing the fragment from the same locus all contain identical locus specific marker sequence (identifier sequence), that is, locus specific barcode.In these embodiments, the primer amplification circular DNA molecule with sequence hybridization in the main chain can be used, and the oligonucleotide hybridization of labeling can be carried out by RCA product, and the locus of derived cloned fragment can be detected, wherein the oligonucleotide of labeling hybridizes with locus specific marker sequence.As will be apparent, multiple locus specific marker sequences and the oligonucleotide of labeling that can be distinguished with these sequence hybridizations can be used, this embodiment of the method of multiplexing. Because all circular products have the same backbone and differ from each other only by the sequence of the cloned fragment and the locus-specific barcode, RCA products amplified from these products are consistently amplified, and the loci corresponding to these RCA products can be accurately detected. Methods utilizing this probe system and kits for performing the methods are also provided.

如下文将更详细地讨论,在某些情况下,使用来自怀有胎儿的孕妇的cfDNA的样品,该方法可以用来检测胎儿中的染色体异常(例如,21三体)。As will be discussed in more detail below, in certain cases, the method can be used to detect chromosomal abnormalities (e.g., trisomy 21) in the fetus using a sample of cfDNA from a pregnant woman carrying the fetus.

提供一种分析核酸样品的探针系统。在一些实施方案中,该探针系统可以包含:(a)序列B的标示寡核苷酸集合;(b)式X’-A’-B’-Z’的夹板寡核苷酸集合,其中:在该集合中:(i)序列A’和B’变动,并且(ii)序列X’和Z’彼此不同并且不是可变的;并且,在每个夹板寡核苷酸中:(i)序列A’与核酸样品的基因组片段互补并且(ii)序列B’与标示寡核苷酸集合的至少一个成员互补;和(c)一个或多个包含X和Z的探针序列,其中序列X和Z不是可变的并且与序列X’和Z’杂交;其中每个夹板寡核苷酸能够杂交至:(i)探针序列,(ii)标示寡核苷酸集合的成员和(iii)基因组片段,从而产生式X-A-B-Z的可连接复合物。在一些实施方案中,不同的标示寡核苷酸及其互补序列B’鉴定不同的染色体,例如,染色体21、18和13。A probe system for analyzing a nucleic acid sample is provided. In some embodiments, the probe system can comprise: (a) a set of marker oligonucleotides of sequence B; (b) a set of splint oligonucleotides of the formula X'-A'-B'-Z', wherein: in the set: (i) sequences A' and B' vary, and (ii) sequences X' and Z' are different from each other and are not variable; and, in each splint oligonucleotide: (i) sequence A' is complementary to a genomic fragment of the nucleic acid sample and (ii) sequence B' is complementary to at least one member of the set of marker oligonucleotides; and (c) one or more probe sequences comprising X and Z, wherein sequences X and Z are not variable and hybridize to sequences X' and Z'; wherein each splint oligonucleotide is capable of hybridizing to: (i) the probe sequence, (ii) a member of the set of marker oligonucleotides, and (iii) a genomic fragment, thereby generating a ligatable complex of the formula X-A-B-Z. In some embodiments, different marker oligonucleotides and their complementary sequence B' identify different chromosomes, for example, chromosomes 21, 18, and 13.

在一些实施方案中,标示寡核苷酸集合可以包含至少两个(例如,2个、3个或4个或更多个)不同的B序列标示寡核苷酸,并且在夹板寡核苷酸集合中存在至少100个不同的A’序列和与至少两个不同的标示寡核苷酸互补的至少两个不同的B’序列。In some embodiments, the collection of marker oligonucleotides can comprise at least two (e.g., 2, 3, or 4 or more) different B-sequence marker oligonucleotides, and there are at least 100 different A' sequences and at least two different B' sequences complementary to the at least two different marker oligonucleotides in the collection of splint oligonucleotides.

在一些实施方案中,每个标示寡核苷酸或其在夹板寡核苷酸中的互补B’序列可以对应于基因组片段。In some embodiments, each marker oligonucleotide or its complementary B' sequence in a splint oligonucleotide can correspond to a genomic fragment.

在一些实施方案中,每个标示寡核苷酸或其在夹板寡核苷酸中的互补B’序列可以指示基因组片段所来源的基因组中的基因座。In some embodiments, each marker oligonucleotide or its complementary B' sequence in a splint oligonucleotide can indicate the locus in the genome from which the genomic fragment originates.

在一些实施方案中,每个标示寡核苷酸或其在夹板寡核苷酸中的互补B’序列可以指示基因组片段所来源的染色体。In some embodiments, each marker oligonucleotide or its complementary B' sequence in a splint oligonucleotide can indicate the chromosome from which the genomic fragment originated.

在一些实施方案中,基因组片段来自哺乳动物基因组。In some embodiments, the genomic fragment is from a mammalian genome.

在一些实施方案中,每个标示寡核苷酸或其在夹板寡核苷酸中的互补B’序列可以鉴定染色体21、染色体18和染色体13的一者或多者。In some embodiments, each marker oligonucleotide or its complementary B' sequence in the splint oligonucleotide can identify one or more of chromosome 21, chromosome 18, and chromosome 13.

在一些实施方案中,基因组片段可以是限制性片段。In some embodiments, the genomic fragments may be restriction fragments.

在一些实施方案中,(c)的一个或多个探针序列还可以包含了包含序列Y的寡核苷酸,并且其中可连接复合物是线状的。In some embodiments, the one or more probe sequences of (c) further comprise an oligonucleotide comprising sequence Y, and wherein the ligatable complex is linear.

在一些实施方案中,探针系统还可以与(c)的一个或多个探针杂交的PCR引物对。In some embodiments, the probe system can also be a PCR primer pair that hybridizes to one or more probes of (c).

在一些实施方案中,(c)的一个或多个探针序列可以包含式X-Y-Z的主链探针,其中Y包含寡核苷酸序列,从而可连接复合物是式X-A-B-Z-Y的环状可连接复合物,其中序列Y接合序列X和Z。In some embodiments, one or more probe sequences of (c) can comprise a backbone probe of the formula X-Y-Z, wherein Y comprises an oligonucleotide sequence, such that the ligatable complex is a circular ligatable complex of the formula X-A-B-Z-Y, wherein sequence Y joins sequences X and Z.

在一些实施方案中,探针系统还可以包含与主链探针中的序列杂交的滚环扩增引物。In some embodiments, the probe system can further comprise a rolling circle amplification primer that hybridizes to a sequence in the backbone probe.

在一些实施方案中,探针系统还可以包含(A)使序列与主链探针杂交的滚环扩增引物;和(B)至多四个可区分地标记的检测寡核苷酸,其中每个可区分地标记的检测寡核苷酸与B’序列杂交。In some embodiments, the probe system can further comprise (A) a rolling circle amplification primer that hybridizes a sequence to the backbone probe; and (B) up to four distinguishably labeled detection oligonucleotides, wherein each distinguishably labeled detection oligonucleotide hybridizes to the B' sequence.

还提供一种分析样品的方法。在一些实施方案中,方法可以包括:(a)将上文总结的探针系统的任何实施方案与包含基因组片段的测试基因组样品杂交,以产生式X-A-B-Z的可连接复合物;(b)连接可连接复合物以产生式X-A-B-Z的产物DNA分子;并且(c)计数与序列B的每个基因座标示物相对应的产物DNA分子。Also provided is a method for analyzing a sample. In some embodiments, the method can include: (a) hybridizing any embodiment of the probe system summarized above to a test genomic sample comprising genomic fragments to produce a ligatable complex of the formula X-A-B-Z; (b) ligating the ligatable complex to produce product DNA molecules of the formula X-A-B-Z; and (c) counting the product DNA molecules corresponding to each locus marker of sequence B.

在一些实施方案中,可以通过以下方式进行计数:对产物DNA分子或其扩增产物测序,以产生序列读出结果,并且计数包含每个序列B或其互补物的序列读出结果的数目。In some embodiments, counting can be performed by sequencing the product DNA molecules or amplified products thereof to generate sequence reads, and counting the number of sequence reads comprising each sequence B or its complement.

在一些实施方案中,产物DNA分子可以是环状的,并且计数可以包括通过滚环扩增法扩增产物DNA分子,并计数包含每个序列B或其互补物的扩增产物的数目。在这些实施方案中,该方法可以包括使用与序列B’杂交的可区分地标记的探针标记RCA产物,并且通过对每种可区分的标记物计数RCA产物的数目,进行计数。In some embodiments, the product DNA molecules may be circular, and counting may comprise amplifying the product DNA molecules by rolling circle amplification and counting the number of amplification products comprising each sequence B or its complement. In these embodiments, the method may comprise labeling the RCA products using a distinguishably labeled probe that hybridizes to sequence B', and performing the counting by counting the number of RCA products for each distinguishable label.

在一些实施方案中,该方法可以包括:i.在平面支持物上沉积RCA产物;并且ii.在支持物的某区域中计数各个标记的RCA产物的数目。在这些实施方案中,支持物可以例如是载玻片或多孔透明毛细管膜。In some embodiments, the method may comprise: i. depositing the RCA products on a planar support; and ii. counting the number of individually labeled RCA products in a region of the support. In these embodiments, the support may be, for example, a glass slide or a porous transparent capillary membrane.

在一些实施方案中,不同序列B及其互补序列B’鉴定不同的染色体,并且方法还包括将包含B或B’的第一序列的产物DNA分子的数目与包含B或B’的第二序列的产物DNA分子的数目比较,以确定基因组样品是否具有非整倍体。In some embodiments, different sequences B and their complementary sequences B' identify different chromosomes, and the method further comprises comparing the number of product DNA molecules comprising the first sequence of B or B' with the number of product DNA molecules comprising the second sequence of B or B' to determine whether the genomic sample has aneuploidy.

在一些实施方案中,该方法可以包括将步骤(c)的计数结果与从一份或多份参比样品获得的计数结果比较。In some embodiments, the method can include comparing the count results of step (c) to count results obtained from one or more reference samples.

在一些实施方案中,测试基因组样品可以来自疑似患有疾病或病状或面临患有疾病或病状风险的患者,并且步骤(c)的计数结果提供患者或其胎儿是否患有疾病或病状的指示。In some embodiments, the test genomic sample can be from a patient suspected of having or at risk of having a disease or condition, and the counting result of step (c) provides an indication of whether the patient or their fetus has the disease or condition.

在一些实施方案中,疾病或病状可以是癌症、感染性疾病、炎性疾病、移植排斥或三体性。In some embodiments, the disease or condition may be cancer, an infectious disease, an inflammatory disease, transplant rejection, or a trisomy.

在一些实施方案中,片段是限制性片段。In some embodiments, the fragments are restriction fragments.

附图简述BRIEF DESCRIPTION OF THE DRAWINGS

技术人员将理解,下文描述附图仅用于说明目的。附图不意在以任何方式限制本发明教导内容的范围。The skilled person will understand that the drawings described below are for illustration purposes only and are not intended to limit the scope of the present teachings in any way.

图1示意地显示本发明探针系统的一些特征。FIG1 schematically illustrates some features of the probe system of the present invention.

图2示意地显示序列B怎样起到鉴定序列A的基因座的作用。FIG2 schematically shows how sequence B serves to identify the locus of sequence A.

图3示意地显示一些示例性探针系统布局。FIG3 schematically shows some exemplary probe system layouts.

图4示意地显示主题方法的实施方案的一些特征。FIG4 schematically illustrates some features of an embodiment of the subject method.

图5示意地显示主题方法的一个实施方案的一些特征。FIG5 schematically illustrates some features of one embodiment of the subject method.

图6示意地显示探针系统的设计。Figure 6 schematically shows the design of the probe system.

图7显示使用两个不同探针系统获得的数据。Figure 7 shows data obtained using two different probe systems.

图8显示从分析临床样品获得的数据。Figure 8 shows data obtained from analysis of clinical samples.

定义definition

在更详细地描述示例性实施方案前,阐述以下定义以显示并定义说明书中所用术语的含义和范围。Before describing exemplary embodiments in greater detail, the following definitions are set forth to illustrate and define the meaning and scope of the terms used in the specification.

数值范围包括界定该范围的数值。除非另外说明,否则分别地,核酸从左至右以5'至3'方向书写,并且氨基酸序列从左至由以氨基至羧基方向书写。Numerical ranges are inclusive of the numbers defining the range.Unless otherwise indicated, nucleic acids are written left to right in 5' to 3' orientation and amino acid sequences are written left to right in amino to carboxyl orientation, respectively.

除非另外定义,否则本文中所用的全部技术与科学术语均具有如本发明所属领域的普通技术人员通常理解的相同含义。Singleton等人,DICTIONARY OF MICROBIOLOGY ANDMOLECULAR BIOLOGY,第2版,John Wiley and Sons,New York(1994)和Hale与Markham,THEHARPER COLLINS DICTIONARY OF BIOLOGY,Harper Perennial,N.Y.(1991)向技术人员提供本文中所使用的众多术语的一般含义。另外,为清晰和易于参考,下文定义某些术语。Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Singleton et al., DICTIONARY OF MICROBIOLOGY AND MOLECULAR BIOLOGY, 2nd ed., John Wiley and Sons, New York (1994) and Hale and Markham, THE HARPER COLLINS DICTIONARY OF BIOLOGY, Harper Perennial, N.Y. (1991) provide those of skill in the art with a general meaning of many of the terms used herein. In addition, for clarity and ease of reference, certain terms are defined below.

必须指出,除非上下文另外明确指出,否则如本文中和所附权利要求中所用,单数形式“一个(a)”、“一种(an)”和“该(an)”包括复数称谓。例如,术语“引物”指一种或多种引物,即,单一引物和多重引物。进一步指出,可以起草这些权利要求以排除任何任选的要素。因而,这种声明意在充当与描述权利要求要素相联系地使用这类排他性术语如“单独”、“仅”等或使用“否定式”限制的先行基础。It must be noted that, as used herein and in the appended claims, the singular forms "a," "an," and "the" include plural reference unless the context clearly dictates otherwise. For example, the term "primer" refers to one or more primers, i.e., a single primer and multiple primers. It is further noted that these claims can be drafted to exclude any optional elements. Thus, this statement is intended to serve as antecedent basis for the use of such exclusive terminology as "solely," "only," and the like in connection with describing claim elements, or the use of a "negative" limitation.

术语“核苷酸”是意在包括不仅含有已知嘌呤碱基和嘧啶碱基、还含有已经修饰的其他杂环碱基的那些部分。这类修饰包括甲基化的嘌呤或嘧啶、酰化的嘌呤或嘧啶,烷基化的核糖或其他杂环。此外,术语“核苷酸”包括含有半抗原或荧光标记物并且可以不仅含有常规的核糖和脱氧核糖糖,还含有其他糖的那些部分。修饰的核苷或核苷酸还包括在糖部分上的修饰,例如,其中一个或多个羟基替换为卤原子或脂族基团,官能化为醚、胺等。The term "nucleotide" is intended to include those moieties that contain not only the known purine and pyrimidine bases but also other modified heterocyclic bases. Such modifications include methylated purines or pyrimidines, acylated purines or pyrimidines, alkylated ribose or other heterocycles. In addition, the term "nucleotide" includes moieties that contain haptens or fluorescent markers and may contain not only conventional ribose and deoxyribose sugars but also other sugars. Modified nucleosides or nucleotides also include modifications on the sugar moiety, for example, where one or more hydroxyl groups are replaced with halogen atoms or aliphatic groups, functionalized into ethers, amines, etc.

术语“核酸”和“多核苷酸”在本文中互换地用来描述任何长度(例如,大于约2个碱基、大于约10个碱基、大于约100个碱基、大于约500个碱基、大于1000个碱基、至多约10,000个或更多个碱基)的由核苷酸(例如,脱氧核糖核苷酸或核糖核苷酸)组成并且可以酶促或合成产生的聚合物(例如,如美国专利号5,948,902及其中引用的参考文献中所述的PNA),所述聚合物可以与天然存在的核酸按照与两个天然存在核酸类似的序列特异性方式杂交,例如,可以参与Watson-Crick碱基配对相互作用。天然存在的核苷酸包括鸟嘌呤、胞嘧啶、腺嘌呤、胸腺嘧啶、尿嘧啶(分别是G、C、A、T和U)。DNA和RNA分别具有脱氧核糖和核糖糖主链,而PNA的主链由通过肽键连接的重复性N-(2-氨基甲基)-甘氨酸单元组成。在PNA中,多种嘌呤碱基和嘧啶碱基通过亚甲基羰基键与主链连接。锁核酸(LNA),经常称作不可及性RNA,是修饰的RNA核苷酸。LNA核苷酸的核糖部分以连接2'氧和4'碳的额外桥进行修饰。这个桥将核糖“锁定”处于3'-内(North)构象,这种构象经常存在A形式双链体中。无论何时需要,LNA核苷酸均可以与寡核苷酸中的DNA残基或RNA残基混合。术语“非结构化核酸”或“UNA”是含有彼此以降低的稳定性结合的非天然核苷酸的核酸。例如,非结构化核酸可以含有G’残基和C’残基,其中这些残基对应于G和C的非天然存在形式,即,类似物,所述的非天然存在形式彼此以降低的稳定性发生碱基配对,但是保留分别与天然存在的C和G残基发生碱基配对的能力。非结构化核酸在US 20050233340中描述,所述文献通过引用的方式就UNA公开内容并入本文。The terms "nucleic acid" and "polynucleotide" are used interchangeably herein to describe polymers of any length (e.g., greater than about 2 bases, greater than about 10 bases, greater than about 100 bases, greater than about 500 bases, greater than 1000 bases, up to about 10,000 or more bases) composed of nucleotides (e.g., deoxyribonucleotides or ribonucleotides) and that can be produced enzymatically or synthetically (e.g., PNAs as described in U.S. Pat. No. 5,948,902 and references cited therein) that can hybridize with naturally occurring nucleic acids in a sequence-specific manner similar to two naturally occurring nucleic acids, for example, can participate in Watson-Crick base pairing interactions. Naturally occurring nucleotides include guanine, cytosine, adenine, thymine, and uracil (G, C, A, T, and U, respectively). DNA and RNA have deoxyribose and ribose sugar backbones, respectively, while the backbone of PNAs is composed of repeating N-(2-aminomethyl)-glycine units linked by peptide bonds. In PNA, various purine and pyrimidine bases are linked to the backbone via methylene carbonyl bonds. Locked nucleic acids (LNA), often referred to as inaccessible RNA, are modified RNA nucleotides. The ribose moiety of an LNA nucleotide is modified with an additional bridge connecting the 2' oxygen and the 4' carbon. This bridge "locks" the ribose sugar in a 3'-endo (North) conformation, which is often found in A-form duplexes. Whenever desired, LNA nucleotides can be mixed with DNA or RNA residues in oligonucleotides. The term "unstructured nucleic acid" or "UNA" refers to a nucleic acid containing non-natural nucleotides that are bound to each other with reduced stability. For example, an unstructured nucleic acid can contain G' and C' residues, where these residues correspond to non-naturally occurring forms of G and C, i.e., analogs, that base pair with each other with reduced stability but retain the ability to base pair with naturally occurring C and G residues, respectively. Unstructured nucleic acids are described in US 20050233340, which is incorporated herein by reference for its UNA disclosure.

如本文所用的术语“寡核苷酸”指约2至200个核苷酸、直至500个核苷酸长度的核苷酸单链多聚体。寡核苷酸可以是合成的或可以酶促产生,并且在一些实施方案中具有30至150个核苷酸长度。寡核苷酸可以含有核糖核苷酸单体(即,可以是寡核糖核苷酸)或脱氧核糖核苷酸单体。寡核苷酸可以例如具有10至20、21至30、31至40、41至50、51至60、61至70、71至80、80至100、100至150或150至200个核苷酸长度。As used herein, term " oligonucleotide " refers to about 2 to 200 nucleotides, up to a nucleotide single-stranded polymer of 500 nucleotides in length. Oligonucleotide can be synthetic or can be enzymatically produced, and in some embodiments has a 30 to 150 nucleotide length. Oligonucleotide can contain ribonucleotide monomers (that is, can be oligoribonucleotides) or deoxyribonucleotide monomers. Oligonucleotide can for example have 10 to 20, 21 to 30, 31 to 40, 41 to 50, 51 to 60, 61 to 70, 71 to 80, 80 to 100, 100 to 150 or 150 to 200 nucleotide length.

如本文中所用的术语“引物”指一种寡核苷酸,所述寡核苷酸在置于引起与一条核酸链互补的引物延伸产物合成的条件下(即存在核苷酸和诱导物质(如DNA聚合酶)并在适宜的温度和pH)时能够充当合成的起始点。引物可以是单链的并且必须足够长到在诱导物质存在下引发所需延伸产物的合成。引物的确切长度将取决于众多因素,包括温度、引物来源和方法的用途。例如,对于诊断性应用,取决于靶序列或片段的复杂程度,寡核苷酸引物一般含有15-25个或更多个核苷酸,不过它可以含有更少的核苷酸。选择本文中引物的以与特定靶DNA序列的不同链基本上互补。这意味着引物必须充分互补以与其相应的链杂交。因此,引物序列不需要反映模板的确切序列。例如,非互补性核苷酸片段可以与引物的5'末端连接,引物序列的剩余部分与链互补。备选地,非互补性碱基或更长的序列可以散布入引物中,只要该引物序列与所述链的序列具有足够互补性以与之杂交,从而形成用于合成延伸产物的模板。As used herein, the term "primer" refers to an oligonucleotide that can serve as a starting point for synthesis when placed under conditions that cause the synthesis of a primer extension product complementary to a nucleic acid chain (i.e., in the presence of nucleotides and an inducing substance (such as DNA polymerase) and at a suitable temperature and pH). The primer can be single-stranded and must be long enough to initiate the synthesis of the desired extension product in the presence of an inducing substance. The exact length of the primer will depend on numerous factors, including the temperature, the source of the primer, and the purposes of the method. For example, for diagnostic applications, depending on the complexity of the target sequence or fragment, an oligonucleotide primer generally contains 15-25 or more nucleotides, but it can contain fewer nucleotides. Primers herein are selected to be substantially complementary to the different chains of a specific target DNA sequence. This means that primers must be fully complementary to hybridize with their corresponding chains. Therefore, the primer sequence does not need to reflect the exact sequence of the template. For example, a non-complementary nucleotide fragment can be connected to the 5' end of the primer, and the remainder of the primer sequence is complementary to the chain. Alternatively, non-complementary bases or longer sequences may be interspersed into the primer, provided that the primer sequence is sufficiently complementary to the sequence of the strand to hybridize therewith, thereby forming a template for the synthesis of extension products.

术语“杂交”指其中核酸链在正常杂交条件下与第二互补性核酸链复性并形成稳定双链体(同源双链体或异源双链体)并且在相同的正常杂交条件下不与不相关的核酸分子形成稳定双链体的过程。双链体的形成通过使两条互补性核酸链在杂交反应中复性完成。可以通过以下方式使杂交反应具有高度特异性:调整杂交反应发生的杂交条件(经常称作杂交严格性),从而两条核酸链之间的杂交将不形成稳定双链体,例如,在正常严格性条件下保留双链区域的双链体,除非这两条核酸链在基本上或完全互补的特定序列中含有一定数目的核苷酸。轻易确定任何给定杂交反应的“正常杂交或正常严格性条件”。参见,例如,Ausubel等人,Current Protocols in Molecular Biology,John Wiley&Sons,Inc.,New York,或Sambrook等人,Molecular Cloning:A Laboratory Manual,Cold SpringHarbor Laboratory Press。如本文中所用,术语“杂交”指核酸分子链通过碱基配对作用与互补链结合的任何过程。The term "hybridization" refers to a process in which a nucleic acid chain anneals with a second complementary nucleic acid chain under normal hybridization conditions and forms a stable duplex (homoduplex or heteroduplex) and does not form a stable duplex with an unrelated nucleic acid molecule under identical normal hybridization conditions. The formation of a duplex is completed by annealing two complementary nucleic acid chains in a hybridization reaction. Hybridization reaction can be made highly specific in the following manner: the hybridization conditions (often referred to as hybridization stringency) in which the hybridization reaction occurs are adjusted, so that the hybridization between the two nucleic acid chains will not form a stable duplex, for example, the duplex in the double-stranded region is retained under normal stringency conditions, unless the two nucleic acid chains contain a certain number of nucleotides in a specific sequence that is substantially or completely complementary. "Normal hybridization or normal stringency conditions" of any given hybridization reaction is easily determined. See, for example, Ausubel et al., Current Protocols in Molecular Biology, John Wiley & Sons, Inc., New York, or Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press. As used herein, the term "hybridization" refers to any process by which a strand of nucleic acid molecules binds to a complementary strand through base pairing.

如果一个核酸和一个参比核酸序列在高严格性杂交和洗涤条件下彼此特异性杂交,则该核酸视为“选择性可杂交于”参比核酸序列。中度和高严格性杂交条件是已知的(参见,例如,Ausubel等人,Short Protocols in Molecular Biology,第3版,Wiley&Sons1995和Sambrook等人,Molecular Cloning:A Laboratory Manual,第3版,2001ColdSpring Harbor,N.Y.)。高严格性条件的一个示例包括在约42℃于50%甲酰胺,5×SSC,5×Denhardt's溶液,0.5%SDS和100μg/ml变性载体DNA中杂交,随后在室温于2×SSC和0.5%SDS中洗涤2次并且在42℃于0.1×SSC和0.5%SDS中额外洗涤2次。A nucleic acid is considered "selectively hybridizable to" a reference nucleic acid sequence if it specifically hybridizes to the reference nucleic acid sequence under high stringency hybridization and wash conditions. Moderate and high stringency hybridization conditions are known (see, e.g., Ausubel et al., Short Protocols in Molecular Biology, 3rd ed., Wiley & Sons 1995 and Sambrook et al., Molecular Cloning: A Laboratory Manual, 3rd ed., 2001 Cold Spring Harbor, N.Y.). An example of high stringency conditions includes hybridization in 50% formamide, 5x SSC, 5x Denhardt's solution, 0.5% SDS, and 100 μg/ml denatured carrier DNA at about 42°C, followed by two washes in 2x SSC and 0.5% SDS at room temperature and two additional washes in 0.1x SSC and 0.5% SDS at 42°C.

如本文所用,术语“条形码序列”或“分子条形码”,指用来a)鉴定和/或追溯反应中多核苷酸的来源和/或b)计数初始分子被测序多少次(例如,在样品中基本上每个分子均用不同序列加标签的并且随后扩增样品的情况下)的独特核苷酸序列。条形码序列可以在寡核苷酸的5′端、3’端或在其中部。条形码序列可以在大小和组成方面广泛变动;以下参考文献提供选择适用于具体实施方案的条形码序列集合的指南:Casbon(Nuc.Acids Res.2011,22e81);Brenner美国专利号5,635,400;Brenner等人,Proc.Natl.Acad.Sci.,97:1665-1670(2000);Shoemaker等人,Nature Genetics,14:450-456(1996);Morris等人,欧洲专利公开0799897A1;Wallace,美国专利号5,981,179等。在具体的实施方案中,条形码序列可以具有4至36个核苷酸、或6至30个核苷酸或8至20个核苷酸范围内的长度。As used herein, the term "barcode sequence" or "molecular barcode" refers to a unique nucleotide sequence used to a) identify and/or trace the source of a polynucleotide in a reaction and/or b) count how many times an initial molecule has been sequenced (e.g., in the case where substantially every molecule in a sample is tagged with a different sequence and the sample is subsequently amplified). The barcode sequence can be at the 5' end, 3' end, or in the middle of an oligonucleotide. Barcode sequences can vary widely in size and composition; the following references provide guidance for selecting a barcode sequence set suitable for a particular embodiment: Casbon (Nuc. Acids Res. 2011, 22e81); Brenner U.S. Pat. No. 5,635,400; Brenner et al., Proc. Natl. Acad. Sci., 97: 1665-1670 (2000); Shoemaker et al., Nature Genetics, 14: 450-456 (1996); Morris et al., European Patent Publication 0799897A1; Wallace, U.S. Pat. No. 5,981,179, etc. In specific embodiments, the barcode sequence can have a length ranging from 4 to 36 nucleotides, or from 6 to 30 nucleotides, or from 8 to 20 nucleotides.

如本文所用,术语“测序”指借以获得多核苷酸的至少10个连续核苷酸的身份(例如,至少20、至少50、至少100个或至少200个或更多个连续核苷酸的身份)的方法。As used herein, the term "sequencing" refers to methods by which the identity of at least 10 consecutive nucleotides of a polynucleotide is obtained (e.g., the identity of at least 20, at least 50, at least 100, or at least 200 or more consecutive nucleotides).

术语“下一代测序”指目前例如Illumina、Life Technologies和Roche等所用的所谓平行化合成测序平台或连接测序平台。下一代测序方法还可以包括纳米孔测序方法或基于电子检测的方法,如,例如,由Life Technologies商业化的Ion Torrent技术。The term "next generation sequencing" refers to so-called parallelized synthesis sequencing platforms or ligation sequencing platforms currently used by, for example, Illumina, Life Technologies, and Roche. Next generation sequencing methods may also include nanopore sequencing methods or methods based on electronic detection, such as, for example, the Ion Torrent technology commercialized by Life Technologies.

如本文所用,术语“双链体”或“双链体的”描述了发生碱基配对(即,杂交在一起)的两个互补性多核苷酸。As used herein, the terms "duplex" or "duplexed" describe two complementary polynucleotides that are base paired (ie, hybridized together).

术语“确定”、“测量”、“评价”、“评估”、“验定”和“分析”在本文中可互换地用来指多种测量形式,并且包括确定某要素是否存在或不存在。这些术语包括定量性和/或定性确定。评估可以是相对或绝对的。The terms "determine," "measure," "evaluate," "assess," "verify," and "analyze" are used interchangeably herein to refer to various forms of measurement and include determining whether an element is present or absent. These terms encompass quantitative and/or qualitative determinations. Assessments can be relative or absolute.

如本文所用,术语“亲和标签”指可以用来分离连接至亲和标签的分子与不含有亲和标签的其他分子的部分。“亲和标签”是特异性结合对子(即,这样的两个分子,其中一个分子通过化学或物理手段与特异性结合另一个分子)的成员。特异性结合对子的互补性成员(在本文中称作“捕获剂”)可以固定化(例如,固定化至色谱支持物、珠或平面表面)以产生特异性结合亲和标签的亲和色谱支持物。换而言之,“亲和标签”可以与“捕获剂”结合,其中亲和标签与捕获剂特异性结合,从而促进连接至亲和标签的分子与不含有亲和标签的其他分子分开。As used herein, the term "affinity tag" refers to a portion that can be used to separate molecules connected to the affinity tag from other molecules that do not contain the affinity tag. An "affinity tag" is a member of a specific binding pair (i.e., two molecules in which one molecule specifically binds to the other molecule by chemical or physical means). The complementary member of the specific binding pair (referred to herein as a "capture agent") can be immobilized (e.g., to a chromatography support, a bead, or a planar surface) to produce an affinity chromatography support that specifically binds to the affinity tag. In other words, an "affinity tag" can be combined with a "capture agent," wherein the affinity tag specifically binds to the capture agent, thereby facilitating the separation of molecules connected to the affinity tag from other molecules that do not contain the affinity tag.

如本文所用,术语“生物素部分”指包括生物素或生物素类似物如脱硫生物素、氧生物素、2’-亚氨基生物素、二氨基生物素、生物素亚砜、生物胞素等的亲和剂。生物素部分以至少10-8M的亲和力与链霉亲和素结合。生物素亲和剂还可以包含接头,例如,─LC-生物素、─LC-LC-生物素、─SLC-生物素或─PEGn-生物素,其中n是3-12。As used herein, the term "biotin moiety" refers to an affinity agent comprising biotin or a biotin analog, such as desthiobiotin, oxybiotin, 2'-iminobiotin, diaminobiotin, biotin sulfoxide, biocytin, and the like. The biotin moiety binds to streptavidin with an affinity of at least 10<sup>-8</sup> M. The biotin affinity agent may further comprise a linker, for example, ─LC-biotin, ─LC-LC-biotin, ─SLC-biotin, or ─PEG<sub> n </sub>-biotin, where n is 3-12.

如本文所用,术语“末端核苷酸”,指在核酸分子5’末端或3’末端的核苷酸。核酸分子可以处于双链形式(即,双链体)或处于单链形式。As used herein, the term "terminal nucleotide" refers to the nucleotide at the 5' end or the 3' end of a nucleic acid molecule. Nucleic acid molecules can be in double-stranded form (i.e., duplex) or in single-stranded form.

如本文所用,术语“连接”指第一DNA分子5'末端处的末端核苷酸与第二DNA分子3'末端处的末端核苷酸的酶促催化连接。As used herein, the term "ligation" refers to the enzymatically catalyzed joining of the terminal nucleotide at the 5' end of a first DNA molecule to the terminal nucleotide at the 3' end of a second DNA molecule.

术语“多个”、“集合”和“群体”互换地用来指含有至少2个成员的某个实体。在某些情况下,多个可以具有至少10、至少100、至少100、至少10,000、至少100,000、至少106、至少107、至少108或至少109个或更多个成员。The terms "plurality,""collection," and "population" are used interchangeably to refer to an entity comprising at least 2 members. In some cases, a plurality can have at least 10, at least 100, at least 100, at least 10,000, at least 10,000, at least 10 6 , at least 10 7 , at least 10 8 , or at least 10 9 or more members.

术语“消化”意在指示核酸由限制性酶切割的过程。为了消化核酸,限制性酶和含有该限制性酶的识别位点的核酸在适于限制性酶发挥作用的条件下接触。适于市售限制性酶活性的条件是已知的并且当购买时与这些酶提供。The term "digestion" is intended to indicate the process by which a nucleic acid is cleaved by a restriction enzyme. To digest a nucleic acid, a restriction enzyme and a nucleic acid containing a recognition site for the restriction enzyme are contacted under conditions suitable for the activity of the restriction enzyme. Conditions suitable for the activity of commercially available restriction enzymes are known and are provided with these enzymes when purchased.

“寡核苷酸结合位点”指在靶多核苷酸或片段中与寡核苷酸杂交的位点。如果寡核苷酸“提供”引物的结合位点,随后则该引物可以与寡核苷酸或其互补物杂交。An "oligonucleotide binding site" refers to a site in a target polynucleotide or fragment to which an oligonucleotide hybridizes. If an oligonucleotide "provides" a binding site for a primer, then the primer can hybridize to the oligonucleotide or its complement.

如本文所用,术语“分开”,指物理分开两种要素(例如,借助大小或亲和力等)以及降解一种要素,留下另一种要素保持完整。As used herein, the term "separate" refers to the physical separation of two elements (eg, by size or affinity, etc.) as well as the degradation of one element, leaving the other element intact.

如本文所用,术语“参比染色体区域”指核苷酸序列已知的染色体区域,例如其序列例如保藏于NCBI Genbank数据库或其他数据库的染色体区域。As used herein, the term "reference chromosomal region" refers to a chromosomal region whose nucleotide sequence is known, such as a chromosomal region whose sequence is deposited, for example, in the NCBI Genbank database or other databases.

如本文所用的术语“链”指由通过共价键(例如,磷酸二酯键)共价连接在一起的核苷酸构成的核酸。As used herein, the term "strand" refers to a nucleic acid composed of nucleotides covalently linked together by covalent bonds (eg, phosphodiester bonds).

在细胞中,DNA通常以双链形式存在,如此具有两个核酸互补链,在本文中称作“顶部”链和“底部”链。在某些情况下,染色体区域的互补链可以称作“正”和“负”链、“第一”链和“第二”链、“编码”链和“非编码”链、“Watson”链和“Crick”链或“有义”和“反义”链。链归属为顶部或底部链是任意的并且不暗示任何特定的取向、功能或结构。几个示例性哺乳动物染色体区域(例如,BAC、装配物、染色体等)的第一链的核苷酸序列是已知的,并且可以例如在NCBI Genbank数据库中找到。In cell, DNA exists in double-stranded form conventionally, so has two nucleic acid complementary chains, referred to as "top" chain and "bottom" chain in this article.In some cases, the complementary chain of chromosome region can be referred to as "plus" and "minus" chain, "first" chain and "second" chain, "coding" chain and "non-coding" chain, "Watson" chain and "Crick" chain or "righteousness" and "antisense" chain.Chain is attributed to top or bottom chain is arbitrary and does not imply any specific orientation, function or structure.The nucleotide sequence of the first chain of several exemplary mammal chromosome regions (for example, BAC, assembly, chromosome etc.) is known, and can for example be found in the NCBI Genbank database.

如本文所用,术语“顶部链”指核酸的任一条链,而不是核酸的两条链。当寡核苷酸或引物仅与顶部链结合或复性时,它仅与一条链结合,但是不与另一条链结合。如本文所用,术语“底部链”指与“顶部链”互补的链。当寡核苷酸仅与一条链结合或复性时,它仅与一条链(例如,第一链或第二链)结合,但是不与另一条链结合。As used herein, the term "top strand" refers to either strand of a nucleic acid, not both strands of a nucleic acid. When an oligonucleotide or primer binds or anneals only to the top strand, it binds only to one strand, but not to the other strand. As used herein, the term "bottom strand" refers to the strand that is complementary to the "top strand." When an oligonucleotide binds or anneals only to one strand, it binds only to one strand (e.g., the first strand or the second strand), but not to the other strand.

术语“共价连接”指在两个分开的分子(例如,双链核酸的顶部链和底部链)之间产生共价键。连接(ligating)是一种类型的共价连接。The term "covalently linking" refers to the creation of a covalent bond between two separate molecules (eg, the top and bottom strands of a double-stranded nucleic acid). Ligating is a type of covalent linking.

如本文所用,术语“变性”指将双链体置于合适的变性条件,使核酸双链体的至少一部分碱基对分开。变性条件是本领域熟知的。在一个实施方案中,为了使核酸双链体变性,双链体可以暴露于双链体解链温度的温度,从而令双链体的一条链从另一条链释放。在某些实施方案中,核酸可以通过使核酸暴露于至少90℃的温度持续适量的时间(例如,至少30秒,直至30分钟)而变性。核酸也可以化学地变性(例如,使用脲或NaOH)。As used herein, the term "denaturation" refers to subjecting a duplex to suitable denaturing conditions to separate at least a portion of the base pairs of the nucleic acid duplex. Denaturing conditions are well known in the art. In one embodiment, to denature a nucleic acid duplex, the duplex can be exposed to a temperature at the melting temperature of the duplex, thereby releasing one strand of the duplex from the other. In certain embodiments, the nucleic acid can be denatured by exposing the nucleic acid to a temperature of at least 90° C. for an appropriate amount of time (e.g., at least 30 seconds up to 30 minutes). The nucleic acid can also be denatured chemically (e.g., using urea or NaOH).

如本文所用,术语“标记物”指可以用来提供可检测(优选地可定量)作用并且可以与核酸或蛋白质连接的任何原子或分子。标记物包括但不限于单独或与可以通过荧光共振能量转移(FRET)抑制或偏移发射光谱的部分组合的染料和放射标记物如32P;结合部分如生物素;半抗原如洋地黄甙;激活发光、磷光或发荧光部分;和荧光染料。标记物可以提供通过荧光、放射性活度、比色法、重量分析法、X射线衍射或吸收、磁性、酶活性等可检测的信号。标记物可以是带电荷部分(正电荷或负电荷)或可选地,可以是电中性的。标记物可以包含核酸或蛋白质序列或由其组成,只要包含标记物的序列是可检测的。As used herein, the term "label" refers to any atom or molecule that can be used to provide a detectable (preferably quantifiable) effect and that can be attached to a nucleic acid or protein. Labels include, but are not limited to, dyes and radiolabels such as 32 P, alone or in combination with moieties that can inhibit or shift the emission spectrum by fluorescence resonance energy transfer (FRET); binding moieties such as biotin; haptens such as digoxin; activated luminescent, phosphorescent or fluorescent moieties; and fluorescent dyes. Labels can provide a signal detectable by fluorescence, radioactivity, colorimetry, gravimetric analysis, X-ray diffraction or absorption, magnetism, enzymatic activity, and the like. Labels can be charged moieties (positive or negative) or, alternatively, can be electrically neutral. Labels can comprise or consist of a nucleic acid or protein sequence, as long as the sequence comprising the label is detectable.

如本文所用,术语“标记的寡核苷酸”和“标记的探针”指具有亲和标签(例如,生物素部分)的寡核苷酸、用使得分离或检测成为可能(例如,赋予不同密度的溴-脱氧尿苷或胶体金粒子)的原子或基团修饰的寡核苷酸,和用可光学检测的标记物(例如,荧光或另一个类型的光发射标记物)修饰的寡核苷酸。仅含有天然存在的核苷酸的寡核苷酸不是标记的寡核苷酸。As used herein, the terms "labeled oligonucleotide" and "labeled probe" refer to oligonucleotides having affinity tags (e.g., biotin moieties), oligonucleotides modified with atoms or groups that allow separation or detection (e.g., bromo-deoxyuridine or colloidal gold particles that impart different densities), and oligonucleotides modified with optically detectable labels (e.g., fluorescent or another type of light-emitting label). Oligonucleotides containing only naturally occurring nucleotides are not labeled oligonucleotides.

如本文所用,术语“延伸”指使用聚合酶通过添加核苷酸而延伸引物。如果与核酸复性的引物延伸,核酸充当延伸反应的模板。As used herein, the term "extension" refers to extending a primer by adding nucleotides using a polymerase. If a primer annealed to a nucleic acid is extended, the nucleic acid serves as a template for the extension reaction.

如本文所用,在短语“连接第一和第二寡核苷酸至片段的相应末端”中,术语“相应末端”意指将一个寡核苷酸添加至该片段的一个末端并将另一个寡核苷酸添加至靶片段的另一端。As used herein, in the phrase "ligating a first and a second oligonucleotide to corresponding ends of a fragment," the term "corresponding ends" means adding one oligonucleotide to one end of the fragment and adding another oligonucleotide to the other end of the target fragment.

如本文所用,在彼此可连接相邻的两个寡核苷酸序列的语境下,术语“可连接相邻的”意指在两个寡核苷酸之间不存在间插性核苷酸并且它们可以彼此连接。As used herein, the term "ligatably adjacent" in the context of two oligonucleotide sequences that are ligatably adjacent to each other means that there are no intervening nucleotides between the two oligonucleotides and they can be ligated to each other.

如本文所用,如本文所用,术语“夹板寡核苷酸”指与两个或更多个其他多核苷酸杂交时,寡核苷酸充当“夹板”以使这些多核苷酸彼此相邻定位,从而它们可以连接在一起,如图1中所示。As used herein, the term "splint oligonucleotide" refers to an oligonucleotide that, when hybridized to two or more other polynucleotides, acts as a "splint" to position these polynucleotides adjacent to each other so that they can be ligated together, as shown in FIG1 .

如本文所用,术语“环状核酸分子”指处于无游离3’或5'末端的闭合圆环形式的链。As used herein, the term "circular nucleic acid molecule" refers to a chain in the form of a closed circle with no free 3' or 5' termini.

如本文所用,术语“对应于”及语法等同物,例如,“相应”,指本术语所指的各要素之间的特定关系。例如,对应于基因组中某序列的RCA含有与基因组中该序列相同的核苷酸序列。As used herein, the term "corresponding to" and grammatical equivalents, such as "corresponding to", refer to a specific relationship between the elements to which the term refers. For example, an RCA corresponding to a sequence in a genome contains the same nucleotide sequence as the sequence in the genome.

本文所述的某些多核苷酸可以由某个式(例如,“X’-A’-B’-Z’”)提及。除非另外说明,否则由某个式限定的多核苷酸可以5’至3’方向或5’至3’方向定向。例如,由式“X’-A’-B’-Z’”限定的多核苷酸可以是“5’-X’-A’-B’-Z’-3’”或“3’-X’-A’-B’-Z’-5’”。式的组分,例如,“A”,“X”和“B”等,分别指多核苷酸中可分开定义的核苷酸序列,其中,除非从上下文提示(例如,在“特定式的可连接复合物的上下文中),否则各序列共价连接在一起,从而由某个式描述的多核苷酸是单个分子。在许多情况下,式的组分在单个分子中彼此紧邻。遵循习惯,式中所示的序列的互补物将用单引号(')指示,从而序列“A”的互补物将是“A'”。另外,除非另外说明或从上下文提示,由某个式限定的多核苷酸可以在其3’末端、其5'末端或同时在3'末端和5'末端具有额外的序列、引物结合位点、分子条形码、启动子或间隔序列等。如果由某个式限定的多核苷酸将描述为环状,则这些分子的末端直接或间接地连接在一起。例如,在式X-A-B-Z-Y的环状复合物的情况下,则该分子的5'末端直接或间接地连接至分子的3’末端以产生一个环。如将显而易见,多核苷酸的各种组分序列(例如,A、B、C、X、Y、Z等)可以独立地是任何所需的长度,只要它们能够执行所需的功能(例如,与另一个序列杂交)。例如,多核苷酸的各种组分序列可以独立地具有8-80个核苷酸(例如,10-50个核苷酸或12-30个核苷酸)范围内的长度。Certain polynucleotides described herein may be referred to by a formula (e.g., "X'-A'-B'-Z'"). Unless otherwise indicated, a polynucleotide defined by a formula can be oriented in the 5' to 3' direction or the 5' to 3' direction. For example, a polynucleotide defined by the formula "X'-A'-B'-Z'" can be "5'-X'-A'-B'-Z'-3'" or "3'-X'-A'-B'-Z'-5'." The components of a formula, e.g., "A," "X," and "B," etc., each refer to a separately definable sequence of nucleotides in a polynucleotide, wherein, unless otherwise indicated by the context (e.g., in the context of a "linkable complex of a particular formula"), the sequences are covalently linked together so that the polynucleotide described by a certain formula is a single molecule. In many cases, the components of a formula are immediately adjacent to each other in a single molecule. Following convention, the complement of a sequence shown in a formula will be indicated by a single quote ('), so that the complement of sequence "A" will be "A'." In addition, unless otherwise indicated or indicated by the context, a polynucleotide defined by a certain formula may have additional sequences, primer binding sites, molecules at its 3' terminus, its 5' terminus, or both at its 3' terminus and its 5' terminus. Barcode, promoter or spacer sequence etc. If the polynucleotide limited by a certain formula will be described as circular, the ends of these molecules are directly or indirectly connected together. For example, in the case of a cyclic complex of formula X-A-B-Z-Y, the 5' end of the molecule is directly or indirectly connected to the 3' end of the molecule to produce a ring. As will be apparent, the various component sequences of polynucleotides (e.g., A, B, C, X, Y, Z, etc.) can independently be any desired length, as long as they can perform the desired function (e.g., hybridize with another sequence). For example, the various component sequences of polynucleotides can independently have a length in the range of 8-80 nucleotides (e.g., 10-50 nucleotides or 12-30 nucleotides).

术语(例如,式X-A-B-Z的)“可连接复合物”指其中多种寡核苷酸(以环状或线状形式)彼此可连接地相邻,由夹板寡核苷酸结合在一起的复合物,如图1中所示。The term "ligatable complex" (e.g., of formula X-A-B-Z) refers to a complex in which multiple oligonucleotides (in circular or linear form) are ligatably adjacent to each other, held together by a splint oligonucleotide, as shown in Figure 1.

术语(例如,式X-A-B-Z-Y的)“可连接环状复合物”指其中多种寡核苷酸彼此在环中可连接地相邻,由夹板寡核苷酸结合在一起的环状复合物。The term "ligatable circular complex" (e.g., of the formula X-A-B-Z-Y) refers to a circular complex in which multiple oligonucleotides are ligatably adjacent to each other in a loop, held together by a splint oligonucleotide.

如本文所用的术语“基因座”、“基因组座位”指基因组(例如,动物或植物基因组如人、猴、大鼠、鱼或昆虫或植物的基因组)的限定区域。基因座可以是短到100kb的染色体区域,并且可以长达一个染色体臂或整个染色体。As used herein, the terms "locus" and "genomic locus" refer to a defined region of a genome (e.g., an animal or plant genome such as a human, monkey, rat, fish, insect, or plant genome). A locus can be a chromosomal region as short as 100 kb and can be as long as a chromosome arm or an entire chromosome.

术语“第一基因座”和“第二基因座”指不同的基因座,即,基因组中的不同区域,例如,不同的染色体臂或不同的染色体。The terms "first locus" and "second locus" refer to different loci, ie, different regions in the genome, eg, different chromosome arms or different chromosomes.

术语“基因座的片段”指特定基因座的限定片段的群体(这可以使用限制性酶或借助RNA指导的再编程核酸内切酶如CAS9产生)。基因座的全部片段并非都需要分析。因为多种基因组的序列已经公开,所以设计与基因座的某个片段杂交的寡核苷酸是例行工作。The term "fragment of a locus" refers to a population of defined fragments of a particular locus (which can be generated using restriction enzymes or with the aid of RNA-guided reprogramming endonucleases such as CAS9). Not all fragments of a locus need to be analyzed. Because the sequences of many genomes are already public, it is routine to design oligonucleotides that hybridize to a certain fragment of a locus.

术语“与片段互补的”指与某片段的链(顶部链或底部链)互补的序列。The term "complementary to a fragment" refers to a sequence that is complementary to a strand (top strand or bottom strand) of a fragment.

如本文所用,术语“基因组序列”指基因组中存在的序列。As used herein, the term "genomic sequence" refers to a sequence present in a genome.

在可变的两个或更多个核酸序列的语境下,术语“可变”指相对于彼此具有不同的核苷酸序列的两个或更多个核酸。换句话说,如果某群体的多核苷酸具有可变序列或特定序列“变化”,则该群体的多核苷酸分子的核苷酸序列在各分子之间变动。术语“可变”不得解读为要求群体中的每个分子具有与群体中其他分子不同的序列。In the context of two or more variable nucleic acid sequences, the term "variable" refers to two or more nucleic acids that have different nucleotide sequences relative to each other. In other words, if a population of polynucleotides has variable sequences or a particular sequence "varies," the nucleotide sequence of the polynucleotide molecules in the population varies from molecule to molecule. The term "variable" should not be interpreted as requiring that each molecule in the population have a sequence that is different from that of other molecules in the population.

如果两个核酸(例如,序列A和A’)是“互补的”,则它们在高严格性条件下彼此杂交。在许多情况下,互补的两个序列具有至少10个,例如,至少12、至少15、至少20或至少25个核苷酸的互补性并且在某些情况下可以具有一个、两个或三个非互补碱基。If two nucleic acids (e.g., sequences A and A') are "complementary", they hybridize to each other under high stringency conditions. In many cases, two sequences that are complementary have at least 10, e.g., at least 12, at least 15, at least 20, or at least 25 nucleotides of complementarity and in some cases may have one, two, or three non-complementary bases.

在鉴定基因座的序列的上下文中,术语“鉴定”指对基因座而言独一无二的分子条形码。这种序列不来自基因座本身,反而它是一种向正在分析的基因座的片段添加并且确定这些片段来自该基因座的分子条形码,其通常具有正在分析的样品中不存在的序列。例如,如果来自第一基因座的片段连接至第一标示序列并且来自第二基因座的片段连接至第二标示序列,则可以通过检测哪个标示序列已经与这些片段连接,确定这些片段的来源(与它们对应的基因座)。In the context of identifying sequences of a locus, the term "identification" refers to a molecular barcode that is unique to the locus. This sequence does not come from the locus itself, but rather is a molecular barcode that is added to the fragments of the locus being analyzed and identifies them as coming from that locus, typically with sequences that are not present in the sample being analyzed. For example, if a fragment from a first locus is linked to a first marker sequence and a fragment from a second locus is linked to a second marker sequence, the source of the fragments (the loci to which they correspond) can be determined by detecting which marker sequence has been linked to the fragments.

术语”反向”在按反向与其他序列杂交的两个序列的上下文指其中序列之一的5’和3’末端与另一个序列以其中所述末端彼此面对的方式杂交的结构,作如在图3B的顶部所示。The term "reverse" in the context of two sequences hybridizing in reverse direction to the other sequence refers to a structure in which the 5' and 3' ends of one of the sequences hybridize to the other sequence in such a manner that the ends face each other, as shown at the top of Figure 3B.

如本文所用,术语“滚环扩增”或(缩写)“RCA”指使用链置换聚合酶产生环状核酸模板线性连环化副本的等温扩增。RCA是分子生物学领域中熟知的并且在多种出版物中描述,包括但不限于Lizardi等人(Nat.Genet.1998 19:225-232)、Schweitzer等人(Proc.Natl.Acad.Sci.2000 97:10113-10119)、Wiltshire等人(Clin.Chem.2000 46:1990-1993)和Schweitzer等人(Curr.Opin.Biotech 2001 12:21-27),所述文献通过引用的方式并入本文。As used herein, the term "rolling circle amplification" or (abbreviation) "RCA" refers to the isothermal amplification of linear concatenated copies of circular nucleic acid templates using a strand displacement polymerase. RCA is well known in the field of molecular biology and is described in various publications, including but not limited to Lizardi et al. (Nat. Genet. 1998 19: 225-232), Schweitzer et al. (Proc. Natl. Acad. Sci. 2000 97: 10113-10119), Wiltshire et al. (Clin. Chem. 2000 46: 1990-1993) and Schweitzer et al. (Curr. Opin. Biotech 2001 12: 21-27), which are incorporated herein by reference.

如本文所用,术语“滚环扩增产物”指滚环扩增反应的连环化体产物。如本文所用,术语“荧光标记的滚环扩增产物”指已经例如通过荧光标记的寡核苷酸与滚环扩增产物杂交或其他手段(例如,通过扩增期间荧光核苷酸掺入产物中)被荧光标记的滚环扩增产物。As used herein, the term "rolling circle amplification product" refers to the concatemer product of a rolling circle amplification reaction. As used herein, the term "fluorescently labeled rolling circle amplification product" refers to a rolling circle amplification product that has been fluorescently labeled, for example, by hybridization of a fluorescently labeled oligonucleotide to the rolling circle amplification product or other means (e.g., by incorporation of fluorescent nucleotides into the product during amplification).

如本文所用,在支持物的区域或图像的区域的上下文中,术语“区域”指连续或不连续区域。例如,如果方法涉及确定计数某区域内标记的RCA产物的数目,将计数RCA产物的区域可以是单一连续性空间或多个不连续空间。As used herein, the term "region" in the context of a region of a support or a region of an image refers to a continuous or discontinuous region. For example, if the method involves determining and counting the number of labeled RCA products within a region, the region in which the RCA products are to be counted may be a single continuous space or a plurality of discontinuous spaces.

如本文所用,术语“成像”指借以检测到来自对象表面的光学信号并且与位置关联的数据(即,“像素”)存储的过程。对象的数字图像可以从该数据重建。可以使用单幅图像或一幅或多幅图像,对支持物的某个区域成像。As used herein, the term "imaging" refers to the process by which optical signals from an object's surface are detected and data associated with the locations (i.e., "pixels") are stored. A digital image of the object can be reconstructed from this data. A single image or one or more images can be used to image an area of a support.

如本文所用,术语“各个标记的RCA产物”指标记的各个RCA分子。As used herein, the term "individual labelled RCA products" refers to individual RCA molecules that are labelled.

如本文所用,术语“计数”指确定更大集合中各个对象的数目。“计数”需要检测多个对象中来自各个对象的独立信号(并非来自多个对象的集体信号)并且随后通过计数各个信号确定多个对象中存在多少个对象。在本发明方法的情况下,通过确定信号阵列中各个信号的数目,进行“计数”。As used herein, the term "counting" refers to determining the number of individual objects in a larger collection. "Counting" requires detecting individual signals from each object in a plurality of objects (not the collective signal from the plurality of objects) and then determining how many objects are present in the plurality of objects by counting the individual signals. In the context of the present method, "counting" is performed by determining the number of individual signals in a signal array.

如本文所用,谈及RCA产物阵列时,术语“阵列”指在平面表面上的单个RCA产物的集合,其中RCA产物在表面的平面上相互空间分离(到这样的程度,依据泊松分布,该阵列是否为真实随机的)。“随机”阵列是其中要素(例如,RCA产物)在基材表面上按未预定的位置分布的阵列。在一些情况下,RCA产物在随机阵列上的分布可以由泊松统计学描述,从而,例如,随机阵列的RCA产物之间距离的分布按泊松分布逼近。As used herein, when referring to an array of RCA products, the term "array" refers to a collection of individual RCA products on a planar surface, wherein the RCA products are spatially separated from one another in the plane of the surface (to the extent that the array is truly random according to a Poisson distribution). A "random" array is one in which the elements (e.g., RCA products) are distributed in unpredicted positions on the surface of the substrate. In some cases, the distribution of RCA products in a random array can be described by Poisson statistics, such that, for example, the distribution of distances between RCA products of a random array is approximated by a Poisson distribution.

其他术语定义可以在本说明书通篇范围内出现。Other term definitions may appear throughout this specification.

示例性实施方案的描述Description of Exemplary Embodiments

在描述多种实施方案之前,应当理解本公开的教导内容不限于所述的具体实施方案,并且本身当然可以变动。还应当理解本文所用的术语其目的仅在于描述具体实施方案,并且不意图是限制性的,因为本发明教导内容的范围将仅由所附的权利要求限制。Before describing various embodiments, it should be understood that the teachings of the present disclosure are not limited to the specific embodiments described and may, of course, vary. It should also be understood that the terminology used herein is for the purpose of describing specific embodiments only and is not intended to be limiting, as the scope of the present teachings will be limited only by the appended claims.

本文所用的章节标题仅出于组织目的并且不得以任何方式解释为限制所描述的主题。尽管结合多种实施方案描述了本发明教导内容,本发明的教导内容不意图限于这类实施方案。相反,本发明教导内容涵盖各种备选物、修饰和等同物,如本领域技术人员将领会。The section headings used herein are for organizational purposes only and are not to be construed in any way as limiting the subject matter described. Although the present teachings are described in conjunction with various embodiments, the present teachings are not intended to be limited to such embodiments. On the contrary, the present teachings encompass various alternatives, modifications, and equivalents, as will be appreciated by those skilled in the art.

除非另外定义,本文中所用的全部技术术语和科学术语具有与本公开所属领域的普通技术人员通常所理解的相同意义。尽管与本文所述的那些方法和材料相似或等同的任意方法和材料也可以用于本发明教导内容的实施或检验,然而现在描述一些示例性方法和材料。Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by those of ordinary skill in the art to which this disclosure pertains. Although any methods and materials similar or equivalent to those described herein may also be used for the implementation or testing of the present teachings, some exemplary methods and materials are now described.

对任何出版物的引用是因其在申请日前的披露并且不应当解释为承认因在先发明而不认为本发明权利要求早于这种出版物。此外,所提供的公开日可能不同于需要独立核实的实际公开日。The citation of any publication is for its disclosure prior to the filing date and should not be construed as an admission that the present invention is not entitled to antedate such publication by virtue of prior invention. Furthermore, the dates of publication provided may be different from the actual publication dates which need to be independently verified.

如本领域技术人员阅读本公开时将显而易见,本文所描述和展示的各个变型的每一者都具有可以轻易地与其他几种实施方案的任一者的特征分离或与之组合的独立组分和特征,而不脱离本发明教导内容的范围或精神。任何所述的方法可以按列举的事件顺序或按逻辑上可能的任何其他顺序实施。As will be apparent to those skilled in the art upon reading this disclosure, each of the various variations described and illustrated herein has independent components and features that can be readily separated from or combined with the features of any of the other several embodiments without departing from the scope or spirit of the teachings of the present invention. Any of the methods described can be practiced in the order of events recited or in any other order that is logically possible.

本文提到的全部专利和出版物,包括这些专利和出版物内公开的所有序列在内,均明确地通过引用的方式并入。All patents and publications mentioned herein, including all sequences disclosed within such patents and publications, are expressly incorporated by reference.

探针组合物Probe composition

探针系统的一些实施方案可以包含:(a)序列B的标示寡核苷酸集合;(b)式X’-A’-B’-Z’的夹板寡核苷酸集合,其中:在该集合中:(i)序列A和B’变动,并且(ii)序列X’和Z’彼此不同并且不是可变的;并且,在每个夹板寡核苷酸中:(i)序列A’与核酸样品的基因组片段互补并且(ii)序列B’与标示寡核苷酸集合的至少一个成员互补;和(c)一个或多个包含X和Z的探针序列,其中序列X和Z不是可变的并且与序列X’和Z’杂交;其中每个夹板寡核苷酸能够杂交至:(i)探针序列,(ii)标示寡核苷酸集合的成员和(iii)基因组片段,从而产生式X-A-B-Z的可连接复合物。如下文将更详细地描述,在一些实施方案中,不同的标示寡核苷酸及其互补序列B’鉴定不同的染色体,例如,染色体21、18和13。Some embodiments of the probe system can include: (a) a set of marker oligonucleotides of sequence B; (b) a set of splint oligonucleotides of the formula X'-A'-B'-Z', wherein: within the set: (i) sequences A and B' vary, and (ii) sequences X' and Z' are different from each other and are not variable; and, in each splint oligonucleotide: (i) sequence A' is complementary to a genomic fragment of a nucleic acid sample and (ii) sequence B' is complementary to at least one member of the set of marker oligonucleotides; and (c) one or more probe sequences comprising X and Z, wherein sequences X and Z are not variable and hybridize to sequences X' and Z'; wherein each splint oligonucleotide is capable of hybridizing to: (i) the probe sequence, (ii) a member of the set of marker oligonucleotides, and (iii) a genomic fragment, thereby generating a ligatable complex of the formula X-A-B-Z. As will be described in more detail below, in some embodiments, different marker oligonucleotides and their complementary sequence B' identify different chromosomes, for example, chromosomes 21, 18, and 13.

图1显示式X-A-B-Z的可连接复合物,其结构表征本发明探针系统。如图1中所示,在复合物中,序列X、A、B和Z彼此可连接地相邻,由夹板寡核苷酸固定就位。如图1中所示,序列A是基因组的靶片段(例如,限制性片段的链),并且序列B鉴定衍生相邻序列A的基因座(例如,染色体上的特定区域、特定染色体臂或特定染色体等)。序列A和B之间的关系在2图中显示,所述图2显示与各种基因组片段(A1至A6)杂交的简单探针集合。如图2中所示,在(序列A1、A2和A3的)顶部三个复合物中的基因组片段来自第一基因座(例如,染色体21)并且在(序列A4、A5和A6的)底部三个复合物中的基因组片段来自第二基因座(例如,染色体18)。衍生顶部三个复合物中基因组片段的基因座借助单个序列(B1)鉴定,并且衍生底部三个复合物中基因组片段的基因座借助不同的序列(B2)鉴定。序列X和Z在全部所示的复合物中均相同。Fig. 1 shows the connectable complex of formula XABZ, and its structural characterization probe system of the present invention.As shown in Figure 1, in complex, sequence X, A, B and Z are adjacent to each other in connectable ground, are fixed in place by splint oligonucleotide.As shown in Figure 1, sequence A is genomic target fragment (for example, the chain of restriction fragment), and sequence B identifies the locus (for example, specific region on chromosome, specific chromosome arm or specific chromosome etc.) of derived adjacent sequence A.The relationship between sequence A and B is shown in 2 figures, and described Fig. 2 shows the simple probe set of hybridization with various genomic fragments (A 1 to A 6 ).As shown in Figure 2, the genomic fragment in the top three complexes (sequence A 1 , A 2 and A 3 ) is from the first locus (for example, chromosome 21) and the genomic fragment in the bottom three complexes (sequence A 4 , A 5 and A 6 ) is from the second locus (for example, chromosome 18). The loci from which the genomic fragments in the top three complexes were derived are identified by a single sequence (B 1 ), and the loci from which the genomic fragments in the bottom three complexes were derived are identified by different sequences (B 2 ). Sequences X and Z are identical in all shown complexes.

如将显而易见,夹板寡核苷酸集合可以是如所需那样复杂,并且在一些实施方案中,序列A’可以具有至少100、至少1,000、至少5,000、至少10,000或至少50,000或更大的复杂度,其意指夹板寡核苷酸可以,总体上与至少100、至少1,000、至少5,000、至少10,000或至少50,000个或更多个基因组DNA片段杂交。夹板寡核苷酸集合中的序列B’可以具有小得多的多样性,因为它单纯充当基因座标示物。从而,在夹板寡核苷酸集合中,序列B’可以具有至少2(例如,3或4)的复杂度,不过在一些实施方案中,序列B’可以具有至少10、至少100或至少1000的复杂度。如将显而易见,因为序列B’与序列B互补,所以基因座特异性寡核苷酸集合的复杂度可以是与序列B’的复杂度相同。例如,如果存在三个标示寡核苷酸,则可以存在三个不同的B’序列。集合中夹板寡核苷酸的数目可以大幅度变动,这取决于基因座的长度和靶片段的数目。在一些实施方案中,每个夹板寡核苷酸集合可以含有至少10、至少50、至少100、至少500、至少1,000、至少5,000、至少10,000或至少50,000个不同的夹板寡核苷酸。As will be apparent, the splint oligonucleotide set can be as complex as desired, and in some embodiments, sequence A' can have a complexity of at least 100, at least 1,000, at least 5,000, at least 10,000, or at least 50,000, or more, meaning that the splint oligonucleotide can, in general, hybridize to at least 100, at least 1,000, at least 5,000, at least 10,000, or at least 50,000, or more genomic DNA fragments. Sequence B' in the splint oligonucleotide set can have much less diversity because it simply acts as a locus marker. Thus, in the splint oligonucleotide set, sequence B' can have a complexity of at least 2 (e.g., 3 or 4), although in some embodiments, sequence B' can have a complexity of at least 10, at least 100, or at least 1000. As will be apparent, because sequence B' is complementary to sequence B, the complexity of the locus-specific oligonucleotide set can be the same as the complexity of sequence B'. For example, if there are three marker oligonucleotides, there can be three different B' sequences. The number of splint oligonucleotides in a collection can vary widely, depending on the length of the locus and the number of target fragments. In some embodiments, each splint oligonucleotide collection can contain at least 10, at least 50, at least 100, at least 500, at least 1,000, at least 5,000, at least 10,000, or at least 50,000 different splint oligonucleotides.

例如,在一些实施方案中,夹板寡核苷酸集合可以含有:(i)含有至少100个A’序列(例如,集合A1,X’,x=1-100+)的第一夹板寡核苷酸亚群,所述A’序列与第一基因座的不同片段(例如,染色体21的片段,或,例如,集合A1,X,x=1-100+)互补,其中这个夹板寡核苷酸亚群的每一者具有相同的B’序列,例如,B1’;(ii)含有至少100个A’序列(例如,集合A2,X’,x=1-100+)的第二夹板寡核苷酸亚群,所述A’序列与第二基因座的不同片段(例如,染色体18的片段,或例如,集合A2X,x=1-100+)互补,其中这个夹板寡核苷酸亚群的每一者具有与第一(或任何其他)亚群的B’序列不同的相同B’序列,例如,B2’;(iii)含有至少100个A’序列(例如,集合A3,X’,x=1-100+)的第三夹板寡核苷酸亚群,所述A’序列与第三基因座的不同片段(例如,染色体18的片段,或,例如,集合A3X,x=1-100+)互补,其中这个夹板寡核苷酸亚群的每一者具有与任何其他亚群的B’序列不同的相同B’序列,例如,B3’;(iv)含有至少100个A’序列(例如,集合A4X’,x=1-100+)的任选的第四夹板寡核苷酸亚群,所述A’序列与第四基因座的不同片段(例如,另一个染色体的片段或,例如,集合A4,X,x=1-100+)互补,其中这个夹板寡核苷酸亚群的每一者具有与任何其他亚群的B’序列不同的B’序列,例如,B4’。For example, in some embodiments, a collection of splint oligonucleotides can contain: (i) a first subpopulation of splint oligonucleotides containing at least 100 A' sequences (e.g., set A1 ,X ', x=1-100+) that are complementary to different fragments of a first locus (e.g., fragments of chromosome 21, or, for example, set A1,X , x=1-100+), wherein each of this subpopulation of splint oligonucleotides has the same B' sequence, e.g., B1 '; (ii) a second subpopulation of splint oligonucleotides containing at least 100 A' sequences (e.g., set A2 ,X ', x=1-100+) that are complementary to different fragments of a second locus (e.g., fragments of chromosome 18, or, for example, set A2X , x=1-100+), wherein each of this subpopulation of splint oligonucleotides has the same B' sequence that is different from the B' sequence of the first (or any other) subpopulation, e.g., B2 '; (iii) a third subpopulation of splint oligonucleotides containing at least 100 A' sequences (e.g., set A3 ,X ', x=1-100+) that are complementary to different fragments of a third locus (e.g., fragments of chromosome 18, or, for example, set A3X , x=1-100+), wherein each of this subpopulation of splint oligonucleotides has the same B' sequence that is different from the B' sequences of any other subpopulation, e.g., B3 '; (iv) an optional fourth subpopulation of splint oligonucleotides containing at least 100 A' sequences (e.g., set A4X ', x=1-100+) that are complementary to different fragments of a fourth locus (e.g., fragments of another chromosome or, for example, set A4 ,X , x=1-100+), wherein each of this subpopulation of splint oligonucleotides has a B' sequence that is different from the B' sequences of any other subpopulation, e.g., B4 ' .

如图3中所示,探针系统可以按多种不同方式设置,这取决于将会怎样使用它。例如,如图3A、图3C和图3D中所示,序列X和Z可以在不同的分子中,并且因此,可连接的复合物是线状的。在这些实施方案中,一个或多个含有序列X和Y的探针可以包含了包含序列X的第一寡核苷酸和包含序列Y的第二寡核苷酸。在这些实施方案中,第一和第二寡核苷酸不需要加尾,如图1A中所示。在这些实施方案中,在连接后,可以扩增连接产物,使用,例如,与序列X和Z杂交的加尾(talked)PCR引物扩增。在一些实施方案中(如图3C和图3D中所示),第一和/或第二寡核苷酸可以本身具有提供引物结合位点以促进扩增和计数的尾。在一些实施方案中,尾可以含有分子索引器(例如,随机序列),所述分子索引器允许在那些分子已经扩增和测序后计数原始连接产物的数目。在可选实施方案中和如图3B中所显示,含有序列X和Y的一个或多个探针可以是式X-Y-Z的单一主链探针。在这些实施方案中和如所示,可连接复合物是式X-A-B-Z-Y的环状可连接复合物,其中序列Y接合序列X和Z。在图3E中所示的另一个实施方案中,一个或多个含有序列X和Z的探针可以本身是夹板寡核苷酸的部分。在这些实施方案中,连接产物可以是“哑铃”型,如图3E中所示。As shown in Figure 3, the probe system can be arranged in a variety of different ways, depending on how it will be used. For example, as shown in Figure 3A, Figure 3C and Figure 3D, sequences X and Z can be in different molecules, and therefore, the connectable complex is linear. In these embodiments, one or more probes containing sequences X and Y can include a first oligonucleotide comprising sequence X and a second oligonucleotide comprising sequence Y. In these embodiments, the first and second oligonucleotides do not need to be tailed, as shown in Figure 1A. In these embodiments, after connection, the connection product can be amplified, using, for example, a tailed (talked) PCR primer amplification hybridized with sequences X and Z. In some embodiments (as shown in Figure 3C and Figure 3D), the first and/or second oligonucleotide can itself have a tail that provides a primer binding site to promote amplification and counting. In some embodiments, the tail can contain a molecular indexer (for example, a random sequence), which allows counting the number of original connection products after those molecules have been amplified and sequenced. In optional embodiments and as shown in Figure 3B, the one or more probes containing sequences X and Y can be a single backbone probe of formula X-Y-Z. In these embodiments and as shown, the ligatable complex is a circular ligatable complex of the formula X-A-B-Z-Y, wherein sequence Y joins sequences X and Z. In another embodiment, shown in Figure 3E , one or more probes containing sequences X and Z can themselves be part of a splint oligonucleotide. In these embodiments, the ligation product can be a "dumbbell" shape, as shown in Figure 3E .

在这些实施方案中,探针系统还可以包含PCR引物对,所述PCR引物对与一个或多个包含序列X和Z的探针杂交,从而允许扩增连接产物的中央部分(即,含有序列A和B的部分)。在一些实施方案中,例如,图3B中所示的实施方案,探针系统还可以包含与主链探针中的序列杂交的滚环扩增引物,从而促进这些产物由滚环扩增法扩增。在一些实施方案中,探针系统可以包含使序列与主链探针杂交的滚环扩增引物;和至多四个可区分地标记的寡核苷酸,其中每个可区分地标记的寡核苷酸与序列B’的互补物杂交。这将在下文更详细地解释。In these embodiments, the probe system can also include a PCR primer pair that hybridizes to one or more probes comprising sequences X and Z, thereby allowing amplification of the central portion of the connection product (i.e., the portion containing sequences A and B). In some embodiments, such as the embodiment shown in Figure 3B, the probe system can also include a rolling circle amplification primer that hybridizes to the sequence in the backbone probe, thereby promoting the amplification of these products by rolling circle amplification. In some embodiments, the probe system can include a rolling circle amplification primer that hybridizes the sequence to the backbone probe; and up to four distinguishably labeled oligonucleotides, each of which distinguishably labeled oligonucleotides hybridizes to the complement of sequence B '. This will be explained in more detail below.

从而,探针系统的一些实施方案可以包含夹板寡核苷酸、主链探针和一个或多个基因座特异性寡核苷酸。探针系统还可以包含一个或多个扩增引物,如杂交主链探针中序列的滚环扩增引物或与主链探针中位点杂交的PCR引物对,和,任选地,一个或多个与基因座特异性寡核苷酸的互补物杂交的标记探针。Thus, some embodiments of the probe system can include a splint oligonucleotide, a backbone probe, and one or more locus-specific oligonucleotides. The probe system can also include one or more amplification primers, such as a rolling circle amplification primer that hybridizes to a sequence in the backbone probe or a PCR primer pair that hybridizes to a site in the backbone probe, and, optionally, one or more label probes that hybridize to the complement of the locus-specific oligonucleotide.

如上文所示,序列A’在集合的不同成员之间变化,并且A’的序列各自设计成与基因组的不同靶片段互补。A’的序列可以独立地长度和序列各异,并且在一些情况下,可以是处于8至80个核苷酸,例如,10至60个核苷酸长度范围内,这取决于靶片段的长度和序列。序列B’鉴定相邻片段所来源的基因座(例如,特定染色体,如染色体18或21,等)。序列B’可以具有任何合适的长度,但是在一些实施方案中,它处于8至30个核苷酸长度范围内。在任何单一测定法中,序列X’和Z’彼此不同并且不是可变的。序列X’和Z’可以具有任何合适的长度,但是在一些实施方案中,它们独立地处于8至30个核苷酸长度范围内,不过可以使用更长或更短的序列。夹板寡核苷酸的总体长度可以处于50至200个核苷酸范围内。在一些实施方案中,夹板寡核苷酸可以是生物素酰化的,从而允许连接产物(下文讨论)在扩增之前与未连接的其他产物分离。如将显而易见,序列X和Z(其可以具有任何合适的长度,但是在一些实施方案中,它们独立地处于8至30个核苷酸长度范围内)不是可变的并且与序列X’和Z’杂交。基因座特异性寡核苷酸具有序列B,再次,所述序列B可以具有任何合适的长度,例如,处于8至30个核苷酸长度范围内。As shown above, sequence A' varies between different members of the set, and each of A's sequences is designed to complement a different target fragment of the genome. The sequence of A' can vary in length and sequence independently, and in some cases, can be in the range of 8 to 80 nucleotides, for example, 10 to 60 nucleotides in length, depending on the length and sequence of the target fragment. Sequence B' identifies the locus from which the adjacent fragment originates (e.g., a specific chromosome, such as chromosome 18 or 21, etc.). Sequence B' can have any suitable length, but in some embodiments, it is in the range of 8 to 30 nucleotides in length. In any single assay, sequences X' and Z' are different from each other and are not variable. Sequences X' and Z' can have any suitable length, but in some embodiments, they are independently in the range of 8 to 30 nucleotides in length, although longer or shorter sequences can be used. The overall length of the splint oligonucleotide can be in the range of 50 to 200 nucleotides. In some embodiments, the splint oligonucleotide can be biotinylated, thereby allowing the ligation product (discussed below) to be separated from other unligated products before amplification. As will be apparent, sequences X and Z (which can be of any suitable length, but in some embodiments, are independently within the range of 8 to 30 nucleotides in length) are not variable and hybridize to sequences X' and Z'. The locus-specific oligonucleotide has sequence B, which again can be of any suitable length, for example, within the range of 8 to 30 nucleotides in length.

如上文所示,使用上述探针系统产生的复合物可以是线状或环状(如图3中所示)。图4显示图3B中所示的环状实施方案的一些特征。As indicated above, the complexes generated using the above probe system can be linear or circular (as shown in Figure 3). Figure 4 shows some features of the circular embodiment shown in Figure 3B.

如图4中所示,在一些实施方案中,探针系统可以包含夹板寡核苷酸集合2(式X’-A’-B’-Z’,其可以处于5’至3’或3’至5’方向)、式X-Y-Z的主链探针6,其中序列X和Z不是可变的并且以反向与序列X’和Z’杂交(即,从而,主链的末端指向彼此,如所显示),和具有序列B的基因座特异性寡核苷酸集合8。主链探针中的序列Y可以是任何便利长度,例如,20至100个核苷酸。主链探针6的总长度可以处于50至300个核苷酸长度范围内,或在某些情况下更长。As shown in Figure 4, in some embodiments, the probe system can include a set of splint oligonucleotides 2 (formula X'-A'-B'-Z', which can be in the 5' to 3' or 3' to 5' direction), a backbone probe 6 of the formula X-Y-Z, wherein sequences X and Z are not variable and hybridize to sequences X' and Z' in reverse orientation (i.e., so that the ends of the backbone point to each other, as shown), and a set of locus-specific oligonucleotides 8 having sequence B. Sequence Y in the backbone probe can be any convenient length, for example, 20 to 100 nucleotides. The total length of the backbone probe 6 can be in the range of 50 to 300 nucleotides in length, or in some cases longer.

如图4中所示,探针集合在多种寡核苷酸中的表征,可以与基因组片段杂交以产生可连接环状复合物10的第一集合(即,复合物中主链探针6的末端、基因座特异性寡核苷酸8和基因组片段4彼此可连接地相邻并且由夹板寡核苷酸2彼此可连接相邻地固定)。如所示的例子中显示,主链探针6、基因座特异性寡核苷酸8和片段4与第一夹板寡核苷酸2杂交以产生式X-A-B-Z-Y的可连接环状复合物10的集合,其中序列Y接合序列X和Z。在可连接环状复合物10的这个集合中存在的片段4可以来自至少2、至少5、至少10,或至少50个或更多个不同的基因座(例如,不同染色体),并且相邻片段所来源的基因座(例如,特定染色体)的身份由对于每个基因座而言序列相同的基因座特异性寡核苷酸8提供。在这个例子中,序列A和A’(对应于不同基因组片段的序列)变动,B和B’(基因座标示物)变动,并且序列X、Y和Z不变动。As shown in FIG4 , a probe set is characterized in a plurality of oligonucleotides that can be hybridized with genomic fragments to produce a first set of ligatable circular complexes 10 (i.e., complexes in which the ends of the backbone probes 6, the locus-specific oligonucleotides 8, and the genomic fragments 4 are ligatably adjacent to each other and are secured ligatably adjacent to each other by the splint oligonucleotides 2). As shown in the illustrated example, the backbone probes 6, the locus-specific oligonucleotides 8, and the fragments 4 hybridize with the first splint oligonucleotide 2 to produce a set of ligatable circular complexes 10 of the formula X-A-B-Z-Y, where sequence Y joins sequences X and Z. The fragments 4 present in this set of ligatable circular complexes 10 can be from at least 2, at least 5, at least 10, or at least 50 or more different loci (e.g., different chromosomes), and the identity of the loci (e.g., specific chromosomes) from which the adjacent fragments originate is provided by the locus-specific oligonucleotides 8, which have the same sequence for each locus. In this example, sequences A and A' (sequences corresponding to different genomic fragments) vary, B and B' (locus markers) vary, and sequences X, Y, and Z do not vary.

如下文将更详细地描述,在这个实施方案中,探针系统(其包含夹板寡核苷酸2的第一集合、主链探针6和基因座特异性寡核苷酸8)可以与包含基因组4的片段的样品杂交以产生式X-A-B-Z-Y的可连接环状复合物10的第一集合,如所显示。在连接可连接环状复合物以产生式X-A-B-Z-Y的环状DNA分子12的第一集合后,环状DNA分子第一集合可以通过滚环扩增(RCA)进行扩增以产生RCA产物16的第一集合。可以使用与主链探针6中序列杂交的滚环扩增引物14(如图4中所示)或与连接片段侧翼的位点杂交的PCR引物进行RCA。如此,在某些实施方案中,探针系统可以另外包含滚环扩增引物14(所述引物与主链探针6中序列杂交)或与连接片段侧翼的位点杂交的PCR引物对。在RCA后,特定的RCA产物16中的已克隆片段的“来源”(即,已克隆的基因组片段所来源的基因座,例如,特定染色体)则可以通过第一标记的寡核苷酸18与序列B的互补物(即,B’)杂交或通过测序确定。如将显而易见,标记的寡核苷酸18可以包含序列B的至少一些。从而,在某些实施方案中,探针系统可以另外包含与第一基因座特异性寡核苷酸8的互补物杂交的标记寡核苷酸。As will be described in more detail below, in this embodiment, a probe system (which comprises a first set of splint oligonucleotides 2, a backbone probe 6, and a locus-specific oligonucleotide 8) can be hybridized with a sample comprising a fragment of a genome 4 to produce a first set of ligatable circular complexes 10 of the formula X-A-B-Z-Y, as shown. After ligating the ligatable circular complexes to produce a first set of circular DNA molecules 12 of the formula X-A-B-Z-Y, the first set of circular DNA molecules can be amplified by rolling circle amplification (RCA) to produce a first set of RCA products 16. RCA can be performed using rolling circle amplification primers 14 (as shown in FIG4 ) that hybridize to sequences in the backbone probes 6 or PCR primers that hybridize to sites flanking the ligated fragments. Thus, in certain embodiments, the probe system can additionally comprise rolling circle amplification primers 14 (the primers hybridize to sequences in the backbone probes 6) or PCR primer pairs that hybridize to sites flanking the ligated fragments. After RCA, the "origin" of the cloned fragment in a particular RCA product 16 (i.e., the locus from which the cloned genomic fragment originated, for example, a particular chromosome) can then be determined by hybridization of the first labeled oligonucleotide 18 to the complement of sequence B (i.e., B') or by sequencing. As will be apparent, the labeled oligonucleotide 18 may comprise at least some of sequence B. Thus, in certain embodiments, the probe system may further comprise a labeled oligonucleotide that hybridizes to the complement of the first locus-specific oligonucleotide 8.

如将显而易见,如果来自两个或更多个不同基因座的序列将在相同反应中检出,则探针系统可以包含额外的可区分地标记的寡核苷酸,每个基因座标示物B各一个,从而可以同时鉴定两个RCA产物集合。在这些实施方案中,探针系统还可以包含至多四个可区分地标记的寡核苷酸(例如,B1、B2、B3、B4),其中每个可区分地标记的寡核苷酸与序列B’的互补物(例如,B1’、B2’、B3’、B4’)杂交。As will be apparent, if sequences from two or more different loci are to be detected in the same reaction, the probe system may comprise additional distinguishably labelled oligonucleotides, one for each locus marker B, so that two sets of RCA products can be identified simultaneously. In these embodiments, the probe system may also comprise up to four distinguishably labelled oligonucleotides (e.g., B1 , B2 , B3 , B4 ), wherein each distinguishably labelled oligonucleotide hybridises to the complement of sequence B' (e.g., B1 ', B2 ', B3 ', B4 ').

如将显而易见,与夹板寡核苷酸杂交的片段是正在分析的基因组的限制性片段。另外,上文描述的探针、寡核苷酸或引物的任一者(例如,主链探针)可以含有分子条形码(例如,索引序列如随机或半随机序列),从而每个环状DNA分子能够依据克隆片段和条形码的结合区分,从而允许计数有多少初始分子被测序,甚至在分子已经扩增后也是如此(参见,例如,Casbon等人)。As will be apparent, the fragments that hybridize to the splint oligonucleotides are restriction fragments of the genome being analyzed. Additionally, any of the probes, oligonucleotides, or primers described above (e.g., backbone probes) can contain a molecular barcode (e.g., an index sequence such as a random or semi-random sequence) so that each circular DNA molecule can be distinguished based on the combination of cloned fragments and barcodes, thereby allowing counting of how many initial molecules were sequenced, even after the molecules have been amplified (see, e.g., Casbon et al.).

方法method

本文中还提供一种方法,所述方法包括:(a)将如上文所述的探针系统与包含基因组的片段的测试基因组样品杂交,以产生式X-A-B-Z的可连接复合物;(b)连接可连接复合物以产生式X-A-B-Z的产物DNA分子;并且(c)计数与序列B的每个基因座标示物相对应的产物DNA分子。在一些实施方案中,可以通过以下方式进行计数:对产物DNA分子或其扩增产物测序,以产生序列读出结果,并且计数包含每个序列B的序列读出结果的数目。Also provided herein is a method comprising: (a) hybridizing a probe system as described above to a test genomic sample comprising a fragment of a genome to produce a ligatable complex of the formula X-A-B-Z; (b) ligating the ligatable complex to produce product DNA molecules of the formula X-A-B-Z; and (c) counting the product DNA molecules corresponding to each locus marker of sequence B. In some embodiments, counting can be performed by sequencing the product DNA molecules or amplified products thereof to produce sequence reads, and counting the number of sequence reads comprising each sequence B.

在其中产物DNA分子为环状的实施方案中,计数可以包括通过滚环扩增法扩增产物DNA分子,并计数包含每个序列B的扩增产物的数目。在这些实施方案中,所述方法可以包括使用与序列B杂交的可区分地标记的探针标记RCA产物,和通过对每种可区分的标记物计数RCA产物的数目,进行计数。图4中显示这种方法的一个实施的一般原理。如将显而易见,与夹板寡核苷酸杂交的片段可以(独立地)是正在分析的基因组的顶部链或底部链限制性片段。可以通过用一种或多种限制性酶(例如,具有四碱基识别序列的酶的组合)消化基因组并且随后使消化的样品变性,产生这些片段。从而,正在克隆的片段具有确定的末端,从而允许设计夹板寡核苷酸以克隆这些片段。存在产生具有确定末端的片段的其他方式(例如,使用瓣状核酸内切酶、核酸外切酶、缺口填补等的方法)。In the embodiment in which the product DNA molecule is circular, counting can include amplifying the product DNA molecule by rolling circle amplification, and counting the number of amplified products comprising each sequence B. In these embodiments, the method can include using a distinguishably labeled probe labeling RCA product that hybridizes with sequence B, and by counting the number of each distinguishable marker RCA product, counting. The general principle of an implementation of this method is shown in Fig. 4. As will be apparent, the fragment hybridized with the splint oligonucleotide can be (independently) the top chain or bottom chain restriction fragment of the genome being analyzed. These fragments can be produced by digesting the genome with one or more restriction enzymes (for example, a combination of enzymes with four base recognition sequences) and subsequently denaturing the digested sample. Thus, the fragment being cloned has a definite end, thereby allowing the design of splint oligonucleotide to clone these fragments. There are other ways (for example, methods using flap endonucleases, exonucleases, gap fillings, etc.) to produce a fragment with a definite end.

如上文所示,这种方法可以多路复用以提供分析两个或更多个不同基因座的方式,如图5中所示。参考图5,含有基因组DNA40的片段的样品可以:a)与探针系统42杂交,所述探针系统42包含:(i)如上文所述的夹板探针的第一集合;(ii)如上文所述的第一基因座特异性寡核苷酸;(iii)如上文所述的夹板探针的第二集合;(iv)如上文所述的第二基因座特异性寡核苷酸;和,(v)如上文所述的主链探针,以产生混合物44,所述混合物44包含式X-A-B-Z-Y的可连接环状复合物的第一集合(所述可连接环状复合物含有来自第一基因座(例如,第一染色体)的片段以及来自第二基因座(例如,第二染色体)的片段)。接下来,该方法包括(b)连接可连接环状复合物以产生环状DNA分子46的混合物(所述混合物含有环状DNA分子的第一和第二集合),并且用核酸外切酶处理样品以除去线性核酸分子后,(c)使用与主链探针杂交的单一引物通过滚环扩增法扩增环状DNA分子46,以产生RCA产物48。随后可以通过以下方式鉴定含于每种RCA产物中的每个片段所来源的基因座:将RCA产物与可区分地标记的第一和第二寡核苷酸探针杂交,所述寡核苷酸探针与每种产物中存在的基因座特异性寡核苷酸的互补物杂交,以产生标记的样品50。在这些实施方案中,该方法可以包括:(d)分别:(i)使用与第一基因座标示物序列杂交的标记探针,检测含有来自第一基因座的片段的RCA产物和(ii)使用与第二基因座标示物序列杂交的标记探针检测含有来自第二基因座的片段的RCA产物,其中标记探针是可区分地标记的。如上文所示,在连接后,如果夹板寡核苷酸是生物素酰化的,则使用,例如,链霉亲和素珠,环状产物可以与未连接的产物分离。在任一种情况下,连接的样品可以用核酸外切酶处理,从而从反应中移除线性DNA分子。这种原理可以扩展至计数针对任何数目的基因座(例如,3、4,直至10或至多100个或更多个基因座)产生的连接产物的数目。As indicated above, this approach can be multiplexed to provide a means of analyzing two or more different loci, as shown in Figure 5. Referring to Figure 5, a sample containing fragments of genomic DNA 40 can be: a) hybridized with a probe system 42 comprising: (i) a first set of splint probes as described above; (ii) a first locus-specific oligonucleotide as described above; (iii) a second set of splint probes as described above; (iv) a second locus-specific oligonucleotide as described above; and, (v) a backbone probe as described above, to produce a mixture 44 comprising a first set of ligatable circular complexes of the formula X-A-B-Z-Y (the ligatable circular complexes containing fragments from a first locus (e.g., a first chromosome) and fragments from a second locus (e.g., a second chromosome)). Next, the method comprises (b) ligating the ligatable circular complexes to produce a mixture of circular DNA molecules 46 (the mixture comprising the first and second populations of circular DNA molecules), and after treating the sample with an exonuclease to remove linear nucleic acid molecules, (c) amplifying the circular DNA molecules 46 by rolling circle amplification using a single primer that hybridizes to the backbone probe to produce RCA products 48. The locus from which each fragment contained in each RCA product originates can then be identified by hybridizing the RCA products with distinguishably labeled first and second oligonucleotide probes that hybridize to the complement of the locus-specific oligonucleotide present in each product to produce a labeled sample 50. In these embodiments, the method may comprise: (d) respectively: (i) detecting RCA products containing fragments from the first locus using a labeled probe that hybridizes to the first locus marker sequence and (ii) detecting RCA products containing fragments from the second locus using a labeled probe that hybridizes to the second locus marker sequence, wherein the labeled probes are distinguishably labeled. As shown above, after ligation, if the splint oligonucleotide is biotinylated, the circular product can be separated from the unligated product using, for example, streptavidin beads. In either case, the ligated sample can be treated with an exonuclease to remove the linear DNA molecules from the reaction. This principle can be extended to counting the number of ligation products generated for any number of loci (e.g., 3, 4, up to 10 or up to 100 or more loci).

在一些实施方案中,检测步骤可以(d)包括:(i)在支持物上沉积RCA产物;并且(ii)分别计数支持物的区域中用一种标记物标记的各个标记的RCA产物的数目和用另一种标记物标记的各个标记的RCA产物的数目。如将理解,标记的寡核苷酸的杂可以在RCA产物于支持物上分布前或在RCA产物于支持物上分布后进行。In some embodiments, the detecting step may (d) comprise: (i) depositing the RCA products on a support; and (ii) counting the number of each labeled RCA product labeled with one label and the number of each labeled RCA product labeled with another label in an area of the support. As will be appreciated, hybridization of the labeled oligonucleotides may be performed before or after the RCA products are distributed on the support.

也就是说,可以通过例如以下方式估计与每个基因座相对应的滚环扩增产物的数目:在支持物(载片或多孔膜)的表面上分布RCA产物、使用标记的寡核苷酸(例如,荧光标记的寡核苷酸)杂交RCA产物并且随后例如,使用荧光读数仪,计数支持物的区域中离散信号的数目。标记可以在产物已经于支持物上分布之前或之后进行,并且,因为每个RCA产物含有上千个副本的相同序列,应当存在标记寡核苷酸的上千个结合位点,从而增加信号。在多重实施方案(例如,其中计数与两个不同基因座相对应的RCA产物)中,与一个基因座相对应的RCA产物可以用一种荧光团标记并且与另一个基因座相对应的RCA产物可以用不同荧光团标记,从而允许分别计数不同的RCA产物。That is, the number of rolling circle amplification products corresponding to each locus can be estimated, for example, by distributing the RCA products on the surface of a support (slide or porous membrane), hybridizing the RCA products using labeled oligonucleotides (e.g., fluorescently labeled oligonucleotides) and then, for example, using a fluorescence reader, counting the number of discrete signals in the region of the support. Labeling can be performed before or after the product has been distributed on the support, and, because each RCA product contains thousands of copies of the same sequence, there should be thousands of binding sites for the labeled oligonucleotides, thereby increasing the signal. In a multiplex embodiment (e.g., wherein the RCA products corresponding to two different loci are counted), the RCA product corresponding to one locus can be labeled with one fluorophore and the RCA product corresponding to the other locus can be labeled with a different fluorophore, thereby allowing different RCA products to be counted separately.

在某些实施方案中,该方法包括:(a)经多孔透明毛细管膜过滤含有滚环扩增(RCA)产物的液体样品,从而在该膜上浓缩RCA产物并产生RCA产物的阵列;(b)在步骤(a)之前或之后将RCA产物荧光标记;并且,(c)计数膜区域中各个标记的RCA产物的数目,从而提供对样品中标记的RCA产物数目的评估。在一些实施方案中,多孔透明毛细管膜可以是多孔阳极氧化铝膜。在这些实施方案中,可以通过在步骤(a)之前或之后将荧光标记的寡核苷酸与RCA产物杂交,进行步骤(b)。在某些实施方案中,该方法可以包括对膜区域成像以产生一幅或多幅图像并计数一幅或多幅图像中各个标记的RCA产物的数目。这类方法的例子在2016年5月2日提交的PCT/IB2016/052495中描述,所述文献通过引用的方式并入本文。In certain embodiments, the method comprises: (a) filtering a liquid sample containing rolling circle amplification (RCA) products through a porous transparent capillary membrane, thereby concentrating the RCA products on the membrane and producing an array of RCA products; (b) fluorescently labeling the RCA products before or after step (a); and, (c) counting the number of each labeled RCA product in the membrane area, thereby providing an assessment of the number of labeled RCA products in the sample. In some embodiments, the porous transparent capillary membrane can be a porous anodic aluminum oxide membrane. In these embodiments, step (b) can be performed by hybridizing a fluorescently labeled oligonucleotide to the RCA product before or after step (a). In certain embodiments, the method can include imaging the membrane area to produce one or more images and counting the number of each labeled RCA product in the one or more images. Examples of such methods are described in PCT/IB2016/052495, filed May 2, 2016, which is incorporated herein by reference.

定量来自各个RCA产物的信号是有意义的,因为在许多应用(例如,依据cfDNA分析的非侵入性产前诊断)中,与特定染色体(例如,染色体21)相对应的片段的数目需要相当准确地和在无偏倚的情况下测定。常见分析方法使用PCR,如熟知,所述PCR是一种非常有偏倚的方法,因为一些序列的扩增效率比其他序列高得多。对于许多诊断工作而言,这使得基于PCR的策略不切实际。Quantifying the signal from each RCA product is meaningful because in many applications (e.g., non-invasive prenatal diagnosis based on cfDNA analysis), the number of fragments corresponding to a specific chromosome (e.g., chromosome 21) needs to be determined quite accurately and without bias. Common analytical methods use PCR, which, as is well known, is a very biased method because some sequences amplify much more efficiently than others. For many diagnostic tasks, this makes PCR-based strategies impractical.

在具体的实施方案中,样品可以含有多个RCA产物群体(例如,2个、3个或4个或更多个RCA产物群体,如第一标记的RCA产物群体和第二RCA产物群体),其中不同的RCA产物群体可区分地标记,这意味着每个RCA产物群体标记物的各自成员可以独立地检出并计数,甚至当群体混合时也是如此。在主题方法中可用的合适的可区分荧光标记物对例如包括Cy-3和Cy-5(Amersham Inc.,Piscataway,NJ)、Quasar 570和Quasar 670(BiosearchTechnology,Novato CA)、Alexafluor555和Alexafluor647(Molecular Probes,Eugene,OR)、BODIPY V-1002和BODIPY V1005(Molecular Probes,Eugene,OR)、POPO-3和TOTO-3(Molecular Probes,Eugene,OR)以及POPRO3和TOPRO3(Molecular Probes,Eugene,OR)。其他合适的可区分可检测标记物可以例如在Kricka等人(Ann Clin Biochem.39:114-29,2002)中找到。例如,RCA产物可以用ATTO、ALEXA、CY或二聚体菁染料如YOYO、TOTO等的任何组合标记。也可以使用其他标记物。In particular embodiments, a sample may contain multiple RCA product populations (e.g., 2, 3 or 4 or more RCA product populations, such as a first labeled RCA product population and a second RCA product population), wherein the different RCA product populations are distinguishably labeled, meaning that individual members of each RCA product population label can be independently detected and counted, even when the populations are mixed. Suitable distinguishable fluorescent label pairs that can be used in the subject method include, for example, Cy-3 and Cy-5 (Amersham Inc., Piscataway, NJ), Quasar 570 and Quasar 670 (Biosearch Technology, Novato CA), Alexafluor 555 and Alexafluor 647 (Molecular Probes, Eugene, OR), BODIPY V-1002 and BODIPY V1005 (Molecular Probes, Eugene, OR), POPO-3 and TOTO-3 (Molecular Probes, Eugene, OR), and POPRO3 and TOPRO3 (Molecular Probes, Eugene, OR). Other suitable distinguishable detectable labels can be found, for example, in Kricka et al. (Ann Clin Biochem. 39: 114-29, 2002). For example, the RCA product can be labeled with any combination of ATTO, ALEXA, CY, or dimeric cyanine dyes such as YOYO, TOTO, and the like. Other markers may also be used.

在一些情况下,RCA产物群体可以通过用多重标记物标记它来可区分地标记,从而增加多路复合的可能性。例如,在一些情况下,群体可以用两种可区分染料(例如,Cy3和Cy5)标记,当读取时,所述群体将与用单一染料(例如,Cy3或Cy5)标记的群体可区分。在一些实施方案中,第一RCA产物群体代表标记的RCA产物的“测试”群体并且第二RCA产物群体代表第一RCA产物的数目可以与之比较的RCA产物“参比”群体。例如,在一些实施方案中,第一RCA产物群体可以对应于第一染色体区域(例如,第一染色体如染色体21)并且第二RCA产物群体可以对应于第二染色体区域(例如,第二染色体如染色体13或18或第一染色体的不同区域),并且第一RCA产物群体和第二RCA产物群体的数目可以计数并且比较以确定是否存在该区域的拷贝数差异(提示存在测试区域的复制或缺失)。在一些实施方案中,样品含有至少第一RCA产物群体和第二RCA产物群体,其中将第一和第二标记的RCA产物群体在标记步骤(步骤(b))中可区分地标记。在这些实施方案中,该方法包括计数膜区域中第一标记的RCA产物的数目和计数膜区域(相同区域或不同区域)中第二标记的RCA产物的数目,从而提供对样品中第一和第二RCA产物群体的数量的评估。这个实施方案还可以涉及将样品中第一RCA产物的数目与样品中第二RCA产物的数目比较。In some cases, RCA product populations can be distinguishably marked by marking it with multiple markers, thereby increasing the possibility of multiplexing. For example, in some cases, a population can be marked with two distinguishable dyes (for example, Cy3 and Cy5), and when read, the population will be distinguishable from a population marked with a single dye (for example, Cy3 or Cy5). In some embodiments, the first RCA product population represents a "test" population of the RCA product of the label and the second RCA product population represents an RCA product "reference" population to which the number of the first RCA product can be compared. For example, in some embodiments, the first RCA product population can correspond to a first chromosome region (for example, a first chromosome such as chromosome 21) and the second RCA product population can correspond to a second chromosome region (for example, a second chromosome such as chromosome 13 or 18 or a different region of the first chromosome), and the number of the first RCA product population and the second RCA product population can be counted and compared to determine whether there is a copy number difference in the region (prompting the presence of duplication or deletion in the test region). In some embodiments, the sample contains at least a first RCA product population and a second RCA product population, wherein the first and second labeled RCA product populations are distinguishably labeled in the labeling step (step (b)). In these embodiments, the method includes counting the number of first labeled RCA products in a membrane area and counting the number of second labeled RCA products in a membrane area (the same area or a different area), thereby providing an assessment of the quantity of the first and second RCA product populations in the sample. This embodiment may also involve comparing the number of the first RCA product in the sample with the number of the second RCA product in the sample.

在该方法的这些实施方案中的某些实施方案内,该方法可以包括对第一和第二标记的RCA产物群体成像以产生一幅或多幅图像(例如,分别是第一图像和第二图像)并且,任选地,(i)计数一幅或多幅图像中标记的RCA产物的数目,从而提供对样品中第一和第二标记的RCA产物群体数目的评估。可以使用已知的方法(例如,使用适宜的滤器等)分别检测第一和第二标记的RCA产物群体。该方法的这些实施方案还可以包括将样品中第一标记的RCA产物的数目与样品中第二标记的RCA产物的数目比较。该方法的这个步骤可以涉及计数第一群体中至少1,000个(例如,至少5,000、至少10,000、至少20,000、至少50,000、至少100,000、至少500,000个直至1百万个或更多个)标记的RCA产物并且计数膜区域中至少1,000个(例如、至少5,000、至少10,000、至少20,000或至少50,000、至少100,000、至少500,000个直至1百万个或更多个)标记的RCA产物,从而确保可以按统计严格性描述拷贝数的差异。In certain of these embodiments of the method, the method may comprise imaging the first and second labeled RCA product populations to produce one or more images (e.g., a first image and a second image, respectively) and, optionally, (i) counting the number of labeled RCA products in the one or more images, thereby providing an assessment of the number of first and second labeled RCA product populations in the sample. The first and second labeled RCA product populations may be detected separately using known methods (e.g., using suitable filters, etc.). These embodiments of the method may further comprise comparing the number of first labeled RCA products in the sample with the number of second labeled RCA products in the sample. This step of the method may involve counting at least 1,000 (e.g., at least 5,000, at least 10,000, at least 20,000, at least 50,000, at least 100,000, at least 500,000 up to 1 million or more) labelled RCA products in the first population and counting at least 1,000 (e.g., at least 5,000, at least 10,000, at least 20,000 or at least 50,000, at least 100,000, at least 500,000 up to 1 million or more) labelled RCA products in a region of the membrane to ensure that differences in copy number can be described with statistical rigour.

在可选实施方案中,可以使用与那些序列翼侧的位点杂交或与之相同的PCR引物,通过PCR扩增DNA分子中的克隆片段(和,任选地,环状DNA分子中的任何索引序列)。在这个实施方案中,PCR产物可以使用所述引物扩增。在这个实施方案中,可以通过任何合适的qPCR测定法(例如,Taqman测定法)等对产物的数量定量。在另一个实施方案中,产物可以测序(在扩增或不扩增的情况下)。在这些实施方案中,与每个基因座相对应的环状分子的数量可以通过计数与基因座相对应的序列读出结果的数目(例如,计数有多少序列读出结果具有特定的基因座特异性条形码序列)来估计。在一些实施方案中,如果使用索引序列,可以通过确定有多少不同分子条形码序列与每个基因座特异性条形码序列相关,计数与每个基因座相对应的环状分子的数目。In an optional embodiment, PCR primers that hybridize to or are identical to the sites flanking those sequences can be used to amplify cloned fragments (and, optionally, any index sequence in the circular DNA molecule) in the DNA molecule by PCR. In this embodiment, the PCR products can be amplified using the primers. In this embodiment, the quantity of the product can be quantified by any suitable qPCR assay (e.g., Taqman assay) or the like. In another embodiment, the product can be sequenced (with or without amplification). In these embodiments, the number of circular molecules corresponding to each locus can be estimated by counting the number of sequence reads corresponding to the locus (e.g., counting how many sequence reads have specific locus-specific barcode sequences). In some embodiments, if an index sequence is used, the number of circular molecules corresponding to each locus can be counted by determining how many different molecular barcode sequences are associated with each locus-specific barcode sequence.

如将显而易见,在这个实施方案中,所用的引物可以与在下列的应用中相容的序列,例如,Illumina的可逆性终止子方法、Roche的焦磷酸测序法方法(454),LifeTechnologies的连接测序法(SOLiD平台)或Life Technologies的Ion Torrent平台。以下参考文献中描述了这类方法的例子:Margulies等人(Nature 2005 437:376–80);Ronaghi等人(Analytical Biochemistry 1996 242:84–9);Shendure(Science 2005 309:1728);Imelfort 等人(Brief Bioinform.2009 10:609-18);Fox等人(Methods Mol Biol.2009;553:79-108);Appleby等人(Methods Mol Biol.2009;513:19-39)和Morozova(Genomics.2008 92:255-64),所述文献通过引用的方式并入用于总体描述方法和这些方法的具体步骤,包括每个步骤的全部起始产物、试剂和终产物。As will be apparent, in this embodiment, the primers used can be sequences compatible with applications such as Illumina's reversible terminator method, Roche's pyrosequencing method (454), Life Technologies' ligation sequencing method (SOLiD platform), or Life Technologies' Ion Torrent platform. Examples of such methods are described in the following references: Margulies et al. (Nature 2005 437:376–80); Ronaghi et al. (Analytical Biochemistry 1996 242:84–9); Shendure (Science 2005 309:1728); Imelfort et al. (Brief Bioinform. 2009 10:609-18); Fox et al. (Methods Mol Biol. 2009; 553:79-108); Appleby et al. (Methods Mol Biol. 2009; 513:19-39); and Morozova (Genomics. 2008 92:255-64), which are incorporated by reference for a general description of the methods and specific steps of these methods, including all starting products, reagents, and final products for each step.

测试基因组样品可以来自疑似患有疾病或病状或面临患有疾病或病状风险的患者,并且步骤(c)的结果提示患者或其胎儿是否患有疾病或病状。在一些实施方案中,疾病或病状可以是癌症、感染性疾病、炎性疾病、移植排斥或染色体缺陷如三体性。The test genomic sample can be from a patient suspected of having a disease or condition or facing a risk of having a disease or condition, and the result of step (c) indicates whether the patient or his fetus has the disease or condition. In some embodiments, the disease or condition can be cancer, an infectious disease, an inflammatory disease, transplant rejection, or a chromosomal defect such as trisomy.

如上文所示,在一些情况下,使用这种方法分析的样品可以是从血液(例如,从孕妇血液)获得的cfDNA的样品。在这些实施方案中,该方法可以用来检测正在发育的胎儿中的染色体异常(如上文所述)或用来例如计算样品中胎儿DNA的分数。As indicated above, in some cases, the sample analyzed using this method can be a sample of cfDNA obtained from blood (e.g., from a pregnant woman's blood). In these embodiments, the method can be used to detect chromosomal abnormalities in a developing fetus (as described above) or to calculate, for example, the fraction of fetal DNA in a sample.

可以使用该方法检测到的示意性拷贝数异常包括但不限于21三体、13三体、18三体、16三体、XXY、XYY、XXX、X单体、21单体、22单体、16单体和15单体。下表中列出可以使用本发明方法检测到的其他拷贝数异常。Illustrative copy number abnormalities that can be detected using the method include, but are not limited to, trisomy 21, trisomy 13, trisomy 18, trisomy 16, XXY, XYY, XXX, monosomy X, monosomy 21, monosomy 22, monosomy 16, and monosomy 15. The following table lists other copy number abnormalities that can be detected using the methods of the present invention.

染色体 异常和疾病关联性Chromosomal abnormalities and disease associations

X: XO(Turner综合征)X: XO (Turner syndrome)

Y: XXY(Klinefelter综合征)Y:XXY (Klinefelter syndrome)

Y: XYY(双Y综合征)Y: XYY (double Y syndrome)

Y: XXX(X三体综合征)Y: XXX (Trisomy X)

Y: XXXX(四X综合征)Y: XXXX (quadruple X syndrome)

Y: Xp21缺失(Duchenne/Becker综合征、先天性肾上腺发育不全、慢性肉Y: Xp21 deletion (Duchenne/Becker syndrome, congenital adrenal hypoplasia, chronic myeloid leukemia

芽肿病)budding disease)

Y: Xp22缺失(类固醇硫酸酯酶缺乏症)Y: Xp22 deletion (steroid sulfatase deficiency)

Y: Xq26缺失(X连锁淋巴细胞增生病)Y: Xq26 deletion (X-linked lymphoproliferative disorder)

1: 1p体细胞(神经母细胞瘤)1:1p somatic cell (neuroblastoma)

1: 单体性(神经母细胞瘤)1: Monosomy (neuroblastoma)

1: 三体性(神经母细胞瘤)1: Trisomy (neuroblastoma)

2: 单体性(生长迟缓、发育延迟及智力低下和少量身体异常)2: Monosomy (growth retardation, developmental delay, mental retardation, and minor physical abnormalities)

2: 三体性2q(生长迟缓、发育延迟及智力低下和少量身体异常)2: Trisomy 2q (growth retardation, developmental delay, intellectual disability, and minor physical abnormalities)

3: 单体性(非霍奇金淋巴瘤)3: Monosomal (non-Hodgkin's lymphoma)

3: 三体性体细胞(非霍奇金淋巴瘤)3: Trisomy 3 (non-Hodgkin lymphoma)

4: 单体性(急性非淋巴细胞白血病(ANLL))4: Monosomy (acute nonlymphocytic leukemia (ANLL))

4: 三体性体细胞(急性非淋巴细胞白血病(ANLL))4: Trisomy 4 (acute nonlymphocytic leukemia (ANLL))

5: 5p(Cri du chat;Lejeune综合征)5:5p (Cri du chat; Lejeune syndrome)

5: 5q体细胞(骨髓增生异常综合征)5:5q somatic (myelodysplastic syndrome)

5: 单体性(骨髓增生异常综合征)5: Monosomy (myelodysplastic syndrome)

5: 三体性(骨髓增生异常综合征)Trisomy 5 (Myelodysplastic Syndrome)

6: 单体性(透明细胞肉瘤)6: Monosomy (clear cell sarcoma)

6: 三体性体细胞(透明细胞肉瘤)6: Trisomy 6 (clear cell sarcoma)

7: 7q11.23缺失(Williams综合征)7: 7q11.23 deletion (Williams syndrome)

7: 单体性(儿童7单体综合征;体细胞:肾皮质腺瘤;骨髓增生异常综合征)Monosomy 7 (monosomy 7 syndrome in children; somatic: renal cortical adenoma; myelodysplastic syndrome)

7: 三体性(儿童7单体综合征;体细胞:肾皮质腺瘤;骨髓增生异常综合征)Trisomy 7 (monosomy 7 syndrome in children; somatic: renal cortical adenoma; myelodysplastic syndrome)

8: 8q24.1缺失(Langer-Giedon综合征)8: 8q24.1 deletion (Langer-Giedon syndrome)

8: 单体性(骨髓增生异常综合征;Warkany综合征;体细胞:慢性髓性白8: Monosomy (myelodysplastic syndrome; Warkany syndrome; somatic: chronic myeloid leukemia

血病)Blood disease)

8: 三体性(骨髓增生异常综合征;Warkany综合征;体细胞:慢性髓性白8: Trisomy (myelodysplastic syndrome; Warkany syndrome; somatic: chronic myeloid leukemia

血病)Blood disease)

9: 单体性9p(Alfi’s综合征)9: Monosomy 9p (Alfi’s syndrome)

9: 单体性9p(Rethore综合征)9: Monosomy 9p (Rethore syndrome)

9: 部分三体性(Rethore综合征)9: Partial trisomy (Rethore syndrome)

9: 三体性(完全9三体综合征;嵌合9三体综合征)9: Trisomy (complete trisomy 9; mosaic trisomy 9)

10: 单体性(ALL或ANLL)10: Monosomy (ALL or ANLL)

10: 三体性体细胞(ALL或ANLL)10: Trisomy ALL or ANLL

11: 11p-(无虹膜症;Wilm肿瘤)11:11p- (aniridia; Wilm tumor)

11: 11q-(Jacobsen综合征)11:11q- (Jacobsen syndrome)

11: 单体性(髓样谱系受累(ANLL、MDS))Monosomy 11 (myeloid lineage involvement (ANLL, MDS))

11: 三体性体细胞(髓样谱系受累(ANLL、MDS))11: Somatic trisomy (myeloid lineage involvement (ANLL, MDS))

12: 单体性(CLL,幼年型粒层细胞肿瘤(JGCT))12: Monosomy (CLL, juvenile granulosa cell tumor (JGCT))

12: 三体性体细胞(CLL,幼年型粒层细胞肿瘤(JGCT))12: Trisomy 12 (CLL, juvenile granulosa cell tumor (JGCT))

13: 13q-(13q-综合征;Orbeli综合征)13: 13q- (13q- syndrome; Orbeli syndrome)

13: 13q14缺失(视网膜母细胞瘤)13:13q14 deletion (retinoblastoma)

13: 单体性(Patau综合征)Monosomy 13 (Patau syndrome)

13: 三体性(Patau综合征)Trisomy 13 (Patau syndrome)

14: 单体性(髓样疾病(MDS、ANLL、非典型CML)14: Monosomy (myeloid diseases (MDS, ANLL, atypical CML)

14: 三体性体细胞(髓样疾病(MDS、ANLL、非典型CML)14: Trisomy 14: Somatic myeloid diseases (MDS, ANLL, atypical CML)

15: 15q11-q13缺失(Prader-Willi,Angelman综合征)15:15q11-q13 deletion (Prader-Willi, Angelman syndrome)

15: 单体性(Prader-Willi,Angelman综合征)15: Monosomy (Prader-Willi, Angelman syndrome)

15: 三体性体细胞(髓样和淋巴样谱系受累,例如,MDS、ANLL、ALL、CLL))15: Somatic trisomy (involvement of myeloid and lymphoid lineages, eg, MDS, ANLL, ALL, CLL)

16: 16q13.3缺失(Rubenstein-Taybi)16:16q13.3 deletion (Rubenstein-Taybi)

16: 单体性(乳头状肾细胞癌(恶性))16: Monosomy (Papillary Renal Cell Carcinoma (Malignant))

16: 三体性体细胞(乳头状肾细胞癌(恶性))16: Trisomy 16 (Papillary Renal Cell Carcinoma (Malignant))

17: 17p-体细胞(髓样恶性肿瘤中的17P综合征)17: 17p-somatic (17P syndrome in myeloid malignancies)

17: 17q11.2缺失(Smith-Magenis)17:17q11.2 deletion (Smith-Magenis)

17: 17q13.3(Miller-Dieker)17: 17q13.3(Miller-Dieker)

17: 单体性(肾皮质腺瘤)17: Monosomy (renal cortical adenoma)

17: 三体性体细胞(肾皮质腺瘤)17: Trisomy 17 (renal cortical adenoma)

17: 17p11.2-12(Charcot-Marie Tooth综合征1型;HNPP)17: 17p11.2-12 (Charcot-Marie Tooth syndrome type 1; HNPP)

17: 三体性(Charcot-Marie Tooth综合征1型;HNPP)Trisomy 17 (Charcot-Marie Tooth syndrome type 1; HNPP)

18: 18p-(部分18p单体综合征或Grouchy Lamy Thieffry综合征)18: 18p- (partial monosomy 18p or Grouchy Lamy Thieffry syndrome)

18: 18q-(Grouchy Lamy Salmon Landry综合征)18:18q- (Grouchy Lamy Salmon Landry syndrome)

18: 单体性(爱德华综合征)18: Monosomy (Edwards syndrome)

18: 三体性(爱德华综合征)Trisomy 18 (Edwards syndrome)

19: 单体性(爱德华综合征)19: Monosomy (Edwards syndrome)

19: 三体性(爱德华综合征)Trisomy 19 (Edwards syndrome)

20: 20p-(20p三体综合征)20: 20p- (Trisomy 20p)

20: 20p11.2-12缺失(Alagille)20:20p11.2-12 deletion (Alagille)

20: 20q-(体细胞:MDS、ANLL、真性红细胞增多症、慢性嗜中性白血病)20:20q- (somatic: MDS, ANLL, polycythemia vera, chronic neutrophilic leukemia)

20: 单体性(乳头状肾细胞癌(恶性))20: Monosomy (papillary renal cell carcinoma (malignant))

20: 三体性体细胞(乳头状肾细胞癌(恶性))20: Trisomy 20 (Papillary Renal Cell Carcinoma (Malignant))

21: 单体性(唐氏综合征)21: Monosomy (Down syndrome)

21: 三体性(唐氏综合征)21: Trisomy (Down syndrome)

22: 22q11.2缺失(DiGeorge综合征、腭心面综合征、圆锥动脉干-异常面22: 22q11.2 deletion (DiGeorge syndrome, velocardiofacial syndrome, conotruncal-facial anomaly)

容综合征综合征、常染色体显性Opitz G/BBB综合征、Caylor心面综合征)Jung's syndrome, autosomal dominant Opitz G/BBB syndrome, Caylor cardiofacial syndrome)

22: 单体性(完全22三体综合征)22: Monosomy (complete trisomy 22)

22: 三体性(完全22三体综合征)22: Trisomy (complete trisomy 22)

本文所述的方法可以用于分析实际上来自任何生物(包括但不限于植物、动物(例如,爬行类、哺乳动物、昆虫、蠕虫、鱼类等))、组织样品、细菌、真菌(例如,酵母)、噬菌体、病毒、尸体组织、考古/古代样品等的基因组DNA。在某些实施方案中,该方法中使用的基因组DNA可以衍生自哺乳动物,其中在某些实施方案中,哺乳动物是人。在示例性实施方案中,基因组样品可以含有来自哺乳动物细胞(如人、小鼠、大鼠或猴细胞)的基因组DNA。样品可以从培养的细胞或临床样品的细胞(例如,组织活检样品、刮擦物或灌洗液的细胞或法医样品的细胞(即,在犯罪现场采集的样品的细胞))制得。在具体的实施方案中,核酸样品可以从生物样品(如细胞、组织、体液和粪便)获得。目的体液包括但不限于血清、血浆、唾液、粘液、痰(phlegm)、脑脊液、胸膜液、泪、阴道管液(lactal duct fluid)、淋巴液、痰(sputum)、脑脊液、滑液、尿、羊水和精液。在具体的实施方案中,样品可以从受试者(例如,人)获得。在一些实施方案中,分析的样品可以是从血液(例如,从孕妇血液)获得的cfDNA的样品。The methods described herein can be used to analyze genomic DNA from virtually any organism (including but not limited to plants, animals (e.g., reptiles, mammals, insects, worms, fish, etc.)), tissue samples, bacteria, fungi (e.g., yeast), bacteriophages, viruses, cadaver tissues, archaeological/ancient samples, etc. In certain embodiments, the genomic DNA used in the method can be derived from a mammal, wherein in certain embodiments, the mammal is a human. In an exemplary embodiment, the genomic sample can contain genomic DNA from a mammalian cell (e.g., a human, mouse, rat, or monkey cell). The sample can be prepared from cultured cells or cells of a clinical sample (e.g., cells of a tissue biopsy, scrapings, or lavage fluid, or cells of a forensic sample (i.e., cells of a sample collected at a crime scene)). In a specific embodiment, a nucleic acid sample can be obtained from a biological sample (e.g., a cell, tissue, body fluid, and feces). Target body fluids include, but are not limited to, serum, plasma, saliva, mucus, phlegm, cerebrospinal fluid, pleural fluid, tears, lactal duct fluid, lymph, sputum, cerebrospinal fluid, synovial fluid, urine, amniotic fluid, and semen. In specific embodiments, the sample can be obtained from a subject (e.g., a human). In some embodiments, the sample analyzed can be a sample of cfDNA obtained from blood (e.g., from pregnant blood).

例如,在一些实施方案中,可以获得DNA的样品并且将样品用一种或多种限制性酶(或RNA指导的核酸内切酶如cas9)消化以产生可预测片段(所述片段的中位大小可以处于20-100个碱基范围内)。上文描述的方法可以对消化的DNA进行,并且使用本文所述的方法,对应于一个基因座(例如,一条染色体)的片段的数目可以与对应于另一个基因座(例如,另一条染色体)的片段的数目比较。如所示,该方法可以用来鉴定与疾病或病状相关的拷贝数差异,例如,染色体非整倍体。For example, in some embodiments, a sample of DNA can be obtained and the sample is digested with one or more restriction enzymes (or RNA-guided endonucleases such as cas9) to produce predictable fragments (the median size of the fragments can be in the range of 20-100 bases). The method described above can be performed on the digested DNA, and using the methods described herein, the number of fragments corresponding to one locus (e.g., a chromosome) can be compared with the number of fragments corresponding to another locus (e.g., another chromosome). As shown, the method can be used to identify copy number differences associated with a disease or condition, for example, chromosomal aneuploidy.

如上文所示,在一些情况下,分析的样品可以是从血液(例如,从孕妇血液)获得的cfDNA的样品。在这些实施方案中,该方法可以用来检测正在发育的胎儿中的染色体异常或用来例如计算样品中胎儿DNA的分数。As indicated above, in some cases, the sample analyzed can be a sample of cfDNA obtained from blood (e.g., from a pregnant woman's blood). In these embodiments, the method can be used to detect chromosomal abnormalities in a developing fetus or to, for example, calculate the fraction of fetal DNA in a sample.

试剂盒Reagent test kit

本公开还提供用于实施如上文所述的主题方法的试剂盒。在某些实施方案中,试剂盒可以包含:(a)式X’-A’-B’-Z’的夹板寡核苷酸集合,其中:在集合中:(i)A和B’的序列变动,以及(ii)X’和Z’的序列彼此不同并且不是可变的;并且在每个分子中:(i)序列A’与基因组的片段互补并且(ii)序列B’鉴定从中衍生基因组片段的基因座,所述基因组片段与相邻的A’序列杂交;(b)一个或多个包含序列X和Z的探针,其中:i.序列X和Z不是可变的并且与序列X’和Z’杂交,和(c)序列B的基因座特异性寡核苷酸集合;并且其中:(a)的每个夹板寡核苷酸均能够杂交至(i)(b)的探针序列;(ii)(c)的基因座特异性寡核苷酸;和,(iii)(a)的基因组片段,以产生式X-A-B-Z的可连接复合物,其中序列B鉴定相邻序列A的基因座。在一些实施方案中,(b)的一个或多个探针包含了包含序列X的第一寡核苷酸和包含序列Y的第二寡核苷酸。在一些实施方案中,试剂盒还可以与一个或多个包含序列X和Y的探针杂交的PCR引物对。在某些实施方案中,(b)的一个或多个探针序列是式X-Y-Z的主链探针,并且可连接复合物是式X-A-B-Z-Y的环状可连接复合物,其中序列Y连接序列X和Z,并且序列B鉴定相邻序列A的基因座。在这些实施方案中,试剂盒还可以包含与主链探针中的序列杂交的滚环扩增引物。在这些实施方案中,试剂盒可以包含多个可区分地标记的寡核苷酸,其中每个可区分地标记的寡核苷酸与B’序列的互补物杂交。试剂盒可以另外含有用于进行滚环扩增的连接酶和/或链置换聚合酶。The present disclosure also provides a kit for implementing the subject method as described above. In certain embodiments, the kit may comprise: (a) a set of splint oligonucleotides of formula X'-A'-B'-Z', wherein: in the set: (i) the sequences of A and B' vary, and (ii) the sequences of X' and Z' are different from each other and are not variable; and in each molecule: (i) sequence A' is complementary to a fragment of a genome and (ii) sequence B' identifies the locus from which the genomic fragment is derived, said genomic fragment hybridizing to the adjacent A' sequence; (b) one or more probes comprising sequences X and Z, wherein: i. sequences X and Z are not variable and hybridize to sequences X' and Z', and (c) a set of locus-specific oligonucleotides of sequence B; and wherein: each splint oligonucleotide of (a) is capable of hybridizing to the probe sequence of (i) (b); (ii) the locus-specific oligonucleotide of (c); and, (iii) the genomic fragment of (a), to produce a ligatable complex of formula X-A-B-Z, wherein sequence B identifies the locus of the adjacent sequence A. In some embodiments, the one or more probes of (b) comprise a first oligonucleotide comprising sequence X and a second oligonucleotide comprising sequence Y. In some embodiments, the kit can also include a PCR primer pair that hybridizes to one or more probes comprising sequences X and Y. In certain embodiments, the one or more probe sequences of (b) are backbone probes of the formula X-Y-Z, and the ligatable complex is a circular ligatable complex of the formula X-A-B-Z-Y, wherein sequence Y connects sequences X and Z, and sequence B identifies the locus of the adjacent sequence A. In these embodiments, the kit can also include a rolling circle amplification primer that hybridizes to a sequence in the backbone probe. In these embodiments, the kit can include a plurality of distinguishably labeled oligonucleotides, wherein each distinguishably labeled oligonucleotide hybridizes to the complement of the B 'sequence. The kit can additionally contain a ligase and/or a strand displacement polymerase for performing rolling circle amplification.

试剂盒的各种组分可以存在于独立的容器中或某些相容的组分(例如,第一和第二夹板探针集合和第一和第二基因座特异性探针)可以根据需要预先合并至单个容器中。The various components of the kit may be present in separate containers or certain compatible components (eg, the first and second splint probe sets and the first and second locus-specific probes) may be pre-combined into a single container as desired.

除上文提到的组分之外,主题试剂盒还可以包括使用试剂盒的组分以实施主题方法的说明书。In addition to the components mentioned above, the subject kits can also include instructions for using the components of the kit to practice the subject methods.

实施例Example

提供以下实施例从而为本领域普通技术人员提供如何制造和使用本发明的附加公开和描述,并且不意图限制本发明人视为其发明的范围,它们也不意图表明下文的实验是所进行的全部或仅有实验。The following examples are put forth so as to provide those of ordinary skill in the art with additional disclosure and description of how to make and use the invention and are not intended to limit the scope of what the inventors regard as their invention, nor are they intended to indicate that the experiments below are all or the only experiments performed.

实施例I Example 1

验证方法的初始数据Initial data for validation methods

这个实验的目的是将使用具有染色体特异性的主链寡核苷酸(例如,用来捕获来自第一染色体(例如,染色体21)的片段的主链寡核苷酸与用来捕获来自第二染色体(例如,染色体18)的片段的主链寡核苷酸不同,如WO 2015083001和WO 2015083002中所述)的方法与其中相同主链寡核苷酸用于全部所检验的染色体的方法比较。这在图6中显示。如显示,“新”设计中使用作为靶片段克隆至相同环状产物中的染色体特异性序列(例如,A或B),确定已克隆片段的来源。在新方法中,使用单一主链寡核苷酸(与先前方法中的多重主链寡核苷酸相比),并且可以使用相同RCA引物或单一PCR引物对扩增来自全部染色体的已克隆片段。The purpose of this experiment is to compare the method using a chromosome-specific backbone oligonucleotide (for example, the backbone oligonucleotide used to capture fragments from a first chromosome (for example, chromosome 21) is different from the backbone oligonucleotide used to capture fragments from a second chromosome (for example, chromosome 18), as described in WO 2015083001 and WO 2015083002) with a method in which the same backbone oligonucleotide is used for all chromosomes tested. This is shown in Figure 6. As shown, the "new" design uses chromosome-specific sequences (for example, A or B) cloned into the same circular product as target fragments to determine the source of the cloned fragments. In the new method, a single backbone oligonucleotide is used (compared to the multiple backbone oligonucleotides in the previous method), and the same RCA primer or single PCR primer pair can be used to amplify cloned fragments from all chromosomes.

将细胞系DNA(10ng)消化、变性并与“旧”设计探针和“新”设计探针杂交。在杂交和连接后,连接反应接受核酸外切酶处理,以移除溶液中的任何非环化DNA。剩下的环状产物充当RCA反应中的模板,所述RCA反应产生环状产物的连环体拷贝。将这些RCA产物用互补于“夹板”序列的荧光标记的寡核苷酸标记,并沉积至固相支持物上以便检测。Cell line DNA (10ng) is digested, denatured and hybridized with " old " design probe and " new " design probe. After hybridization and connection, ligation reaction is accepted exonuclease treatment, to remove any non-circularized DNA in solution. The remaining ring product serves as the template in the RCA reaction, and the RCA reaction produces a concatemer copy of the ring product. These RCA products are labeled with fluorescently labeled oligonucleotides that are complementary to the " splint " sequence, and are deposited on a solid support to detect.

来自孕妇的13份cfDNA样品是接受如上文所述的相同反应处理。Thirteen cfDNA samples from pregnant women were processed using the same reaction as described above.

对于全部反应,以每种颜色计数各个对象(RCA产物)的数目。对于每份样品,计算按颜色A/B计的对象数目的比率并且计算变异系数作为测定法精度的度量。低变异系数使得精确测量胎儿分数低的样品成为可能。通过添加含有低加标量21三体细胞系样品的样品展示这一点。For all reactions, the number of each object (RCA product) was counted for each color. For each sample, the ratio of the number of objects in color A/B was calculated and the coefficient of variation was calculated as a measure of the accuracy of the assay. A low coefficient of variation makes it possible to accurately measure samples with low fetal fractions. This was demonstrated by adding samples containing a low spike-in amount of a trisomy 21 cell line sample.

根据图7中所显示的数据,新设计对细胞系DNA和cfDNA均产生较低的CV,使得更精确测量染色体异常的胎儿DNA成为可能。Based on the data shown in Figure 7, the new design produces lower CVs for both cell line DNA and cfDNA, enabling more precise measurement of chromosomally abnormal fetal DNA.

在不希望受任何具体理论约束的情况下,认为这种方法可能对样品中的杂质较不敏感。Without wishing to be bound by any particular theory, it is believed that this method may be less sensitive to impurities in the sample.

实施例II Example II

临床样品分析Clinical sample analysis

从26位正常怀孕个体和4位怀有存在21三体的胎儿的个体制备cfDNA样品。离心来自每位患者的血液(10ml)以分离血浆与红细胞和暗黄覆盖层。相应血浆(约3-5ml/患者)接受基于珠的DNA提取方案处理,产生稀释于50μl缓冲液中的提取cfDNA。cfDNA samples were prepared from 26 individuals with normal pregnancies and 4 individuals carrying fetuses with trisomy 21. Blood (10 ml) from each patient was centrifuged to separate plasma from red blood cells and a buffy coat. The corresponding plasma (approximately 3-5 ml/patient) was subjected to a bead-based DNA extraction protocol, resulting in extracted cfDNA diluted in 50 μl of buffer.

cfDNA随后接受上文所述的方法处理并通过使用荧光显微镜数字式计数滚环产物进行分析。全部4个阳性病例均以高于3以上的z-评分检出。正常样品的CV经计算为0.49%,从而显示测定法的高精度。cfDNA was then processed as described above and analyzed by digitally counting rolling circle products using a fluorescence microscope. All four positive cases were detected with a z-score greater than 3. The CV for normal samples was calculated to be 0.49%, demonstrating the high precision of the assay.

Claims (23)

1.一种分析核酸样品的探针系统,包括:1. A probe system for analyzing nucleic acid samples, comprising: (a)序列B的标示寡核苷酸集合;(a) The set of oligonucleotides labeled with sequence B; (b)式X’-A’-B’-Z’的夹板寡核苷酸集合,其中:(b) A set of splice oligonucleotides of formula X’-A’-B’-Z’, wherein: 在该集合中:(i)序列A’和B’变动,和(ii)序列X’和Z’彼此不同并且不是可变的;以及,In this set: (i) sequences A’ and B’ are variable, and (ii) sequences X’ and Z’ are distinct from each other and are not variable; and, 在每个夹板寡核苷酸中:(i)序列A’与核酸样品的基因组片段互补,其中所述基因组片段为序列A的基因组片段,并且(ii)序列B’与标示寡核苷酸集合的至少一个成员互补;In each splice oligonucleotide: (i) sequence A’ is complementary to a genomic fragment of a nucleic acid sample, wherein the genomic fragment is the genomic fragment of sequence A, and (ii) sequence B’ is complementary to at least one member of an oligonucleotide set; 其中每个标示寡核苷酸或其在夹板寡核苷酸中的互补B’序列鉴定(i)基因组片段所来源的基因组中的基因座或(ii)基因组片段所来源的染色体;Each labeled oligonucleotide or its complementary B' sequence in the splice oligonucleotide identifies (i) the locus in the genome from which the genomic fragment originates or (ii) the chromosome from which the genomic fragment originates; and (c)一个或多个包含X和Z的探针序列,其中序列X和Z不是可变的并且能够与序列X’和Z’杂交;(c) One or more probe sequences containing X and Z, wherein sequences X and Z are not variable and are capable of hybridizing with sequences X’ and Z’; 其中每个夹板寡核苷酸能够杂交至:(i)探针序列、(ii)标示寡核苷酸集合的成员和(iii)基因组片段,从而产生式X-A-B-Z的可连接复合物。Each of the splice oligonucleotides can hybridize to: (i) a probe sequence, (ii) a member of the oligonucleotide set and (iii) a genomic fragment, thereby producing a linkable complex of the formula X-A-B-Z. 2.根据权利要求1所述的探针系统,其中标示寡核苷酸集合包含至少两个不同的B序列标示寡核苷酸,并且在夹板寡核苷酸集合中,存在:至少100个不同的A’序列;和与所述至少两个不同的标示寡核苷酸互补的至少两个不同的B’序列。2. The probe system of claim 1, wherein the labeled oligonucleotide set comprises at least two different B sequence labeled oligonucleotides, and the splice oligonucleotide set contains: at least 100 different A’ sequences; and at least two different B’ sequences complementary to the at least two different labeled oligonucleotides. 3.根据权利要求1所述的探针系统,其中每个标示寡核苷酸或其在夹板寡核苷酸中的互补B’序列对应于基因组片段。3. The probe system of claim 1, wherein each labeled oligonucleotide or its complementary B' sequence in the splice oligonucleotide corresponds to a genomic fragment. 4.根据权利要求1所述的探针系统,其中基因组片段来自哺乳动物基因组。4. The probe system of claim 1, wherein the genomic fragment is derived from a mammalian genome. 5.根据权利要求1所述的探针系统,其中每个标示寡核苷酸或其在夹板寡核苷酸中的互补B’序列鉴定染色体21、染色体18和染色体13的一者或多者。5. The probe system of claim 1, wherein each labeled oligonucleotide or its complementary B’ sequence in the splint oligonucleotide identifies one or more of chromosomes 21, 18 and 13. 6.根据权利要求1所述的探针系统,其中基因组片段是限制性片段。6. The probe system of claim 1, wherein the genomic fragment is a restriction fragment. 7.根据权利要求1所述的探针系统,其中(c)的一个或多个探针序列还包含了包含序列Y的寡核苷酸,并且其中可连接复合物是线状的。7. The probe system of claim 1, wherein one or more probe sequences of (c) further comprise an oligonucleotide comprising sequence Y, and wherein the linkable complex is linear. 8.根据权利要求1所述的探针系统,还包含与(c)的一个或多个探针杂交的PCR引物对。8. The probe system of claim 1 further comprises a PCR primer pair that hybridizes with one or more probes of (c). 9.根据权利要求1-6中任一项所述的探针系统,其中(c)的一个或多个探针序列包含式X-Y-Z的主链探针,其中Y包含寡核苷酸序列,从而可连接复合物是式X-A-B-Z-Y的环状可连接复合物,其中序列Y接合序列X和Z。9. The probe system according to any one of claims 1-6, wherein one or more probe sequences in (c) comprise a main-chain probe of formula X-Y-Z, wherein Y comprises an oligonucleotide sequence, such that the linkable complex is a cyclic linkable complex of formula X-A-B-Z-Y, wherein sequence Y binds to sequences X and Z. 10.根据权利要求9所述的探针系统,还包含与主链探针中的序列杂交的滚环扩增引物。10. The probe system of claim 9, further comprising rolling circle amplification primers that hybridize with sequences in the main-chain probe. 11.根据权利要求9所述的探针系统,还包含:11. The probe system of claim 9, further comprising: (A)使序列与主链探针杂交的滚环扩增引物;和(A) Rolling circle amplification primers that hybridize the sequence with the main-chain probe; and (B)至多四个可区分地标记的检测寡核苷酸,其中每个可区分地标记的检测寡核苷酸与B’序列杂交。(B) Up to four distinguishably labeled detection oligonucleotides, wherein each distinguishably labeled detection oligonucleotide hybridizes with the B’ sequence. 12.根据权利要求1-11任一项所述的探针系统在制备用于确定基因组样品是否具有非整倍体的方法的试剂盒中的用途,其中所述方法包括:12. Use of the probe system according to any one of claims 1-11 in the preparation of a kit for a method of determining whether a genomic sample is aneuploid, wherein the method comprises: (a)将根据权利要求1-11任一项所述的探针系统与包含基因组片段的测试基因组样品杂交,以产生式X-A-B-Z的可连接复合物;(a) Hybridizing the probe system according to any one of claims 1-11 with a test genome sample containing a genomic fragment to produce a connectable complex of formula X-A-B-Z; (b)连接可连接复合物以产生式X-A-B-Z的产物DNA分子;并且(b) Ligating the ligation complex to produce a product DNA molecule of formula X-A-B-Z; and (c)计数与各序列B或B’相对应的产物DNA分子。(c) Count the product DNA molecules corresponding to each sequence B or B’. 13.根据权利要求12所述的用途,其中通过以下方式进行计数:对产物DNA分子或其扩增产物测序,以产生序列读出结果,并且计数包含每个序列B或其互补物的序列读出结果的数目。13. The use according to claim 12, wherein counting is performed by sequencing the product DNA molecule or its amplification product to produce a sequence readout, and counting the number of sequence readouts containing each sequence B or its complement. 14.根据权利要求12所述的用途,其中产物DNA分子是环状的,并且计数包括通过滚环扩增法扩增产物DNA分子,并计数包含每个序列B或其互补物的扩增产物的数目。14. The use according to claim 12, wherein the product DNA molecule is circular, and the counting includes amplifying the product DNA molecule by rolling circle amplification and counting the number of amplified products containing each sequence B or its complement. 15.根据权利要求14所述的用途,其中所述方法包括使用与序列B’杂交的可区分地标记的探针标记RCA产物,并且通过对每种可区分的标记物计数RCA产物的数目,进行计数。15. The use according to claim 14, wherein the method comprises labeling the RCA product with a probe that is distinguishably labeled and hybridizes to sequence B’, and counting by counting the number of RCA products for each distinguishable label. 16.根据权利要求15所述的用途,进一步其中方法包括:i.在平面支持物上沉积RCA产物;并且ii.在支持物的区域中计数各个标记的RCA产物的数目。16. The use according to claim 15, further wherein the method comprises: i. depositing RCA products on a planar support; and ii. counting the number of RCA products of each marker in a region of the support. 17.根据权利要求16所述的用途,其中支持物是载玻片。17. The use according to claim 16, wherein the support is a glass slide. 18.根据权利要求16所述的用途,其中支持物是多孔透明毛细管膜。18. The use according to claim 16, wherein the support is a porous transparent capillary membrane. 19.根据权利要求12-18中任一项所述的用途,其中不同序列B及其互补序列B’鉴定不同的染色体,并且所述方法还包括将包含B或B’的第一序列的产物DNA分子的数目与包含B或B’的第二序列的产物DNA分子的数目比较,以确定基因组样品是否具有非整倍体。19. The use according to any one of claims 12-18, wherein different sequences B and their complementary sequences B’ identify different chromosomes, and the method further comprises comparing the number of product DNA molecules containing the first sequence B or B’ with the number of product DNA molecules containing the second sequence B or B’ to determine whether the genomic sample has aneuploidy. 20.根据权利要求12-18中任一项所述的用途,其中所述方法包括将步骤(c)的计数结果与从一份或多份参比样品获得的计数结果比较。20. The use according to any one of claims 12-18, wherein the method comprises comparing the counting result of step (c) with the counting result obtained from one or more reference samples. 21.根据权利要求12-18中任一项所述的用途,其中测试基因组样品来自疑似患有疾病或病状或面临患有疾病或病状风险的患者,并且步骤(c)的计数结果提供患者或其胎儿是否患有疾病或病状的指示。21. The use according to any one of claims 12-18, wherein the genomic sample being tested is from a patient suspected of having a disease or condition or at risk of having a disease or condition, and the counting result of step (c) provides an indication of whether the patient or their fetus has a disease or condition. 22.根据权利要求21所述的用途,其中疾病或病状是癌症、感染性疾病、炎性疾病、移植排斥或三体性。22. The use according to claim 21, wherein the disease or condition is cancer, an infectious disease, an inflammatory disease, transplant rejection, or trisomy. 23.根据权利要求12-18中任一项所述的用途,其中片段是限制性片段。23. The use according to any one of claims 12-18, wherein the fragment is a limiting fragment.
HK17104284.8A 2015-09-18 2016-09-16 Probe set for analyzing a dna sample and method for using the same HK1230651B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US62/220,746 2015-09-18

Publications (2)

Publication Number Publication Date
HK1230651A1 HK1230651A1 (en) 2017-12-08
HK1230651B true HK1230651B (en) 2021-01-29

Family

ID=

Similar Documents

Publication Publication Date Title
US12378599B2 (en) Probe set for analyzing a DNA sample and method for using the same
US10711269B2 (en) Method for making an asymmetrically-tagged sequencing library
CN105026577B (en) Detection of genomic rearrangements by sequence Capture
JP5986572B2 (en) Direct capture, amplification, and sequencing of target DNA using immobilized primers
CN107636166A (en) Highly Parallel Method for Accurate Nucleic Acid Measurement
EP3714051A1 (en) Method for making a cdna library
US12270125B2 (en) System and method for modular and combinatorial nucleic acid sample preparation for sequencing
CA2993914C (en) Probe set for analyzing a dna sample and method for using the same
HK1230651B (en) Probe set for analyzing a dna sample and method for using the same
HK40018808B (en) Probe set for analyzing a dna sample and method for using the same
HK40018808A (en) Probe set for analyzing a dna sample and method for using the same
HK1230651A1 (en) Probe set for analyzing a dna sample and method for using the same
BR112018001686B1 (en) PROBE SYSTEM FOR ANALYZING A NUCLEIC ACID SAMPLE AND METHOD FOR USING THE SAME