WO2025222410A1

WO2025222410A1 - Method for synthesis and assembly of large fragment genomic dna in vitro

Info

Publication number: WO2025222410A1
Application number: PCT/CN2024/089597
Authority: WO
Inventors: 戴俊彪; 林继伟; 马英新
Original assignee: Bgi Geneland Scientific Co Ltd; Shenzhen Institute of Advanced Technology of CAS
Current assignee: Bgi Geneland Scientific Co Ltd; Shenzhen Institute of Advanced Technology of CAS
Priority date: 2024-04-24
Filing date: 2024-04-24
Publication date: 2025-10-30
Anticipated expiration: 2026-10-24

Abstract

Provided is a method for the synthesis and assembly of large fragment genomic DNA in vitro. A double-stranded DNA sticky-end generation system is used in the method. The generation system comprises DNA assembly sequences, single-stranded guide DNA, nicking endonuclease A, nicking endonuclease B, and nicking endonuclease C. The DNA assembly sequence comprises, from the 5' end to the 3' end: a first nicking endonuclease recognition site unit, a target DNA sequence, and a second nicking endonuclease recognition site unit. The method comprises: dividing a genomic sequence into n target DNA sequences, preparing DNA assembly sequences, designing and synthesizing single-stranded guide DNA, and adding nicking endonuclease A, nicking endonuclease B, and nicking endonuclease C to each DNA assembly sequence and the single-stranded guide DNA for enzyme digestion; and performing in vitro DNA assembly on the enzyme-digested DNA assembly sequences. The method overcomes the limitation of short overhangs generated by restriction endonucleases and improves the success rate of large DNA molecule assembly.

Description

A method for synthesizing and assembling large genomic DNA fragments in vitro

Technical Field

本发明属于合成生物学技术领域，具体涉及一种体外大片段基因组DNA的合成组装方法。This invention belongs to the field of synthetic biology technology, specifically relating to a method for synthesizing and assembling large fragments of genomic DNA in vitro.

Background Technology

合成DNA的能力使研究人员能够控制DNA序列的组成，从而消除了对从生物体中分离出来的自然序列的依赖，并为设计生命系统提供了一种多功能工具。随着设计复杂的代谢途径、遗传回路甚至整个生物体基因组的需求不断增加，对合成DNA，特别是大片段DNA的需求稳步增长。利用限制性内切酶将小片段DNA连接到大分子上的方法有多种，如BioBrick、Golden Gate和YeastFab。然而，限制性内切酶通常只产生0-4个核苷酸的悬臂，这些悬臂太短，无法为多个片段的组装提供足够的特异性和亲和力，这使得大多数组装技术只适用于生成低于10kbp的DNA分子。随着所需DNA序列的大小和复杂性的增加，组装的成功率急剧下降，使其难以扩展到更大的组装。The ability to synthesize DNA allows researchers to control the composition of DNA sequences, eliminating reliance on naturally occurring sequences isolated from organisms and providing a versatile tool for designing living systems. The demand for synthetic DNA, particularly large DNA fragments, is steadily increasing as the need to design complex metabolic pathways, genetic circuits, and even entire genomes grows. Various methods exist for linking small DNA fragments to large molecules using restriction endonucleases, such as BioBrick, Golden Gate, and YeastFab. However, restriction endonucleases typically produce cantilevers of only 0-4 nucleotides, which are too short to provide sufficient specificity and affinity for assembling multiple fragments. This limits most assembly techniques to producing DNA molecules smaller than 10 kbp. As the size and complexity of the desired DNA sequence increase, the success rate of assembly drops sharply, making it difficult to scale up to larger assemblies.

一些不依赖于限制性内切酶的方法，如序列和连接不依赖克隆(SLIC)、Gibson组装或聚合酶循环组装(PCA)也得到了发展。Gibson组装由于其简单、高效和在最终结构中没有限制性位点或疤痕序列而被广泛采用。然而，Gibson组装仅适用于组装一定长度的DNA框架，通常在10-15kb左右，不适合组装整个染色体。此外，通过利用酿酒酵母的同源重组能力，已经成功组装了大小从几十到几百kb不等的合成DNA结构，甚至是整个细菌基因组。近年来，科学家们利用一种名为“交换营养缺陷体渐进式整合(SwAP-In)”的方法，在体内构建了多条合成酵母染色体，这种方法是用合成序列迭代地替换天然序列。虽然合成染色体可以成功地组装并整合到酵母细胞中，但将组装好的染色体从酵母转移到其他生物(如哺乳动物细胞)仍然非常困难。Several restriction endonuclease-independent methods, such as sequence and ligation-independent cloning (SLIC), Gibson assembly, or polymerase cycle assembly (PCA), have also been developed. Gibson assembly is widely used due to its simplicity, efficiency, and the absence of restriction sites or scar sequences in the final structure. However, Gibson assembly is only suitable for assembling DNA frameworks of a certain length, typically around 10-15 kb, and is not suitable for assembling entire chromosomes. Furthermore, by utilizing the homologous recombination ability of Saccharomyces cerevisiae, synthetic DNA structures ranging from tens to hundreds of kb in size, and even entire bacterial genomes, have been successfully assembled. In recent years, scientists have used a method called "swAP-In" (swap auxotrophic body progressive integration) to construct multiple synthetic yeast chromosomes in vivo, a method that iteratively replaces natural sequences with synthetic sequences. Although synthetic chromosomes can be successfully assembled and integrated into yeast cells, transferring the assembled chromosomes from yeast to other organisms (such as mammalian cells) remains very difficult.

发明内容Summary of the Invention

为了解决现有技术中的不足，本发明的目的在于提供一种体外大片段基因组DNA的合成组装方法。To address the shortcomings of existing technologies, the present invention aims to provide a method for synthesizing and assembling large fragments of genomic DNA in vitro.

本发明的具体技术方案如下：The specific technical solution of the present invention is as follows:

本发明提供一种双链DNA粘性末端的生成系统在体外大片段基因组DNA的合成组装中的用途，所述系统包括DNA组装序列、单链引导DNA、切刻内切酶A、切刻内切酶B、切刻内切酶C；This invention provides the use of a double-stranded DNA sticky end generation system in the in vitro synthesis and assembly of large genomic DNA fragments. The system includes a DNA assembly sequence, a single-stranded guide DNA, nicking endonuclease A, nicking endonuclease B, and nicking endonuclease C.

所述DNA组装序列自5’端至3’端包括第一切刻内切酶识别位点单元、目标DNA序列和第二切刻内切酶识别位点单元，所述第一切刻内切酶识别位点单元包括2或3个切刻内切酶A识别位点和位于所述切刻内切酶A识别位点之间的间隔碱基，所述第二切刻内切酶识别位点单元包括2或3个切刻内切酶B识别位点和位于所述切刻内切酶B识别位点之间的间隔碱基，所述切刻内切酶A和切刻内切酶B属于同类切刻内切酶，所述切刻内切酶A和切刻内切酶B的切割位点位于DNA的上下两条链上，所述DNA组装序列经切刻内切酶A和切刻内切酶B酶切处理后在DNA组装序列两端分别形成单链区；The DNA assembly sequence includes a first nicking endonuclease recognition site unit, a target DNA sequence, and a second nicking endonuclease recognition site unit from the 5' end to the 3' end. The first nicking endonuclease recognition site unit includes 2 or 3 nicking endonuclease A recognition sites and spacer bases located between the nicking endonuclease A recognition sites. The second nicking endonuclease recognition site unit includes 2 or 3 nicking endonuclease B recognition sites and spacer bases located between the nicking endonuclease B recognition sites. The nicking endonuclease A and nicking endonuclease B belong to the same type of nicking endonuclease. The cleavage sites of the nicking endonuclease A and nicking endonuclease B are located on the upper and lower strands of the DNA. After the DNA assembly sequence is digested by nicking endonuclease A and nicking endonuclease B, single-stranded regions are formed at both ends of the DNA assembly sequence.

所述单链引导DNA包括第一单链引导DNA和第二单链引导DNA，所述单链引导DNA由靶序列的识别序列和折叠成茎环结构的短DNA序列组成，所述靶序列为单链DNA序列且包括所述单链区的部分序列以及其相邻目标DNA序列的部分序列，该单链区的部分序列被称为第一靶序列，相邻目标DNA序列的部分序列被称为第二靶序列，所述短DNA序列折叠成的茎环结构具有切刻内切酶C识别位点，但缺少可被所述切刻内切酶C切割的序列，所述靶序列的识别序列与第一靶序列杂交后，第二靶序列所对应的另一条DNA链被分离，靶序列的识别序列进一步与第二靶序列杂交，靶序列与识别序列杂交形成双链结构，该双链结构可被所述切刻内切酶C识别，并使第二靶序列的预定位置处于可以介由所述切刻内切酶C对单链引导DNA上的识别序列的识别而被所述切刻内切酶C切割的位置，所述第一单链引导DNA和第二单链引导DNA的靶序列的识别序列分别与经切刻内切酶A和切刻内切酶B酶切处理后在DNA组装序列两端形成的单链区的部分序列以及其相邻目标DNA序列的部分序列杂交形成双链结构。The single-stranded guide DNA includes a first single-stranded guide DNA and a second single-stranded guide DNA. The single-stranded guide DNA consists of a recognition sequence for the target sequence and a short DNA sequence folded into a stem-loop structure. The target sequence is a single-stranded DNA sequence and includes a portion of the single-stranded region. The short DNA sequence comprises a single-stranded region and a partial sequence of its adjacent target DNA sequence. The partial sequence of this single-stranded region is referred to as the first target sequence, and the partial sequence of the adjacent target DNA sequence is referred to as the second target sequence. The stem-loop structure formed by the short DNA sequence has a nicking endonuclease C recognition site, but lacks a sequence that can be cleaved by the nicking endonuclease C. After the recognition sequence of the target sequence hybridizes with the first target sequence, the other DNA strand corresponding to the second target sequence is separated. The recognition sequence of the target sequence further hybridizes with the second target sequence. The target sequence and the recognition sequence hybridize to form a double-stranded structure. This double-stranded structure can be recognized by the nicking endonuclease C, and the predetermined position of the second target sequence is located at a position that can be cleaved by the nicking endonuclease C through the recognition of the recognition sequence on the single-stranded guide DNA by the nicking endonuclease C. The recognition sequences of the target sequences of the first single-stranded guide DNA and the second single-stranded guide DNA hybridize with the partial sequences of the single-stranded regions formed at both ends of the DNA assembly sequence after nicking endonuclease A and nicking endonuclease B digestion, as well as the partial sequences of their adjacent target DNA sequences, to form a double-stranded structure.

进一步地，所述切刻内切酶A和切刻内切酶B分别选自Nt.BbvCI、Nt.AlwI、Nt.BsmAI、Nt.BspQI、Nt.BstNBI、Nt.CviPII中的一种，或者Nb.BbvCI、Nb.BsmI、Nb.BsrDI、Nb.BtsI中的一种；Furthermore, the nicking endonuclease A and nicking endonuclease B are selected from one of Nt.BbvCI, Nt.AlwI, Nt.BsmAI, Nt.BspQI, Nt.BstNBI, and Nt.CviPII, or one of Nb.BbvCI, Nb.BsmI, Nb.BsrDI, and Nb.BtsI, respectively;

优选地，所述第一切刻内切酶识别位点单元包括3个切刻内切酶A识别位点，所述第二切刻内切酶识别位点单元包括3个切刻内切酶B识别位点；Preferably, the first nicking endonuclease recognition site unit includes three nicking endonuclease A recognition sites, and the second nicking endonuclease recognition site unit includes three nicking endonuclease B recognition sites;

优选地，所述切刻内切酶A和切刻内切酶B为同种切刻内切酶；Preferably, the nicking endonuclease A and nicking endonuclease B are the same nicking endonuclease;

优选地，所述切刻内切酶A和切刻内切酶B为Nt.BspQI；Preferably, the nicking endonuclease A and nicking endonuclease B are Nt.BspQI;

优选地，所述第一切刻内切酶识别位点单元包括3个切刻内切酶A识别位点，所述第二切刻内切酶识别位点单元包括3个切刻内切酶B识别位点。Preferably, the first nicking endonuclease recognition site unit includes three nicking endonuclease A recognition sites, and the second nicking endonuclease recognition site unit includes three nicking endonuclease B recognition sites.

进一步地，所述切刻内切酶A识别位点之间的间隔碱基满足如下条件：所述切刻内切酶A的切割位点不位于其他切刻内切酶A识别序列中；Furthermore, the spacer bases between the nicking endonuclease A recognition sites satisfy the following condition: the cleavage site of the nicking endonuclease A is not located in other nicking endonuclease A recognition sequences;

所述切刻内切酶B识别位点之间的间隔碱基满足如下条件：所述切刻内切酶B的切割位点不位于其他切刻内切酶B识别序列中；The spacer bases between the nicking endonuclease B recognition sites satisfy the following condition: the nicking endonuclease B cleavage site is not located in other nicking endonuclease B recognition sequences;

优选地，所述切刻内切酶A识别位点之间的间隔碱基和所述切刻内切酶B识别位点之间的间隔碱基为4～10个任意排列的A、C、G或T，且间隔碱基不形成切刻内切酶识别位点。Preferably, the spacer bases between the nicking endonuclease A recognition sites and the spacer bases between the nicking endonuclease B recognition sites are 4 to 10 randomly arranged A, C, G or T bases, and the spacer bases do not form nicking endonuclease recognition sites.

进一步地，所述切刻内切酶C选自Nt.BbvCI、Nt.AlwI、Nt.BsmAI、Nt.BspQI、Nt.BstNBI、Nt.CviPII、Nb.BbvCI、Nb.BsmI、Nb.BsrDI、Nb.BtsI中的一种；Furthermore, the nicking endonuclease C is selected from one of Nt.BbvCI, Nt.AlwI, Nt.BsmAI, Nt.BspQI, Nt.BstNBI, Nt.CviPII, Nb.BbvCI, Nb.BsmI, Nb.BsrDI, and Nb.BtsI;

优选地，所述切刻内切酶C选自Nt.BstNBI。Preferably, the nicking endonuclease C is selected from Nt.BstNBI.

进一步地，所述DNA组装序列的第一切刻内切酶识别位点单元的5’端和第二切刻内切酶识别位点单元的3’端分别包括大小为500-1500bp的辅助序列；或者，将所述DNA组装序列克隆至质粒中。Furthermore, the 5' end of the first nicking endonuclease recognition site unit and the 3' end of the second nicking endonuclease recognition site unit of the DNA assembly sequence each include an auxiliary sequence of 500-1500 bp in size; or, the DNA assembly sequence is cloned into a plasmid.

进一步地，所述单链区的长度为5-100个碱基。Furthermore, the length of the single-chain region is 5-100 bases.

进一步地，所述第二靶序列的长度在5-100碱基之间，优选在5-30碱基之间，更优选在10-30碱基之间。Furthermore, the length of the second target sequence is between 5 and 100 bases, preferably between 5 and 30 bases, and more preferably between 10 and 30 bases.

进一步地，所述粘性末端的长度为5-30碱基之间，优选为5-20碱基之间。Furthermore, the length of the sticky end is between 5 and 30 bases, preferably between 5 and 20 bases.

进一步地，所述大片段基因组DNA为酵母染色体或T7噬菌体基因组。Furthermore, the large fragment of genomic DNA is a yeast chromosome or a T7 phage genome.

本发明还提供一种体外大片段基因组DNA的合成组装方法，采用所述的系统，包括如下步骤： This invention also provides a method for synthesizing and assembling large genomic DNA fragments in vitro using the aforementioned system, comprising the following steps:

(1)将基因组序列分成n个目标DNA序列，n为正整数，制备n个DNA组装序列，设计并合成每个DNA组装序列的单链引导DNA；(1) Divide the genome sequence into n target DNA sequences, where n is a positive integer, prepare n DNA assembly sequences, and design and synthesize single-stranded guide DNA for each DNA assembly sequence;

(2)向每个DNA组装序列、与其对应的单链引导DNA中加入切刻内切酶A、切刻内切酶B和切刻内切酶C，进行酶切处理，相邻目标DNA序列的DNA组装序列经酶切处理后产生的粘性末端互补；(2) Add nicking endonuclease A, nicking endonuclease B and nicking endonuclease C to each DNA assembly sequence and its corresponding single-stranded guide DNA for enzyme digestion. The sticky ends generated by the enzyme digestion of adjacent target DNA sequences are complementary.

(3)将步骤(2)酶切处理后的DNA组装序列进行DNA体外组装。(3) The DNA assembly sequence after enzyme digestion in step (2) is assembled into DNA in vitro.

进一步地，步骤(1)中所述目标DNA序列的大小为300bp-50kb；Furthermore, the target DNA sequence described in step (1) is 300bp-50kb in size;

优选地，步骤(1)中所述DNA组装序列的制备方法为：将目标DNA序列克隆至包含第一切刻内切酶识别位点单元和第二切刻内切酶识别位点单元的质粒中，且目标DNA序列位于第一切刻内切酶识别位点单元和第二切刻内切酶识别位点单元之间；或者，在每个目标DNA序列的两端分别插入第一切刻内切酶识别位点单元和第二切刻内切酶识别位点单元，扩增获得所述DNA组装序列；Preferably, the method for preparing the DNA assembly sequence in step (1) is as follows: cloning the target DNA sequence into a plasmid containing a first nicking endonuclease recognition site unit and a second nicking endonuclease recognition site unit, wherein the target DNA sequence is located between the first nicking endonuclease recognition site unit and the second nicking endonuclease recognition site unit; or, inserting the first nicking endonuclease recognition site unit and the second nicking endonuclease recognition site unit at both ends of each target DNA sequence, and amplifying to obtain the DNA assembly sequence;

优选地，所述质粒为pCCE，其核苷酸序列如SEQ ID NO.13所示；Preferably, the plasmid is pCCE, and its nucleotide sequence is shown in SEQ ID NO.13;

优选地，所述DNA组装序列的第一切刻内切酶识别位点单元的5’端和第二切刻内切酶识别位点单元的3’端分别包括大小为500-1500bp的辅助序列，扩增获得所述DNA组装序列。Preferably, the 5' end of the first nicking endonuclease recognition site unit and the 3' end of the second nicking endonuclease recognition site unit of the DNA assembly sequence each include an auxiliary sequence of 500-1500 bp in size, and the DNA assembly sequence is obtained by amplification.

进一步地，步骤(2)中所述酶切处理在30-75℃下进行，优选在45-65℃下进行，更优选在50-60℃下进行；Furthermore, the enzymatic digestion process described in step (2) is carried out at 30-75°C, preferably at 45-65°C, and more preferably at 50-60°C;

优选地，步骤(2)中所述酶切处理的体系为：4μg DNA组装序列，1x cutsmart buffer，1mM DTT，体积百分比浓度8％的PEG4000，3μM单链引导DNA，12U切刻内切酶A、12U切刻内切酶B和12U切刻内切酶C；酶切处理的条件为在50℃孵育2h。Preferably, the enzyme digestion system in step (2) is: 4 μg DNA assembly sequence, 1x cutsmart buffer, 1 mM DTT, 8% PEG4000 (v/v), 3 μM single-stranded guide DNA, 12 U nicking endonuclease A, 12 U nicking endonuclease B and 12 U nicking endonuclease C; the enzyme digestion conditions are incubation at 50°C for 2 h.

进一步地，步骤(3)中所述DNA体外组装根据基因组DNA片段的大小，进行多轮组装，从而得到大片段基因组DNA；Furthermore, the DNA in vitro assembly described in step (3) involves multiple rounds of assembly based on the size of the genomic DNA fragment, thereby obtaining a large fragment of genomic DNA;

优选地，步骤(3)中所述DNA体外组装包括依次进行的杂交反应和组装反应两个步骤：Preferably, the DNA in vitro assembly in step (3) includes two steps: a hybridization reaction and an assembly reaction performed sequentially.

1)杂交反应：在杂交缓冲液中加入相邻的酶切后DNA组装序列，进行杂交反应；杂交反应条件为：55±5℃，5±3min；45±5℃，10±5min；40±5℃，10±5min；35±5℃，10±5min；20±5℃，10±5min；杂交缓冲液组分为100mM NaCl，20mM Tris-Ac，1mM EDTA，体积百分比浓度8％-16％的PEG4000，pH 7.5；1) Hybridization reaction: Add adjacent enzyme-digested DNA assembly sequences to the hybridization buffer and perform the hybridization reaction; the hybridization reaction conditions are: 55±5℃, 5±3min; 45±5℃, 10±5min; 40±5℃, 10±5min; 35±5℃, 10±5min; 20±5℃, 10±5min; the hybridization buffer composition is 100mM NaCl, 20mM Tris-Ac, 1mM EDTA, PEG4000 with a volume percentage concentration of 8%-16%, pH 7.5;

2)组装反应：杂交反应结束后，向杂交反应体系中加入组装缓冲液和DNA连接酶；组装反应条件为：30±5℃，10±5min；50-100个循环×(30±5℃，1±0.5min；42±5℃，1±0.5min)；组装缓冲液组分为30mM Tris-HCl，4mM MgCl₂，26μM NAD，1mM DTT，50μg/mL BSA；2) Assembly reaction: After the hybridization reaction, add assembly buffer and DNA ligase to the hybridization reaction system; the assembly reaction conditions are: 30±5℃, 10±5min; 50-100 cycles × (30±5℃, 1±0.5min; 42±5℃, 1±0.5min); the assembly buffer components are 30mM Tris-HCl, 4mM MgCl2, _26μM NAD, 1mM DTT, 50μg/mL BSA;

优选地，所述杂交缓冲液组分中PEG4000的体积百分比浓度为8％；Preferably, the volume percentage concentration of PEG4000 in the hybridization buffer component is 8%;

优选地，向杂交反应体系中加入大肠杆菌DNA连接酶或T7 DNA连接酶，优选加入大肠杆菌DNA连接酶；Preferably, Escherichia coli DNA ligase or T7 DNA ligase is added to the hybridization reaction system, and more preferably Escherichia coli DNA ligase is added.

优选地，在杂交缓冲液中加入相邻的酶切后DNA组装序列数为3-12个，取整数；Preferably, the number of adjacent DNA assembly sequences after enzyme digestion added to the hybridization buffer is 3-12, rounded to the nearest integer.

优选地，在杂交缓冲液中加入相邻的酶切后DNA组装序列数为6或7个。 Preferably, the number of adjacent post-digestion DNA assembly sequences added to the hybridization buffer is 6 or 7.

本发明的有益效果为：The beneficial effects of this invention are as follows:

本发明提供一种双链DNA粘性末端的生成系统，可以在双链DNA片段两端产生超长和精确的悬臂，悬臂通过可编程DNA引导的链切割(DSC)技术产生，可自主设计，没有长度或序列偏好。This invention provides a system for generating sticky ends of double-stranded DNA, which can generate ultra-long and precise cantilevers at both ends of double-stranded DNA fragments. The cantilevers are generated by programmable DNA-guided strand cutting (DSC) technology, which can be designed autonomously without length or sequence preference.

本发明还提供一种体外迭代组装策略，使用DNA引导链切割(DSC)技术处理小DNA片段，在小DNA片段两端产生超长和精确的悬臂，两端具有超长悬臂的小片段进一步用于体外组装，通过一系列的测试优化含目标悬臂DNA片段的组装效率，从而实现体外大片段基因组DNA的合成组装。本发明克服了限制性内切酶(REs)产生短悬臂的限制并提高大DNA分子组装的成功率。本发明以T7噬菌体为例，由PCR扩增子组装成完整的T7噬菌体基因组，并能够产生感染性病毒粒子。最后，利用62个合成DNA片段在体外构建了端粒-端粒合成酵母染色体，并将其分离并转移到酵母细胞中。本发明技术通常用于大片段DNA分子的无缝构建，且可加速未来的基因组的合成。This invention also provides an in vitro iterative assembly strategy that uses DNA guide strand cutting (DSC) technology to process small DNA fragments, generating ultra-long and precise cantilevers at both ends of the small DNA fragments. These small fragments with ultra-long cantilevers are further used for in vitro assembly. A series of tests are used to optimize the assembly efficiency of DNA fragments containing target cantilevers, thereby achieving the in vitro synthesis and assembly of large genomic DNA fragments. This invention overcomes the limitation of restriction endonucleases (REs) in generating short cantilevers and improves the success rate of assembling large DNA molecules. Taking T7 bacteriophage as an example, this invention assembles a complete T7 bacteriophage genome from PCR amplicon, which can generate infectious viral particles. Finally, using 62 synthesized DNA fragments, a telomere-telomere synthetic yeast chromosome was constructed in vitro, isolated, and transferred into yeast cells. This technique is commonly used for the seamless construction of large DNA molecules and can accelerate future genome synthesis.

Attached Figure Description

图1为基于DNA引导链切割技术的体外迭代DNA组装策略的详解。(A)DSC识别和切割ssDNA示意图。ssDNA链中的灰色表示目标ssDNA。剪刀表示切刻内切酶在目标ssDNA中的切割位点。(B)利用DSC剪切dsDNA产生设计粘性末端示意图。剪刀表示切刻内切酶切割位点，剪刀所在dsDNA链的灰色表示目标dsDNA和所需的5'悬臂。(C)syn I组装示意图。DSC处理的长度为～5kb的DNA片段在两端具有所需的悬臂。将相邻片段连接得到长度约为40kb的DNA片段。重复这个连接过程，在下一阶段的组装中产生长度约200kb的DNA片段，最终组装出1Mb大小的线性染色体。(D)DSC处理后目标DNA的PAGE分析。泳道1表示DSC处理前的目标ssDNA。泳道2至12是用不同的sgDNA(sgDNA-1～sgDNA-11)用DSC处理过的靶标ssDNA。星号表示全长目标ssDNA。(E)对产生的悬臂进行直接Sanger测序分析。TO表示目标悬垂。测序信号显示在右侧，相应的核苷酸用虚线方框表示，与设计的悬臂完全匹配。末端额外的A峰来自测序酶的终止转移酶活性。Figure 1 illustrates the in vitro iterative DNA assembly strategy based on DNA guide strand cutting technology. (A) Schematic diagram of DSC recognition and cleavage of ssDNA. Gray in the ssDNA strand represents the target ssDNA. The scissors represent the cleavage sites of the endonuclease in the target ssDNA. (B) Schematic diagram of generating designed sticky ends by cutting dsDNA using DSC. The scissors represent the cleavage sites of the endonuclease, and the gray in the dsDNA strand where the scissors are located represents the target dsDNA and the desired 5' cantilever. (C) Schematic diagram of syn I assembly. DNA fragments of ~5kb in length treated by DSC have the desired cantilever at both ends. Adjacent fragments are ligated to obtain DNA fragments of approximately 40kb in length. This ligation process is repeated to generate DNA fragments of approximately 200kb in length in the next stage of assembly, ultimately assembling a linear chromosome of 1Mb size. (D) PAGE analysis of the target DNA after DSC treatment. Lane 1 represents the target ssDNA before DSC treatment. Lanes 2 to 12 are target ssDNAs treated with DSC using different sgDNAs (sgDNA-1 to sgDNA-11). An asterisk indicates the full-length target ssDNA. (E) Direct Sanger sequencing analysis was performed on the resulting cantilever. TO indicates the target cantilever. Sequencing signals are shown on the right, with the corresponding nucleotides indicated by dashed boxes, perfectly matching the designed cantilever. The additional A peak at the end originates from the termination transferase activity of the sequencing enzyme.

图2为测试质粒的SnapGene图谱。Figure 2 shows the SnapGene map of the test plasmid.

图3为不同反应条件对DNA组装效率的影响。(A)不同连接酶对DNA组装效率的影响。星号指向目标产品。(B)PEG4000用量对DNA组装效率的影响。星号指向目标产品。(C)片段数对DNA组装效率的影响。对每个反应条件进行了两次平行实验。Figure 3 shows the effect of different reaction conditions on DNA assembly efficiency. (A) Effect of different ligases on DNA assembly efficiency. Asterisks indicate the target product. (B) Effect of PEG4000 dosage on DNA assembly efficiency. Asterisks indicate the target product. (C) Effect of fragment number on DNA assembly efficiency. Two parallel experiments were performed for each reaction condition.

图4为T7基因组和synI的组装和表征。(A)PCR扩增T7 DNA片段TP15a的电泳分析。(B)DSC处理PCR片段TP6a的电泳分析。(C)T7 DNA第一阶段组装后的电泳分析。(D)T7 DNA第二阶段组装后的电泳分析。(E)组装后的T7基因组转化大肠杆菌的噬菌体斑块测定。(F)DSC生成的具有所需悬臂的同源DNA片段的电泳分析。合成的DNA以2-4kb的片段在质粒中提供，并使用DSC进行酶切(上图)。释放的DNA片段被纯化和验证(下图)。(G)synI DNA第一阶段组装后的电泳分析。目标产品由色星号表示。+/-表示存在或不存在DNA连接酶。(H)synI DNA第二阶段组装后的电泳分析。色星号指向组装好的synI。+/-表示存在或不存在DNA连接酶。(1)4个含synI酵母克隆的脉冲场凝胶电泳分析(左图)和Southern印迹分析(中图:YAL003C特异探针，右图:URA3特异探针)。BY4742作为对照。P1和P2是用纯化的synI转化的克隆。M1和M2是二次组装后混合连接产物转化的克隆。(J)4个酵母克隆染色体同工型的纳米孔测序分析。每个圆圈代表一个基因组片段。开放、填充和半填充的圆圈分别代表野生型、合成型和杂交型。P1和P2是用纯化的synI转化的克隆。M1和M2是二次组装后混合连接产物转化的克隆。Figure 4 shows the assembly and characterization of the T7 genome and synI. (A) Electrophoretic analysis of PCR-amplified T7 DNA fragment TP15a. (B) Electrophoretic analysis of DSC-treated PCR fragment TP6a. (C) Electrophoretic analysis after the first stage of T7 DNA assembly. (D) Electrophoretic analysis after the second stage of T7 DNA assembly. (E) Phage plaque assay of *E. coli* transformed with the assembled T7 genome. (F) Electrophoretic analysis of homologous DNA fragments with desired cantilevers generated by DSC. Synthesized DNA was provided in 2-4 kb fragments in plasmids and digested using DSC (top). The released DNA fragments were purified and validated (bottom). (G) Electrophoretic analysis after the first stage of synI DNA assembly. Target products are indicated by colored asterisks. +/- indicates the presence or absence of DNA ligase. (H) Electrophoretic analysis after the second stage of synI DNA assembly. Colored asterisks point to assembled synI. +/- indicates the presence or absence of DNA ligase. (1) Pulsed-field gel electrophoresis analysis (left image) and Southern blotting analysis of four yeast clones containing synI (middle image: YAL003C specific probe, right image: URA3 specific probe). BY4742 was used as a control. P1 and P2 were transformed with purified synI. Clones. M1 and M2 are clones transformed from the mixed ligation product after secondary assembly. (J) Nanopore sequencing analysis of chromosome isoforms of four yeast clones. Each circle represents a genome fragment. Open, filled, and half-filled circles represent wild-type, synthetic, and hybrid types, respectively. P1 and P2 are clones transformed with purified synI. M1 and M2 are clones transformed from the mixed ligation product after secondary assembly.

图5为synI的结构和组装示意图。(A)synI的两个标记物HIS3和URA3位于端粒的两端附近。(B)SynI被分成9个chunk，每个chunk包含6-7个fragment。Figure 5 shows a schematic diagram of the structure and assembly of synI. (A) The two markers of synI, HIS3 and URA3, are located near the two ends of the telomeres. (B) SynI is divided into 9 chunks, each containing 6-7 fragments.

图6为4个含synI酵母克隆的PCR标签分析。M1(A)和M2(B)是第2次组装的混合连接产物转化的无性系。P1(C)和P2(D)是用纯化的synI转化的克隆。SYN:合成型PCRTags,WT:野生型PCRTags。Figure 6 shows the PCR tag analysis of four yeast clones containing synI. M1(A) and M2(B) are clones transformed from the mixed ligation product of the second assembly. P1(C) and P2(D) are clones transformed with purified synI. SYN: synthetic PCR tags, WT: wild-type PCR tags.

图7为synI和chrI同工异构体的结构。M1(A)、M2(B)、P1(C)和P2(D)菌株中synI和chrI亚型结构的测序证据绘制在箭头图和堆叠直方图中。断点由包含合成PCRTags和野生型PCRTags的杂交reads识别。所有跨越长度超过10kb的断点的读取都绘制为箭头图。堆叠直方图显示了PCR标签位点的测序深度。SYN：合成PCRTags为浅灰色，WT：野生型PCRTags为深灰色。Figure 7 shows the structures of the synI and chrI isoforms. Sequencing evidence for the synI and chrI subtype structures in strains M1(A), M2(B), P1(C), and P2(D) is plotted in the arrow diagram and stacked histogram. Breakpoints are identified by hybridization reads containing both synthetic and wild-type PCR tags. All reads spanning breakpoints longer than 10 kb are plotted as arrows. The stacked histogram shows the sequencing depth of the PCR tag sites. SYN: synthetic PCR tags are light gray; WT: wild-type PCR tags are dark gray.

Detailed Implementation

为了更清楚地理解本发明，现参照下列实施例及附图进一步描述本发明。实施例仅用于解释而不以任何方式限制本发明。实施例中，各原始试剂材料均可商购获得，未注明具体条件的实验方法为所属领域熟知的常规方法和常规条件，或按照仪器制造商所建议的条件。To better understand the present invention, it is now further described with reference to the following embodiments and accompanying drawings. The embodiments are for illustrative purposes only and do not limit the invention in any way. In the embodiments, all original reagents and materials are commercially available, and experimental methods not specifically specified are conventional methods and conditions well known in the art, or according to the conditions recommended by the instrument manufacturer.

实施例1：基于DNA引导链切割技术开发体外迭代DNA组装策略Example 1: Development of an in vitro iterative DNA assembly strategy based on DNA guide strand cutting technology

DNA引导链切割，英文全称为DNA-guided strand cleavage，缩写为DSC。DSC技术通过DSC系统在目标单链DNA中产生单链断裂，DSC系统包括一个切刻内切酶和一个单链引导DNA(sgDNA)，sgDNA通过与切刻内切酶作用锚定目标序列，并在目标单链DNA内特定位点精确切割形成缺口。本发明基于DSC技术，在双链DNA片段的两端产生超长和精确的悬臂，用于后续的DNA组装。下面对本发明基于DNA引导链切割技术的体外迭代DNA组装策略进行详细阐述。DNA-guided strand cleavage, abbreviated as DSC, is a technique that creates single-strand breaks in target single-stranded DNA using a DSC system. The DSC system includes a nicking endonuclease and a single-stranded guide DNA (sgDNA). The sgDNA interacts with the nicking endonuclease to anchor the target sequence and precisely cleaves at specific sites within the target single-stranded DNA, creating a notch. This invention, based on DSC technology, generates ultra-long and precise cantilever arms at both ends of double-stranded DNA fragments for subsequent DNA assembly. The following section details the in vitro iterative DNA assembly strategy based on DNA-guided strand cleavage technology.

1、程序化DSC用于单链DNA的精确切割1. Programmed DSC is used for precise cutting of single-stranded DNA.

可编程的DSC技术基于DSC系统实现对单链DNA的精确切割，DSC系统包括一个切刻内切酶和一个单链引导DNA(sgDNA)，利用切刻内切酶在目标DNA中产生单链断裂(图1A)。DSC的关键成分是单链引导DNA(sgDNA)，它由5'端靶识别序列(TRS)和3'端折叠成茎环结构的短DNA(shDNA)序列组成。TRS包含大约20个核苷酸，这些核苷酸与目标单链DNA(ssDNA)互补，并通过DNA-DNA碱基配对与其结合。茎环结构为切刻内切酶提供了一个识别位点，形成一个DNA-酶复合物，在目标ssDNA内产生缺口。因此，DSC可以被编程为在目标序列中切割任何没有任何识别位点的序列，允许无缝的DNA组装，避免引入额外的序列。Programmable DSC technology enables precise cleavage of single-stranded DNA based on a DSC system. The DSC system comprises a nicking endonuclease and a single-stranded guide DNA (sgDNA). The nicking endonuclease creates single-strand breaks in the target DNA (Figure 1A). The key component of DSC is the single-stranded guide DNA (sgDNA), which consists of a 5' target recognition sequence (TRS) and a short DNA (shDNA) sequence folded into a stem-loop structure at the 3' end. The TRS contains approximately 20 nucleotides complementary to the target single-stranded DNA (ssDNA) and binds to it via DNA-DNA base pairing. The stem-loop structure provides a recognition site for the nicking endonuclease, forming a DNA-enzyme complex that creates a gap within the target ssDNA. Therefore, DSC can be programmed to cleave any sequence in the target sequence that lacks a recognition site, allowing for seamless DNA assembly and avoiding the introduction of additional sequences.

本实施例中DSC切割目标ssDNA的体系和反应条件为：待切割的目标DNA分子(4μg)在1x cutsmart buffer(NEB)，1mM DTT，8％PEG4000(Thermo)，3μM sgDNA和12U Nt.BstNBI(NEB)的混合溶液中消化。在50℃孵育2小时。In this embodiment, the system and reaction conditions for DSC cleavage of the target ssDNA were as follows: 4 μg of the target DNA molecule to be cleaved was digested in a mixed solution of 1x Cutsmart Buffer (NEB), 1 mM DTT, 8% PEG4000 (Thermo), 3 μM sgDNA, and 12 U Nt. BstNBI (NEB). The digestion was carried out at 50°C for 2 hours.

为了验证DSC能够精确切割目标ssDNA，本实施例合成了一系列sgDNA(如表1所示)来引导 Nt.BstNBI切刻内切酶切割氨苄西林耐药基因的50nt ssDNA(其核苷酸序列如SEQ ID NO.12所示)。每个sgDNA包含10-14nt的TRS，这取决于它们计算的熔化温度，从5'端开始以4nt的间隔与目标ssDNA配对。sgDNA-1意味着它的TRS与nts 1-10结合，切刻内切酶Nt.BstNBI在第4个核苷酸之后切割目标ssDNA，产生4和46nts的两段核苷酸。同样，sgDNA-11在目标ssDNA的3'端结合10个nts，产生44和6个nts的寡核苷酸。这些sgDNA与Nt.BstNBI结合，切割目标ssDNA，并通过PAGE分析(图1D)。正如预期的那样，泳道1作为阴性对照，只含有全长靶基因ssDNA和Nt.BstNBI，不含sgDNA。在泳道2-11中，对于每个sgDNA，目标ssDNA被消化成两个预期大小的片段。在第2道和第11道中，较小的方格分别为4nt和6nt，它们太短而无法观察到。To verify that DSC can accurately cleave the target ssDNA, a series of sgDNAs (as shown in Table 1) were synthesized in this embodiment to guide the cleavage. Nt.BstNBI nicking endonuclease cleaves 50nt ssDNA of the ampicillin resistance gene (its nucleotide sequence is shown in SEQ ID NO.12). Each sgDNA contains 10–14nt TRS, depending on their calculated melting temperature, and pairs with the target ssDNA at 4nt intervals starting from the 5' end. sgDNA-1, meaning its TRS binds to nts 1–10, cleaves the target ssDNA with the nicking endonuclease Nt.BstNBI after the 4th nucleotide, producing two nucleotide segments of 4 and 46 nt. Similarly, sgDNA-11 binds 10 nts to the 3' end of the target ssDNA, producing oligonucleotides of 44 and 6 nt. These sgDNAs bind to Nt.BstNBI, cleave the target ssDNA, and are analyzed by PAGE (Figure 1D). As expected, lane 1 serves as a negative control, containing only the full-length target gene ssDNA and Nt.BstNBI, but no sgDNA. In lanes 2-11, for each sgDNA, the target ssDNA was digested into two fragments of the expected size. In lanes 2 and 11, the smaller squares were 4 nt and 6 nt, respectively, which were too short to be observed.

表1
Table 1

2、DSC辅助生成所需的dsDNA粘性末端2. DSC-assisted generation of the required sticky ends of dsDNA

基于程序化DSC用于单链DNA精确切割的策略，将DSC用于双链DNA(dsDNA)切割产生双链断裂(DSB)。然而，与ssDNA不同的是，dsDNA中的靶序列与其互补链杂交，阻止了TRS的结合并阻断了酶的裂解。因此，为了克服这一限制，本发明设计了一种特殊的载体pCCE(核苷酸序列如SEQ ID NO.13所示)。将靶DNA片段连接于载体pCCE中，Nt.BspQI的三个相邻的识别位点被设计在靶DNA片段的侧面，在酶切后暴露出所需的5'悬臂的第一个核苷酸，并在这些缺口之间释放短寡核苷酸，形成一个单链DNA(ssDNA)区域(图1B)。接下来，在DSC存在的情况下，TRS序列与ssDNA区域配对，分离设计的5'悬垂，Nt.BstNBI切割目标链。DNA片段的另一端经过类似的处理，在另一端产生一个设计的悬臂，用于随后的组装。Based on the strategy of using programmed DSC for precise single-stranded DNA cleavage, DSC is used to cleave double-stranded DNA (dsDNA) to generate double-strand breaks (DSBs). However, unlike ssDNA, the target sequence in dsDNA hybridizes with its complementary strand, preventing TRS binding and blocking enzyme cleavage. Therefore, to overcome this limitation, this invention designs a special vector pCCE (nucleotide sequence shown in SEQ ID NO. 13). The target DNA fragment is ligated into the vector pCCE, and three adjacent recognition sites of Nt.BspQI are designed on the sides of the target DNA fragment, exposing the first nucleotide of the desired 5' cantilever after enzyme digestion, and releasing short oligonucleotides between these gaps to form a single-stranded DNA (ssDNA) region (Figure 1B). Next, in the presence of DSC, the TRS sequence pairs with the ssDNA region, separating the designed 5' cantilever, and Nt.BstNBI cleaves the target strand. The other end of the DNA fragment is similarly treated, generating a designed cantilever at the other end for subsequent assembly.

本实施例中DSC辅助生成所需的dsDNA粘性末端的体系和反应条件为：待切割的目标DNA分子(4μg)在1x cutsmart buffer(NEB)，1mM DTT，8％PEG4000(Thermo)，3μM sgDNA，12U Nt.BspQI(NEB)和12U Nt.BstNBI(NEB)的混合溶液中消化。在50℃孵育2小时。 In this embodiment, the system and reaction conditions for DSC-assisted generation of the required dsDNA sticky ends are as follows: 4 μg of the target DNA molecule to be cleaved is digested in a mixed solution of 1x cutsmart buffer (NEB), 1 mM DTT, 8% PEG4000 (Thermo), 3 μM sgDNA, 12 U Nt.BspQI (NEB), and 12 U Nt.BstNBI (NEB). The mixture is incubated at 50°C for 2 hours.

本实施例使用DSC在两条链的相对位置切割，得到dsDNA两端的5'悬臂。由于大多数DNA聚合酶具有核酸外切酶活性，特定5'悬臂的产生可以最大限度地减少潜在的环境污染。它使我们有机会通过Sanger测序直接检查悬垂的完整性和精度，这是无错误和高效组装的先决条件。Sanger测序在DNA合成过程中引入三磷酸二脱氧核苷(ddNTP)制备不同长度的DNA片段。根据这些片段的大小在凝胶电泳中分离，得到完整的DNA序列。在存在5'悬臂的情况下，ddNTP用于延长dsDNA中的短链以获得期望的信号。This embodiment uses DSC to cut the two strands at their relative positions, obtaining 5' cantilevers at both ends of the dsDNA. Since most DNA polymerases possess exonuclease activity, the generation of specific 5' cantilevers minimizes potential environmental contamination. It allows us to directly examine the integrity and accuracy of the cantilevers via Sanger sequencing, a prerequisite for error-free and efficient assembly. Sanger sequencing introduces dideoxynucleotide triphosphates (ddNTPs) during DNA synthesis to prepare DNA fragments of varying lengths. These fragments are separated according to their size in gel electrophoresis to obtain the complete DNA sequence. In the presence of 5' cantilevers, ddNTPs are used to extend the short strands in the dsDNA to obtain the desired signal.

将761bp的DNA片段插入pMV中构建测试质粒(图2，测试质粒的核苷酸序列如SEQ ID NO.14所示)，用DSC处理后释放两端含有5'悬臂的片段。为了对两端进行测序，该片段通过添加另一种限制性内切酶BtsI进一步被消化成两个更小的片段。正如预期的那样，荧光信号在276nt-290nt和487nt-501nt之间被检测到(图1E)。从荧光信号中得到与5'悬臂相对应的DNA序列，每个5'悬臂都与设计完美匹配。结果表明，DSC可以高保真地生成设计的粘端，并且在DNA提取过程中悬臂部分保持完整，保证了后续过程中DNA的高质量组装。A 761 bp DNA fragment was inserted into pMV to construct a test plasmid (Figure 2, the nucleotide sequence of the test plasmid is shown in SEQ ID NO. 14). After DSC treatment, fragments with 5' cantilevers at both ends were released. For sequencing of both ends, this fragment was further digested into two smaller fragments by adding another restriction endonuclease, BtsI. As expected, fluorescence signals were detected between 276 nt–290 nt and 487 nt–501 nt (Figure 1E). DNA sequences corresponding to the 5' cantilevers were obtained from the fluorescence signals, and each 5' cantilever perfectly matched the design. The results indicate that DSC can generate the designed sticky ends with high fidelity, and the cantilevers remain intact during DNA extraction, ensuring high-quality DNA assembly in subsequent processes.

载体pCCE的核苷酸序列(SEQ ID NO.13)：

The nucleotide sequence of the vector pCCE (SEQ ID NO.13):

测试质粒的核苷酸序列(SEQ ID NO.14)：

Nucleotide sequence of the test plasmid (SEQ ID NO.14):

3、DNA组装3. DNA assembly

利用生成的含所需悬臂的DNA片段，本发明开发了一种称为DSC辅助迭代(Dai)组装的分步染色体组装策略，用于体外构建染色体大小的线性DNA分子(图1C)。在具体组装过程中，将线性染色体分解成多个DNA片段，这些片段从DNA合成公司获得，并克隆到设计的载体pCCE中，或者在合成的DNA片段的两端插入三个相邻的Nt.BspQI位点。在组装过程中，这些片段首先通过DSC处理，在每一端产生所需的悬垂，然后通过琼脂糖凝胶电泳纯化。接下来，将多个相邻片段混合并连接得到长度约为40kb的DNA片段，然后再次凝胶纯化以去除未连接的片段。重复此过程在下一轮组装中生成大的DNA片段(～200kb)，最终获得整个染色体(图1C)。对于较长的染色体，可以进行更多的组装，理论上，这个组装过程可以根据需要重复多次。Utilizing the generated DNA fragments containing the desired overhangs, this invention develops a stepwise chromosome assembly strategy called DSC-assisted iterative (Dai) assembly for the in vitro construction of chromosome-sized linear DNA molecules (Figure 1C). In the specific assembly process, the linear chromosome is broken down into multiple DNA fragments obtained from a DNA synthesis company and cloned into a designed vector pCCE, or by inserting three adjacent Nt.BspQI sites at both ends of a synthesized DNA fragment. During assembly, these fragments are first treated with DSC to generate the desired overhangs at each end, and then purified by agarose gel electrophoresis. Next, multiple adjacent fragments are mixed and ligated to obtain a DNA fragment approximately 40 kb in length, which is then purified again by gel electrophoresis to remove unligated fragments. This process is repeated in the next round of assembly to generate larger DNA fragments (~200 kb), ultimately yielding the entire chromosome (Figure 1C). For longer chromosomes, more assemblies can be performed; theoretically, this assembly process can be repeated as needed.

实施例2：利用Dai组装构建T7基因组Example 2: Construction of the T7 genome using Dai assembly

本实施例利用Dai组装构建T7噬菌体基因组(长度约30kb)。将整个基因组随机分成15个片段，每个片段长约2kb，由于某些DNA片段对大肠杆菌具有毒性，因此它们不能被克隆到设计的pCCE载体中，因此，在扩增DNA片段的两端插入三个相邻的Nt.BspQI位点，并通过DSC直接生成所需的5'悬臂，用于后续组装。本发明以“准质粒法”进行T7基因组的拼接和转化，整个过程无需将序列克隆到质粒，具体包括以下步骤：This embodiment utilizes Dai assembly to construct the T7 phage genome (approximately 30 kb in length). The entire genome was randomly divided into 15 fragments, each approximately 2 kb in length. Because some DNA fragments are toxic to *E. coli*, they could not be cloned into the designed pCCE vector. Therefore, three adjacent Nt.BspQI sites were inserted at both ends of the amplified DNA fragments, and the required 5' cantilever was directly generated using DSC. For subsequent assembly. This invention uses the "quasi-plasmid method" for the splicing and transformation of the T7 genome. The entire process does not require cloning the sequence into a plasmid, and specifically includes the following steps:

1、设计引物用于三轮拼接PCR，三轮PCR得到准质粒，其中准质粒是指延长了两端序列的PCR产物，其酶切产物可以用电泳区分。1. Primers were designed for three-round splicing PCR. Quasi-plasmids were obtained from the three rounds of PCR. Quasi-plasmids are PCR products with extended sequences at both ends. Their enzyme digestion products can be distinguished by electrophoresis.

(1)将T7基因组分成15段(记为seg_3k)，每段长度在2k-3k之间，如下表2所示，加上载体pTEF共16个片段。(1) The T7 genome was divided into 15 segments (denoted as seg_3k), each segment being 2k-3k in length, as shown in Table 2 below. In addition to the vector pTEF, there were a total of 16 segments.

表2
Table 2

载体pTEF的核苷酸序列(SEQ ID NO.15)：

Nucleotide sequence of vector pTEF (SEQ ID NO.15):

(2)每个seg_3k都需要在两端添加1k左右的辅助序列，该设计的好处是，当酶切发生后，未完全酶切的PCR产物与完全酶切的PCR产物之间有1k或以上的长度差异，很容易用电泳的方式进行区分，从而确保回收得到的片段末端含有较高比例的预期样式，本发明将这种添加了长辅助序列的PCR产物记为“准质粒”，通过酶切准质粒得到的片段在拼接时有接近质粒酶切片段的连接效率。(2) Each seg_3k requires the addition of approximately 1k of auxiliary sequences at both ends. The advantage of this design is that after enzyme digestion, there is a length difference of 1k or more between the incompletely digested PCR product and the fully digested PCR product, which can be easily distinguished by electrophoresis. This ensures that the recovered fragment ends contain a high proportion of the expected pattern. In this invention, this PCR product with added long auxiliary sequences is referred to as "quasi-..." "Plasmids" are fragments obtained by digesting plasmids with enzymes, which have a ligation efficiency close to that of plasmid digestion fragments during splicing.

(3)为得到准质粒，需要进行三轮的PCR(本发明使用的聚合酶都是Phanta max，这是为了尽量确保序列的准确性)，16个片段的PCR引物如表3所示。(3) To obtain the quasi-plasmid, three rounds of PCR are required (the polymerase used in this invention is Phantamax, which is to ensure the accuracy of the sequence as much as possible). The PCR primers for the 16 fragments are shown in Table 3.

每个seg_3k片段都对应3个PCR产物，因此需要设计3对引物。以扩增TP1a片段为例，详细说明三轮PCR：Each seg_3k fragment corresponds to 3 PCR products, therefore 3 pairs of primers need to be designed. Taking the amplification of the TP1a fragment as an example, the three rounds of PCR are explained in detail:

第一轮PCR：3对引物分别为CCE1671F和TX1vR，TX1c24F和TX1c2224R，TX1vF和CCE4357R，其中引物CCE1671F和TX1vR，TX1vF和CCE4357R分别扩增pCCE载体上的序列，分别为包含了Nt.BspQI的左侧和右侧序列。引物CCE1671F和CCE4357R是所有片段都一样的，而中间的4个引物每个seg_3k都不一样。引物TX1c24F和TX1c2224R以T7基因组为模板扩增。其中TX1vR和TX1vF的阴影部分分别与TX1c24F和TX1c2224R反向互补，这样当3个片段PCR得到后，可以进行PCA拼接。The first round of PCR used three primer pairs: CCE1671F and TX1vR, TX1c24F and TX1c2224R, and TX1vF and CCE4357R. Primers CCE1671F and TX1vR, and TX1vF and CCE4357R, amplified sequences on the pCCE vector, containing the left and right sides of the Nt.BspQI sequence, respectively. Primers CCE1671F and CCE4357R were identical across all fragments, while the four middle primers had different seg_3k values for each fragment. Primers TX1c24F and TX1c2224R amplified the T7 genome as a template. The shaded regions of TX1vR and TX1vF were reverse complementary to TX1c24F and TX1c2224R, respectively, allowing for PCA splicing after the three fragments were obtained by PCR.

第二轮PCR：以第一轮PCR的3个回收片段混合后作为模板，作三段PCA扩增，引物为CCE1671F和CCE4357R。Second round PCR: The three recovered fragments from the first round PCR were mixed and used as templates for three-segment PCA amplification. The primers were CCE1671F and CCE4357R.

第三轮PCR：以第二轮PCR回收产物为模板，以CCE1722F和CCE6819R为引物扩增，这对引物比CCE1671F和CCE4357R更靠内，所以第三轮本质上是巢式PCR。用巢式PCR的原因是为了得到更加干净的目标片段。第三轮PCR产物需要先M.MlyI甲基化酶甲基化后再用，甲基化可以在PCR产物凝胶回收前或回收后进行，本发明是在回收前进行甲基化。这些片段通过PCR扩增很容易获得，图4A展示了PCR扩增片段TP15a的电泳分析结果。The third round of PCR: Using the recovered products from the second round of PCR as templates, amplification was performed using primers CCE1722F and CCE6819R. These primers are more internal than CCE1671F and CCE4357R, so the third round is essentially nested PCR. Nested PCR is used to obtain cleaner target fragments. The third round PCR products need to be methylated with M.MlyI methyltransferase before use. Methylation can be performed before or after gel recovery of the PCR products; in this invention, methylation is performed before recovery. These fragments are easily obtained through PCR amplification. Figure 4A shows the electrophoretic analysis results of the PCR amplified fragment TP15a.

表3

Table 3

2、用设计的sgDNA对回收的第三轮PCR产物进行酶切，针对16个片段所设计的sgDNA序列如表4所示。酶切反应体系和反应条件为：待切割的目标DNA分子(4μg)在1x cutsmart buffer(NEB)，1mM DTT，8％PEG4000(Thermo)，3μM sgDNA，12U Nt.BspQI(NEB)和12U Nt.BstNBI(NEB)的混合溶液中消化。在50℃孵育2小时。2. The recovered third-round PCR products were digested with designed sgDNA. The designed sgDNA sequences for the 16 fragments are shown in Table 4. The digestion reaction system and conditions were as follows: 4 μg of the target DNA molecule to be digested was digested in a mixed solution of 1x Cutsmart buffer (NEB), 1 mM DTT, 8% PEG4000 (Thermo), 3 μM sgDNA, 12 U Nt. BspQI (NEB), and 12 U Nt. BstNBI (NEB). The digestion was carried out at 50°C for 2 hours.

片段TP6a第三轮PCR产物酶切后的电泳图如图4B，其中主带是目标酶切片段，其上方的次带是未完全酶切的PCR产物，可以看到在电泳图上有较大的区分，而下面的条带是被切下来的1k左右的辅助序列。Figure 4B shows the electrophoresis image of the third-round PCR product of fragment TP6a after enzyme digestion. The main band is the target enzyme digestion fragment, the secondary band above it is the PCR product that was not completely digested, and you can see that there is a large distinction in the electrophoresis image. The band below is the auxiliary sequence of about 1k that was cut out.

表4

Table 4

3、T7基因组组装3. T7 genome assembly

(1)组装条件优化(1) Assembly conditions optimization

通过上述DSC辅助酶切生成所需的dsDNA粘性末端，分别在15个DNA片段的两端获得所需的悬臂，用于随后的组装。DNA体外组装包括杂交反应和组装反应两个步骤。本实施例通过以下实验研究了不同连接酶、PEG4000用量和片段数对于组装反应的效率的影响。杂交反应体系为：在50μL的杂交缓冲液(100mM NaCl,20mM Tris-Ac,1mM EDTA,体积百分比浓度8％PEG4000,pH 7.5)中组装相邻的酶切后DNA片段，每个片段15ng。杂交反应条件为：55℃，5min；45℃，10min；40℃，10min；35℃，10min；20℃，10min。杂交反应结束后，向杂交反应体系中加入组装反应体系，进行组装反应。组装反应体系为：在5μL的组装缓冲液(30mM Tris-HCl,4mM MgCl₂,26μM NAD,1mM DTT,50μg/mL BSA)中加入0.1U DNA 连接酶。组装反应条件为：30℃，10min；30℃，1min；42℃，1min；循环时间为3h。The required dsDNA sticky ends were generated by DSC-assisted enzyme digestion, and the desired cantilever arms were obtained at both ends of 15 DNA fragments for subsequent assembly. In vitro DNA assembly includes two steps: hybridization and assembly. This example investigated the effects of different ligases, PEG4000 dosage, and fragment number on the efficiency of the assembly reaction through the following experiments. The hybridization reaction system was as follows: adjacent enzyme-digested DNA fragments were assembled in 50 μL of hybridization buffer (100 mM NaCl, 20 mM Tris-Ac, 1 mM EDTA, 8% PEG4000, pH 7.5), with 15 ng of each fragment. The hybridization reaction conditions were: 55℃, 5 min; 45℃, 10 min; 40℃, 10 min; 35℃, 10 min; 20℃, 10 min. After the hybridization reaction, the assembly reaction system was added to the hybridization reaction system to carry out the assembly reaction. The assembly reaction system was as follows: 0.1 U DNA was added to 5 μL of assembly buffer (30 mM Tris-HCl, 4 mM _MgCl₂ , 26 μM NAD, 1 mM DTT, 50 μg/mL BSA). Ligase. Assembly reaction conditions: 30℃, 10 min; 30℃, 1 min; 42℃, 1 min; cycle time: 3 h.

1)不同连接酶对DNA组装效率的影响研究1) Study on the effect of different ligases on DNA assembly efficiency

取7个相邻的酶切后DNA片段(TP1a-TP7a)，每个片段15ng，采用上述DNA体外组装条件进行组装。本实验使用的酶为Taq DNA连接酶(NEB)、E.coli DNA连接酶(NEB)、T4 DNA连接酶(NEB)、T7 DNA连接酶(NEB)和9°N^TM DNA连接酶(NEB)，每种酶进行两次平行实验。Seven adjacent, enzyme-digested DNA fragments (TP1a-TP7a), 15 ng each, were collected and assembled using the in vitro DNA assembly conditions described above. The enzymes used in this experiment were Taq DNA ligase (NEB), E. coli DNA ligase (NEB), T4 DNA ligase (NEB), T7 DNA ligase (NEB), and 9°N ^™ DNA ligase (NEB), with each enzyme used in duplicate.

2)PEG4000用量对DNA组装效率的影响2) Effect of PEG4000 dosage on DNA assembly efficiency

取7个相邻的酶切后DNA片段(TP1a-TP7a)，每个片段15ng，采用上述DNA体外组装条件进行组装。本实验使用的DNA连接酶为大肠杆菌DNA连接酶，同时分别进行PEG4000体积百分比浓度为0、4％、8％、12％和16％的组装实验，测试了PEG4000体积百分比浓度为0、4％、8％、12％和16％对装配效率的影响，并对每种浓度进行了两次平行实验。Seven adjacent, enzyme-digested DNA fragments (TP1a-TP7a), 15 ng each, were collected and assembled under the aforementioned in vitro DNA assembly conditions. The DNA ligase used in this experiment was *E. coli* DNA ligase. Assembly experiments were conducted with PEG4000 volume percentage concentrations of 0%, 4%, 8%, 12%, and 16% to test the effect of PEG4000 volume percentage concentrations on assembly efficiency. Each concentration was tested in duplicate.

3)片段数对DNA组装效率的影响3) The effect of fragment number on DNA assembly efficiency

组装几个相邻的酶切后DNA片段，每个片段15ng，采用上述DNA体外组装条件进行组装。本实验使用的DNA连接酶为大肠杆菌DNA连接酶。分别测试3(TP1a-TP3a)、6(TP1a-TP6a)、9(TP1a-TP9a)、12(TP1a-TP12a)和15个(TP1a-TP15a)相邻的DNA片段的装配效率，每种条件下并行进行两次实验。Assemble several adjacent, enzyme-digested DNA fragments, 15 ng each, using the in vitro DNA assembly conditions described above. The DNA ligase used in this experiment was E. coli DNA ligase. The assembly efficiency of 3 (TP1a-TP3a), 6 (TP1a-TP6a), 9 (TP1a-TP9a), 12 (TP1a-TP12a), and 15 (TP1a-TP15a) adjacent DNA fragments was tested, with each condition performed twice in parallel.

不同反应条件对DNA组装反应效率的实验结果如图3。Figure 3 shows the experimental results of the effect of different reaction conditions on the efficiency of DNA assembly reaction.

经过上述条件的优化，确定DNA体外组装优化的体系和条件。杂交反应体系为：在50μL的杂交缓冲液(100mM NaCl,20mM Tris-Ac,1mM EDTA,体积百分比浓度8％PEG4000,pH 7.5)中组装6或7个相邻的酶切后DNA片段，每个片段90ng。杂交反应条件为：55℃，5min；45℃，10min；40℃，10min；35℃，10min；20℃，10min。杂交反应结束后，向杂交反应体系中加入组装反应体系，进行组装反应。组装反应体系为：在5μL的组装缓冲液(30mM Tris-HCl,4mM MgCl₂,26μM NAD,1mM DTT,50μg/mL BSA)中加入10U大肠杆菌DNA连接酶。组装反应条件为：30℃，10min；30℃，1min；42℃，1min；循环时间为3h。After optimizing the above conditions, the optimal system and conditions for in vitro DNA assembly were determined. The hybridization reaction system consisted of assembling 6 or 7 adjacent enzyme-digested DNA fragments (90 ng each) in 50 μL of hybridization buffer (100 mM NaCl, 20 mM Tris-Ac, 1 mM EDTA, 8% PEG4000, pH 7.5). The hybridization reaction conditions were: 55℃ for 5 min; 45℃ for 10 min; 40℃ for 10 min; 35℃ for 10 min; 20℃ for 10 min. After the hybridization reaction, the assembly reaction system was added to the hybridization reaction system for assembly. The assembly reaction system consisted of 10 U of *E. coli* DNA ligase added to 5 μL of assembly buffer (30 mM Tris-HCl, 4 mM _MgCl₂ , 26 μM NAD, 1 mM DTT, 50 μg/mL BSA). The assembly reaction conditions were: 30℃ for 10 min; 30℃ for 1 min; 42℃ for 1 min; and the cycle time was 3 h.

(2)全长拼接(2) Full-length splicing

为了产生一个完整的T7基因组，本实施例进行了两轮组装。第一轮组装是将经过DSC处理的15个DNA片段和pTEF载体分成4组进行组装，4组分别为TP3a、TP4a、TP5a和TP6a，TP7a、TP8a、TP9a和TP10a，TP11a、TP12a、TP13a和TP14a，TP1a、TP2a、TP15a和pTEF，得到更大的DNA片段(图4C)。第二轮组装是以第一轮回收片段进行组装(图4D)。To generate a complete T7 genome, this embodiment performed two rounds of assembly. The first round involved assembling the 15 DSC-processed DNA fragments and the pTEF vector into four groups: TP3a, TP4a, TP5a, and TP6a; TP7a, TP8a, TP9a, and TP10a; TP11a, TP12a, TP13a, and TP14a; and TP1a, TP2a, TP15a, and pTEF, resulting in larger DNA fragments (Figure 4C). The second round of assembly used the fragments recovered in the first round (Figure 4D).

(3)对拼接进行化学转化(3) Chemical transformation of splicing

将组装好的产物直接转化到大肠杆菌中进行噬菌体斑块实验(图4E)。从斑块中分离基因组DNA并进行测序，表明T7基因组组装成功。The assembled product was directly transformed into E. coli for phage patch experiments (Figure 4E). Genomic DNA was isolated from the patches and sequenced, indicating that the T7 genome was successfully assembled.

实施例3：用Dai组装法在体外构建完整的酵母染色体Example 3: Construction of a complete yeast chromosome in vitro using the Dai assembly method

本实施例应用Dai组装来测试整个线性染色体是否可以在体外重新组装。为此，以出芽酵母合成的1号染色体(synI)为例，其长度约为180kb。在Sc2.0中，synI被合成为与合成的13号染色体(synVIII)的融合，其作为独立染色体的能力从未被测试过。在这里，发明人通过将染色体I的原生着丝粒置于其原始位点，并在两端添加UTC(通用端粒帽)序列，将synI恢复为一条独立的染色体。为了方便酵母中synI的选择，在酵母的左右端粒附近插入HIS3和URA3两个标记(图5A)，其余序列与报道一致。将设计的synI序列分成62个片段，每个片段的长度约为3kb(YST1-YST62)，YST1-YST62的序列及对应的sgDNA、产生的悬臂如表5和表6所示。这些片段被化学合成并克隆到pCCE中进行后续组装。SynI的组装分为两个阶段(图5B)。在第一阶段，每个质粒经过DSC处理，释放DNA片段，其两端有8-17个nts的5'悬臂(图4F和表5)。平行建立了9个组装反应，每个反应有6到7个相邻的片段，通过退火将它们连接在一起，产生9个长度为～20kb的大DNA片段(图4G)。作为对照，在没有连接酶的情况下进行相同的反应。在下一阶段，9个片段被组装在一起，形成全长synI，如图4H中的星号所示。This embodiment uses Dai assembly to test whether a complete linear chromosome can be reassembled in vitro. For this purpose, chromosome 1 (synI) synthesized by budding yeast is used as an example, with a length of approximately 180 kb. In Sc2.0, synI is synthesized as a fusion with synthesized chromosome 13 (synVIII). SynI, as an independent chromosome, had never been tested. Here, the inventors restored synI to an independent chromosome by placing the native centromere of chromosome I at its original site and adding UTC (universal telomere cap) sequences to both ends. To facilitate the selection of synI in yeast, two markers, HIS3 and URA3, were inserted near the left and right telomeres of yeast (Figure 5A), with the remaining sequences consistent with reports. The designed synI sequence was divided into 62 fragments, each approximately 3kb in length (YST1-YST62). The sequences of YST1-YST62, the corresponding sgDNA, and the generated cantilever are shown in Tables 5 and 6. These fragments were chemically synthesized and cloned into pCCE for subsequent assembly. SynI assembly consisted of two stages (Figure 5B). In the first stage, each plasmid was treated with DSC to release the DNA fragment, which had 5' cantilever of 8-17 nts at both ends (Figure 4F and Table 5). Nine parallel assembly reactions were established, each with six to seven adjacent fragments, which were ligated together by annealing to produce nine large DNA fragments of ~20 kb in length (Figure 4G). As a control, the same reaction was performed without ligase. In the next stage, the nine fragments were assembled together to form the full-length synI, as indicated by the asterisk in Figure 4H.

表5

Table 5

表6

Table 6

实施例4：synI的转化与表征Example 4: Transformation and Characterization of synI

将组装好的SynI通过酵母原生质体转化转化到BY4742中，并将其培养在不含组氨酸和尿嘧啶的合成完整培养基(SC-His-Ura)上。克隆M1/M2(用第二次组装后的混合连接产物转化)和P1/P2(用纯化的synI转化)，使用所有合成和野生型PCR标签进行分离和确认(图6和表7)。接下来，对4个菌株进行脉冲场凝胶电泳(PFGE)，以评估synI和原生I染色体(chrI)的存在。如图4I左图所示，chrI是BY4742中迁移到PFGE前端的最小染色体，用色星号表示。这4个克隆的染色体都小于chrI，表明存在synI。不同大小的多个条带推断了synI的不同同工型的存在。考虑到synI和chrI之间的高度序列相似性，我们推测这些条带可能是两条染色体的重组体。The assembled SynI was transformed into BY4742 via yeast protoplast transformation and cultured on synthetic intact medium (SC-His-Ura) without histidine and uracil. Clones M1/M2 (transformed with the mixed ligation product after the second assembly) and P1/P2 (transformed with purified synI) were isolated and confirmed using all synthetic and wild-type PCR tags (Figure 6 and Table 7). Next, pulsed-field gel electrophoresis (PFGE) was performed on the four strains to assess the presence of synI and native chromosome I (chrI). As shown in the left panel of Figure 4I, chrI is the smallest chromosome that migrated to the front of the PFGE in BY4742, indicated by a colored asterisk. The chromosomes of all four clones are smaller than chrI, indicating the presence of synI. Multiple bands of different sizes inferred the presence of different isoforms of synI. Considering the high sequence similarity between synI and chrI, we speculate that these bands may be recombinants of the two chromosomes.

表7

Table 7

为了区分多个chrI/synI亚型，采用Southern印迹法。发明人使用了两种不同的探针，一种特异于chrI和synI中的YAL003C基因，另一种特异于synI中的URA3基因(表8)。如图4I的中间和右侧所示，URA3存在于四个克隆中，P1、P2和M1具有预期的大小，表明存在组装的synI或其重组体。在P2、M1和M2中，YAL003C探针可以看到synI和chrI之间的中间带，表明它们是重组体。在M1和M2中，URA3探针未发现中间带，表明重组中不含URA3基因。有趣的是，在M2中，URA3探针观察到与chrI大小相同的条带，可能是由于包括URA3在内的一小部分区域被重组为原生chrI。在克隆P1中，发明人观察到chrI和synI都完好无损。发明人通过Oxford纳米孔技术(ONT)测序进一步验证了synI和chrI的存在。与Southern blot结果一致，只有在P1克隆中发现了完整的synI和chrI，而在其他克隆中，synI和chrI重组的嵌合区域较多(图4J和图7)。综上所述，上述数据表明synI可以完整地传递到酵母细胞中并独立存在。To differentiate between multiple chrI/synI subtypes, Southern blotting was employed. The inventors used two different probes: one specific to the YAL003C gene in both chrI and synI, and the other specific to the URA3 gene in synI (Table 8). As shown in the middle and right of Figure 4I, URA3 was present in four clones, P1, P2, and M1, exhibiting the expected size, indicating the presence of assembled synI or its recombinants. In P2, M1, and M2, the YAL003C probe detected an intermediate band between synI and chrI, indicating they were recombinants. In M1 and M2, the URA3 probe did not detect an intermediate band, indicating the absence of the URA3 gene in the recombinants. Interestingly, in M2, the URA3 probe observed a band the same size as chrI, possibly due to a small region, including URA3, being recombined into native chrI. In clone P1, the inventors observed that both chrI and synI were intact. The inventors further verified the presence of synI and chrI using Oxford nanopore technology (ONT) sequencing. Consistent with the Southern blot results, intact synI and chrI were found only in the P1 clone, while other clones showed more chimeric regions of synI and chrI recombination (Figures 4J and 7). In summary, the above data indicate that synI can be completely delivered into yeast cells and exist independently.

表8
Table 8

显然，上述实施例仅仅是为清楚地说明所作的举例，而并非对实施方式的限定。对于所属领域的普通技术人员来说，在上述说明的基础上还可以做出其它不同形式的变化或变动。这里无需也无法对所有的实施方式予以穷举。而由此所引伸出的显而易见的变化或变动仍处于本发明创造的保护范围之中。 Obviously, the above embodiments are merely illustrative examples for clear explanation and are not intended to limit the implementation. Those skilled in the art will recognize that other variations or modifications can be made based on the above description. It is neither necessary nor possible to exhaustively list all possible implementations here. However, obvious variations or modifications derived therefrom are still within the scope of protection of this invention.

Claims

The use of a double-stranded DNA sticky end generation system in the in vitro synthesis and assembly of large genomic DNA fragments, characterized in that the system comprises a DNA assembly sequence, a single-stranded guide DNA, nicking endonuclease A, nicking endonuclease B, and nicking endonuclease C;

The DNA assembly sequence includes a first nicking endonuclease recognition site unit, a target DNA sequence, and a second nicking endonuclease recognition site unit from the 5' end to the 3' end. The first nicking endonuclease recognition site unit includes 2 or 3 nicking endonuclease A recognition sites and spacer bases located between the nicking endonuclease A recognition sites. The second nicking endonuclease recognition site unit includes 2 or 3 nicking endonuclease B recognition sites and spacer bases located between the nicking endonuclease B recognition sites. The nicking endonuclease A and nicking endonuclease B belong to the same type of nicking endonuclease. The cleavage sites of the nicking endonuclease A and nicking endonuclease B are located on the upper and lower strands of the DNA. After the DNA assembly sequence is digested by nicking endonuclease A and nicking endonuclease B, single-stranded regions are formed at both ends of the DNA assembly sequence.

The single-stranded guide DNA includes a first single-stranded guide DNA and a second single-stranded guide DNA. The single-stranded guide DNA consists of a recognition sequence of the target sequence and a short DNA sequence folded into a stem-loop structure. The target sequence is a single-stranded DNA sequence including a portion of the single-stranded region and a portion of the adjacent target DNA sequence. This portion of the single-stranded region is referred to as the first target sequence, and the portion of the adjacent target DNA sequence is referred to as the second target sequence. The stem-loop structure folded into the short DNA sequence has a nicking endonuclease C recognition site but lacks a sequence that can be cleaved by the nicking endonuclease C. After the recognition sequence of the target sequence hybridizes with the first target sequence, the second target sequence... The corresponding other DNA strand is separated, and the recognition sequence of the target sequence further hybridizes with the second target sequence. The target sequence and the recognition sequence hybridize to form a double-stranded structure, which can be recognized by the nicking endonuclease C. This places the predetermined position of the second target sequence at a position that can be cleaved by the nicking endonuclease C through the recognition of the recognition sequence on the single-stranded guide DNA. The recognition sequences of the target sequences of the first single-stranded guide DNA and the second single-stranded guide DNA hybridize with partial sequences of the single-stranded regions formed at both ends of the DNA assembly sequence after digestion by nicking endonuclease A and nicking endonuclease B, as well as partial sequences of their adjacent target DNA sequences, to form a double-stranded structure.

According to the use described in claim 1, the nicking endonuclease A and nicking endonuclease B are respectively selected from one of Nt.BbvCI, Nt.AlwI, Nt.BsmAI, Nt.BspQI, Nt.BstNBI, and Nt.CviPII, or one of Nb.BbvCI, Nb.BsmI, Nb.BsrDI, and Nb.BtsI;

Preferably, the first nicking endonuclease recognition site unit includes three nicking endonuclease A recognition sites, and the second nicking endonuclease recognition site unit includes three nicking endonuclease B recognition sites;

Preferably, the nicking endonuclease A and nicking endonuclease B are the same nicking endonuclease;

Preferably, the nicking endonuclease A and nicking endonuclease B are Nt.BspQI.

According to the use described in claim 1, the first nicking endonuclease recognition site unit includes three nicking endonuclease A recognition sites, and the second nicking endonuclease recognition site unit includes three nicking endonuclease B recognition sites.

According to the use of claim 1, the spacer bases between the nicking endonuclease A recognition sites satisfy the following condition: the nicking endonuclease A cleavage site is not located in other nicking endonuclease A recognition sequences;

The spacer bases between the recognition sites of the nicking endonuclease B satisfy the following condition: the cleavage site of the nicking endonuclease B is not located in other... The cleavage sequence is recognized by endonuclease B.

Preferably, the spacer bases between the nicking endonuclease A recognition sites and the spacer bases between the nicking endonuclease B recognition sites are 4 to 10 randomly arranged A, C, G or T bases, and the spacer bases do not form nicking endonuclease recognition sites.

According to the use described in claim 1, the nicking endonuclease C is selected from one of Nt.BbvCI, Nt.AlwI, Nt.BsmAI, Nt.BspQI, Nt.BstNBI, Nt.CviPII, Nb.BbvCI, Nb.BsmI, Nb.BsrDI, and Nb.BtsI;

Preferably, the nicking endonuclease C is selected from Nt.BstNBI.

According to the use described in claim 1, the 5' end of the first nicking endonuclease recognition site unit and the 3' end of the second nicking endonuclease recognition site unit of the DNA assembly sequence each include an auxiliary sequence of 500-1500 bp in size; or, the DNA assembly sequence is cloned into a plasmid.

According to the use described in claim 1, the single-chain region has a length of 5-100 bases.

According to the use of claim 1, the second target sequence is characterized in that its length is between 5 and 100 bases, preferably between 5 and 30 bases, and more preferably between 10 and 30 bases.

According to the use described in claim 1, the sticky end has a length of 5-30 bases, preferably 5-20 bases.

According to the use described in claim 1, the large fragment of genomic DNA is a yeast chromosome or a T7 phage genome.

A method for synthesizing and assembling large genomic DNA fragments in vitro, characterized by employing the double-stranded DNA sticky end generation system described in claim 1, comprising the following steps:

(1) Divide the genome sequence into n target DNA sequences, where n is a positive integer, prepare n DNA assembly sequences, and design and synthesize single-stranded guide DNA for each DNA assembly sequence;

(2) Add nicking endonuclease A, nicking endonuclease B and nicking endonuclease C to each DNA assembly sequence and its corresponding single-stranded guide DNA for enzyme digestion. The sticky ends generated by the enzyme digestion of adjacent target DNA sequences are complementary.

(3) The DNA assembly sequence after enzyme digestion in step (2) is assembled into DNA in vitro.

According to the synthesis and assembly method of claim 11, the target DNA sequence in step (1) is 300bp-50kb in size;

Preferably, the method for preparing the DNA assembly sequence in step (1) is as follows: cloning the target DNA sequence into a plasmid containing a first nicking endonuclease recognition site unit and a second nicking endonuclease recognition site unit, wherein the target DNA sequence is located between the first nicking endonuclease recognition site unit and the second nicking endonuclease recognition site unit; or, inserting the first nicking endonuclease recognition site unit and the second nicking endonuclease recognition site unit at both ends of each target DNA sequence, and amplifying to obtain the DNA assembly sequence.

The synthesis and assembly method according to claim 12, characterized in that the plasmid is pCCE, and its nucleotide sequence is as shown in SEQ ID NO. Shown in NO.13;

The DNA assembly sequence is obtained by amplifying the 5' end of the first nicking endonuclease recognition site unit and the 3' end of the second nicking endonuclease recognition site unit, which each include an auxiliary sequence of 500-1500 bp in size.

According to the synthesis and assembly method of claim 11, the enzymatic digestion in step (2) is carried out at 30-75°C, preferably at 45-65°C, and more preferably at 50-60°C.

Preferably, the enzyme digestion system in step (2) is: 4 μg DNA assembly sequence, 1x cutsmart buffer, 1 mM DTT, 8% PEG4000 (v/v), 3 μM single-stranded guide DNA, 12 U nicking endonuclease A, 12 U nicking endonuclease B and 12 U nicking endonuclease C; the enzyme digestion conditions are incubation at 50°C for 2 h.

According to the synthesis and assembly method of claim 11, the DNA in vitro assembly in step (3) is performed in multiple rounds according to the size of the genomic DNA fragment, thereby obtaining a large fragment of genomic DNA;

Preferably, the DNA in vitro assembly in step (3) includes two steps: a hybridization reaction and an assembly reaction performed sequentially.

1) Hybridization reaction: Adjacent DNA assembly sequences after enzyme digestion were added to the hybridization buffer for hybridization reaction; the hybridization reaction conditions were: 55±5℃, 5±3min; 45±5℃, 10±5min; 40±5℃, 10±5min; 35±5℃, 10±5min; 20±5℃, 10±5min; the hybridization buffer composition was 100mM NaCl, 20mM Tris-Ac, 1mM EDTA, PEG4000 with a volume percentage concentration of 8%-16%, and pH 7.5.

2) Assembly reaction: After the hybridization reaction, add assembly buffer and DNA ligase to the hybridization reaction system; the assembly reaction conditions are: 30±5℃, 10±5min; 50-100 cycles × (30±5℃, 1±0.5min; 42±5℃, 1±0.5min); the assembly buffer components are 30mM Tris-HCl, 4mM MgCl2, _26μM NAD, 1mM DTT, and 50μg/mL BSA.

The synthesis and assembly method according to claim 15 is characterized in that the volume percentage concentration of PEG4000 in the hybridization buffer component is 8%;

Preferably, Escherichia coli DNA ligase or T7 DNA ligase is added to the hybridization reaction system, and more preferably Escherichia coli DNA ligase is added.

Preferably, the number of adjacent DNA assembly sequences after enzyme digestion added to the hybridization buffer is 3-12, rounded to the nearest integer.

Preferably, the number of adjacent post-digestion DNA assembly sequences added to the hybridization buffer is 6 or 7.