CN114703203B

CN114703203B - Baculovirus vector and its use

Info

Publication number: CN114703203B
Application number: CN202210127524.1A
Authority: CN
Inventors: 潘雨堃; 王天天
Original assignee: Shanghai Boyin Biotechnology Co ltd
Current assignee: Shanghai Boyin Biotechnology Co ltd
Priority date: 2022-02-11
Filing date: 2022-02-11
Publication date: 2024-08-06
Anticipated expiration: 2042-02-11
Also published as: CN114703203A

Abstract

本申请涉及一种利用昆虫细胞‑‑杆状病毒体系制备一种线性双链无末端的DNA(neDNA)表达载体，所述neDNA载体含有AAV反向末端重复序列(ITRs)和一种基因表达盒。本申请的昆虫细胞‑‑杆状病毒体系制备neDNA的产量高，产率平均有2‑3倍提升。相比于其他Bac‑rep，经过3次连续杆状病毒传代后，Rep蛋白(Rep78)的表达稳定性更好。The present application relates to a method for preparing a linear double-stranded DNA (neDNA) expression vector without ends by using an insect cell-baculovirus system, wherein the neDNA vector contains AAV inverted terminal repeats (ITRs) and a gene expression cassette. The insect cell-baculovirus system of the present application has a high yield for preparing neDNA, with an average yield increase of 2-3 times. Compared with other Bac-reps, after three consecutive baculovirus passages, the expression stability of Rep protein (Rep78) is better.

Description

Baculovirus vector and its use

技术领域Technical Field

本申请涉及生物医药领域，具体的涉及一种杆状病毒载体及其用途。The present application relates to the field of biomedicine, and specifically to a baculovirus vector and its use.

背景技术Background technique

基因治疗是一种在核酸水平通过调控基因表达这一生物学过程达到治疗疾病目的的新型治疗方法。理想的基因治疗载体需具备递送效率高、安全性高、生产可放大、成本低这些特点。重组腺相关病毒(rAAV)载体具有高效和安全的特点，是目前用于遗传病基因治疗的主要基因递送载体。rAAV仍旧面临生产放大困难、生产成本高的问题。基于DNA治疗的非病毒载体基因治疗有望解决该问题。2013年，Robert Kotin等人利用昆虫细胞杆状病毒表达系统，制备得到一种线性双链无末端的DNA(no end DNA，neDNA)，该线性DNA两端通过AAV基因组的反向末端重复序列(ITRs)闭合。Robert Kotin的制备方法由其实验室2002年开发的AAV杆状病毒生产体系直接转化而来，在产率方面没有针对neDNA进行优化；同时该制备方法在重组杆状病毒Bac-Rep表达Rep蛋白时，Rep78和Rep52之间会发生同源重组，体系不稳定。Gene therapy is a new treatment method that achieves the purpose of treating diseases by regulating the biological process of gene expression at the nucleic acid level. An ideal gene therapy vector must have the characteristics of high delivery efficiency, high safety, scalable production, and low cost. Recombinant adeno-associated virus (rAAV) vectors are efficient and safe and are currently the main gene delivery vectors used for gene therapy of genetic diseases. rAAV still faces the problems of difficulty in production and amplification and high production cost. Non-viral vector gene therapy based on DNA therapy is expected to solve this problem. In 2013, Robert Kotin et al. used the insect cell baculovirus expression system to prepare a linear double-stranded no-end DNA (neDNA), the two ends of which were closed by the inverted terminal repeats (ITRs) of the AAV genome. Robert Kotin's preparation method was directly transformed from the AAV baculovirus production system developed by his laboratory in 2002, and the yield was not optimized for neDNA; at the same time, when the recombinant baculovirus Bac-Rep expresses Rep protein, homologous recombination occurs between Rep78 and Rep52, and the system is unstable.

因此仍然需要克服在昆虫细胞中大规模(商业化)生产neDNA的上述严重限制。因此提供在昆虫细胞中稳定和高产(大规模)地生产neDNA的手段和方法是本发明的一个目标。Therefore, there is still a need to overcome the above-mentioned serious limitations of large-scale (commercial) production of neDNA in insect cells. It is therefore an object of the present invention to provide means and methods for stable and high-yield (large-scale) production of neDNA in insect cells.

发明内容Summary of the invention

本申请目的在于设计一种高效的基于昆虫细胞杆状病毒表达系统及其用于制备neDNA的方法，优化了Rep蛋白表达载体，提升Rep蛋白表达的稳定性，从而制备提高了neDNA的产量和生产体系的稳定性。本发明的特征包括：1)使用强启动子(p10启动子)，提高了Rep78表达量；2)优化了Rep52的序列密码子，避免Rep78和Rep52序列发生同源重组。本发明的效果：1)较Robert Kotin的制备方法提高产率2-3倍；2)Rep蛋白杆状病毒表达载体，传3代后依旧稳定。The purpose of this application is to design an efficient insect cell baculovirus expression system and a method for preparing neDNA, optimize the Rep protein expression vector, and improve the stability of Rep protein expression, thereby improving the yield of neDNA and the stability of the production system. The features of the present invention include: 1) using a strong promoter (p10 promoter) to increase the expression of Rep78; 2) optimizing the sequence codons of Rep52 to avoid homologous recombination between Rep78 and Rep52 sequences. Effects of the present invention: 1) The yield is increased by 2-3 times compared to Robert Kotin's preparation method; 2) The Rep protein baculovirus expression vector is still stable after 3 generations.

一方面，本申请提供了一种分离的核酸分子，其包含SEQ ID NO:12所示的核苷酸序列。In one aspect, the present application provides an isolated nucleic acid molecule comprising the nucleotide sequence shown in SEQ ID NO:12.

在某些实施方式中，所述的分离的核酸分子编码腺相关病毒(AAV)Rep52蛋白。In certain embodiments, the isolated nucleic acid molecule encodes adeno-associated virus (AAV) Rep52 protein.

另一方面，本申请提供了一种分离的核酸分子，其包含编码所述编码腺相关病毒(AAV)Rep78蛋白的核苷酸序列和编码Rep52蛋白的核苷酸序列，其中所述编码Rep52蛋白的核酸分子包含SEQ ID NO:12所示的核苷酸序列。On the other hand, the present application provides an isolated nucleic acid molecule, which comprises a nucleotide sequence encoding the adeno-associated virus (AAV) Rep78 protein and a nucleotide sequence encoding the Rep52 protein, wherein the nucleic acid molecule encoding the Rep52 protein comprises the nucleotide sequence shown in SEQ ID NO:12.

在某些实施方式中，其中所述编码Rep78蛋白的核苷酸序列为野生型。In certain embodiments, the nucleotide sequence encoding the Rep78 protein is wild type.

在某些实施方式中，其中所述编码Rep78蛋白的核酸分子包含SEQ ID NO:11所示的核苷酸序列。In certain embodiments, the nucleic acid molecule encoding the Rep78 protein comprises the nucleotide sequence shown in SEQ ID NO:11.

在某些实施方式中，其还包括启动所述编码AAV Rep78蛋白的核苷酸序列转录的第一启动子和启动所述编码AAV Rep52蛋白的核酸分子的第二启动子，所述第一启动子与第二启动子相同或不同。In certain embodiments, it further comprises a first promoter for initiating transcription of the nucleotide sequence encoding the AAV Rep78 protein and a second promoter for initiating transcription of the nucleic acid molecule encoding the AAV Rep52 protein, wherein the first promoter is the same as or different from the second promoter.

在某些实施方式中，所述第一启动子和第二启动子包括昆虫细胞启动子。In certain embodiments, the first promoter and the second promoter comprise insect cell promoters.

在某些实施方式中，所述第一启动子包括强启动子。In certain embodiments, the first promoter comprises a strong promoter.

在某些实施方式中，与所述第二启动子相比，所述第一启动子具有相同或更高的转录启动能力。In certain embodiments, the first promoter has the same or higher transcription initiation ability as the second promoter.

在某些实施方式中，其中所述第一启动子和第二启动子各自独立地选自：p10启动子、Polyhedrin(polh)启动子和IE1启动子。In certain embodiments, the first promoter and the second promoter are each independently selected from the group consisting of: p10 promoter, Polyhedrin (polh) promoter and IE1 promoter.

在某些实施方式中，其中所述第一启动子和第二启动子的转录方向相同或相反。In certain embodiments, the transcription directions of the first promoter and the second promoter are the same or opposite.

在某些实施方式中，其中所述第一启动子与所述编码Rep78蛋白的核苷酸序列可操作地连接，第二启动子与所述编码Rep52蛋白的核苷酸序列可操作地连接。In certain embodiments, the first promoter is operably linked to the nucleotide sequence encoding the Rep78 protein, and the second promoter is operably linked to the nucleotide sequence encoding the Rep52 protein.

在某些实施方式中，当第一启动子和第二启动子的转录方向相同时，其依次包含第一启动子、编码Rep78蛋白的核苷酸序列、第二启动子和编码Rep52蛋白的核苷酸序列。In certain embodiments, when the transcription directions of the first promoter and the second promoter are the same, it sequentially comprises the first promoter, a nucleotide sequence encoding the Rep78 protein, the second promoter, and a nucleotide sequence encoding the Rep52 protein.

在某些实施方式中，其中所述编码Rep78蛋白的核苷酸序列和编码Rep52蛋白的核苷酸序列下游还分别包含编码polyA的核苷酸序列(pA)。In certain embodiments, the nucleotide sequence encoding the Rep78 protein and the nucleotide sequence encoding the Rep52 protein further comprise a nucleotide sequence encoding polyA (pA) downstream thereof.

在某些实施方式中，当第一启动子和第二启动子的转录方向相同时，其依次包含第一启动子、编码Rep78蛋白的核苷酸序列、第一pA、第二启动子、编码Rep52蛋白的核苷酸序列和第二pA。In certain embodiments, when the transcription directions of the first promoter and the second promoter are the same, it sequentially comprises the first promoter, a nucleotide sequence encoding the Rep78 protein, a first pA, a second promoter, a nucleotide sequence encoding the Rep52 protein, and a second pA.

在某些实施方式中，当第一启动子和第二启动子的转录方向相反时，其依次包含编码Rep78的核苷酸序列、第一启动子、第二启动子、编码Rep52的核苷酸序列，其中所述第一启动子启动所述编码Rep78蛋白的核苷酸序列的转录，所述第二启动子启动所述编码Rep52蛋白的核苷酸序列的转录。In certain embodiments, when the transcription directions of the first promoter and the second promoter are opposite, they sequentially comprise a nucleotide sequence encoding Rep78, a first promoter, a second promoter, and a nucleotide sequence encoding Rep52, wherein the first promoter initiates the transcription of the nucleotide sequence encoding the Rep78 protein, and the second promoter initiates the transcription of the nucleotide sequence encoding the Rep52 protein.

在某些实施方式中，其中所述第一启动子的5’端与第二启动子的5’端直接或间接连接。In certain embodiments, the 5' end of the first promoter is directly or indirectly connected to the 5' end of the second promoter.

在某些实施方式中，其中所述第一启动子的3’端与所述编码Rep78的核苷酸序列的5’端直接或间接连接。In certain embodiments, the 3’ end of the first promoter is directly or indirectly connected to the 5’ end of the nucleotide sequence encoding Rep78.

在某些实施方式中，其还包含第一pA，其中所述编码Rep78蛋白的核苷酸序列的3’端与所述编第一pA的5’端直接或间接连接。In certain embodiments, it further comprises a first pA, wherein the 3' end of the nucleotide sequence encoding the Rep78 protein is directly or indirectly connected to the 5' end of the first pA.

在某些实施方式中，其中所述第二启动子的3’端与所述编码Rep52的核苷酸序列的5’端直接或间接连接。In certain embodiments, the 3’ end of the second promoter is directly or indirectly connected to the 5’ end of the nucleotide sequence encoding Rep52.

在某些实施方式中，其还包含第二pA，其中所述编码Rep52蛋白的核苷酸序列的3’端与所述编第二pA的5’端直接或间接连接。In certain embodiments, it further comprises a second pA, wherein the 3' end of the nucleotide sequence encoding the Rep52 protein is directly or indirectly connected to the 5' end of the second pA.

在某些实施方式中，其中所述pA选自：SV40 polyA和HSV TK polyA中的任意一种。In certain embodiments, the pA is selected from any one of SV40 polyA and HSV TK polyA.

在某些实施方式中，所述的分离的核酸分子包含SEQ ID NO:8所示的核苷酸序列。In certain embodiments, the isolated nucleic acid molecule comprises the nucleotide sequence shown in SEQ ID NO:8.

另一方面，本申请提供一种分离的核酸分子，其依次包含第一polyA(pA)、编码Rep78蛋白的核苷酸序列、第一启动子、第二启动子、编码Rep52的蛋白核苷酸序列和第二polyA(pA)，其中所述第一启动子为编码Rep78蛋白的核苷酸序列和第一pA的转录启动子，所述第二启动子为编码Rep52蛋白的核苷酸序列和第二polyA的转录启动子，其中所述编码Rep52蛋白的核苷酸序列和/或所述编码Rep78蛋白的核苷酸序列的序列经过密码子优化以避免同源重组，所述第一启动子和第二启动子包括昆虫细胞启动子，所述第一启动子是强启动子。On the other hand, the present application provides an isolated nucleic acid molecule, which comprises, in sequence, a first polyA (pA), a nucleotide sequence encoding a Rep78 protein, a first promoter, a second promoter, a nucleotide sequence encoding a Rep52 protein, and a second polyA (pA), wherein the first promoter is a transcriptional promoter of a nucleotide sequence encoding a Rep78 protein and the first pA, and the second promoter is a transcriptional promoter of a nucleotide sequence encoding a Rep52 protein and the second polyA, wherein the sequences of the nucleotide sequence encoding the Rep52 protein and/or the nucleotide sequence encoding the Rep78 protein are codon-optimized to avoid homologous recombination, the first promoter and the second promoter comprise insect cell promoters, and the first promoter is a strong promoter.

在某些实施方式中，其中所述第一启动子包括p10启动子、polh启动子或IE1启动子。In certain embodiments, the first promoter comprises p10 promoter, polh promoter or IE1 promoter.

在某些实施方式中，其中所述p10启动子包含SEQ ID NO:9所示的核苷酸序列。In certain embodiments, the p10 promoter comprises the nucleotide sequence shown in SEQ ID NO:9.

在某些实施方式中，其中所述第二启动子包括第二启动子包括p10启动子、polh启动子或IE1启动子。In certain embodiments, the second promoter comprises a p10 promoter, a polh promoter or an IE1 promoter.

在某些实施方式中，其中所述polh启动子包含SEQ ID NO:10所示的核苷酸序列。In certain embodiments, the polh promoter comprises the nucleotide sequence shown in SEQ ID NO:10.

另一方面，本申请提供一种载体，其包含本申请所述的分离的核酸分子。In another aspect, the present application provides a vector comprising the isolated nucleic acid molecule described in the present application.

在某些实施方式中，所述载体包括病毒载体。In certain embodiments, the vector comprises a viral vector.

在某些实施方式中，所述载体包括杆状病毒载体。In certain embodiments, the vector comprises a baculovirus vector.

在某些实施方式中，所述载体包括pFastBac载体。In certain embodiments, the vector comprises a pFastBac vector.

在某些实施方式中，所述的载体包含SEQ ID NO:14所示的核苷酸序列。In certain embodiments, the vector comprises the nucleotide sequence shown in SEQ ID NO:14.

另一方面，本申请提供一种细胞，其包含本申请所述的分离的核酸分子或本申请所述的载体。In another aspect, the present application provides a cell comprising the isolated nucleic acid molecule described herein or the vector described herein.

在某些实施方式中，所述细胞包括昆虫细胞。In certain embodiments, the cell comprises an insect cell.

在某些实施方式中，所述细胞包括Spodoptera frugiperda(Sf9)细胞。In certain embodiments, the cell comprises a Spodoptera frugiperda (Sf9) cell.

另一方面，本申请提供一种杆状病毒表达系统，其包含第一杆状病毒载体以及包含编码目的基因的核酸序列的第二杆状病毒载体，所述第一杆状病毒载体为本申请所述的杆状病毒载体。On the other hand, the present application provides a baculovirus expression system, which comprises a first baculovirus vector and a second baculovirus vector comprising a nucleic acid sequence encoding a target gene, wherein the first baculovirus vector is the baculovirus vector described in the present application.

在某些实施方式中，自5’端至3’端，所述编码目的基因的核酸序列其依次包含第一细小病毒的反向末端重复序列(ITR)、目的基因和第二ITR。In certain embodiments, from the 5' end to the 3' end, the nucleic acid sequence encoding the target gene comprises, in sequence, a first parvovirus inverted terminal repeat sequence (ITR), the target gene, and a second ITR.

在某些实施方式中，其中所述第一ITR与目的基因之间还包含至少一个启动子。In certain embodiments, at least one promoter is further included between the first ITR and the target gene.

在某些实施方式中，其中所述第一ITR与目的基因之间还包含至少一个真核启动子。In certain embodiments, at least one eukaryotic promoter is further included between the first ITR and the target gene.

在某些实施方式中，其中所述第一ITR与目的基因之间还包含至少一个哺乳动物细胞启动子。In certain embodiments, at least one mammalian cell promoter is further included between the first ITR and the target gene.

在某些实施方式中，其中所述第一ITR与目的基因之间还包含一个哺乳动物细胞启动子和一个昆虫细胞启动子。In certain embodiments, a mammalian cell promoter and an insect cell promoter are further included between the first ITR and the target gene.

在某些实施方式中，所述哺乳动物细胞启动子包括广泛性启动子和组织特异性启动子。In certain embodiments, the mammalian cell promoter includes a general promoter and a tissue-specific promoter.

在某些实施方式中，所述广泛性启动子包括CMV、SV40、EF1a、CAG或UBC启动子。In certain embodiments, the ubiquitous promoter comprises a CMV, SV40, EF1a, CAG or UBC promoter.

在某些实施方式中，所述组织特异性启动子包括ALB、hAAT、TBG，TTR、GFAP、MHCK7或hSyn启动子。In certain embodiments, the tissue-specific promoter comprises ALB, hAAT, TBG, TTR, GFAP, MHCK7 or hSyn promoter.

在某些实施方式中，其中所述哺乳动物细胞启动子包括CMV启动子。In certain embodiments, the mammalian cell promoter comprises a CMV promoter.

在某些实施方式中，其中所述昆虫细胞启动子包括p10启动子。In certain embodiments, the insect cell promoter comprises the p10 promoter.

在某些实施方式中，所述启动子包括CMV和p10启动子。In certain embodiments, the promoter comprises CMV and p10 promoters.

另一方面，本申请提供一种含有编码第一氨基酸序列的第一核苷酸序列和编码第二氨基酸序列的第二核苷酸序列的昆虫细胞，其中所述第一核苷酸序列包含编码Rep78蛋白的核苷酸序列，所述第二核苷酸序列编码Rep52蛋白的核苷酸序列，其中所述第一核苷酸序列包含SEQ ID NO:11所示的核苷酸序列，第二核苷酸序列包含SEQ ID NO:12所示的核苷酸序列。On the other hand, the present application provides an insect cell containing a first nucleotide sequence encoding a first amino acid sequence and a second nucleotide sequence encoding a second amino acid sequence, wherein the first nucleotide sequence comprises a nucleotide sequence encoding a Rep78 protein, and the second nucleotide sequence encodes a nucleotide sequence of a Rep52 protein, wherein the first nucleotide sequence comprises the nucleotide sequence shown in SEQ ID NO:11, and the second nucleotide sequence comprises the nucleotide sequence shown in SEQ ID NO:12.

在某些实施方式中，其中所述第一和第二核苷酸序列是一个核酸构建体的一部分，其中所述第一和第二核苷酸序列每个都与用于昆虫细胞表达的表达控制序列可操作地连接。In certain embodiments, the first and second nucleotide sequences are part of a nucleic acid construct, wherein the first and second nucleotide sequences are each operably linked to an expression control sequence for insect cell expression.

在某些实施方式中，其中所述昆虫细胞还包含：一段含有至少一段细小病毒反向末端重复核苷酸序列(ITR)的第三核酸序列。In certain embodiments, the insect cell further comprises: a third nucleic acid sequence comprising at least one parvovirus inverted terminal repeat nucleotide sequence (ITR).

在某些实施方式中，其中所述第三核苷酸序列还含有至少一个编码目的基因的核苷酸序列。In certain embodiments, the third nucleotide sequence further contains at least one nucleotide sequence encoding a target gene.

在某些实施方式中，其中所述第三核苷酸序列含有两个细小病毒ITR核苷酸序列，且其中所述至少一个编码目的基因的核苷酸序列位于所述两个细小病毒ITR核苷酸序列之间。In certain embodiments, the third nucleotide sequence contains two parvovirus ITR nucleotide sequences, and the at least one nucleotide sequence encoding a gene of interest is located between the two parvovirus ITR nucleotide sequences.

在某些实施方式中，其中所述细小病毒包括腺相关病毒。In certain embodiments, the parvovirus comprises an adeno-associated virus.

在某些实施方式中，其中所述第三核苷酸序列是另一个核酸构建体的一部分，其中所述每个编码目的基因的核苷酸序列都与用于哺乳动物表达的表达控制序列可操作地连接。In certain embodiments, the third nucleotide sequence is part of another nucleic acid construct, wherein each nucleotide sequence encoding a gene of interest is operably linked to an expression control sequence for mammalian expression.

在某些实施方式中，所述核酸构建体为昆虫细胞相容的载体。In certain embodiments, the nucleic acid construct is an insect cell compatible vector.

在某些实施方式中，所述核酸构建体为杆状病毒载体。In certain embodiments, the nucleic acid construct is a baculovirus vector.

在某些实施方式中，其包含本申请所述的杆状病毒载体。In certain embodiments, it comprises a baculovirus vector as described herein.

另一方面，本申请提供了本申请所述的杆状病毒表达系统或本申请所述的昆虫细胞在制备目的核酸分子中的应用。On the other hand, the present application provides the use of the baculovirus expression system described in the present application or the insect cell described in the present application in preparing a target nucleic acid molecule.

在某些实施方式中，其中所述目的核酸分子为具有共价封闭末端的线性DNA分子(neDNA)。In certain embodiments, the target nucleic acid molecule is a linear DNA molecule (neDNA) with covalently blocked ends.

另一方面，本申请提供了一种目的核酸分子的制备方法，包括培养本申请所述的昆虫细胞。On the other hand, the present application provides a method for preparing a target nucleic acid molecule, comprising culturing the insect cells described in the present application.

在某些实施方式中，所述的制备方法包括：In certain embodiments, the preparation method comprises:

1)提供本申请所述的杆状病毒表达系统；1) Providing the baculovirus expression system described in the present application;

2)将目的基因序列插入所述第二杆状病毒载体；2) inserting the target gene sequence into the second baculovirus vector;

3)将第一杆状病毒载体和第二杆状病毒载体共转染至昆虫细胞；3) co-transfecting the first baculovirus vector and the second baculovirus vector into insect cells;

4)使所述昆虫细胞在允许包含目的基因的DNA复制和释放的条件下生长；4) growing the insect cells under conditions that allow replication and release of the DNA containing the gene of interest;

5)收集目的核酸分子。5) Collect the target nucleic acid molecules.

在某些实施方式中，所述方法还保护分离所述目的核酸分子。In certain embodiments, the method further provides for isolating the nucleic acid molecule of interest.

另一方面，本申请提供了一种试剂盒，其包含本申请所述的分离的核酸分子、本申请所述的杆状病毒表达系统和/或本申请所述的昆虫细胞。In another aspect, the present application provides a kit comprising the isolated nucleic acid molecule described in the present application, the baculovirus expression system described in the present application and/or the insect cell described in the present application.

本领域技术人员能够从下文的详细描述中容易地洞察到本申请的其它方面和优势。下文的详细描述中仅显示和描述了本申请的示例性实施方式。如本领域技术人员将认识到的，本申请的内容使得本领域技术人员能够对所公开的具体实施方式进行改动而不脱离本申请所涉及发明的精神和范围。相应地，本申请的附图和说明书中的描述仅仅是示例性的，而非为限制性的。Those skilled in the art can easily perceive other aspects and advantages of the present application from the detailed description below. In the detailed description below, only exemplary embodiments of the present application are shown and described. As will be appreciated by those skilled in the art, the content of the present application enables those skilled in the art to modify the disclosed specific embodiments without departing from the spirit and scope of the invention to which the present application relates. Accordingly, the description in the drawings and specification of the present application is merely exemplary and not restrictive.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

本申请所涉及的发明的具体特征如所附权利要求书所显示。通过参考下文中详细描述的示例性实施方式和附图能够更好地理解本申请所涉及发明的特点和优势。对附图简要说明如下：The specific features of the invention involved in this application are shown in the attached claims. The features and advantages of the invention involved in this application can be better understood by referring to the exemplary embodiments and drawings described in detail below. The drawings are briefly described as follows:

图1A显示的是本申请所述pFastBac-ITR-EGFP质粒图谱。FIG. 1A shows a plasmid map of pFastBac-ITR-EGFP described in this application.

图1B显示的是本申请所述pFastBac-p10Rep质粒图谱。FIG. 1B shows the plasmid map of pFastBac-p10Rep described in this application.

图2显示的是本申请所述p10Rep、RepWT、inRep和CORep基因转录示意图。FIG. 2 shows a schematic diagram of transcription of the p10Rep, RepWT, inRep and CORep genes described in the present application.

图3显示的是Western Blotting分析Rep蛋白表达及稳定性结果；其中，Ctr：未感染Sf9细胞；1：BacV-RepWT感染的Sf9细胞；2：BacV-inRep感染的Sf9细胞；3：BacV-CORep感染的Sf9细胞；4：BacV-p10Rep感染的Sf9细胞。Figure 3 shows the results of Western Blotting analysis of Rep protein expression and stability; wherein, Ctr: uninfected Sf9 cells; 1: Sf9 cells infected with BacV-RepWT; 2: Sf9 cells infected with BacV-inRep; 3: Sf9 cells infected with BacV-CORep; 4: Sf9 cells infected with BacV-p10Rep.

图4显示的是本申请所述不同Rep蛋白驱动neDNA-ITR-EGFP基因表达载体鉴定结果；其中，M：DNA Marker；1-4：分别为RepWT，inRep，CORep和p10Rep驱动neDNA-ITR-EGFP基因表达载体的电泳图谱。Figure 4 shows the identification results of the neDNA-ITR-EGFP gene expression vector driven by different Rep proteins described in the present application; wherein, M: DNA Marker; 1-4: electrophoresis patterns of the neDNA-ITR-EGFP gene expression vector driven by RepWT, inRep, CORep and p10Rep, respectively.

图5A-5B显示的是本申请所述neDNA-ITR-EGFP基因表达载体酶切鉴定结果。Figures 5A-5B show the results of enzyme digestion identification of the neDNA-ITR-EGFP gene expression vector described in the present application.

图6显示的是本申请所述RepWT，inRep，CORep和p10Rep分别驱动neDNA-ITR-EGFP基因表达载体并转染HEK293细胞，在荧光显微镜下拍摄转染72h后的荧光表达图。Figure 6 shows the fluorescence expression graph of RepWT, inRep, CORep and p10Rep described in the present application driving the neDNA-ITR-EGFP gene expression vector and transfecting HEK293 cells, which was taken under a fluorescence microscope 72 hours after transfection.

图7显示的是纳米脂质颗粒递送neDNA-ITR-FLuc在C57BL/6小鼠体内荧光素酶表达的结果。FIG. 7 shows the results of luciferase expression in C57BL/6 mice by nanolipid particle delivery of neDNA-ITR-FLuc.

图8显示的是本申请所述Rep52优化前后序列对比(Query:Rep52WT,Sbjct:Rep52-CO)。FIG8 shows the sequence comparison of Rep52 before and after optimization described in the present application (Query: Rep52WT, Sbjct: Rep52-CO).

具体实施方式Detailed ways

以下由特定的具体实施例说明本申请发明的实施方式，熟悉此技术的人士可由本说明书所公开的内容容易地了解本申请发明的其他优点及效果。The following is an explanation of the implementation of the present invention by means of specific embodiments. Those skilled in the art can easily understand other advantages and effects of the present invention from the contents disclosed in this specification.

术语定义Definition of Terms

细小病毒科的病毒是小DNA动物病毒。细小病毒科可分为两个亚科：感染脊椎动物的细小病毒亚科和感染昆虫的浓核病毒亚科。所述细小病毒亚科的成员在本文中称为细小病毒并包括依赖病毒属。如可从它们的属名推断的，所述依赖病毒的成员是独特的，因为它们通常需要与在细胞培养中产生感染的辅助病毒如腺病毒或疱疹病毒共感染。依赖病毒属包括AAV，其在人类和其他灵长类动物中非常普遍，并且已从各种组织样品中分离出几种血清型。在人类细胞中发现了血清型2、3、5和6，非人灵长类动物样品中的AAV血清型1、4和7-11。Kenneth I.Berns,"Parvoviridae：TheViruses and Their Replication,"Chapter69in Fields Virology(3dEd.1996)描述了有关细小病毒和细小病毒科的其他成员的其他信息。应理解，本发明不限于AAV，而是可同样地应用于其他细小病毒。The viruses of the Parvoviridae family are small DNA animal viruses. The Parvoviridae family can be divided into two subfamilies: the Parvoviridae that infects vertebrates and the Densovirinae that infects insects. The members of the Parvoviridae subfamily are referred to herein as Parvovirus and include the genus Dependovirus. As can be inferred from their genus name, the members of the Dependovirus are unique because they usually require co-infection with helper viruses such as adenovirus or herpes virus to produce infection in cell culture. The Dependovirus genus includes AAV, which is very common in humans and other primates, and several serotypes have been isolated from various tissue samples. Serotypes 2, 3, 5 and 6 were found in human cells, and AAV serotypes 1, 4 and 7-11 in non-human primate samples. Kenneth I. Berns, "Parvoviridae: The Viruses and Their Replication," Chapter 69 in Fields Virology (3d Ed. 1996) describes additional information about parvoviruses and other members of the Parvoviridae family. It will be appreciated that the present invention is not limited to AAV but is equally applicable to other parvoviruses.

AAV的基因组是长度小于约5000核苷酸(nt)的线状单链DNA分子。反向末端重复序列(ITR)位于独特的编码非结构复制(rep)蛋白和结构蛋白(VP)的编码核苷酸序列的侧翼。所述VP蛋白(VP1、VP2和VP3)构成衣壳。末端的145nt是自身互补和有序的从而可形成一个能量上稳定的形成T形发夹的分子内双链。这些发夹结构的功能是病毒DNA复制的起点，用作细胞DNA聚合酶复合物的引物。在wtAAV感染哺乳动物细胞后，Rep基因(即Rep78和Rep52)分别从P5启动子和P19启动子表达，所表达的两种Rep蛋白都在该病毒基因组的复制中起作用。该Rep ORF中的剪接事件实际上导致四种Rep蛋白(即Rep78、Rep68、Rep52和Rep40)的表达。然而，编码Rep78和Rep52的未剪接的mRNA在哺乳动物细胞中足以产生AAV载体。在昆虫细胞中Rep78和Rep52蛋白也足以产生AAV载体。The genome of AAV is a linear single-stranded DNA molecule with a length of less than about 5000 nucleotides (nt). Inverted terminal repeats (ITRs) flank the unique coding nucleotide sequences for nonstructural replication (rep) proteins and structural proteins (VP). The VP proteins (VP1, VP2 and VP3) constitute the capsid. The 145nt at the end are self-complementary and ordered so as to form an energetically stable intramolecular double strand forming a T-shaped hairpin. The function of these hairpin structures is the starting point for viral DNA replication and is used as a primer for the cellular DNA polymerase complex. After wtAAV infects mammalian cells, the Rep genes (i.e., Rep78 and Rep52) are expressed from the P5 promoter and the P19 promoter, respectively, and the two expressed Rep proteins both play a role in the replication of the viral genome. The splicing event in the Rep ORF actually leads to the expression of four Rep proteins (i.e., Rep78, Rep68, Rep52 and Rep40). However, unspliced mRNA encoding Rep78 and Rep52 is sufficient to produce AAV vectors in mammalian cells. Rep78 and Rep52 proteins are also sufficient to produce AAV vectors in insect cells.

在本申请中，术语"AAV载体"或"rAAV载体"通常是指包含一个或多个侧翼为细小病毒或AAV反转末端重复序列(ITR)的目的多核苷酸序列、目的基因或"转基因"的载体。In this application, the term "AAV vector" or "rAAV vector" generally refers to a vector comprising one or more polynucleotide sequences of interest, a gene of interest or a "transgene" flanked by parvoviral or AAV inverted terminal repeats (ITRs).

在本申请中，术语"可操作地连接"是指多核苷酸(或多肽)元件在一种功能关系上的连接。当一段核酸被置于与另一段核酸序列具有一种功能关系的位置时，它就是"可操作地连接"的。例如，如果一段转录调控序列影响一段编码序列的转录，则它就与所述编码序列可操作地连接。可操作地连接意味着，相连接的DNA序列通常是连续的，并在有必要将两段蛋白编码区域连接起来时，所述相连接的DNA序列是连续的且在阅读框中。In this application, the term "operably linked" refers to the connection of polynucleotide (or polypeptide) elements in a functional relationship. When a nucleic acid is placed in a functional relationship with another nucleic acid sequence, it is "operably linked". For example, if a transcription regulatory sequence affects the transcription of a coding sequence, it is operably linked to the coding sequence. Operably linked means that the linked DNA sequences are generally continuous and, when it is necessary to connect two protein coding regions, the linked DNA sequences are continuous and in reading frame.

在本申请中，术语"表达控制序列"通常是指一段调控与之可操作地相连的一段核苷酸序列表达的核酸序列。当一段表达控制序列控制和调控一段核苷酸序列的转录和/或翻译时，所述表达控制序列就与所述核苷酸序列"可操作地连接"。因此，一段表达控制序列可包括启动子、增强子、内部核糖体进入位点(IRES)、转录终止子、蛋白编码基因前的起始密码子、内含子剪接信号和终止密码子。术语"表达控制序列"在最低限度上意图包括一段为了影响表达而存在的序列，也可包括其他有利组件。例如，前导序列和融合伙伴序列(fusion partner sequence)是表达控制序列。该术语还可包括将框内外不想要的可能的起始密码子从所述序列中除去的核酸序列设计。其还可包括将不想要的可能的剪接位点除去的核酸序列设计。其包括指导添加polyA尾的序列或多聚腺苷酸化序列(pA)，所述polyA尾即位于mRNA的3'末端的一串腺嘌呤残基，该序列被称为polyA序列。其还可被设计成可增加mRNA的稳定性。已知昆虫细胞中存在影响转录和翻译稳定性的表达控制序列如启动子以及实现翻译的序列如Kozak序列。表达控制序列具有调节与之可操作地连接的核苷酸序列从而降低或提高表达水平的性质。In the present application, the term "expression control sequence" generally refers to a nucleic acid sequence that regulates the expression of a nucleotide sequence to which it is operably linked. When an expression control sequence controls and regulates the transcription and/or translation of a nucleotide sequence, the expression control sequence is "operably linked" to the nucleotide sequence. Therefore, an expression control sequence may include a promoter, an enhancer, an internal ribosome entry site (IRES), a transcription terminator, a start codon before a protein-coding gene, an intron splicing signal, and a stop codon. The term "expression control sequence" is intended to include, at a minimum, a sequence that exists to affect expression, and may also include other advantageous components. For example, a leader sequence and a fusion partner sequence are expression control sequences. The term may also include a nucleic acid sequence design that removes unwanted possible start codons from the sequence both within and outside the frame. It may also include a nucleic acid sequence design that removes unwanted possible splicing sites. It includes a sequence that directs the addition of a polyA tail, a string of adenine residues at the 3' end of the mRNA, or a polyadenylation sequence (pA). It can also be designed to increase the stability of the mRNA. It is known that there are expression control sequences such as promoters that affect the stability of transcription and translation, as well as sequences that enable translation, such as Kozak sequences, in insect cells. Expression control sequences have the property of regulating the nucleotide sequence to which they are operably linked, thereby reducing or increasing the level of expression.

在本申请中，术语"启动子(promoter)"或"转录调控序列"通常是指这样一段核酸片段，所述核酸片段具有控制一种或多种编码序列的转录的作用，它位于所述编码序列的转录起始位点的转录方向上游，并且可通过存在的DNA依赖性RNA聚合酶结合位点、转录起始位点和任何其他DNA序列来进行结构性识别，所述其他DNA序列包括但不限于转录因子结合位点、抑制子和激活子蛋白结合位点和本领域技术人员已知的可直接或间接起到调控启动子转录量作用的任何其他核苷酸序列。"组成型"启动子是一种在大多数生理和发育条件下在大多数组织中具有活性的启动子。"可诱导"启动子为例如通过使用化学诱导物来进行生理或发育调控的启动子。"组织特异性"启动子仅在特定的组织或细胞类型中具有活性。In the present application, term " promoter (promoter) " or " transcriptional regulatory sequence " generally refers to such a nucleic acid fragment, described nucleic acid fragment has the effect of controlling the transcription of one or more coding sequences, it is located at the transcription direction upstream of the transcription start site of described coding sequence, and can carry out structural recognition by the DNA dependent RNA polymerase binding site, transcription start site and any other DNA sequence that exist, described other DNA sequence includes but is not limited to transcription factor binding site, repressor and activator protein binding site and any other nucleotide sequence that can directly or indirectly play the effect of regulating promoter transcription amount known to those skilled in the art. " Constitutive " promoter is a kind of promoter that has activity in most tissues under most physiological and developmental conditions. " Inducible " promoter is for example, by using chemical inducer to carry out the promoter of physiological or developmental regulation. " Tissue-specific " promoter is only active in specific tissue or cell type.

在本申请中，术语“强启动子(strong promoter)”通常指对RNA聚合酶有很高亲和力的启动子，它能指导合成大量的mRNA，即可以高效启动DNA转录的启动子。哺乳动物强启动子包括CMV启动子、CAG启动子、EF1a启动子、SV40启动子等。昆虫细胞强启动子包括p10启动子、polh启动子、IE1启动子等。以说明书中昆虫细胞--杆状病毒表达系统中p10启动子为例：p10启动子(p10 promoter)，是来自于苜蓿银纹夜蛾核多角体病毒(AcMNPV)中晚期表达的p10启动子，是一段可以驱动目的基因在昆虫细胞中高表达的序列。In this application, the term "strong promoter" generally refers to a promoter with a high affinity for RNA polymerase, which can guide the synthesis of a large amount of mRNA, that is, a promoter that can efficiently initiate DNA transcription. Mammalian strong promoters include CMV promoter, CAG promoter, EF1a promoter, SV40 promoter, etc. Insect cell strong promoters include p10 promoter, polh promoter, IE1 promoter, etc. Take the p10 promoter in the insect cell-baculovirus expression system in the specification as an example: the p10 promoter (p10 promoter) is the p10 promoter expressed in the middle and late stages of Autographa californica nuclear polyhedrosis virus (AcMNPV), and is a sequence that can drive the high expression of the target gene in insect cells.

在本申请中，术语“线性双链无末端的DNA(no end DNA，neDNA)”通常指一种线性、双链、封闭末端DNA的载体，其两端通过AAV基因组的反向末端重复序列(ITRs)闭合。neDNA载体具有共价封闭的，因此对核酸外切酶(如核酸外切酶I或核酸外切酶III)具有抵抗力。neDNA可以具有多种构型，如：单体，二聚体，三聚体及多聚体等。In this application, the term "linear double-stranded no end DNA (neDNA)" generally refers to a linear, double-stranded, closed-end DNA vector, both ends of which are closed by the inverted terminal repeats (ITRs) of the AAV genome. The neDNA vector has covalent closure and is therefore resistant to nucleases (such as exonuclease I or exonuclease III). neDNA can have a variety of configurations, such as: monomers, dimers, trimers and polymers.

在本申请中，术语“载体”或“构建体”通常是用于将过客核酸序列(即DNA或RNA)转移至宿主细胞的核酸分子(一般是DNA或RNA)。三种常见类型的载体包括质粒、噬菌体和病毒。载体优选为病毒。同时含启动子和多核苷酸可以可操作地连接于其中的克隆位点的载体是本领域中公知的。这类载体能够在体外或体内转录RNA,可市购自诸如Stratagene(LaJolla,Calif.)和PromegaBiotech(Madison,ffis.)等供应商。为了优化表达和/或体外转录，需要去除、添加或改变所述克隆的5'和/或3'非翻译部分以除去额外的可能不合适的可变翻译起始密码子或可在转录或翻译水平干涉或减少表达的其他序列。In this application, the term "vector" or "construct" is generally a nucleic acid molecule (generally DNA or RNA) used to transfer a passenger nucleic acid sequence (i.e., DNA or RNA) to a host cell. Three common types of vectors include plasmids, bacteriophages, and viruses. The vector is preferably a virus. Vectors containing promoters and polynucleotides that can be operably connected to cloning sites therein are well known in the art. Such vectors are capable of transcribing RNA in vitro or in vivo and are commercially available from suppliers such as Stratagene (LaJolla, Calif.) and Promega Biotech (Madison, Wis.). In order to optimize expression and/or in vitro transcription, it is necessary to remove, add, or change the 5' and/or 3' non-translated portions of the clone to remove additional, possibly inappropriate variable translation initiation codons or other sequences that can interfere or reduce expression at the transcription or translation level.

在本申请中，术语“病毒载体”通常是指包含下列中的一些或全部的载体编码基因产物的病毒基因、控制序列和病毒包装序列。In this application, the term "viral vector" generally refers to a vector that contains some or all of the following viral genes, control sequences, and viral packaging sequences.

在本申请中，“细小病毒载体”可定义为包含将送递至宿主细胞(在体内(invivo)、离体(ex vivo)或在体外(in vitro))中的多核苷酸的重组产生的细小病毒或细小病毒颗粒。细小病毒载体的实例包括例如腺相关病毒载体。In this application, "parvoviral vector" may be defined as a recombinantly produced parvovirus or parvoviral particle comprising a polynucleotide to be delivered to a host cell (in vivo, ex vivo or in vitro). Examples of parvoviral vectors include, for example, adeno-associated virus vectors.

在本申请中，术语“杆状病毒昆虫细胞表达系统”或“BEVS”(Baculovirusexpression vector system)通常是指一种表达外源蛋白的真核表达系统。当前应用最广的杆状病毒表达体系是Autographa california multiply enveloped nuclearpolyhedrosisvirus(AcMNPV)的裂解病毒，简称为杆状病毒(baculovirus)。这类载体的特点是:杆状病毒表达体系全部使用存在于高等真核细胞内的蛋白质修饰、加工和转运体系，属于真核表达体系；AcMNPV是非辅助病毒，不需任何辅助因素即可适宜悬浮生长的昆虫细胞内大量增殖，因此便于大量地表达重组蛋白质；该表达体系使表达产物呈溶解态；杆状病毒的基因很大(130kb)，适合克隆大片段外源基因。杆状病毒不感染脊椎动物病毒启动子在哺乳动物细胞中无活性。In this application, the term "baculovirus insect cell expression system" or "BEVS" (Baculovirus expression vector system) generally refers to a eukaryotic expression system for expressing foreign proteins. The most widely used baculovirus expression system is the split virus of Autographa california multiply enveloped nuclear polyhedrosisvirus (AcMNPV), referred to as baculovirus. The characteristics of this type of vector are: the baculovirus expression system uses all protein modification, processing and transport systems present in higher eukaryotic cells, and belongs to a eukaryotic expression system; AcMNPV is a non-helper virus and can be suitable for large-scale proliferation in suspended insect cells without any auxiliary factors, so it is convenient to express recombinant proteins in large quantities; this expression system makes the expression product in a soluble state; the baculovirus gene is very large (130kb), which is suitable for cloning large fragments of foreign genes. Baculovirus does not infect vertebrates and viral promoters are inactive in mammalian cells.

在本申请中，术语“Bacmid”或“杆粒”通常是指能在昆虫细胞和大肠杆菌之间穿梭的杆状病毒重组DNA，该名称取自Baculovirus和plasmid。In this application, the term "Bacmid" or "bacmid" generally refers to a baculovirus recombinant DNA that can shuttle between insect cells and E. coli. The name is derived from Baculovirus and plasmid.

在本申请中，术语"基本上同一"、"基本同一"或"基本上相似"或"基本相似"通常是指，当两个肽序列或两个核苷酸序列在例如用缺省参数通过GAP或BESTFIT程序进行最佳比对时，它们至少共有某一百分比的序列同一性，所述序列同一性如说明书其他部分所定义。GAP采用Needleman和Wunsch全序列比对算法对两个全长序列进行比对，并使匹配数最大，使空位数最小。通常采用GAP缺省参数，其空位生成罚分＝50(核苷酸)/8(蛋白质)以及空位延长罚分＝3(核苷酸)/2(蛋白质)。针对核苷酸，采用的缺省计分矩阵为nwsgapdna，针对蛋白质采用的缺省计分矩阵为Blosum62(Henikoff&Henikoff,1992,PNAS 89,915-919)。当RNA序列被认为与DNA序列基本类似或具有一定程度的序列同一性时，可认为DNA序列中的胸腺嘧啶⑴相当于RNA序列中的尿嘧啶(U)，这一点已经很清楚。可以利用计算机程序通过序列比对和计分来确定序列同一性百分比。或者，还可通过检索数据库例如FASTA、BLAST等来确定相似性或同一性百分比。In the present application, the term "substantially identical", "substantially identical" or "substantially similar" or "substantially similar" generally refers to when two peptide sequences or two nucleotide sequences are optimally aligned, for example, by the GAP or BESTFIT program using default parameters, they share at least a certain percentage of sequence identity, as defined in other parts of the specification. GAP uses the Needleman and Wunsch full sequence alignment algorithm to align two full-length sequences and maximize the number of matches and minimize the number of gaps. The GAP default parameters are generally used, with a gap creation penalty of 50 (nucleotides)/8 (proteins) and a gap extension penalty of 3 (nucleotides)/2 (proteins). For nucleotides, the default scoring matrix used is nwsgapdna, and the default scoring matrix used for proteins is Blosum62 (Henikoff & Henikoff, 1992, PNAS 89, 915-919). When an RNA sequence is considered to be substantially similar to a DNA sequence or to have a certain degree of sequence identity, it is clear that thymine (T) in the DNA sequence is equivalent to uracil (U) in the RNA sequence. The percentage of sequence identity can be determined by sequence alignment and scoring using a computer program. Alternatively, similarity or identity percentages can also be determined by searching databases such as FASTA, BLAST, etc.

在本申请中，术语“包含(comprise)”及其变形如“包含(comprises)”和“包含(comprising)”通常是指包括所述整数或步骤或者整数或步骤的组，但不排除任意其它整数或步骤或者整数或步骤的组。本文所用术语“包含”又可用术语“含有”或“包括”替代，或本文所用术语有时可用术语“具有”替代。In this application, the term "comprise" and its variations such as "comprises" and "comprising" generally refer to including the stated integers or steps or groups of integers or steps, but not excluding any other integers or steps or groups of integers or steps. The term "comprises" used herein may also be replaced by the term "containing" or "including", or the term used herein may sometimes be replaced by the term "having".

必须注意的是，除非上下文另外清楚地指出，否则如本说明书和所附权利要求书中所使用，单数形式“一个/一种(a/an)”和“该”包括复数指示物。术语“一个/一种”以及术语“一个或多个/一种或多种”和“至少一个/至少一种”可以在此互换使用。It must be noted that, as used in this specification and the appended claims, the singular forms "a/an" and "the" include plural referents unless the context clearly dictates otherwise. The terms "a/an" and the terms "one or more" and "at least one" may be used interchangeably herein.

发明详述DETAILED DESCRIPTION OF THE INVENTION

分离的核酸分子Isolated nucleic acid molecules

本申请的Rep52序列密码子是经优化的，优化后，避免了Rep78和Rep52发生同源重组。密码子优化可基于在密码子使用数据库(参见如http://www.kazusa.or.jp/codon/)中找到的草地贪夜蛾生物体的密码子使用进行，也可从从NCBI数据库中提取Sf9昆虫细胞转录组测序数据，手工完成提取密码子偏好性适应指数等参数。合适的用于密码子优化的计算机程序是本领域技术人员可获得的(参见如Anders Fuglsang，2003，ProteinExpression and Purification 31：247-249；Jayaraj et al.，2005,Nucl.Acids Res.33(9):3011-3016；Chin et al.,2014 Bioinformatics 30(15)2210-2212以及在互联网上)。或者，所述优化可使用相同的密码子使用数据库手工完成。为避免同源重组，选取连续碱基的核苷酸序列是相同数小于等于30，同源性小于等于85％的候选序列，进行后续实验验证，最终获得Rep52密码子优化序列。The codons of the Rep52 sequence of the present application are optimized, and after optimization, homologous recombination between Rep78 and Rep52 is avoided. Codon optimization can be performed based on the codon usage of the fall armyworm organism found in the codon usage database (see, for example, http://www.kazusa.or.jp/codon/), or can be performed manually by extracting Sf9 insect cell transcriptome sequencing data from the NCBI database and extracting parameters such as the codon preference adaptation index. Suitable computer programs for codon optimization are available to those skilled in the art (see, for example, Anders Fuglsang, 2003, Protein Expression and Purification 31: 247-249; Jayaraj et al., 2005, Nucl. Acids Res. 33 (9): 3011-3016; Chin et al., 2014 Bioinformatics 30 (15) 2210-2212 and on the Internet). Alternatively, the optimization can be performed manually using the same codon usage database. To avoid homologous recombination, candidate sequences with nucleotide sequences with the same number of consecutive bases less than or equal to 30 and homology less than or equal to 85% were selected for subsequent experimental verification, and finally the Rep52 codon optimized sequence was obtained.

Rep52-WT的核酸序列如SEQ ID NO:13所示，Rep52-CO的核酸序列如SEQ ID NO:12所示，Rep52-WT与Rep52-CO密码子优化前后对比如图8所示。The nucleic acid sequence of Rep52-WT is shown in SEQ ID NO: 13, the nucleic acid sequence of Rep52-CO is shown in SEQ ID NO: 12, and the comparison between Rep52-WT and Rep52-CO before and after codon optimization is shown in FIG8 .

在某些实施方式中，其中所述第一启动子包括p10启动子。In certain embodiments, the first promoter comprises the p10 promoter.

在某些实施方式中，其中所述第一启动子包括全长p10启动子。In certain embodiments, the first promoter comprises the full-length p10 promoter.

在某些实施方式中，其中所述第一启动子包含SEQ ID NO:9所示的核苷酸序列。In certain embodiments, the first promoter comprises the nucleotide sequence shown in SEQ ID NO:9.

在某些实施方式中，其中所述第二启动子包括Polyhedrin(polh)启动子。In certain embodiments, the second promoter comprises a Polyhedrin (polh) promoter.

在某些实施方式中，其中所述第二启动子包含SEQ ID NO:10所示的核苷酸序列。In certain embodiments, the second promoter comprises the nucleotide sequence shown in SEQ ID NO:10.

另一方面，本申请提供一种分离的核酸分子，其依次包含第一polyA(pA)、编码Rep78蛋白的核苷酸序列、第一启动子、第二启动子、编码Rep52的蛋白核苷酸序列和第二polyA(pA)，其中所述第一启动子为编码Rep78蛋白的核苷酸序列和第一pA的转录启动子，所述第二启动子为编码Rep52蛋白的核苷酸序列和第二polyA的转录启动子，其中所述编码Rep52蛋白的核苷酸序列和/或所述编码Rep78蛋白的核苷酸序列的序列经过密码子优化以避免同源重组，所述第一启动子比第二启动子具有相同或更高的转录启动能力。On the other hand, the present application provides an isolated nucleic acid molecule, which comprises, in sequence, a first polyA (pA), a nucleotide sequence encoding a Rep78 protein, a first promoter, a second promoter, a nucleotide sequence encoding a Rep52 protein, and a second polyA (pA), wherein the first promoter is a transcriptional promoter of a nucleotide sequence encoding a Rep78 protein and the first pA, and the second promoter is a transcriptional promoter of a nucleotide sequence encoding a Rep52 protein and the second polyA, wherein the sequences of the nucleotide sequence encoding the Rep52 protein and/or the nucleotide sequence encoding the Rep78 protein are codon-optimized to avoid homologous recombination, and the first promoter has the same or higher transcriptional initiation ability than the second promoter.

与所述第二启动子相比，具有相同或更高的转录启动能力的第一启动子可被定义如下。所述启动子的强度可通过在用于本申请方法的条件下获得的表达确定。Compared to the second promoter, a first promoter having the same or higher transcription initiation ability can be defined as follows: The strength of the promoter can be determined by the expression obtained under the conditions used in the method of the present application.

在某些实施方式中，所述第一启动子或第二启动子选自polh启动子、p10启动子、碱性蛋白启动子、一种诱导型启动子或IE1启动子，或任何其他晚期或极晚期杆状病毒基因启动子。In certain embodiments, the first promoter or the second promoter is selected from the group consisting of the polh promoter, the p10 promoter, the basic protein promoter, an inducible promoter or the IE1 promoter, or any other late or very late baculovirus gene promoter.

在一个实施方式中，第一启动子为p10启动子，第二启动子为polh启动子。在在另一个实施方案中，本发明核酸构建体中的第一启动子为polh启动子，第二启动子为IE1启动子。在另一个实施方案中，本发明核酸构建体中的第一启动子为pl0启动子，第二启动子为IE1启动子。在另一个实施方案中，本发明核酸构建体中的第一启动子为polh启动子，第二启动子为polh启动子。In one embodiment, the first promoter is a p10 promoter and the second promoter is a polh promoter. In another embodiment, the first promoter in the nucleic acid construct of the present invention is a polh promoter and the second promoter is an IE1 promoter. In another embodiment, the first promoter in the nucleic acid construct of the present invention is a p10 promoter and the second promoter is an IE1 promoter. In another embodiment, the first promoter in the nucleic acid construct of the present invention is a polh promoter and the second promoter is a polh promoter.

在某些实施方式中，其中所述第二启动子包括Polyhedrin启动子。In certain embodiments, the second promoter comprises a Polyhedrin promoter.

载体和表达系统Vectors and expression systems

在某些实施方式中，所述载体适用于在昆虫细胞中表达和/或复制。In certain embodiments, the vector is suitable for expression and/or replication in insect cells.

在某些实施方式中，所述载体包括杆状病毒载体，例如，所述载体可以是pFastBac载体In certain embodiments, the vector comprises a baculovirus vector, for example, the vector may be a pFastBac vector.

在某些实施方式中，所述的载体包含SEQ ID NO:14所示的核苷酸序列或与SEQ IDNO:14具有至少50％、60％、70％、80％、81％、82％、85％、90％、95％、97％、98％或99％的序列同一性的核苷酸序列。In some embodiments, the vector comprises the nucleotide sequence shown in SEQ ID NO:14 or a nucleotide sequence having at least 50%, 60%, 70%, 80%, 81%, 82%, 85%, 90%, 95%, 97%, 98% or 99% sequence identity with SEQ ID NO:14.

另一方面，本申请提供一种细胞，其包含本申请所述的分离的核酸分子或本申请所述的载体。In another aspect, the present application provides a cell comprising the isolated nucleic acid molecule described in the present application or the vector described in the present application.

在某些实施方式中，所述细胞包括昆虫细胞。例如，所述细胞可以是Sf9细胞。In certain embodiments, the cell comprises an insect cell. For example, the cell may be an Sf9 cell.

在某些实施方式中，所述目的基因包含至少一个编码在哺乳动物细胞中表达的目的基因产物的核苷酸序列。In certain embodiments, the gene of interest comprises at least one nucleotide sequence encoding a gene product of interest expressed in mammalian cells.

在某些实施方式中，其中所述第一ITR与目的基因之间包含至少一个启动子。In certain embodiments, at least one promoter is included between the first ITR and the target gene.

在某些实施方式中，其中所述第一ITR与目的基因之间包含至少一个真核启动子。In certain embodiments, at least one eukaryotic promoter is included between the first ITR and the target gene.

在某些实施方式中，其中所述第一ITR与目的基因之间包含至少一个哺乳动物细胞启动子。In certain embodiments, at least one mammalian cell promoter is included between the first ITR and the target gene.

在某些实施方式中，至少一个编码在哺乳动物细胞中表达的目的基因产物的核苷酸序列，与至少一个哺乳动物细胞相容的表达控制序列例如启动子可操作地连接。本领域已知许多这类启动子。可使用在许多种细胞中广泛表达的构成型启动子，例如CMV启动子。在另一些实施方式中，所述启动子是诱导型的、组织特异的、细胞种类特异的或细胞周期特异的。例如，对于肝脏特异性表达，启动子可选自a 1-抗胰蛋白酶启动子、甲状腺激素结合球蛋白启动子、白蛋白启动子、LPS(甲状腺素结合球蛋白)启动子、HCR-Ap0CII杂交启动子、HCR-hAAT杂交启动子和载脂蛋白E启动子。其他实例包括用于肿瘤选择性一特别是神经细胞肿瘤选择性一表达的E2F启动子(Parr et al.，1997，Nat.Med.3:1145-9)或用于在单核血细胞中使用的IL-2启动子(Hagenbaugh et al.，1997，J Exp Med；185:2101-10)。In some embodiments, at least one nucleotide sequence of the target gene product expressed in mammalian cells is operably connected to at least one mammalian cell compatible expression control sequence such as a promoter. Many such promoters are known in the art. The constitutive promoter widely expressed in many kinds of cells can be used, for example the CMV promoter. In other embodiments, the promoter is inducible, tissue-specific, cell type-specific or cell cycle-specific. For example, for liver-specific expression, the promoter can be selected from α1-antitrypsin promoter, thyroid hormone binding globulin promoter, albumin promoter, LPS (thyroxine binding globulin) promoter, HCR-ApoCII hybrid promoter, HCR-hAAT hybrid promoter and apolipoprotein E promoter. Other examples include the E2F promoter for tumor-selective, particularly neuronal tumor-selective, expression (Parr et al., 1997, Nat. Med. 3: 1145-9) or the IL-2 promoter for use in mononuclear blood cells (Hagenbaugh et al., 1997, J Exp Med; 185: 2101-10).

在某些实施方式中，其中所述第一ITR与目的基因之间包含一个哺乳动物细胞启动子和一个昆虫细胞启动子。In certain embodiments, a mammalian cell promoter and an insect cell promoter are included between the first ITR and the target gene.

昆虫细胞Insect cells

另一方面，本申请提供一种含有编码第一氨基酸序列的第一核苷酸序列和编码第二氨基酸序列的第二核苷酸序列的昆虫细胞，其中所述第一核苷酸序列包含编码Rep78蛋白的核苷酸序列，所述第二核苷酸序列编码Rep52蛋白的核苷酸序列，其中所述第一核苷酸序列包含SEQ ID NO:11所示的核苷酸序列或与SEQ ID NO:11具有至少50％、60％、70％、80％、81％、82％、85％、90％、95％、97％、98％或99％的序列同一性的核苷酸序列，第二核苷酸序列包含SEQ ID NO:12所示的核苷酸序列或与SEQ ID NO:12具有至少50％、60％、70％、80％、81％、82％、85％、90％、95％、97％、98％或99％的序列同一性的核苷酸序列。On the other hand, the present application provides an insect cell containing a first nucleotide sequence encoding a first amino acid sequence and a second nucleotide sequence encoding a second amino acid sequence, wherein the first nucleotide sequence comprises a nucleotide sequence encoding a Rep78 protein, and the second nucleotide sequence encodes a nucleotide sequence of a Rep52 protein, wherein the first nucleotide sequence comprises the nucleotide sequence shown in SEQ ID NO:11, or a nucleotide sequence having at least 50%, 60%, 70%, 80%, 81%, 82%, 85%, 90%, 95%, 97%, 98% or 99% sequence identity with SEQ ID NO:11, and the second nucleotide sequence comprises the nucleotide sequence shown in SEQ ID NO:12, or a nucleotide sequence having at least 50%, 60%, 70%, 80%, 81%, 82%, 85%, 90%, 95%, 97%, 98% or 99% sequence identity with SEQ ID NO:12.

例如，所用细胞系可来自草地夜蛾(Spodopterafrugiperda)、果蝇细胞系，或蚊子细胞系，例如白纹伊蚊(Aedesalbopictu)衍生细胞系。优选的昆虫细胞或细胞系是来自易被杆状病毒感染的昆虫种类的细胞，包括例如Se301、SeIZD2109、SeUCRU Sf9,Sf900+、Sf21、BT1-TN-5Bl-4、MG-l、Tn368、HzAml、Ha2302、Hz2E5、High Five(Invitrogen,CA,USA)和(US 6，103，526；Protein Sciences Corp.，CT,USA)。For example, the cell line used can be from Spodopterafrugiperda, Drosophila cell line, or mosquito cell line, such as Aedes albopictu derived cell line. Preferred insect cells or cell lines are cells from insect species susceptible to baculovirus infection, including, for example, Se301, SeIZD2109, SeUCRU Sf9, Sf900+, Sf21, BT1-TN-5B1-4, MG-1, Tn368, HzAml, Ha2302, Hz2E5, High Five (Invitrogen, CA, USA) and (US 6,103,526; Protein Sciences Corp., CT, USA).

在某些实施方式中，其中所述昆虫细胞还包含：一段含有至少一段细小病毒反向末端重复(ITR)核苷酸序列的第三核酸序列。In certain embodiments, the insect cell further comprises: a third nucleic acid sequence comprising at least one parvovirus inverted terminal repeat (ITR) nucleotide sequence.

在本申请中，术语"细小病毒ITR"通常被理解意为一个回文序列，其含有大部分互补、对称排列的也称为和"C"区的序列。所述ITR的功能为复制起点——一个在复制中具有"顺式"作用的位点，即作为反式作用复制蛋白如Rep78(或Rep68)的识别位点，所述反式作用复制蛋白识别所述回文结构和所述回文结构内部的特异序列。所述ITR序列对称性的一个例外是所述ITR的"D"区。它是独特的(在一个ITR内无互补序列)。单链DNA的切开发生在A区和D区之间的结合处。它是新DNA合成起始的区域。D区通常位于所述回文结构的一侧并为核酸复制步骤提供方向性。在哺乳动物细胞中复制的细小病毒通常含有两个ITR序列。然而，可以设计一种ITR使结合位点在A区和D区的两条链上对称分布，所述回文结构的每侧各一个。因此在双链环形DNA模板(如质粒)上，Rep78或Rep68辅助的核酸复制在两个方向上进行，且单一ITR序列足以进行环形载体的细小病毒复制。因此，一个ITR核苷酸序列可用于本发明的上下文中。然而，优选地，使用两个或其他偶数个规则的ITR。最优选地，使用两个ITR序列。在一个实施方式中，所述细小病毒ITR是AAV ITR。In this application, the term "parvovirus ITR" is generally understood to mean a palindromic sequence containing mostly complementary, symmetrically arranged sequences also called "A" and "C" regions. The function of the ITR is a replication origin - a site that has a "cis" effect in replication, that is, it serves as a recognition site for trans-acting replication proteins such as Rep78 (or Rep68), which recognize the palindrome and specific sequences inside the palindrome. An exception to the symmetry of the ITR sequence is the "D" region of the ITR. It is unique (no complementary sequence within one ITR). The cleavage of the single-stranded DNA occurs at the junction between the A region and the D region. It is the region where new DNA synthesis starts. The D region is usually located on one side of the palindrome and provides directionality for the nucleic acid replication step. Parvoviruses that replicate in mammalian cells usually contain two ITR sequences. However, an ITR can be designed so that the binding sites are symmetrically distributed on the two strands of the A region and the D region, one on each side of the palindrome. Thus, on a double-stranded circular DNA template (such as a plasmid), Rep78 or Rep68-assisted nucleic acid replication is performed in both directions, and a single ITR sequence is sufficient for parvoviral replication of the circular vector. Thus, one ITR nucleotide sequence can be used in the context of the present invention. However, preferably, two or other even numbers of regular ITRs are used. Most preferably, two ITR sequences are used. In one embodiment, the parvoviral ITR is an AAV ITR.

在本申请中，术语"昆虫细胞相容载体"通常被理解为一种能够生产性转化或转染昆虫或昆虫细胞的核酸分子。生物载体的实例包括质粒、线性核酸分子和重组病毒。只要与昆虫细胞相容，任何载体都可被使用。所述载体可整合到所述昆虫细胞基因组中但所述载体也可为游离的。所述载体在所述昆虫细胞中并不需永久存在，也包括瞬时游离载体。可通过任何已知的方法例如通过所述细胞的化学处理、电穿孔或感染引入所述载体。在一些实施方案中，所述载体是一种杆状病毒、一种病毒载体或一种质粒。例如，所述载体可以是一种杆状病毒，即所述构建体是一种杆状病毒载体。In the present application, the term "insect cell compatible vector" is generally understood to be a nucleic acid molecule capable of productive transformation or transfection of insects or insect cells. Examples of biological vectors include plasmids, linear nucleic acid molecules and recombinant viruses. Any vector can be used as long as it is compatible with insect cells. The vector can be integrated into the insect cell genome but the vector can also be free. The vector does not need to be permanently present in the insect cell, and also includes transient free vectors. The vector can be introduced by any known method, such as by chemical treatment, electroporation or infection of the cell. In some embodiments, the vector is a baculovirus, a viral vector or a plasmid. For example, the vector can be a baculovirus, that is, the construct is a baculovirus vector.

在某些实施方式中，所述昆虫细胞包含本申请所述的杆状病毒载体。In certain embodiments, the insect cell comprises a baculovirus vector described herein.

例如，所述昆虫细胞可以包含2种不同的杆状病毒，分别为∶①Bac-Rep∶含有rAAV的Rep52和Rep78，分别由多角体蛋白启动子(Polyhedrin promotor)和p10启动子调控；②Bac-Transgene∶含有由rAAV末端重复序列包裹的转基因，由哺乳动物巨细胞病毒(CMV)启动子调控。For example, the insect cells can contain two different baculoviruses, namely: ① Bac-Rep: containing Rep52 and Rep78 of rAAV, regulated by the polyhedrin promoter (Polyhedrin promotor) and the p10 promoter, respectively; ② Bac-Transgene: containing a transgene wrapped by the terminal repeat sequence of rAAV, regulated by the mammalian cytomegalovirus (CMV) promoter.

又例如，所述昆虫细胞可以包含2种不同的杆状病毒，分别为∶①Bac-Rep∶含有rAAV的Rep52和Rep78，分别由多角体蛋白启动子(Polyhedrin promotor)和p10启动子调控；②Bac-Transgene∶含有由rAAV末端重复序列包裹的转基因，由昆虫p10和哺乳动物巨细胞病毒(CMV)启动子共同调控。For another example, the insect cells may contain two different baculoviruses, namely: ① Bac-Rep: containing Rep52 and Rep78 of rAAV, regulated by the polyhedrin promoter (Polyhedrin promotor) and the p10 promoter, respectively; ② Bac-Transgene: containing a transgene wrapped by the terminal repeat sequence of rAAV, co-regulated by the insect p10 and mammalian cytomegalovirus (CMV) promoters.

在某些实施方式中，所述编码目的基因的核苷酸序列，其位置使其可整合到在昆虫细胞中复制的neDNA中。任何核苷酸序列都可被整合，以随后在用根据本发明产生的neDNA转染的哺乳动物细胞中表达。可编码例如一个可表达出RNAi试剂的核苷酸序列，即能够进行RNA干扰的RNA分子，例如shRNA(短发夹RNA)或siRNA(短干扰RNA)。In certain embodiments, the nucleotide sequence encoding the gene of interest is positioned so that it can be integrated into the neDNA replicated in insect cells. Any nucleotide sequence can be integrated for subsequent expression in mammalian cells transfected with the neDNA produced according to the invention. For example, a nucleotide sequence that can express an RNAi agent, i.e., an RNA molecule capable of RNA interference, such as shRNA (short hairpin RNA) or siRNA (short interfering RNA) can be encoded.

在某些实施方式中，所述编码目的基因的核苷酸序列，可编码转座子系统的转座酶或缺陷转座子，转座子系统包括但不限于Sleeping Beauty转座子和piggyBac转座子。In certain embodiments, the nucleotide sequence encoding the target gene may encode a transposase or a defective transposon of a transposon system, and the transposon system includes but is not limited to the Sleeping Beauty transposon and the piggyBac transposon.

在某些实施方式中，所述编码目的基因的核苷酸序列，可编码基因编辑器或用于基因编辑介导同源重组的DNA模版，基因编辑系统包括但不限于CRISPR、TALEN和各类单碱基基因编辑器。In certain embodiments, the nucleotide sequence encoding the target gene may encode a gene editor or a DNA template for gene editing-mediated homologous recombination, and the gene editing system includes but is not limited to CRISPR, TALEN, and various single-base gene editors.

在哺乳动物细胞中表达的目的产物可以是治疗用基因产物。治疗用基因产物可以是多肽或RNA分子(siRNA)或其他基因产物，所述其他基因产物在靶细胞中表达时可提供想要的治疗效果，例如消除不想要的活性，如除去感染的细胞或补足基因缺陷(如导致酶活缺失的缺陷)。治疗用多肽基因产物的实例包括CFTR、IX因子、VIII因子、PAH、脂蛋白脂肪酶(LPL，优选为LPLS447X；参见WO 01/00220)、载脂蛋白Al、尿苷二磷酸葡萄糖醛酸转移酶(UGT)、视网膜色素变性GTP酶调节因子相互作用蛋白(RP-GRIP)和细胞因子或白细胞介素例如IL-10。在某些实施方式中，所述治疗用基因产物的实例包括编码治疗用抗体的多肽基因治疗产物。在某些实施方式中，所述治疗用基因产物的实例包括编码可诱导激活体内体液免疫反应或细胞免疫反应用于治疗传染性疾病和肿瘤的抗原。The target product expressed in mammalian cells can be a therapeutic gene product. The therapeutic gene product can be a polypeptide or an RNA molecule (siRNA) or other gene products, and the other gene products can provide the desired therapeutic effect when expressed in the target cell, such as eliminating unwanted activity, such as removing infected cells or supplementing gene defects (such as defects that cause enzyme activity to be missing). Examples of therapeutic polypeptide gene products include CFTR, Factor IX, Factor VIII, PAH, lipoprotein lipase (LPL, preferably LPLS447X; see WO 01/00220), apolipoprotein A1, uridine diphosphate glucuronyl transferase (UGT), retinitis pigmentosa GTPase regulator interacting protein (RP-GRIP) and cytokines or interleukins such as IL-10. In some embodiments, examples of the therapeutic gene products include polypeptide gene therapy products encoding therapeutic antibodies. In some embodiments, examples of the therapeutic gene products include antigens that encode humoral immune responses or cellular immune responses in vivo for the treatment of infectious diseases and tumors.

此外，所述的第三核苷酸序列还可含有编码用作标记蛋白的多肽的核苷酸序列，以测定细胞转化和表达。用于此目的的合适的标记蛋白为例如荧光蛋白GFP、荧光素酶(Luciferase)、选择性标记基因HSV胸苷激酶(用于HAT培养基上的选择)、细菌潮霉素B磷酸转移酶(用于对潮霉素B的选择)、Tn5氨基糖苷磷酸转移酶(用于对G418的选择)和二氢叶酸还原酶(DHFR)(用于对甲氨蝶呤的选择)、CD20(低亲和性神经生长因子基因)。获得这些标记基因的来源和其使用方法见Sambrook andRussel(2001)"Molecular Cloning:ALaboratory Manual(3rd edition),Cold Spring Harbor Laboratory,Cold SpringHarbor Laboratory Press,New York。In addition, the third nucleotide sequence can also contain the nucleotide sequence of the polypeptide encoding the marker protein to measure cell transformation and expression. Suitable marker proteins for this purpose are, for example, fluorescent protein GFP, luciferase, selective marker gene HSV thymidine kinase (for the selection on HAT culture medium), bacterial hygromycin B phosphotransferase (for the selection of hygromycin B), Tn5 aminoglycoside phosphotransferase (for the selection of G418) and dihydrofolate reductase (DHFR) (for the selection of methotrexate), CD20 (low-affinity nerve growth factor gene). The source of these marker genes and their use methods are shown in Sambrook and Russel (2001) "Molecular Cloning: A Laboratory Manual (3rd edition), Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, New York.

另外，本文上面定义的第三核苷酸序列可含有编码可用作故障保险机制的多肽的核苷酸序列，在认为必要时，所述多肽使得可以用由本发明的neDNA转导的细胞来治愈受试者。这种通常被称为自杀基因的核苷酸序列编码能够将前药转变为有毒物质的蛋白，所述有毒物质能够杀死所述蛋白在其中表达的转基因细胞。这种自杀基因的合适的实例包括例如大肠杆菌(E.coli)胞嘧啶脱氨酶基因或单纯疱疹病毒、巨细胞病毒和水痘带状疱疹病毒的胸苷激酶基因之一，在这些实例中，更昔洛韦可用作杀死受试者中转基因细胞的前药(参见例如Clair et al.1987,Antimicrob.Agents Chemother.31:844-849)。In addition, the third nucleotide sequence defined above herein may contain a nucleotide sequence encoding a polypeptide that can be used as a failsafe mechanism, which, when deemed necessary, allows the subject to be cured with cells transduced with the neDNA of the present invention. This nucleotide sequence, often referred to as a suicide gene, encodes a protein that can convert a prodrug into a toxic substance that can kill the transgenic cell in which the protein is expressed. Suitable examples of such suicide genes include, for example, the Escherichia coli (E. coli) cytosine deaminase gene or one of the thymidine kinase genes of herpes simplex virus, cytomegalovirus, and varicella zoster virus, in which instances ganciclovir can be used as a prodrug to kill transgenic cells in a subject (see, for example, Clair et al. 1987, Antimicrob. Agents Chemother. 31: 844-849).

在另一个实施方案中，一个目的基因产物可以是AAV蛋白。特别是Rep蛋白，如Rep78和/或Rep52，或其功能片段。Rep78和/或Rep52在ne DNA转导或感染的哺乳动物细胞中的表达，可通过允许经所述重组细小病毒(rAAV)载体引入细胞的其他目的基因产物长期或永久地表达，而有利于所述载体的某些应用。In another embodiment, a target gene product can be an AAV protein. In particular, a Rep protein, such as Rep78 and/or Rep52, or a functional fragment thereof. Expression of Rep78 and/or Rep52 in mammalian cells transduced or infected with ne DNA can facilitate certain applications of the vector by allowing long-term or permanent expression of other target gene products introduced into the cell via the recombinant parvovirus (rAAV) vector.

用途use

5)收集目的核酸分子。5) Collect the target nucleic acid molecules.

在具体的操作中，首先将2种杆状病毒分别感染草地贪夜蛾(Spodopterafrugiperda，Sf9)细胞分别进行扩增，将纯化后的2种杆状病毒共感染的昆虫生产细胞悬浮培养。在生产过程中定期检测被感染的细胞量、成活率和ne DNA产量等参数以优化生产工艺。在最佳时间段收获细胞并提取和提纯neDNA，之后检测neDNA的产量和质量。In the specific operation, the two baculoviruses were first used to infect Spodopterafrugiperda (Sf9) cells and amplified separately, and the insect production cells co-infected with the two purified baculoviruses were suspended and cultured. During the production process, parameters such as the amount of infected cells, survival rate, and ne DNA yield were regularly tested to optimize the production process. The cells were harvested at the optimal time period and ne DNA was extracted and purified, and then the yield and quality of ne DNA were tested.

本申请还提供了以下具体实施方式：This application also provides the following specific implementation methods:

1.分离的核酸分子，其包含SEQ ID NO:12所示的核苷酸序列。1. An isolated nucleic acid molecule comprising the nucleotide sequence shown in SEQ ID NO:12.

2.根据实施方式1所述的分离的核酸分子，其编码腺相关病毒(AAV)Rep52蛋白。2. An isolated nucleic acid molecule according to embodiment 1, which encodes adeno-associated virus (AAV) Rep52 protein.

3.分离的核酸分子，其包含编码所述编码腺相关病毒(AAV)Rep78蛋白的核苷酸序列和编码Rep52蛋白的核苷酸序列，其中所述编码Rep52蛋白的核酸分子包含SEQ ID NO:12所示的核苷酸序列。3. An isolated nucleic acid molecule comprising a nucleotide sequence encoding the adeno-associated virus (AAV) Rep78 protein and a nucleotide sequence encoding the Rep52 protein, wherein the nucleic acid molecule encoding the Rep52 protein comprises the nucleotide sequence shown in SEQ ID NO:12.

4.根据实施方式3所述的分离的核酸分子，其中所述编码Rep78蛋白的核苷酸序列为野生型。4. The isolated nucleic acid molecule according to embodiment 3, wherein the nucleotide sequence encoding the Rep78 protein is wild type.

5.根据实施方式3所述的分离的核酸分子，其中所述编码Rep78蛋白的核酸分子包含SEQ ID NO:11所示的核苷酸序列。5. The isolated nucleic acid molecule according to embodiment 3, wherein the nucleic acid molecule encoding the Rep78 protein comprises the nucleotide sequence shown in SEQ ID NO:11.

6.根据实施方式3-5中任一项所述的分离的核酸分子，其还包括启动所述编码AAVRep78蛋白的核苷酸序列转录的第一启动子和启动所述编码AAV Rep52蛋白的核酸分子的第二启动子，所述第一启动子与第二启动子相同或不同。6. An isolated nucleic acid molecule according to any one of embodiments 3-5, which also includes a first promoter that initiates transcription of the nucleotide sequence encoding the AAVRep78 protein and a second promoter that initiates transcription of the nucleic acid molecule encoding the AAV Rep52 protein, wherein the first promoter is the same as or different from the second promoter.

7.根据实施方式6所述的分离的核酸分子，所述第一启动子和第二启动子包括昆虫细胞启动子。7. According to the isolated nucleic acid molecule according to embodiment 6, the first promoter and the second promoter comprise insect cell promoters.

8.根据实施方式6-7中任一项所述的分离的核酸分子，所述第一启动子包括强启动子。8. According to the isolated nucleic acid molecule according to any one of embodiments 6-7, the first promoter comprises a strong promoter.

9.根据实施方式6-8中任一项所述的分离的核酸分子，与所述第二启动子相比，所述第一启动子具有相同或更高的转录启动能力。9. According to the isolated nucleic acid molecule according to any one of embodiments 6-8, the first promoter has the same or higher transcription initiation ability as the second promoter.

10.根据实施方式6-9中任一项所述的分离的核酸分子，其中所述第一启动子和第二启动子各自独立地选自：p10启动子、Polyhedrin(polh)启动子和IE1启动子。10. The isolated nucleic acid molecule of any one of embodiments 6-9, wherein the first promoter and the second promoter are each independently selected from the group consisting of: p10 promoter, Polyhedrin (polh) promoter, and IE1 promoter.

11.根据实施方式6-10中任一项所述的分离的核酸分子，其中所述第一启动子和第二启动子的转录方向相同或相反。11. An isolated nucleic acid molecule according to any one of embodiments 6-10, wherein the transcription direction of the first promoter and the second promoter is the same or opposite.

12.根据实施方式6-11中任一项所述的分离的核酸分子，其中所述第一启动子与所述编码Rep78蛋白的核苷酸序列可操作地连接，第二启动子与所述编码Rep52蛋白的核苷酸序列可操作地连接。12. An isolated nucleic acid molecule according to any one of embodiments 6-11, wherein the first promoter is operably linked to the nucleotide sequence encoding the Rep78 protein, and the second promoter is operably linked to the nucleotide sequence encoding the Rep52 protein.

13.根据实施方式11-12中任一项所述的分离的核酸分子，当第一启动子和第二启动子的转录方向相同时，其依次包含第一启动子、编码Rep78蛋白的核苷酸序列、第二启动子和编码Rep52蛋白的核苷酸序列。13. The isolated nucleic acid molecule according to any one of embodiments 11-12, when the transcription directions of the first promoter and the second promoter are the same, comprises, in sequence, a first promoter, a nucleotide sequence encoding a Rep78 protein, a second promoter and a nucleotide sequence encoding a Rep52 protein.

14.根据实施方式13所述的分离的核酸分子，其中所述编码Rep78蛋白的核苷酸序列和所述编码Rep52蛋白的核苷酸序列下游还分别包含编码polyA的核苷酸序列(pA)。14. The isolated nucleic acid molecule according to embodiment 13, wherein the nucleotide sequence encoding the Rep78 protein and the nucleotide sequence encoding the Rep52 protein further comprise a nucleotide sequence encoding polyA (pA) downstream thereof.

15.根据实施方式14所述的分离的核酸分子，当第一启动子和第二启动子的转录方向相同时，其依次包含第一启动子、编码Rep78蛋白的核苷酸序列、第一pA、第二启动子、编码Rep52蛋白的核苷酸序列和第二pA。15. According to the isolated nucleic acid molecule described in embodiment 14, when the transcription directions of the first promoter and the second promoter are the same, it sequentially comprises a first promoter, a nucleotide sequence encoding Rep78 protein, a first pA, a second promoter, a nucleotide sequence encoding Rep52 protein and a second pA.

16.根据实施方式11-12中任一项所述的分离的核酸分子，当第一启动子和第二启动子的转录方向相反时，其依次包含编码Rep78的核苷酸序列、第一启动子、第二启动子、编码Rep52的核苷酸序列，其中所述第一启动子启动所述编码Rep78蛋白的核苷酸序列的转录，所述第二启动子启动所述编码Rep52蛋白的核苷酸序列的转录。16. According to any one of embodiments 11-12, the isolated nucleic acid molecule comprises, in sequence, a nucleotide sequence encoding Rep78, a first promoter, a second promoter, and a nucleotide sequence encoding Rep52 when the transcription directions of the first promoter and the second promoter are opposite, wherein the first promoter initiates the transcription of the nucleotide sequence encoding the Rep78 protein and the second promoter initiates the transcription of the nucleotide sequence encoding the Rep52 protein.

17.根据实施方式16所述的分离的核酸分子，其中所述第一启动子的5’端与第二启动子的5’端直接或间接连接。17. An isolated nucleic acid molecule according to embodiment 16, wherein the 5’ end of the first promoter is directly or indirectly connected to the 5’ end of the second promoter.

18.根据实施方式17所述的分离的核酸分子，其中所述第一启动子的3’端与所述编码Rep78的核苷酸序列的5’端直接或间接连接。18. An isolated nucleic acid molecule according to embodiment 17, wherein the 3’ end of the first promoter is directly or indirectly connected to the 5’ end of the nucleotide sequence encoding Rep78.

19.根据实施方式18所述的分离的核酸分子，其还包含第一pA，其中所述编码Rep78蛋白的核苷酸序列的3’端与所述编第一pA的5’端直接或间接连接。19. An isolated nucleic acid molecule according to embodiment 18, which further comprises a first pA, wherein the 3' end of the nucleotide sequence encoding the Rep78 protein is directly or indirectly connected to the 5' end of the first pA.

20.根据实施方式16所述的分离的核酸分子，其中所述第二启动子的3’端与所述编码Rep52的核苷酸序列的5’端直接或间接连接。20. An isolated nucleic acid molecule according to embodiment 16, wherein the 3’ end of the second promoter is directly or indirectly connected to the 5’ end of the nucleotide sequence encoding Rep52.

21.根据实施方式20所述的分离的核酸分子，其还包含第二pA，其中所述编码Rep52蛋白的核苷酸序列的3’端与所述编第二pA的5’端直接或间接连接。21. An isolated nucleic acid molecule according to embodiment 20, which further comprises a second pA, wherein the 3' end of the nucleotide sequence encoding the Rep52 protein is directly or indirectly connected to the 5' end of the second pA.

22.根据实施方式14-21中任一项所述的分离的核酸分子，其中所述pA选自：SV40polyA和HSV TK polyA中的任意一种。22. The isolated nucleic acid molecule of any one of embodiments 14-21, wherein the pA is selected from any one of SV40 polyA and HSV TK polyA.

23.根据实施方式3-22中任一项所述的分离的核酸分子，其包含SEQ ID NO:8所示的核苷酸序列。23. An isolated nucleic acid molecule according to any one of embodiments 3-22, comprising the nucleotide sequence shown in SEQ ID NO:8.

24.分离的核酸分子，其依次包含第一polyA(pA)、编码Rep78蛋白的核苷酸序列、第一启动子、第二启动子、编码Rep52的蛋白核苷酸序列和第二polyA(pA)，其中所述第一启动子为编码Rep78蛋白的核苷酸序列和第一pA的转录启动子，所述第二启动子为编码Rep52蛋白的核苷酸序列和第二polyA的转录启动子，其中所述编码Rep52蛋白的核苷酸序列和/或所述编码Rep78蛋白的核苷酸序列的序列经过密码子优化以避免同源重组，所述第一启动子和第二启动子包括昆虫细胞启动子，所述第一启动子是强启动子。24. An isolated nucleic acid molecule, which comprises, in sequence, a first polyA (pA), a nucleotide sequence encoding a Rep78 protein, a first promoter, a second promoter, a nucleotide sequence encoding a Rep52 protein, and a second polyA (pA), wherein the first promoter is a transcriptional promoter of the nucleotide sequence encoding the Rep78 protein and the first pA, and the second promoter is a transcriptional promoter of the nucleotide sequence encoding the Rep52 protein and the second polyA, wherein the sequences of the nucleotide sequence encoding the Rep52 protein and/or the nucleotide sequence encoding the Rep78 protein are codon-optimized to avoid homologous recombination, the first promoter and the second promoter comprise insect cell promoters, and the first promoter is a strong promoter.

25.根据实施方式24所述的分离的核酸分子，其中所述第一启动子包括p10启动子、polh启动子或IE1启动子。25. The isolated nucleic acid molecule of embodiment 24, wherein the first promoter comprises a p10 promoter, a polh promoter, or an IE1 promoter.

26.根据实施方式25所述的分离的核酸分子，其中所述p10启动子包含SEQ ID NO:9所示的核苷酸序列。26. The isolated nucleic acid molecule of embodiment 25, wherein the p10 promoter comprises the nucleotide sequence shown in SEQ ID NO:9.

27.根据实施方式24所述的分离的核酸分子，其中所述第二启动子包括p10启动子、polh启动子或IE1启动子。27. The isolated nucleic acid molecule of embodiment 24, wherein the second promoter comprises a p10 promoter, a polh promoter, or an IE1 promoter.

28.根据实施方式27所述的分离的核酸分子，其中所述polh启动子包含SEQ IDNO:10所示的核苷酸序列。28. An isolated nucleic acid molecule according to embodiment 27, wherein the polh promoter comprises the nucleotide sequence shown in SEQ ID NO:10.

29.载体，其包含实施方式1-28中任一项所述的分离的核酸分子。29. A vector comprising the isolated nucleic acid molecule of any one of embodiments 1-28.

30.根据实施方式29所述的载体，所述载体包括病毒载体。30. The vector of embodiment 29, wherein the vector comprises a viral vector.

31.根据实施方式29所述的载体，所述载体包括杆状病毒载体。31. The vector of embodiment 29, wherein the vector comprises a baculovirus vector.

32.根据实施方式29所述的载体，所述载体包括pFastBac载体。32. The vector of embodiment 29, comprising a pFastBac vector.

33.根据实施方式29-32中任一项所述的载体，其包含SEQ ID NO:14所示的核苷酸序列。33. A vector according to any one of embodiments 29-32, comprising the nucleotide sequence shown in SEQ ID NO:14.

34.细胞，其包含实施方式1-28中任一项所述的分离的核酸分子或实施方式29-33中任一项所述的载体。34. A cell comprising the isolated nucleic acid molecule of any one of embodiments 1-28 or the vector of any one of embodiments 29-33.

35.根据实施方式34所述的细胞，所述细胞包括昆虫细胞。35. The cell of embodiment 34, wherein the cell comprises an insect cell.

36.根据实施方式35所述的细胞，所述细胞包括Spodoptera frugiperda(Sf9)细胞。36. The cell of embodiment 35, which comprises a Spodoptera frugiperda (Sf9) cell.

37.杆状病毒表达系统，其包含第一杆状病毒载体以及包含编码目的基因的核酸序列的第二杆状病毒载体，所述第一杆状病毒载体为实施方式31-33中任一项所述的杆状病毒载体。37. A baculovirus expression system, comprising a first baculovirus vector and a second baculovirus vector comprising a nucleic acid sequence encoding a target gene, wherein the first baculovirus vector is the baculovirus vector of any one of embodiments 31-33.

38.根据权利要37所述的杆状病毒表达系统，自5’端至3’端，所述编码目的基因的核酸序列其依次包含第一细小病毒的反向末端重复序列(inverted terminal repeat,ITR)、目的基因和第二ITR。38. The baculovirus expression system according to claim 37, wherein from the 5' end to the 3' end, the nucleic acid sequence encoding the target gene comprises, in sequence, a first parvovirus inverted terminal repeat (ITR), the target gene and a second ITR.

39.根据实施方式38中任一项所述的杆状病毒表达系统，其中所述第一ITR与目的基因之间还包含至少一个启动子。39. The baculovirus expression system according to any one of embodiment 38, wherein at least one promoter is further included between the first ITR and the target gene.

40.根据实施方式38-39中任一项所述的杆状病毒表达系统，其中所述第一ITR与目的基因之间还包含至少一个真核启动子。40. The baculovirus expression system according to any one of embodiments 38-39, wherein at least one eukaryotic promoter is further included between the first ITR and the target gene.

41.根据实施方式38-40中任一项所述的杆状病毒表达系统，其中所述第一ITR与目的基因之间还包含至少一个哺乳动物细胞启动子。41. The baculovirus expression system according to any one of embodiments 38-40, wherein at least one mammalian cell promoter is further included between the first ITR and the target gene.

42.根据实施方式38-41中任一项所述的杆状病毒表达系统，其中所述第一ITR与目的基因之间还包含一个哺乳动物细胞启动子和一个昆虫细胞启动子。42. The baculovirus expression system according to any one of embodiments 38-41, wherein a mammalian cell promoter and an insect cell promoter are further included between the first ITR and the target gene.

43.根据实施方式42中任一项所述的杆状病毒表达系统，其中所述哺乳动物细胞启动子包括广泛性启动子和组织特异性启动子。43. The baculovirus expression system of any one of embodiments 42, wherein the mammalian cell promoter comprises a ubiquitous promoter and a tissue-specific promoter.

44.根据实施方式43所述的杆状病毒表达系统，其中所述广泛性启动子包括CMV、SV40、EF1a、CAG或UBC启动子。44. The baculovirus expression system of embodiment 43, wherein the ubiquitous promoter comprises a CMV, SV40, EF1a, CAG or UBC promoter.

45.根据实施方式43所述的杆状病毒表达系统，其中所述组织特异性启动子包括ALB、hAAT、TBG，TTR、GFAP、MHCK7或hSyn启动子。45. The baculovirus expression system of embodiment 43, wherein the tissue-specific promoter comprises ALB, hAAT, TBG, TTR, GFAP, MHCK7 or hSyn promoter.

46.根据实施方式42-25中任一项所述的杆状病毒表达系统，其中所述昆虫细胞启动子包括p10启动子。46. The baculovirus expression system of any one of embodiments 42-25, wherein the insect cell promoter comprises a p10 promoter.

47.根据实施方式42-26中任一项所述的杆状病毒表达系统，所述启动子包括CMV和p10启动子。47. The baculovirus expression system of any one of embodiments 42-26, wherein the promoter comprises CMV and p10 promoters.

48.含有编码第一氨基酸序列的第一核苷酸序列和编码第二氨基酸序列的第二核苷酸序列的昆虫细胞，其中所述第一核苷酸序列包含编码Rep78蛋白的核苷酸序列，所述第二核苷酸序列编码Rep52蛋白的核苷酸序列，其中所述第一核苷酸序列包含SEQ ID NO:11所示的核苷酸序列，第二核苷酸序列包含SEQ ID NO:12所示的核苷酸序列。48. An insect cell comprising a first nucleotide sequence encoding a first amino acid sequence and a second nucleotide sequence encoding a second amino acid sequence, wherein the first nucleotide sequence comprises a nucleotide sequence encoding a Rep78 protein, and the second nucleotide sequence comprises a nucleotide sequence encoding a Rep52 protein, wherein the first nucleotide sequence comprises the nucleotide sequence shown in SEQ ID NO:11, and the second nucleotide sequence comprises the nucleotide sequence shown in SEQ ID NO:12.

49.根据实施方式48所述的昆虫细胞，其中所述第一和第二核苷酸序列是一个核酸构建体的一部分，其中所述第一和第二核苷酸序列每个都与用于昆虫细胞表达的表达控制序列可操作地连接。49. An insect cell according to embodiment 48, wherein the first and second nucleotide sequences are part of a nucleic acid construct, wherein each of the first and second nucleotide sequences is operably linked to an expression control sequence for insect cell expression.

50.根据实施方式48-49中任一项所述的昆虫细胞，其中所述昆虫细胞还包含：一段含有至少一段细小病毒反向末端重复核苷酸序列(inverted terminal repeat,ITR)的第三核酸序列。50. The insect cell of any one of embodiments 48-49, wherein the insect cell further comprises: a third nucleic acid sequence comprising at least one parvoviral inverted terminal repeat nucleotide sequence (ITR).

51.根据实施方式50所述的昆虫细胞，其中所述第三核苷酸序列还含有至少一个编码目的基因的核苷酸序列。51. An insect cell according to embodiment 50, wherein the third nucleotide sequence further contains at least one nucleotide sequence encoding a gene of interest.

52.根据实施方式50-51中任一项所述的昆虫细胞，其中所述第三核苷酸序列含有两个细小病毒ITR核苷酸序列，且其中所述至少一个编码目的基因的核苷酸序列位于所述两个细小病毒ITR核苷酸序列之间。52. An insect cell according to any one of embodiments 50-51, wherein the third nucleotide sequence contains two parvoviral ITR nucleotide sequences, and wherein the at least one nucleotide sequence encoding a gene of interest is located between the two parvoviral ITR nucleotide sequences.

53.根据实施方式52所述的昆虫细胞，其中所述细小病毒包括腺相关病毒。53. An insect cell according to embodiment 52, wherein the parvovirus comprises an adeno-associated virus.

54.根据实施方式51-53中任一项所述的昆虫细胞，其中所述第三核苷酸序列是另一个核酸构建体的一部分，其中所述每个编码目的基因的核苷酸序列都与用于哺乳动物表达的表达控制序列可操作地连接。54. An insect cell according to any one of embodiments 51-53, wherein the third nucleotide sequence is part of another nucleic acid construct, wherein each nucleotide sequence encoding a gene of interest is operably linked to an expression control sequence for mammalian expression.

55.根据实施方式49-54中任一项所述的昆虫细胞，所述核酸构建体为昆虫细胞相容的载体。55. An insect cell according to any one of embodiments 49-54, wherein the nucleic acid construct is an insect cell compatible vector.

56.根据实施方式55所述的昆虫细胞，所述核酸构建体为杆状病毒载体。56. The insect cell of embodiment 55, wherein the nucleic acid construct is a baculovirus vector.

57.根据实施方式48-56任一项所述的昆虫细胞，其包含实施方式31-33任一项所述的杆状病毒载体或实施方式37-47任一项所述的杆状病毒表达系统。57. An insect cell according to any one of embodiments 48-56, comprising the baculovirus vector of any one of embodiments 31-33 or the baculovirus expression system of any one of embodiments 37-47.

58.实施方式37-47任一项所述的杆状病毒表达系统或实施方式48-57任一项所述的昆虫细胞在制备目的核酸分子中的应用。58. Use of the baculovirus expression system described in any one of embodiments 37-47 or the insect cell described in any one of embodiments 48-57 in preparing a target nucleic acid molecule.

59.根据实施方式58所述的应用，其中所述目的核酸分子为具有共价封闭末端的线性DNA分子(neDNA)。59. The use according to embodiment 58, wherein the target nucleic acid molecule is a linear DNA molecule (neDNA) with covalently closed ends.

60.一种目的核酸分子的制备方法，包括培养实施方式48-57任一项所述的昆虫细胞。60. A method for preparing a target nucleic acid molecule, comprising culturing the insect cell described in any one of embodiments 48-57.

61.根据实施方式60所述的制备方法，包括：61. The preparation method according to embodiment 60, comprising:

1)提供实施方式37-47中任一项所述的杆状病毒表达系统；1) providing a baculovirus expression system according to any one of embodiments 37-47;

5)收集目的核酸分子。5) Collect the target nucleic acid molecules.

62.根据实施方式62所述的制备方法，还保护分离所述目的核酸分子。62. According to the preparation method described in embodiment 62, the target nucleic acid molecule is also protected and separated.

63.一种试剂盒，其包含实施方式1-28任一项所述的分离的核酸分子、实施方式37-47任一项所述的杆状病毒表达系统和/或实施方式48-57任一项所述的昆虫细胞。63. A kit comprising the isolated nucleic acid molecule of any one of embodiments 1-28, the baculovirus expression system of any one of embodiments 37-47, and/or the insect cell of any one of embodiments 48-57.

不欲被任何理论所限，下文中的实施例仅仅是为了阐释本申请的核酸分子、载体、表达系统、制备方法和用途等，而不用于限制本申请发明的范围。Without intending to be bound by any theory, the following examples are merely intended to illustrate the nucleic acid molecules, vectors, expression systems, preparation methods and uses of the present application, and are not intended to limit the scope of the invention of the present application.

实施例Example

实施例1Example 1

一、实验方法1. Experimental Methods

本文所述的技术涉及利用昆虫细胞--杆状病毒体系制备一种线性双链无末端的DNA(neDNA)表达载体，所述neDNA载体含有AAV反向末端重复序列(ITRs)和一种基因表达盒(本文以EGFP表达框为例)。使用本文公开的neDNA载体示例性合成方法涉及以下几个主要步骤：The technology described herein involves the use of an insect cell-baculovirus system to prepare a linear double-stranded DNA (neDNA) expression vector containing AAV inverted terminal repeats (ITRs) and a gene expression cassette (the EGFP expression cassette is used as an example in this article). The exemplary synthesis method of the neDNA vector disclosed herein involves the following main steps:

1、载体构建1. Vector construction

1.1构建pFastBac-ITR-EGFP供体载体1.1 Construction of pFastBac-ITR-EGFP donor vector

以质粒pAAV-CMV-p10-EGFP为模板，通过PCR扩增CMV-p10-EGFP序列，上下游引物分别为P1和P2(见表1)，通过酶切位点5’BamHI和3’SalI将基因克隆至载体pFastBac-AAV-MCS-PA，构建质粒pFastBac-ITR-EGFP。Using plasmid pAAV-CMV-p10-EGFP as a template, the CMV-p10-EGFP sequence was amplified by PCR, and the upstream and downstream primers were P1 and P2 (see Table 1), respectively. The gene was cloned into the vector pFastBac-AAV-MCS-PA through the restriction sites 5’BamHI and 3’SalI to construct the plasmid pFastBac-ITR-EGFP.

其中，pAAV-CMV-p10-EGFP载体通过基因合成CMV-p10序列[SEQ ID No:1]，并添加5’KpnI和3’NcoI酶切位点，酶切连接插入pAAV-EGFP载体[SEQ ID No:2]中获得；The pAAV-CMV-p10-EGFP vector was obtained by gene synthesis of the CMV-p10 sequence [SEQ ID No: 1], adding 5'KpnI and 3'NcoI restriction sites, and then inserting the restriction sites into the pAAV-EGFP vector [SEQ ID No: 2].

pFastBac-AAV-MCS-PA载体通过基因合成ITR-MCS-PA-ITR序列[SEQ ID No:.3]，并添加5’KpnI和3’HindIII酶切位点，酶切连接插入pFastBac dual载体[SEQ ID No:4]中获得。The pFastBac-AAV-MCS-PA vector was obtained by gene synthesis of the ITR-MCS-PA-ITR sequence [SEQ ID No:.3], adding 5’KpnI and 3’HindIII restriction sites, and then restriction digestion and ligation and insertion into the pFastBac dual vector [SEQ ID No:4].

1.2构建pFastBac-ITR-Fluc供体质粒1.2 Construction of pFastBac-ITR-Fluc donor plasmid

通过基因合成CpGfreeFluc序列[SEQ ID No:5]，并添加5'SalI和3'PmlI酶切位点，酶切连接插入pFastBac-AAV-MCS-PA，构建pFastBac-CpGfreeFluc。The CpGfreeFluc sequence [SEQ ID No: 5] was synthesized by gene synthesis, and 5'SalI and 3'PmlI restriction sites were added. The residues were then cut and ligated and inserted into pFastBac-AAV-MCS-PA to construct pFastBac-CpGfreeFluc.

表1：引物序列Table 1: Primer sequences

引物名称Primer name 序列(5’-3’)Sequence (5'-3') SEQ ID NO:SEQ ID NO: P1P1 GATCCGGTACCACGCGTCTAGGATCCGGTACCACGCGTCTAG 1515 P2P2 CTCGACGTCGACTTTACTTGTACAGCCTCGACGTCGACTTTACTTGTACAGC 1616 P3P3 GCGGGGTTTTACGAGATTGTGGCGGGGTTTTACGAGATTGTG 1717 P4P4 GGGGTGCCTGCTCAATCAGAGGGGTGCCTGCTCAATCAGA 1818 P5P5 GCAGCACACACTGACATCCAGCAGCACACACTGACATCCA 1919 P6P6 GATCACCGGCGCATCAGAATTGGATCACCGGCGCATCAGAATTG 2020 P7P7 ACTTCAAGATCCGCCACAACATACTTCAAGATCCGCCACAACAT 21twenty one P8P8 TCTCGTTGGGGTCTTGCTCAGTCTCGTTGGGGTCTTGCTCAG 22twenty two M13FM13F CCCAGTCACGACGTTGTAAAACGCCCAGTCACGACGTTGTAAAACG 23twenty three M13RM13R AGCGGATAACAATTTCACACAGGAGCGGATAACAATTTCACACAGG 24twenty four

1.3构建pFastBac-RepWT和pFastBac-inRep辅助载体1.3 Construction of pFastBac-RepWT and pFastBac-inRep auxiliary vectors

合成基因Rep52WT[SEQ ID No:13]，通过5’XmaI和3’NheI将基因克隆至载体pFastBac-Rep，构建质粒pFastBac-RepWT。合成基因inRep[SEQ ID No:6]，通过5’BstZ17I和3’SphI将基因克隆至载体pFastBac dual，构建质粒pFastBac-inRep。The synthetic gene Rep52WT [SEQ ID No: 13] was cloned into the vector pFastBac-Rep via 5'XmaI and 3'NheI to construct the plasmid pFastBac-RepWT. The synthetic gene inRep [SEQ ID No: 6] was cloned into the vector pFastBac dual via 5'BstZ17I and 3'SphI to construct the plasmid pFastBac-inRep.

1.4 Rep52密码子优化和构建pFastBac-CORep辅助载体1.4 Rep52 codon optimization and construction of pFastBac-CORep auxiliary vector

从NCBI数据库提取Sf9昆虫细胞转录组测序数据，抓取密码子偏好性适应指数和密码子背景参数。输入初始Rep52WT序列后，使用密码子优化算法随机产生子代序列，并循环迭代直至结果收敛，获得候选密码子优化基因序列。为避免同源重组，选取连续碱基的核苷酸序列是相同数小于等于30，同源性小于等于85％的候选序列，进行后续实验验证，最终获得Rep52密码子优化序列。Extract Sf9 insect cell transcriptome sequencing data from the NCBI database, and capture the codon preference adaptation index and codon background parameters. After inputting the initial Rep52WT sequence, the codon optimization algorithm was used to randomly generate progeny sequences, and the results were iterated until convergence to obtain the candidate codon optimized gene sequence. To avoid homologous recombination, the candidate sequences with the same number of consecutive bases less than or equal to 30 and homology less than or equal to 85% were selected for subsequent experimental verification, and finally the Rep52 codon optimized sequence was obtained.

合成基因Rep52密码子优化序列Rep52-CO[SEQ ID No:12]，通过5’XmaI和3’NheI将基因克隆至载体pFastBac-Rep，构建质粒pFastBac-CORep。The codon-optimized sequence of the synthetic gene Rep52, Rep52-CO [SEQ ID No: 12], was cloned into the vector pFastBac-Rep via 5'XmaI and 3'NheI to construct the plasmid pFastBac-CORep.

1.5构建pFastBac-p10Rep辅助载体1.5 Construction of pFastBac-p10Rep auxiliary vector

常规合成基因p10[SEQ ID No:9]，添加5’BstZ17I和3’NotI，将基因通过5’BstZ17I和3’NotI克隆至载体pFastBac-CORep，构建质粒pFastBac-p10Rep。The conventional synthetic gene p10 [SEQ ID No: 9] was added with 5’BstZ17I and 3’NotI, and the gene was cloned into the vector pFastBac-CORep through 5’BstZ17I and 3’NotI to construct the plasmid pFastBac-p10Rep.

2、质粒转化DH10 Bac2. Plasmid transformation DH10 Bac

将供体质粒pFastBac-ITR-EGFP或pFastBac-ITR-Fluc与辅助质粒(pFastBac-RepWT、pFastBac-inRep、pFastBac-CORep或pFastBac-p10Rep质粒)分别转化至DH10 Bac大肠杆菌感受态细胞(索莱宝，货号：C1480)。诱导DH10 Bac细胞中的质粒与Bacmid杆状病毒穿梭质粒之间重组，生成重组Bacmid杆状质粒。DH10 Bac细胞中的Φ80dlacZΔM15基因的产物可以实现β-半乳糖苷酶的α-互补现象，用于在LB固体培养基(卡那霉素(50μg/ml)、四环素(10μg/ml)、庆大霉素(7μg/ml)、IPTG(40μg/ml)和X-gal(100μg/ml))上进行重组Bacmid的蓝白斑筛选；挑选出由破坏β-半乳糖苷酶指示基因的易位所造成的白色单菌落，在LB培养基(卡那霉素(50μg/ml)、四环素(10μg/ml)和庆大霉素(7μg/ml))中进行37℃过夜培养；使用PureLink^TM HiPure Plasimd DNA Miniprep Kit(赛默飞，货号：K2100-02)抽提大肠杆菌中重组Bacmid杆状质粒。The donor plasmid pFastBac-ITR-EGFP or pFastBac-ITR-Fluc and the helper plasmid (pFastBac-RepWT, pFastBac-inRep, pFastBac-CORep or pFastBac-p10Rep plasmid) were transformed into DH10 Bac E. coli competent cells (Solabo, Cat. No. C1480), respectively. The plasmid in DH10 Bac cells was induced to recombine with the Bacmid bacmid shuttle plasmid to generate the recombinant Bacmid bacmid plasmid. The product of the Φ80dlacZΔM15 gene in DH10 Bac cells can achieve α-complementation of β-galactosidase, and is used for blue-white screening of recombinant Bacmid on LB solid medium (kanamycin (50 μg/ml), tetracycline (10 μg/ml), gentamicin (7 μg/ml), IPTG (40 μg/ml) and X-gal (100 μg/ml)); single white colonies caused by the translocation that destroys the β-galactosidase indicator gene are selected and cultured overnight at 37°C in LB medium (kanamycin (50 μg/ml), tetracycline (10 μg/ml) and gentamicin (7 μg/ml)); PureLink ^TM HiPure Plasimd DNA Miniprep Kit (Thermo Fisher, Cat. No.: K2100-02) is used to extract the recombinant Bacmid rod-shaped plasmid in Escherichia coli.

3、PCR鉴定重组Bacmid杆状质粒3. PCR identification of recombinant Bacmid plasmid

使用Bacmid上的通用引物M13F/R(见表1)，PCR鉴定重组的Bacmid杆状质粒。PCR扩增的条件为：98℃2min；98℃10s，60℃30s，72℃1min，35个循环；72℃5min。PCR结束后，进行琼脂糖凝胶电泳实验，确定目的条带的大小。The universal primers M13F/R on Bacmid (see Table 1) were used to identify the recombinant Bacmid bacmid by PCR. The conditions for PCR amplification were: 98°C for 2 min; 98°C for 10 s, 60°C for 30 s, 72°C for 1 min, 35 cycles; 72°C for 5 min. After the PCR was completed, an agarose gel electrophoresis experiment was performed to determine the size of the target band.

4、P0代重组杆状病毒的获取4. Obtaining P0 recombinant baculovirus

将鉴定正确的重组杆状质粒Bacmid-ITR-EGFP、Bacmid-ITR-Fluc、Bacmid-RepWT、Bacmid-inRep、Bacmid-CORep以及Bacmid-p10Rep分别用ExpiFectamine^TM SfTransfection转染试剂(赛默飞，货号：A38915)转染细胞6孔板中预铺板Sf9细胞(赛默飞，货号：11496-015，27℃无CO₂恒温培养)，每孔有3ml Sf900^TMIII SFM^TM培养基(赛默飞，货号：12658-019)含1*10⁶Sf9细胞，继续培养72-96h，当细胞出现“空泡”状的结构并且趋向于裂解，离心收集细胞培养上清(500g，5min)并通过0.22μm的过滤器，即获得P0代杆状病毒，可4℃避光保存。The correctly identified recombinant bacmid plasmids Bacmid-ITR-EGFP, Bacmid-ITR-Fluc, Bacmid-RepWT, Bacmid-inRep, Bacmid-CORep and Bacmid-p10Rep were transfected into cells using ExpiFectamine ^™ SfTransfection transfection reagent (Thermo Fisher, Catalog No.: A38915). Sf9 cells (Thermo Fisher, Catalog No.: 11496-015) were pre-plated in 6-well plates and cultured at 27°C without _CO2 . Each well contained 3 ml of Sf900 ^™ III SFM ^™ medium (Thermo Fisher, Catalog No.: 12658-019) containing 1*10 ⁶ Sf9 cells were cultured for 72-96 hours. When the cells showed vacuolar structures and tended to lyse, the cell culture supernatant was collected by centrifugation (500 g, 5 min) and filtered through a 0.22 μm filter to obtain the P0 generation baculovirus, which could be stored at 4°C in the dark.

使用噬菌斑法测定重组杆状病毒滴度，依赖于病毒在感染细胞中的复制以及感染周边细胞形成局灶性病变。将Sf9细胞以1*10⁶细胞量预铺板于细胞6孔板中，用Sf900^TMIIISFM^TM培养基以10倍比连续稀释P0代杆状病毒储存液，分别为10^-1至10^-8稀释度，每一稀释度的体积为5ml。将每份1ml的稀释液加入到上述细胞6孔板中，每个稀释度设置2个重复测定孔，在27℃孵育1h。准备10ml 4％琼脂溶液与30ml Sf-900培养基(1.3x)(赛默飞，货号：10967-032)充分混合，放置在40℃水浴中待用。完全吸除6孔板中的病毒稀释液，铺上2ml/孔的上述琼脂溶液，室温孵育1h，待琼脂完全凝固，将细胞培养板转移至27℃培养箱继续孵育。7-10天后，可肉眼看见小而白的斑点就是病毒空斑，也可以使用1mg/ml的中性红染料对6孔板中的琼脂进行染色，较为清楚的计数空斑个数。计数每一稀释度下单个可见的空斑数，使用以下公式计算病毒的滴度：The titer of recombinant baculovirus was determined by plaque assay, which relies on the replication of the virus in infected cells and the formation of focal lesions in infected surrounding cells. Sf9 cells were pre-plated in a 6-well cell plate at 1*10 ⁶ cells, and the P0 generation baculovirus stock solution was serially diluted with Sf900 ^TM IIISFM ^TM medium in a 10-fold ratio, with dilutions of 10 ^-1 to 10 ^-8 , respectively, and the volume of each dilution was 5 ml. Each 1 ml dilution was added to the above 6-well cell plate, and 2 replicate wells were set for each dilution, and incubated at 27 ° C for 1 hour. Prepare 10 ml of 4% agar solution and 30 ml of Sf-900 medium (1.3x) (Thermo Fisher, Cat. No.: 10967-032) and mix thoroughly, and place in a 40 ° C water bath for use. Completely remove the virus dilution in the 6-well plate, spread 2ml/well of the above agar solution, incubate at room temperature for 1h, and transfer the cell culture plate to a 27℃ incubator for further incubation after the agar is completely solidified. After 7-10 days, small white spots can be seen with the naked eye, which are virus plaques. You can also use 1mg/ml neutral red dye to stain the agar in the 6-well plate to count the number of plaques more clearly. Count the number of single visible plaques at each dilution and calculate the virus titer using the following formula:

同时使用SYBR染料的方法进行实时定量PCR检测，检测重组杆状病毒外源基因拷贝数，来确定重组杆状病毒滴度。使用GeneJET Viral DNA and RNA Purification Kit(赛默飞，货号：K0821)提取P0代杆状病毒基因组DNA，用TE溶液溶解病毒DNA，-80℃存储备用。以EGFP、RepWT、inRep、CORep以及p10Rep序列分别设计qPCR引物，其中RepWT、CORep以及p10Rep序列共用引物P3和P4，inRep序列引物为P5和P6，EGFP序列引物为P7和P8(引物序列信息见表1)。实时定量PCR体系为：SYBR染料预混液25μl，上下游引物(10μm)各2μl，样品溶液5μl，蒸馏水16μl。PCR反应程序：预变性95℃60s，95℃15s，60℃15s，72℃45s，40个循环；溶解曲线分析。每个样品设置3个重复测定管。根据标准曲线的浓度和Ct值(阈值循环，cycle of threshold，Ct)对应的关系，可以确定各待测样品的初始浓度。At the same time, the SYBR dye method was used for real-time quantitative PCR detection to detect the number of recombinant baculovirus exogenous gene copies to determine the recombinant baculovirus titer. GeneJET Viral DNA and RNA Purification Kit (Thermo Fisher, Cat. No. K0821) was used to extract P0 generation baculovirus genomic DNA, and the viral DNA was dissolved with TE solution and stored at -80°C for later use. qPCR primers were designed with EGFP, RepWT, inRep, CORep and p10Rep sequences, respectively, among which RepWT, CORep and p10Rep sequences shared primers P3 and P4, inRep sequence primers were P5 and P6, and EGFP sequence primers were P7 and P8 (primer sequence information is shown in Table 1). The real-time quantitative PCR system was: 25 μl of SYBR dye premix, 2 μl of upstream and downstream primers (10 μm), 5 μl of sample solution, and 16 μl of distilled water. PCR reaction program: pre-denaturation 95℃60s, 95℃15s, 60℃15s, 72℃45s, 40 cycles; melting curve analysis. Three replicate test tubes are set for each sample. According to the corresponding relationship between the concentration of the standard curve and the Ct value (threshold cycle, Ct), the initial concentration of each sample to be tested can be determined.

5、重组杆状病毒的扩增5. Amplification of recombinant baculovirus

通常P0代杆状病毒的体积小，滴度低，需要继续感染Sf9细胞以获得高滴度的杆状病毒。初始P0代杆状病毒的滴度在1*10⁶至1*10⁷pfu/ml(菌斑形成单位，plaque formingunits,pfu)，扩增后P1代杆状病毒的滴度在1*10⁷至1*10⁸pfu/ml。用125ml细胞摇瓶含30mlSf9细胞(27℃，130rpm)，细胞密度为2*10⁶细胞/ml，取P0代杆状病毒按感染复数MOI＝0.1感染细胞；继续培养72-96h，当死细胞数达到60--80％，离心收集细胞培养上清(500g，5min)并通过0.22μm的过滤器，即获得P1代杆状病毒。按照上述相同的步骤，获得高滴度的P2病毒。使用实时定量荧光PCR方法和噬菌斑法测定病毒滴度(同前)。Usually, the P0 generation baculovirus is small in size and low in titer, and needs to continue to infect Sf9 cells to obtain high-titer baculovirus. The initial P0 generation baculovirus titer is 1*10 ⁶ to 1*10 ⁷ pfu/ml (plaque forming units, pfu), and the P1 generation baculovirus titer after amplification is 1*10 ⁷ to 1*10 ⁸ pfu/ml. Use a 125ml cell shaker containing 30ml Sf9 cells (27℃, 130rpm), with a cell density of 2*10 ⁶ cells/ml, take the P0 generation baculovirus and infect the cells at a multiplicity of infection MOI=0.1; continue to culture for 72-96h, when the number of dead cells reaches 60--80%, collect the cell culture supernatant by centrifugation (500g, 5min) and pass it through a 0.22μm filter, and the P1 generation baculovirus is obtained. Follow the same steps as above to obtain a high-titer P2 virus. The virus titer was determined using real-time quantitative fluorescence PCR and plaque assay (same as above).

6、杆状病毒表达Bac-p10Rep蛋白的鉴定6. Identification of baculovirus-expressed Bac-p10Rep protein

使用蛋白印迹实验(Western Blotting，WB)检测Rep蛋白的表达情况。P2-P5代病毒按MOI＝3分别感染Sf9细胞，离心收集细胞样品。使用1x SDS溶液裂解细胞，制备蛋白上样液。使用SDS-PAGE进行电泳，电泳结束后，将蛋白样品转于硝酸纤维素膜；用anti-AAVRep的小鼠单克隆抗体(ARP，货号：03-65171)检测Rep蛋白的表达，并且考察Rep蛋白表达的稳定性，即观察P2、P3、P4和P5代杆状病毒中Rep蛋白表达情况。Western Blotting (WB) was used to detect the expression of Rep protein. P2-P5 viruses were used to infect Sf9 cells at MOI=3, and the cell samples were collected by centrifugation. The cells were lysed with 1x SDS solution to prepare the protein loading solution. SDS-PAGE was used for electrophoresis. After the electrophoresis, the protein samples were transferred to nitrocellulose membranes; the expression of Rep protein was detected with anti-AAVRep mouse monoclonal antibody (ARP, catalog number: 03-65171), and the stability of Rep protein expression was investigated, that is, the expression of Rep protein in P2, P3, P4 and P5 baculoviruses was observed.

7、制备neDNA-ITR-EGFP和neDNA-ITR-Fluc表达载体7. Preparation of neDNA-ITR-EGFP and neDNA-ITR-Fluc expression vectors

P2代两种重组杆状病毒BacV-ITR-EGFP或BacV-ITR-Fluc和BacV-Rep按MOI＝1-5(选择合适的MOI参数)，共感染2*10⁶细胞/ml的Sf9昆虫细胞，继续培养72-96h，当细胞直径在18-20μm且细胞活率处于80％左右，收取细胞(500g，5min)。使用QIAGEN质粒抽提试剂盒(货号：12163)抽提Sf9细胞中小分子量的DNA。P2 generation two recombinant baculovirus BacV-ITR-EGFP or BacV-ITR-Fluc and BacV-Rep at MOI = 1-5 (select appropriate MOI parameters), co-infect 2*10 ⁶ cells/ml Sf9 insect cells, continue to culture for 72-96h, when the cell diameter is 18-20μm and the cell viability is about 80%, collect the cells (500g, 5min). Use QIAGEN plasmid extraction kit (Cat. No.: 12163) to extract low molecular weight DNA in Sf9 cells.

8、neDNA在HEK293T细胞中的表达能力8. Expression ability of neDNA in HEK293T cells

将neDNA-ITR-EGFP用LipoFectamine 2000转染试剂(赛默飞，货号：11668-019)转染HEK293T细胞(使用含10％胎牛血清的高糖DMEM培养基(Gibco，货号：11965-092)，37℃，5％CO₂条件下培养)，72h后使用荧光显微镜观察EGFP的表达情况。HEK293T cells were transfected with neDNA-ITR-EGFP using LipoFectamine 2000 transfection reagent (Thermo Fisher, Catalog No.: 11668-019) (using high-glucose DMEM medium (Gibco, Catalog No.: 11965-092) containing 10% fetal bovine serum and cultured at 37°C, 5% _CO2 ), and the expression of EGFP was observed using a fluorescence microscope after 72 hours.

9、纳米脂质颗粒(lipid nanoparticle)递送neDNA在C57BL/6小鼠体内的表达能力9. Expression ability of lipid nanoparticles in delivering neDNA in C57BL/6 mice

将1mg neDNA-ITR-Fluc溶解于醋酸钠-醋酸缓冲液中，作为水相混合液；取可离子化阳离子脂质Dlin-MC3-DMA、二油酰基磷脂酰胆碱DOPC、胆固醇和PEG脂质以50:10:38:2比例溶解在乙醇中，作为脂相混合液；水相和脂相使用Precision Nanosystems Ignite微流控芯片混合，再用中性的磷酸盐缓冲液进行透析，获得中性缓冲液中的LNP-DNA复合物悬液。以2mg/kg的剂量尾静脉注射C57BL/6小鼠。观察neDNA-ITR-Fluc介导基因表达时，在小鼠体内腹腔注射150mg/kg的荧光素酶底物荧光素luciferin，并在Xenogen IVIS Spectrum小动物活体成像仪上观察荧光信号。1 mg of neDNA-ITR-Fluc was dissolved in sodium acetate-acetic acid buffer as the aqueous phase mixture; ionizable cationic lipid Dlin-MC3-DMA, dioleoylphosphatidylcholine DOPC, cholesterol and PEG lipid were dissolved in ethanol at a ratio of 50:10:38:2 as the lipid phase mixture; the aqueous phase and lipid phase were mixed using a Precision Nanosystems Ignite microfluidic chip, and then dialyzed with a neutral phosphate buffer to obtain a LNP-DNA complex suspension in a neutral buffer. C57BL/6 mice were injected into the tail vein at a dose of 2 mg/kg. When observing neDNA-ITR-Fluc-mediated gene expression, 150 mg/kg of luciferase substrate luciferin was injected intraperitoneally into the mice, and the fluorescence signal was observed on a Xenogen IVIS Spectrum small animal in vivo imager.

二、实验结果2. Experimental Results

1、供体质粒和杆状病毒的制备1. Preparation of donor plasmid and baculovirus

在质粒pFastBac上插入两端包含ITR序列的EGFP基因表达框，得到重组质粒pFastBac-ITR-EGFP(如图1A)。The EGFP gene expression cassette containing ITR sequences at both ends was inserted into the plasmid pFastBac to obtain the recombinant plasmid pFastBac-ITR-EGFP (as shown in FIG1A ).

构建辅助质粒pFastBac-p10Rep(如图1B，SEQ ID NO:14)，p10为Rep78-WT的启动子，polh为Rep52-CO的启动子。构建辅助质粒pFastBac-RepWT，其中ΔIE1为Rep78的启动子，polh为Rep52WT的启动子。构建辅助质粒pFastBac-inRep，在同一表达框中，其中p10为Rep78启动子，Rep78和Rep52之间有一个人工合成的内含子(Intron)序列(包含polh启动子)，作为Rep52的启动子，该构建基于Haifeng Chen 2008年的工作。构建辅助质粒pFastBac-CORep，ΔIE1为Rep78的启动子，polh为Rep52-CO的启动子。图2为p10Rep基因、RepWT基因、inRep基因和CORep基因的转录示意图。Auxiliary plasmid pFastBac-p10Rep (as shown in Figure 1B, SEQ ID NO: 14) was constructed, where p10 is the promoter of Rep78-WT and polh is the promoter of Rep52-CO. Auxiliary plasmid pFastBac-RepWT was constructed, where ΔIE1 is the promoter of Rep78 and polh is the promoter of Rep52WT. Auxiliary plasmid pFastBac-inRep was constructed, where p10 is the promoter of Rep78 and there is an artificially synthesized intron sequence (including polh promoter) between Rep78 and Rep52 in the same expression frame as the promoter of Rep52. This construction is based on the work of Haifeng Chen in 2008. Auxiliary plasmid pFastBac-CORep was constructed, where ΔIE1 is the promoter of Rep78 and polh is the promoter of Rep52-CO. FIG. 2 is a schematic diagram of the transcription of the p10Rep gene, the RepWT gene, the inRep gene, and the CORep gene.

将上述供体质粒(pFastBac-ITR-EGFP)与不同的辅助质粒分别转化DH10 Bac大肠杆菌感受态细胞，蓝白斑筛选得到杆状质粒Bacmid-ITR-EGFP和Bacmid-Rep，并通过PCR进一步筛选序列正确的重组Bacmid杆状质粒。重组Bacmid分别转染Sf9细胞得到重组杆状病毒BacV-ITR-EGFP和BacV-Rep。The donor plasmid (pFastBac-ITR-EGFP) and different helper plasmids were transformed into DH10 Bac E. coli competent cells, and the bacmids Bacmid-ITR-EGFP and Bacmid-Rep were obtained by blue-white screening, and the recombinant Bacmid bacmids with correct sequences were further screened by PCR. The recombinant Bacmids were transfected into Sf9 cells to obtain recombinant baculoviruses BacV-ITR-EGFP and BacV-Rep.

2、Rep蛋白在昆虫细胞Sf9中的表达2. Expression of Rep protein in insect cells Sf9

用anti-AAV Rep的小鼠单克隆抗体(ARP，货号：03-65171)检测BacV-p10Rep感染Sf9细胞表达的Rep蛋白，同时使用未感染Sf9细胞作为阴性对照。使用BacV-RepWT^[1]，BacV-inRep^[2]和BacV-CORep(优化RepWT中Rep52序列)作为对照。P1代BacV-Rep杆状病毒按MOI＝0.1连续感染Sf9细胞，分别获得P2、P3、P4和P5杆状病毒。用P2-P5代BacV-Rep按MOI＝3分别感染Sf9细胞，收集细胞样品，用WB检测Rep蛋白的表达，并且考察不同代次杆状病毒中Rep蛋白表达的稳定性，如图3。The Rep protein expressed in Sf9 cells infected with BacV-p10Rep was detected using a mouse monoclonal antibody against AAV Rep (ARP, Catalog No. 03-65171), and uninfected Sf9 cells were used as negative controls. BacV-RepWT ^[1] , BacV-inRep ^[2] , and BacV-CORep (optimized Rep52 sequence in RepWT) were used as controls. P1 generation BacV-Rep baculovirus was used to continuously infect Sf9 cells at an MOI of 0.1 to obtain P2, P3, P4, and P5 baculoviruses, respectively. P2-P5 generations of BacV-Rep were used to infect Sf9 cells at an MOI of 3, and cell samples were collected. The expression of Rep protein was detected by WB, and the stability of Rep protein expression in different generations of baculovirus was investigated, as shown in Figure 3.

3、制备neDNA-ITR-EGFP基因表达载体3. Preparation of neDNA-ITR-EGFP gene expression vector

P2代两种重组杆状病毒BacV-ITR-EGFP和BacV-Rep按MOI＝3，共感染Sf9昆虫细胞，培养72-96h，当细胞直径在18-20μm且细胞活率处于80％左右，收取细胞并抽提小分子量的neDNA-ITR-EGFP基因表达载体。用0.8％的琼脂糖凝胶进行电泳鉴定neDNA的条带大小，主要条带在2.7kb和5.4kb，分别对应neDNA-ITR-EGFP表达载体的单体和二聚体。二聚体上面的条带根据大小可以推算为三聚体和多聚体，如图4。Two recombinant baculoviruses of the P2 generation, BacV-ITR-EGFP and BacV-Rep, were co-infected with Sf9 insect cells at an MOI of 3 and cultured for 72-96 hours. When the cell diameter was 18-20 μm and the cell viability was about 80%, the cells were harvested and the small molecular weight neDNA-ITR-EGFP gene expression vector was extracted. The band size of neDNA was identified by electrophoresis using 0.8% agarose gel. The main bands were at 2.7 kb and 5.4 kb, corresponding to the monomer and dimer of the neDNA-ITR-EGFP expression vector, respectively. The bands above the dimer can be inferred to be trimers and polymers according to their size, as shown in Figure 4.

使用SalI单限制性内切酶对neDNA进行酶切，得到2kb，0.7kb条带，与预期单体酶切后片段大小相符，如图5A-B所示，5A.neDNA-EGFP单体片段长2.7kb，SalI限制性内切酶可将单体片段酶切为2kb和0.7kb大小的片段。5B.二聚体具有一种“头连头、尾连尾”的结构，可被酶切为2kb和1.4kb大小的片段或4kb和0.7kb大小的片段。深蓝色的长方形指示5’ITR序列，浅蓝色的长方形指示3’ITR序列。The neDNA was digested with SalI restriction endonuclease to obtain 2kb and 0.7kb bands, which were consistent with the expected fragment size after monomer digestion, as shown in Figure 5A-B. 5A. The neDNA-EGFP monomer fragment is 2.7kb long, and the SalI restriction endonuclease can digest the monomer fragment into 2kb and 0.7kb fragments. 5B. The dimer has a "head-to-head, tail-to-tail" structure and can be digested into 2kb and 1.4kb fragments or 4kb and 0.7kb fragments. The dark blue rectangle indicates the 5'ITR sequence, and the light blue rectangle indicates the 3'ITR sequence.

不同Rep蛋白驱动neDNA-ITR-EGFP基因表达载体的表达产量是不同的，相比于RepWT，inRep和CORep驱动的neDNA-ITR-EGFP基因表达载体的表达产量，p10Rep驱动的neDNA-ITR-EGFP基因表达载体的表达产量约为：每6*10⁷Sf9细胞中可表达近270μg的neDNA基因表达载体，是其他三组的2-3倍，详见表2。The expression yields of neDNA-ITR-EGFP gene expression vectors driven by different Rep proteins were different. Compared with the expression yields of neDNA-ITR-EGFP gene expression vectors driven by RepWT, inRep and CORep, the expression yield of neDNA-ITR-EGFP gene expression vector driven by p10Rep was approximately: nearly 270 μg of neDNA gene expression vector could be expressed in every 6*10 ⁷ Sf9 cells, which was 2-3 times that of the other three groups, as shown in Table 2 for details.

表2：neDNA产量Table 2: neDNA yield

4、neDNA-ITR-EGFP基因表达载体转染HEK293细胞4. Transfection of HEK293 cells with neDNA-ITR-EGFP gene expression vector

将neDNA-ITR-EGFP用LipoFectamine 2000转染试剂转染HEK293细胞，24h开始使用荧光显微镜观察EGFP的表达情况。转染后72h，neDNA-ITR-EGFP基因表达载体在HEK293细胞中的表达情况，如图6所示。neDNA-ITR-EGFP was transfected into HEK293 cells using LipoFectamine 2000 transfection reagent, and the expression of EGFP was observed using a fluorescence microscope starting at 24 hours. 72 hours after transfection, the expression of the neDNA-ITR-EGFP gene expression vector in HEK293 cells is shown in FIG6 .

5、neDNA-ITR-Fluc基因表达载体在小鼠体内表达荧光素酶5. neDNA-ITR-Fluc gene expression vector expresses luciferase in mice

使用供体质粒pFastBac-ITR-Fluc和辅助质粒pFastBac-p10Rep构建杆状质粒Bacmid-ITR-Fluc和Bacmid-p10Rep，获得重组杆状病毒BacV-ITR-Fluc和BacV-p10Rep，制备neDNA-ITR-Fluc基因表达载体。用纳米脂质颗粒(LNP)递送neDNA-ITR-Fluc尾静脉注射C57BL/6小鼠体内，24h开始可观察到荧光素酶在小鼠肝脏稳定表达，如图7所示。The donor plasmid pFastBac-ITR-Fluc and the auxiliary plasmid pFastBac-p10Rep were used to construct the bacmid plasmids Bacmid-ITR-Fluc and Bacmid-p10Rep, and the recombinant baculovirus BacV-ITR-Fluc and BacV-p10Rep were obtained to prepare the neDNA-ITR-Fluc gene expression vector. The neDNA-ITR-Fluc was delivered by nanolipid particles (LNP) and injected into the tail vein of C57BL/6 mice. From 24 hours on, the stable expression of luciferase in the mouse liver could be observed, as shown in Figure 7.

综上，In summary,

本申请提供了涉及一种使用昆虫细胞--杆状病毒表达体系产生neDNA的方法，使用该方法产生的neDNA具有多种构型，如：单体，二聚体，三聚体及多聚体等。在小鼠实验中，相比于质粒DNA，neDNA可以介导目的基因表达框在体内(如：肝)的长效表达。The present application provides a method for producing neDNA using an insect cell-baculovirus expression system, wherein the neDNA produced by the method has a variety of configurations, such as monomers, dimers, trimers and polymers, etc. In mouse experiments, compared with plasmid DNA, neDNA can mediate long-term expression of the target gene expression frame in vivo (such as liver).

该方法优化了Rep蛋白表达载体，提升Rep蛋白表达的稳定性，从而制备提高了neDNA的产量和生产体系的稳定性，其中p10Rep是优化后的表达框，完整的p10启动子，提高了Rep78的表达量；Rep52序列密码子优化后，避免了Rep78和Rep52发生同源重组。This method optimizes the Rep protein expression vector and improves the stability of Rep protein expression, thereby improving the yield of neDNA and the stability of the production system. Among them, p10Rep is an optimized expression frame and a complete p10 promoter, which increases the expression level of Rep78; after the Rep52 sequence codon is optimized, homologous recombination between Rep78 and Rep52 is avoided.

Bac-p10Rep相比于其他Bac-Rep和Bac-EFGP共感染Sf9细胞，neDNA的产量最高，产率平均有2-3倍提升。Compared with other Bac-Rep and Bac-EFGP co-infected Sf9 cells, Bac-p10Rep has the highest neDNA production, with an average yield increase of 2-3 times.

Bac-p10Rep相比于其他Bac-Rep，经过3次连续杆状病毒传代后，Rep蛋白(Rep78)的表达稳定性更好。Compared with other Bac-Rep, Bac-p10Rep has better expression stability of Rep protein (Rep78) after three consecutive baculovirus passages.

参考文献:references:

[1]Masashi Urabe,Chuantian Ding,Robert M Kotin.Insect cells as afactory to produce adeno-associated virus type 2 vectors.Hum Gene Ther,2002,13(16):1935-43.[1]Masashi Urabe, Chuantian Ding, Robert M Kotin. Insect cells as afactory to produce adeno-associated virus type 2 vectors. Hum Gene Ther, 2002, 13(16):1935-43.

[2]Haifeng Chen.Intron splicing-mediated expression of AAV Rep andCap genes and production of AAV vectors in insect cells.Mol Ther,2008,16(5):924-30.[2]Haifeng Chen. Intron splicing-mediated expression of AAV Rep andCap genes and production of AAV vectors in insect cells. Mol Ther, 2008, 16(5):924-30.

序列表Sequence Listing

<110> 上海渤因生物科技有限公司<110> Shanghai Boyin Biotechnology Co., Ltd.

<120> 杆状病毒载体及其用途<120> Baculovirus vector and its use

<130> 0251-PA-002<130> 0251-PA-002

<160> 24<160> 24

<170> PatentIn version 3.5<170> PatentIn version 3.5

<210> 1<210> 1

<211> 1097<211> 1097

<212> DNA<212> DNA

<213> 人工序列（Artificial Sequence）<213> Artificial Sequence

<220><220>

<223> CMV-p10<223> CMV-p10

<400> 1<400> 1

ggtaccacgc gtctagttat taatagtaat caattacggg gtcattagtt catagcccat 60ggtaccacgc gtctagttat taatagtaat caattacggg gtcattagtt catagcccat 60

atatggagtt ccgcgttaca taacttacgg taaatggccc gcctggctga ccgcccaacg 120atatggagtt ccgcgttaca taacttacgg taaatggccc gcctggctga ccgcccaacg 120

acccccgccc attgacgtca ataatgacgt atgttcccat agtaacgcca atagggactt 180acccccgccc attgacgtca ataatgacgt atgttcccat agtaacgcca atagggactt 180

tccattgacg tcaatgggtg gagtatttac ggtaaactgc ccacttggca gtacatcaag 240tccattgacg tcaatgggtg gagtatttac ggtaaactgc ccacttggca gtacatcaag 240

tgtatcatat gccaagtccg ccccctattg acgtcaatga cggtaaatgg cccgcctggc 300tgtatcatat gccaagtccg ccccctattg acgtcaatga cggtaaatgg cccgcctggc 300

attatgccca gtacatgacc ttacgggact ttcctacttg gcagtacatc tacgtattag 360attatgccca gtacatgacc ttacgggact ttcctacttg gcagtacatc tacgtattag 360

tcatcgctat taccatgctg atgcggtttt ggcagtacac caatgggcgt ggatagcggt 420tcatcgctat taccatgctg atgcggtttt ggcagtacac caatgggcgt ggatagcggt 420

ttgactcacg gggatttcca agtctccacc ccattgacgt caatgggagt ttgttttggc 480ttgactcacg gggatttcca agtctccacc ccattgacgt caatgggagt ttgttttggc 480

accaaaatca acgggacttt ccaaaatgtc gtaataaccc cgccccgttg acgcaaatgg 540accaaaatca acgggacttt ccaaaatgtc gtaataaccc cgccccgttg acgcaaatgg 540

gcggtaggcg tgtacggtgg gaggtctata taagcagacg tcgtttagtg aaccgtcaga 600gcggtaggcg tgtacggtgg gaggtctata taagcagacg tcgtttagtg aaccgtcaga 600

tcactagatg ctttattgcg gtagtttatc acagttaaat tgctaacgcc agtctcgaac 660tcactagatg ctttattgcg gtagtttatc acagttaaat tgctaacgcc agtctcgaac 660

ttaacgtgca gaagttggtc gtgaggcact gggcaggtaa gtatcgggcc ctttgtgcgg 720ttaacgtgca gaagttggtc gtgaggcact gggcaggtaa gtatcgggcc ctttgtgcgg 720

ggggagcggc tcggggctgt ccgcgggggg acggctgcct tcggggggga cggggcaggg 780ggggagcggc tcggggctgt ccgcgggggg acggctgcct tcggggggga cggggcaggg 780

cggggttcgg cttctggcgt gtgaccggcg gctctagagc ctctgctaac catgttcatg 840cggggttcgg cttctggcgt gtgaccggcg gctctagagc ctctgctaac catgttcatg 840

ccttcttctt tttcctacag ctcctgggca acgtgctggt tattgtgctg tctcatcatt 900ccttcttctt tttcctacag ctcctgggca acgtgctggt tattgtgctg tctcatcatt 900

ttggcaaaga attggatcgg accgaaatta atacgactca ctatagggga attgtgagcg 960ttggcaaaga attggatcgg accgaaatta atacgactca ctatagggga attgtgagcg 960

gataacaatt ccccggagtt aatccgggac ctttaattca acccaacaca atatattata 1020gataacaatt ccccggagtt aatccgggac ctttaattca acccaacaca atatattata 1020

gttaaataag aattattatc aaatcatttg tatattaatt aaaatactat actgtaaatt 1080gttaaataag aattattatc aaatcatttg tatattaatt aaaatactat actgtaaatt 1080

acattttatt tacaatc 1097acattttatt tacaatc 1097

<210> 2<210> 2

<211> 5547<211> 5547

<212> DNA<212> DNA

<213> 人工序列（Artificial Sequence）<213> Artificial Sequence

<220><220>

<223> pAAV-GFP<223> pAAV-GFP

<400> 2<400> 2

cagcagctgg cgtaatagcg aagaggcccg caccgatcgc ccttcccaac agttgcgcag 60cagcagctgg cgtaatagcg aagaggcccg caccgatcgc ccttcccaac agttgcgcag 60

cctgaatggc gaatggaatt ccagacgatt gagcgtcaaa atgtaggtat ttccatgagc 120cctgaatggc gaatggaatt ccagacgatt gagcgtcaaa atgtaggtat ttccatgagc 120

gtttttcctg ttgcaatggc tggcggtaat attgttctgg atattaccag caaggccgat 180gtttttcctg ttgcaatggc tggcggtaat attgttctgg atattaccag caaggccgat 180

agtttgagtt cttctactca ggcaagtgat gttattacta atcaaagaag tattgcgaca 240agtttgagtt cttctactca ggcaagtgat gttattacta atcaaagaag tattgcgaca 240

acggttaatt tgcgtgatgg acagactctt ttactcggtg gcctcactga ttataaaaac 300acggttaatt tgcgtgatgg acagactctt ttactcggtg gcctcactga ttataaaaac 300

acttctcagg attctggcgt accgttcctg tctaaaatcc ctttaatcgg cctcctgttt 360acttctcagg attctggcgt accgttcctg tctaaaatcc ctttaatcgg cctcctgttt 360

agctcccgct ctgattctaa cgaggaaagc acgttatacg tgctcgtcaa agcaaccata 420agctcccgct ctgattctaa cgaggaaagc acgttatacg tgctcgtcaa agcaaccata 420

gtacgcgccc tgtagcggcg cattaagcgc ggcgggtgtg gtggttacgc gcagcgtgac 480gtacgcgccc tgtagcggcg cattaagcgc ggcgggtgtg gtggttacgc gcagcgtgac 480

cgctacactt gccagcgccc tagcgcccgc tcctttcgct ttcttccctt cctttctcgc 540cgctacactt gccagcgccc tagcgcccgc tcctttcgct ttcttccctt cctttctcgc 540

cacgttcgcc ggctttcccc gtcaagctct aaatcggggg ctccctttag ggttccgatt 600cacgttcgcc ggctttcccc gtcaagctct aaatcggggg ctccctttag ggttccgatt 600

tagtgcttta cggcacctcg accccaaaaa acttgattag ggtgatggtt cacgtagtgg 660tagtgcttta cggcacctcg accccaaaaa acttgattag ggtgatggtt cacgtagtgg 660

gccatcgccc tgatagacgg tttttcgccc tttgacgttg gagtccacgt tctttaatag 720gccatcgccc tgatagacgg tttttcgccc tttgacgttg gagtccacgt tctttaatag 720

tggactcttg ttccaaactg gaacaacact caaccctatc tcggtctatt cttttgattt 780tggactcttg ttccaaactg gaacaacact caaccctatc tcggtctatt cttttgattt 780

ataagggatt ttgccgattt cggcctattg gttaaaaaat gagctgattt aacaaaaatt 840ataagggatt ttgccgattt cggcctattg gttaaaaaat gagctgattt aacaaaaatt 840

taacgcgaat tttaacaaaa tattaacgtt tacaatttaa atatttgctt atacaatctt 900taacgcgaat tttaacaaaa tattaacgtt tacaatttaa atatttgctt atacaatctt 900

cctgtttttg gggcttttct gattatcaac cggggtacat atgattgaca tgctagtttt 960cctgtttttg gggcttttct gattatcaac cggggtacat atgattgaca tgctagtttt 960

acgattaccg ttcatcgcct gcactgcgcg ctcgctcgct cactgaggcc gcccgggcaa 1020acgattaccg ttcatcgcct gcactgcgcg ctcgctcgct cactgaggcc gcccgggcaa 1020

agcccgggcg tcgggcgacc tttggtcgcc cggcctcagt gagcgagcga gcgcgcagag 1080agcccgggcg tcgggcgacc tttggtcgcc cggcctcagt gagcgagcga gcgcgcagag 1080

agggagtgga attcacgcgt ggtacgatct gaattcggta caattcacgc gtgggtacca 1140agggagtgga attcacgcgt ggtacgatct gaattcggta caattcacgc gtgggtacca 1140

cgcgtctagt tattaatagt aatcaattac ggggtcatta gttcatagcc catatatgga 1200cgcgtctagt tattaatagt aatcaattac ggggtcatta gttcatagcc catatatgga 1200

gttccgcgtt acataactta cggtaaatgg cccgcctggc tgaccgccca acgacccccg 1260gttccgcgtt acataactta cggtaaatgg cccgcctggc tgaccgccca acgacccccg 1260

cccattgacg tcaataatga cgtatgttcc catagtaacg ccaataggga ctttccattg 1320cccattgacg tcaataatga cgtatgttcc catagtaacg ccaataggga ctttccattg 1320

acgtcaatgg gtggagtatt tacggtaaac tgcccacttg gcagtacatc aagtgtatca 1380acgtcaatgg gtggagtatt tacggtaaac tgcccacttg gcagtacatc aagtgtatca 1380

tatgccaagt acgcccccta ttgacgtcaa tgacggtaaa tggcccgcct ggcattatgc 1440tatgccaagt acgcccccta ttgacgtcaa tgacggtaaa tggcccgcct ggcattatgc 1440

ccagtacatg accttatggg actttcctac ttggcagtac atctacgtat tagtcatcgc 1500ccagtacatg accttatggg actttcctac ttggcagtac atctacgtat tagtcatcgc 1500

tattaccatg gtgatgcggt tttggcagta catcaatggg cgtggatagc ggtttgactc 1560tattaccatg gtgatgcggt tttggcagta catcaatggg cgtggatagc ggtttgactc 1560

acggggattt ccaagtctcc accccattga cgtcaatggg agtttgtttt ggcaccaaaa 1620acggggattt ccaagtctcc accccattga cgtcaatggg agtttgtttt ggcaccaaaa 1620

tcaacgggac tttccaaaat gtcgtaacaa ctccgcccca ttgacgcaaa tgggcggtag 1680tcaacgggac tttccaaaat gtcgtaacaa ctccgcccca ttgacgcaaa tgggcggtag 1680

gcgtgtacgg tgggaggtct atataagcag agctcgttta gtgaaccgtc agatcgcctg 1740gcgtgtacgg tgggaggtct atataagcag agctcgttta gtgaaccgtc agatcgcctg 1740

gagacgccat ccacgctgtt ttgacctcca tagaagacac cgggaccgat ccagcctcca 1800gagacgccat ccacgctgtt ttgacctcca tagaagacac cgggaccgat ccagcctcca 1800

ccggttcgcc accatggtga gcaagggcga ggagctgttc accggggtgg tgcccatcct 1860ccggttcgcc accatggtga gcaagggcga ggagctgttc accggggtgg tgcccatcct 1860

ggtcgagctg gacggcgacg taaacggcca caagttcagc gtgtccggcg agggcgaggg 1920ggtcgagctg gacggcgacg taaacggcca caagttcagc gtgtccggcg agggcgaggg 1920

cgatgccacc tacggcaagc tgaccctgaa gttcatctgc accaccggca agctgcccgt 1980cgatgccacc tacggcaagc tgaccctgaa gttcatctgc accaccggca agctgcccgt 1980

gccctggccc accctcgtga ccaccctgac ctacggcgtg cagtgcttca gccgctaccc 2040gccctggccc accctcgtga ccaccctgac ctacggcgtg cagtgcttca gccgctaccc 2040

cgaccacatg aagcagcacg acttcttcaa gtccgccatg cccgaaggct acgtccagga 2100cgaccacatg aagcagcacg acttcttcaa gtccgccatg cccgaaggct acgtccagga 2100

gcgcaccatc ttcttcaagg acgacggcaa ctacaagacc cgcgccgagg tgaagttcga 2160gcgcaccatc ttcttcaagg acgacggcaa ctacaagacc cgcgccgagg tgaagttcga 2160

gggcgacacc ctggtgaacc gcatcgagct gaagggcatc gacttcaagg aggacggcaa 2220gggcgacacc ctggtgaacc gcatcgagct gaagggcatc gacttcaagg aggacggcaa 2220

catcctgggg cacaagctgg agtacaacta caacagccac aacgtctata tcatggccga 2280catcctgggg cacaagctgg agtacaacta caacagccac aacgtctata tcatggccga 2280

caagcagaag aacggcatca aggtgaactt caagatccgc cacaacatcg aggacggcag 2340caagcagaag aacggcatca aggtgaactt caagatccgc cacaacatcg aggacggcag 2340

cgtgcagctc gccgaccact accagcagaa cacccccatc ggcgacggcc ccgtgctgct 2400cgtgcagctc gccgaccact accagcagaa cacccccatc ggcgacggcc ccgtgctgct 2400

gcccgacaac cactacctga gcacccagtc cgccctgagc aaagacccca acgagaagcg 2460gcccgacaac cactacctga gcacccagtc cgccctgagc aaagacccca acgagaagcg 2460

cgatcacatg gtcctgctgg agttcgtgac cgccgccggg atcactctcg gcatggacga 2520cgatcacatg gtcctgctgg agttcgtgac cgccgccggg atcactctcg gcatggacga 2520

gctgtacaag taaagcggcc atcaagctta tcgataccgt cgactagagc tcgctgatca 2580gctgtacaag taaagcggcc atcaagctta tcgataccgt cgactagagc tcgctgatca 2580

gcctcgactg tgccttctag ttgccagcca tctgttgttt gcccctcccc cgtgccttcc 2640gcctcgactg tgccttctag ttgccagcca tctgttgttt gcccctcccc cgtgccttcc 2640

ttgaccctgg aaggtgccac tcccactgtc ctttcctaat aaaatgagga aattgcatcg 2700ttgaccctgg aaggtgccac tcccactgtc ctttcctaat aaaatgagga aattgcatcg 2700

cattgtctga gtaggtgtca ttctattctg gggggtgggg tggggcagga cagcaagggg 2760cattgtctga gtaggtgtca ttctattctg gggggtgggg tggggcagga cagcaagggg 2760

gaggattggg aagacaatag caggcatgct ggggagagat cgatctgagg aacccctagt 2820gaggattggg aagacaatag caggcatgct ggggagagat cgatctgagg aacccctagt 2820

gatggagttg gccactccct ctctgcgcgc tcgctcgctc actgaggccg ggcgaccaaa 2880gatggagttg gccactccct ctctgcgcgc tcgctcgctc actgaggccg ggcgaccaaa 2880

ggtcgcccga cgcccgggct ttgcccgggc ggcctcagtg agcgagcgag cgcgcagaga 2940ggtcgcccga cgcccgggct ttgcccgggc ggcctcagtg agcgagcgag cgcgcagaga 2940

gggagtggcc aacccccccc cccccccccc tgcatgcagg cgattctctt gtttgctcca 3000gggagtggcc aacccccccc cccccccccc tgcatgcagg cgattctctt gtttgctcca 3000

gactctcagg caatgacctg atagcctttg tagagacctc tcaaaaatag ctaccctctc 3060gactctcagg caatgacctg atagcctttg tagagacctc tcaaaaatag ctaccctctc 3060

cggcatgaat ttatcagcta gaacggttga atatcatatt gatggtgatt tgactgtctc 3120cggcatgaat ttatcagcta gaacggttga atatcatatt gatggtgatt tgactgtctc 3120

cggcctttct cacccgtttg aatctttacc tacacattac tcaggcattg catttaaaat 3180cggcctttct cacccgtttg aatctttacc tacacattac tcaggcattg catttaaaat 3180

atatgagggt tctaaaaatt tttatccttg cgttgaaata aaggcttctc ccgcaaaagt 3240atatgagggt tctaaaaatt tttatccttg cgttgaaata aaggcttctc ccgcaaaagt 3240

attacagggt cataatgttt ttggtacaac cgatttagct ttatgctctg aggctttatt 3300attacagggt cataatgttt ttggtacaac cgatttagct ttatgctctg aggctttatt 3300

gcttaatttt gctaattctt tgccttgcct gtatgattta ttggatgttg gaattcctga 3360gcttaatttt gctaattctt tgccttgcct gtatgattta ttggatgttg gaattcctga 3360

tgcggtattt tctccttacg catctgtgcg gtatttcaca ccgcatatgg tgcactctca 3420tgcggtattt tctccttacg catctgtgcg gtatttcaca ccgcatatgg tgcactctca 3420

gtacaatctg ctctgatgcc gcatagttaa gccagccccg acacccgcca acacccgctg 3480gtacaatctg ctctgatgcc gcatagttaa gccagccccg acacccgcca acacccgctg 3480

acgcgccctg acgggcttgt ctgctcccgg catccgctta cagacaagct gtgaccgtct 3540acgcgccctg acgggcttgt ctgctcccgg catccgctta cagacaagct gtgaccgtct 3540

ccgggagctg catgtgtcag aggttttcac cgtcatcacc gaaacgcgcg agacgaaagg 3600ccggggagctg catgtgtcag aggttttcac cgtcatcacc gaaacgcgcg agacgaaagg 3600

gcctcgtgat acgcctattt ttataggtta atgtcatgat aataatggtt tcttagacgt 3660gcctcgtgat acgcctattt ttataggtta atgtcatgat aataatggtt tcttagacgt 3660

caggtggcac ttttcgggga aatgtgcgcg gaacccctat ttgtttattt ttctaaatac 3720caggtggcac ttttcgggga aatgtgcgcg gaacccctat ttgtttattt ttctaaatac 3720

attcaaatat gtatccgctc atgagacaat aaccctgata aatgcttcaa taatattgaa 3780attcaaatat gtatccgctc atgagacaat aaccctgata aatgcttcaa taatattgaa 3780

aaaggaagag tatgagtatt caacatttcc gtgtcgccct tattcccttt tttgcggcat 3840aaaggaagag tatgagtatt caacatttcc gtgtcgccct tattcccttt tttgcggcat 3840

tttgccttcc tgtttttgct cacccagaaa cgctggtgaa agtaaaagat gctgaagatc 3900tttgccttcc tgtttttgct cacccagaaa cgctggtgaa agtaaaagat gctgaagatc 3900

agttgggtgc acgagtgggt tacatcgaac tggatctcaa cagcggtaag atccttgaga 3960agttgggtgc acgagtgggt tacatcgaac tggatctcaa cagcggtaag atccttgaga 3960

gttttcgccc cgaagaacgt tttccaatga tgagcacttt taaagttctg ctatgtggcg 4020gttttcgccc cgaagaacgt tttccaatga tgagcacttt taaagttctg ctatgtggcg 4020

cggtattatc ccgtattgac gccgggcaag agcaactcgg tcgccgcata cactgagtga 4080cggtattatc ccgtattgac gccgggcaag agcaactcgg tcgccgcata cactgagtga 4080

taacactgcg gccaacttac ttctgacaac gatcggagga ccgaaggagc taaccgcttt 4140taacactgcg gccaacttac ttctgacaac gatcggagga ccgaaggagc taaccgcttt 4140

tttgcacaac atgggggatc atgtaactcg ccttgatcgt tgggaaccgg agctgaatga 4200tttgcacaac atgggggatc atgtaactcg ccttgatcgt tgggaaccgg agctgaatga 4200

agccatacca aacgacgagc gtgacaccac gatgcctgta gcaatggcaa caacgttgcg 4260agccatacca aacgacgagc gtgacaccac gatgcctgta gcaatggcaa caacgttgcg 4260

caaactatta actggcgaac tacttactct agcttcccgg caacaattaa tagactggat 4320caaactatta actggcgaac tacttactct agcttcccgg caacaattaa tagactggat 4320

ggaggcggat aaagttgcag gaccacttct gcgctcggcc cttccggctg gctggtttat 4380ggaggcggat aaagttgcag gaccacttct gcgctcggcc cttccggctg gctggtttat 4380

tgctgataaa tctggagccg gtgagcgtgg gtctcgcggt atcattgcag cactggggcc 4440tgctgataaa tctggagccg gtgagcgtgg gtctcgcggt atcattgcag cactggggcc 4440

agatggtaag ccctcccgta tcgtagttat ctacacgacg gggagtcagg caactatgga 4500agatggtaag ccctcccgta tcgtagttat ctacacgacg gggagtcagg caactatgga 4500

tgaacgaaat agacagatcg ctgagatagg tgcctcactg attaagcatt ggtaactgtc 4560tgaacgaaat agacagatcg ctgagatagg tgcctcactg attaagcatt ggtaactgtc 4560

agaccaagtt tactcatata tactttagat tgatttaaaa cttcattttt aatttaaaag 4620agaccaagtt tactcatata tactttagat tgatttaaaa cttcattttt aatttaaaag 4620

gatctaggtg aagatccttt ttgataatct catgaccaaa atcccttaac gtgagttttc 4680gatctaggtg aagatccttt ttgataatct catgaccaaa atcccttaac gtgagttttc 4680

gttccactga gcgtcagacc ccgtagaaaa gatcaaagga tcttcttgag atcctttttt 4740gttccactga gcgtcagacc ccgtagaaaa gatcaaagga tcttcttgag atcctttttt 4740

tctgcgcgta atctgctgct tgcaaacaaa aaaaccaccg ctaccagcgg tggtttgttt 4800tctgcgcgta atctgctgct tgcaaacaaa aaaaccaccg ctaccagcgg tggtttgttt 4800

gccggatcaa gagctaccaa ctctttttcc gaaggtaact ggcttcagca gagcgcagat 4860gccggatcaa gagctaccaa ctctttttcc gaaggtaact ggcttcagca gagcgcagat 4860

accaaatact gtccttctag tgtagccgta gttaggccac cacttcaaga actctgtagc 4920accaaatact gtccttctag tgtagccgta gttaggccac cacttcaaga actctgtagc 4920

accgcctaca tacctcgctc tgctaatcct gttaccagtg gctgctgcca gtggcgataa 4980accgcctaca tacctcgctc tgctaatcct gttaccagtg gctgctgcca gtggcgataa 4980

gtcgtgtctt accgggttgg actcaagacg atagttaccg gataaggcgc agcggtcggg 5040gtcgtgtctt accgggttgg actcaagacg atagttaccg gataaggcgc agcggtcggg 5040

ctgaacgggg ggttcgtgca cacagcccag cttggagcga acgacctaca ccgaactgag 5100ctgaacgggg ggttcgtgca cacagcccag cttggagcga acgacctaca ccgaactgag 5100

atacctacag cgtgagctat gagaaagcgc cacgcttccc gaagggagaa aggcggacag 5160atacctacag cgtgagctat gagaaagcgc cacgcttccc gaagggagaa aggcggacag 5160

gtatccggta agcggcaggg tcggaacagg agagcgcacg agggagcttc cagggggaaa 5220gtatccggta agcggcaggg tcggaacagg agagcgcacg agggagcttc cagggggaaa 5220

cgcctggtat ctttatagtc ctgtcgggtt tcgccacctc tgacttgagc gtcgattttt 5280cgcctggtat ctttatagtc ctgtcgggtt tcgccacctc tgacttgagc gtcgattttt 5280

gtgatgctcg tcaggggggc ggagcctatg gaaaaacgcc agcaacgcgg cctttttacg 5340gtgatgctcg tcaggggggc ggagcctatg gaaaaacgcc agcaacgcgg cctttttacg 5340

gttcctggcc ttttgctggc cttttgctca catgttcttt cctgcgttat cccctgattc 5400gttcctggcc ttttgctggc cttttgctca catgttcttt cctgcgttat cccctgattc 5400

tgtggataac cgtattaccg cctttgagtg agctgatacc gctcgccgca gccgaacgac 5460tgtggataac cgtattaccg cctttgagtg agctgatacc gctcgccgca gccgaacgac 5460

cgagcgcagc gagtcagtga gcgaggaagc ggaagagcgc ccaatacgca aaccgcctct 5520cgagcgcagc gagtcagtga gcgaggaagc ggaagagcgc ccaatacgca aaccgcctct 5520

ccccgcgcgt tggccgattc attaatg 5547ccccgcgcgt tggccgattc attaatg 5547

<210> 3<210> 3

<211> 925<211> 925

<212> DNA<212> DNA

<213> 人工序列（Artificial Sequence）<213> Artificial Sequence

<220><220>

<223> ITR-MCS-PA-ITR<223> ITR-MCS-PA-ITR

<400> 3<400> 3

ggtaccacat gtcctgcagg cagctgcgcg ctcgctcgct cactgaggcc gcccgggcaa 60ggtaccacat gtcctgcagg cagctgcgcg ctcgctcgct cactgaggcc gcccgggcaa 60

agcccgggcg tcgggcgacc tttggtcgcc cggcctcagt gagcgagcga gcgcgcagag 120agcccgggcg tcgggcgacc tttggtcgcc cggcctcagt gagcgagcga gcgcgcagag 120

agggagtggc caactccatc actaggggtt cctgcggccg cagatctacc ggtggcgcgc 180agggagtggc caactccatc actaggggtt cctgcggccg cagatctacc ggtggcgcgc 180

cggatccgaa ttctctagag tcgacgtcga gctagcatcg atgtttaaac gagctcacta 240cggatccgaa ttctctagag tcgacgtcga gctagcatcg atgtttaaac gagctcacta 240

gtctcgagac gcgtacgggt ggcatccctg tgacccctcc ccagtgcctc tcctggccct 300gtctcgagac gcgtacgggt ggcatccctg tgacccctcc ccagtgcctc tcctggccct 300

ggaagttgcc actccagtgc ccaccagcct tgtcctaata aaattaagtt gcatcatttt 360ggaagttgcc actccagtgc ccaccagcct tgtcctaata aaattaagtt gcatcatttt 360

gtctgactag gtgtccttct ataatattat ggggtggagg ggggtggtat ggagcaaggg 420gtctgactag gtgtccttct ataatattat ggggtggagg ggggtggtat ggagcaaggg 420

gcaagttggg aagacaacct gtagggcctg cggggtctat tgggaaccaa gctggagtgc 480gcaagttggg aagacaacct gtagggcctg cggggtctat tgggaaccaa gctggagtgc 480

agtggcacaa tcttggctca ctgcaatctc cgcctcctgg gttcaagcga ttctcctgcc 540agtggcacaa tcttggctca ctgcaatctc cgcctcctgg gttcaagcga ttctcctgcc 540

tcagcctccc gagttgttgg gattccaggc atgcatgacc aggctcagct aatttttgtt 600tcagcctccc gagttgttgg gattccaggc atgcatgacc aggctcagct aatttttgtt 600

tttttggtag agacggggtt tcaccatatt ggccaggctg gtctccaact cctaatctca 660tttttggtag agacggggtt tcaccatatt ggccaggctg gtctccaact cctaatctca 660

ggtgatctac ccaccttggc ctcccaaatt gctgggatta caggcgtgaa ccactgctcc 720ggtgatctac ccaccttggc ctcccaaatt gctgggatta caggcgtgaa ccactgctcc 720

cttccctgtc cttctgattt tgtaggtaac cacgtgcgga ccgagcggcc gcaggaaccc 780cttccctgtc cttctgattt tgtaggtaac cacgtgcgga ccgagcggcc gcaggaaccc 780

ctagtgatgg agttggccac tccctctctg cgcgctcgct cgctcactga ggccgggcga 840ctagtgatgg agttggccac tccctctctg cgcgctcgct cgctcactga ggccgggcga 840

ccaaaggtcg cccgacgccc gggctttgcc cgggcggcct cagtgagcga gcgagcgcgc 900ccaaaggtcg cccgacgccc gggctttgcc cgggcggcct cagtgagcga gcgagcgcgc 900

agctgcctgc aggggcgcca agctt 925agctgcctgc aggggcgcca agctt 925

<210> 4<210> 4

<211> 5238<211> 5238

<212> DNA<212> DNA

<213> 人工序列（Artificial Sequence）<213> Artificial Sequence

<220><220>

<223> pFastBac dual<223> pFastBac dual

<400> 4<400> 4

ttctctgtca cagaatgaaa atttttctgt catctcttcg ttattaatgt ttgtaattga 60ttctctgtca cagaatgaaa atttttctgt catctcttcg ttattaatgt ttgtaattga 60

ctgaatatca acgcttattt gcagcctgaa tggcgaatgg gacgcgccct gtagcggcgc 120ctgaatatca acgcttattt gcagcctgaa tggcgaatgg gacgcgccct gtagcggcgc 120

attaagcgcg gcgggtgtgg tggttacgcg cagcgtgacc gctacacttg ccagcgccct 180attaagcgcg gcgggtgtgg tggttacgcg cagcgtgacc gctacacttg ccagcgccct 180

agcgcccgct cctttcgctt tcttcccttc ctttctcgcc acgttcgccg gctttccccg 240agcgcccgct cctttcgctt tcttcccttc ctttctcgcc acgttcgccg gctttccccg 240

tcaagctcta aatcgggggc tccctttagg gttccgattt agtgctttac ggcacctcga 300tcaagctcta aatcgggggc tccctttagg gttccgattt agtgctttac ggcacctcga 300

ccccaaaaaa cttgattagg gtgatggttc acgtagtggg ccatcgccct gatagacggt 360ccccaaaaaa cttgattagg gtgatggttc acgtagtggg ccatcgccct gatagacggt 360

ttttcgccct ttgacgttgg agtccacgtt ctttaatagt ggactcttgt tccaaactgg 420ttttcgccct ttgacgttgg agtccacgtt ctttaatagt ggactcttgt tccaaactgg 420

aacaacactc aaccctatct cggtctattc ttttgattta taagggattt tgccgatttc 480aacaacactc aaccctatct cggtctattc ttttgattta taagggattt tgccgatttc 480

ggcctattgg ttaaaaaatg agctgattta acaaaaattt aacgcgaatt ttaacaaaat 540ggcctattgg ttaaaaaatg agctgattta acaaaaattt aacgcgaatt ttaacaaaat 540

attaacgttt acaatttcag gtggcacttt tcggggaaat gtgcgcggaa cccctatttg 600attaacgttt acaatttcag gtggcacttt tcggggaaat gtgcgcggaa cccctatttg 600

tttatttttc taaatacatt caaatatgta tccgctcatg agacaataac cctgataaat 660tttatttttc taaatacatt caaatatgta tccgctcatg agacaataac cctgataaat 660

gcttcaataa tattgaaaaa ggaagagtat gagtattcaa catttccgtg tcgcccttat 720gcttcaataa tattgaaaaa ggaagagtat gagtattcaa catttccgtg tcgcccttat 720

tccctttttt gcggcatttt gccttcctgt ttttgctcac ccagaaacgc tggtgaaagt 780tccctttttt gcggcatttt gccttcctgt ttttgctcac ccagaaacgc tggtgaaagt 780

aaaagatgct gaagatcagt tgggtgcacg agtgggttac atcgaactgg atctcaacag 840aaaagatgct gaagatcagt tgggtgcacg agtgggttac atcgaactgg atctcaacag 840

cggtaagatc cttgagagtt ttcgccccga agaacgtttt ccaatgatga gcacttttaa 900cggtaagatc cttgagagtt ttcgccccga agaacgtttt ccaatgatga gcacttttaa 900

agttctgcta tgtggcgcgg tattatcccg tattgacgcc gggcaagagc aactcggtcg 960agttctgcta tgtggcgcgg tattatcccg tattgacgcc gggcaagagc aactcggtcg 960

ccgcatacac tattctcaga atgacttggt tgagtactca ccagtcacag aaaagcatct 1020ccgcatacac tattctcaga atgacttggt tgagtactca ccagtcacag aaaagcatct 1020

tacggatggc atgacagtaa gagaattatg cagtgctgcc ataaccatga gtgataacac 1080tacggatggc atgacagtaa gagaattatg cagtgctgcc ataaccatga gtgataacac 1080

tgcggccaac ttacttctga caacgatcgg aggaccgaag gagctaaccg cttttttgca 1140tgcggccaac ttacttctga caacgatcgg aggaccgaag gagctaaccg cttttttgca 1140

caacatgggg gatcatgtaa ctcgccttga tcgttgggaa ccggagctga atgaagccat 1200caacatgggg gatcatgtaa ctcgccttga tcgttgggaa ccggagctga atgaagccat 1200

accaaacgac gagcgtgaca ccacgatgcc tgtagcaatg gcaacaacgt tgcgcaaact 1260accaaacgac gagcgtgaca ccacgatgcc tgtagcaatg gcaacaacgt tgcgcaaact 1260

attaactggc gaactactta ctctagcttc ccggcaacaa ttaatagact ggatggaggc 1320attaactggc gaactactta ctctagcttc ccggcaacaa ttaatagact ggatggaggc 1320

ggataaagtt gcaggaccac ttctgcgctc ggcccttccg gctggctggt ttattgctga 1380ggataaagtt gcaggaccac ttctgcgctc ggcccttccg gctggctggt ttattgctga 1380

taaatctgga gccggtgagc gtgggtctcg cggtatcatt gcagcactgg ggccagatgg 1440taaatctgga gccggtgagc gtgggtctcg cggtatcatt gcagcactgg ggccagatgg 1440

taagccctcc cgtatcgtag ttatctacac gacggggagt caggcaacta tggatgaacg 1500taagccctcc cgtatcgtag ttatctacac gacggggagt caggcaacta tggatgaacg 1500

aaatagacag atcgctgaga taggtgcctc actgattaag cattggtaac tgtcagacca 1560aaatagacag atcgctgaga taggtgcctc actgattaag cattggtaac tgtcagacca 1560

agtttactca tatatacttt agattgattt aaaacttcat ttttaattta aaaggatcta 1620agtttactca tatatacttt agattgattt aaaacttcat ttttaattta aaaggatcta 1620

ggtgaagatc ctttttgata atctcatgac caaaatccct taacgtgagt tttcgttcca 1680ggtgaagatc ctttttgata atctcatgac caaaatccct taacgtgagt tttcgttcca 1680

ctgagcgtca gaccccgtag aaaagatcaa aggatcttct tgagatcctt tttttctgcg 1740ctgagcgtca gaccccgtag aaaagatcaa aggatcttct tgagatcctt tttttctgcg 1740

cgtaatctgc tgcttgcaaa caaaaaaacc accgctacca gcggtggttt gtttgccgga 1800cgtaatctgc tgcttgcaaa caaaaaaacc accgctacca gcggtggttt gtttgccgga 1800

tcaagagcta ccaactcttt ttccgaaggt aactggcttc agcagagcgc agataccaaa 1860tcaagagcta ccaactcttt ttccgaaggt aactggcttc agcagagcgc agataccaaa 1860

tactgtcctt ctagtgtagc cgtagttagg ccaccacttc aagaactctg tagcaccgcc 1920tactgtcctt ctagtgtagc cgtagttagg ccaccacttc aagaactctg tagcaccgcc 1920

tacatacctc gctctgctaa tcctgttacc agtggctgct gccagtggcg ataagtcgtg 1980tacatacctc gctctgctaa tcctgttacc agtggctgct gccagtggcg ataagtcgtg 1980

tcttaccggg ttggactcaa gacgatagtt accggataag gcgcagcggt cgggctgaac 2040tcttaccggg ttggactcaa gacgatagtt accggataag gcgcagcggt cgggctgaac 2040

ggggggttcg tgcacacagc ccagcttgga gcgaacgacc tacaccgaac tgagatacct 2100gggggttcg tgcacacagc ccagcttgga gcgaacgacc tacaccgaac tgagatacct 2100

acagcgtgag cattgagaaa gcgccacgct tcccgaaggg agaaaggcgg acaggtatcc 2160acagcgtgag cattgagaaa gcgccacgct tcccgaaggg agaaaggcgg acaggtatcc 2160

ggtaagcggc agggtcggaa caggagagcg cacgagggag cttccagggg gaaacgcctg 2220ggtaagcggc agggtcggaa caggagagcg cacgaggggag cttccagggg gaaacgcctg 2220

gtatctttat agtcctgtcg ggtttcgcca cctctgactt gagcgtcgat ttttgtgatg 2280gtatctttat agtcctgtcg ggtttcgcca cctctgactt gagcgtcgat ttttgtgatg 2280

ctcgtcaggg gggcggagcc tatggaaaaa cgccagcaac gcggcctttt tacggttcct 2340ctcgtcaggg gggcggagcc tatggaaaaa cgccagcaac gcggcctttt tacggttcct 2340

ggccttttgc tggccttttg ctcacatgtt ctttcctgcg ttatcccctg attctgtgga 2400ggccttttgc tggccttttg ctcacatgtt ctttcctgcg ttatcccctg attctgtgga 2400

taaccgtatt accgcctttg agtgagctga taccgctcgc cgcagccgaa cgaccgagcg 2460taaccgtatt accgcctttg agtgagctga taccgctcgc cgcagccgaa cgaccgagcg 2460

cagcgagtca gtgagcgagg aagcggaaga gcgcctgatg cggtattttc tccttacgca 2520cagcgagtca gtgagcgagg aagcggaaga gcgcctgatg cggtattttc tccttacgca 2520

tctgtgcggt atttcacacc gcagaccagc cgcgtaacct ggcaaaatcg gttacggttg 2580tctgtgcggt atttcacacc gcagaccagc cgcgtaacct ggcaaaatcg gttacggttg 2580

agtaataaat ggatgccctg cgtaagcggg tgtgggcgga caataaagtc ttaaactgaa 2640agtaataaat ggatgccctg cgtaagcggg tgtgggcgga caataaagtc ttaaactgaa 2640

caaaatagat ctaaactatg acaataaagt cttaaactag acagaatagt tgtaaactga 2700caaaatagat ctaaactatg acaataaagt cttaaactag acagaatagt tgtaaactga 2700

aatcagtcca gttatgctgt gaaaaagcat actggacttt tgttatggct aaagcaaact 2760aatcagtcca gttatgctgt gaaaaagcat actggacttt tgttatggct aaagcaaact 2760

cttcattttc tgaagtgcaa attgcccgtc gtattaaaga ggggcgtggc caagggcatg 2820cttcattttc tgaagtgcaa attgcccgtc gtattaaaga ggggcgtggc caagggcatg 2820

gtaaagacta tattcgcggc gttgtgacaa tttaccgaac aactccgcgg ccgggaagcc 2880gtaaagacta tattcgcggc gttgtgacaa tttaccgaac aactccgcgg ccgggaagcc 2880

gatctcggct tgaacgaatt gttaggtggc ggtacttggg tcgatatcaa agtgcatcac 2940gatctcggct tgaacgaatt gttaggtggc ggtacttggg tcgatatcaa agtgcatcac 2940

ttcttcccgt atgcccaact ttgtatagag agccactgcg ggatcgtcac cgtaatctgc 3000ttcttcccgt atgcccaact ttgtatagag agccactgcg ggatcgtcac cgtaatctgc 3000

ttgcacgtag atcacataag caccaagcgc gttggcctca tgcttgagga gattgatgag 3060ttgcacgtag atcacataag caccaagcgc gttggcctca tgcttgagga gattgatgag 3060

cgcggtggca atgccctgcc tccggtgctc gccggagact gcgagatcat agatatagat 3120cgcggtggca atgccctgcc tccggtgctc gccggagact gcgagatcat agatatagat 3120

ctcactacgc ggctgctcaa acctgggcag aacgtaagcc gcgagagcgc caacaaccgc 3180ctcactacgc ggctgctcaa acctgggcag aacgtaagcc gcgagagcgc caacaaccgc 3180

ttcttggtcg aaggcagcaa gcgcgatgaa tgtcttacta cggagcaagt tcccgaggta 3240ttcttggtcg aaggcagcaa gcgcgatgaa tgtcttacta cggagcaagt tcccgaggta 3240

atcggagtcc ggctgatgtt gggagtaggt ggctacgtct ccgaactcac gaccgaaaag 3300atcggagtcc ggctgatgtt gggagtaggt ggctacgtct ccgaactcac gaccgaaaag 3300

atcaagagca gcccgcatgg atttgacttg gtcagggccg agcctacatg tgcgaatgat 3360atcaagagca gcccgcatgg atttgacttg gtcagggccg agcctacatg tgcgaatgat 3360

gcccatactt gagccaccta actttgtttt agggcgactg ccctgctgcg taacatcgtt 3420gcccatactt gagccaccta actttgtttt agggcgactg ccctgctgcg taacatcgtt 3420

gctgctgcgt aacatcgttg ctgctccata acatcaaaca tcgacccacg gcgtaacgcg 3480gctgctgcgt aacatcgttg ctgctccata acatcaaaca tcgacccacg gcgtaacgcg 3480

cttgctgctt ggatgcccga ggcatagact gtacaaaaaa acagtcataa caagccatga 3540cttgctgctt ggatgcccga ggcatagact gtacaaaaaa acagtcataa caagccatga 3540

aaaccgccac tgcgccgtta ccaccgctgc gttcggtcaa ggttctggac cagttgcgtg 3600aaaccgccac tgcgccgtta ccaccgctgc gttcggtcaa ggttctggac cagttgcgtg 3600

agcgcatacg ctacttgcat tacagtttac gaaccgaaca ggcttatgtc aactgggttc 3660agcgcatacg ctacttgcat tacagtttac gaaccgaaca ggctttatgtc aactgggttc 3660

gtgccttcat ccgtttccac ggtgtgcgtc acccggcaac cttgggcagc agcgaagtcg 3720gtgccttcat ccgtttccac ggtgtgcgtc acccggcaac cttgggcagc agcgaagtcg 3720

aggcatttct gtcctggctg gcgaacgagc gcaaggtttc ggtctccacg catcgtcagg 3780aggcatttct gtcctggctg gcgaacgagc gcaaggtttc ggtctccacg catcgtcagg 3780

cattggcggc cttgctgttc ttctacggca aggtgctgtg cacggatctg ccctggcttc 3840cattggcggc cttgctgttc ttctacggca aggtgctgtg cacggatctg ccctggcttc 3840

aggagatcgg tagacctcgg ccgtcgcggc gcttgccggt ggtgctgacc ccggatgaag 3900aggagatcgg tagacctcgg ccgtcgcggc gcttgccggt ggtgctgacc ccggatgaag 3900

tggttcgcat cctcggtttt ctggaaggcg agcatcgttt gttcgcccag gactctagct 3960tggttcgcat cctcggtttt ctggaaggcg agcatcgttt gttcgcccag gactctagct 3960

atagttctag tggttggcct acgtacccgt agtggctatg gcagggcttg ccgccccgac 4020atagttctag tggttggcct acgtacccgt agtggctatg gcagggcttg ccgccccgac 4020

gttggctgcg agccctgggc cttcacccga acttgggggt tggggtgggg aaaaggaaga 4080gttggctgcg agccctgggc cttcacccga acttgggggt tggggtgggg aaaaggaaga 4080

aacgcgggcg tattggtccc aatggggtct cggtggggta tcgacagagt gccagccctg 4140aacgcgggcg tattggtccc aatggggtct cggtggggta tcgacagagt gccagccctg 4140

ggaccgaacc ccgcgtttat gaacaaacga cccaacaccc gtgcgtttta ttctgtcttt 4200ggaccgaacc ccgcgtttat gaacaaacga cccaacaccc gtgcgtttta ttctgtcttt 4200

ttattgccgt catagcgcgg gttccttccg gtattgtctc cttccgtgtt tcagttagcc 4260ttattgccgt catagcgcgg gttccttccg gtattgtctc cttccgtgtt tcagttagcc 4260

tcccccatct cccggtaccg catgctatgc atcagctgct agcaccatgg ctcgagatcc 4320tcccccatct cccggtaccg catgctatgc atcagctgct agcaccatgg ctcgagatcc 4320

cgggtgatca agtcttcgtc gagtgattgt aaataaaatg taatttacag tatagtattt 4380cgggtgatca agtcttcgtc gagtgattgt aaataaaatg taatttacag tatagtattt 4380

taattaatat acaaatgatt tgataataat tcttatttaa ctataatata ttgtgttggg 4440taattaatat acaaatgatt tgataataat tctttatttaa ctataatata ttgtgttggg 4440

ttgaattaaa ggtccgtata ctccggaata ttaatagatc atggagataa ttaaaatgat 4500ttgaattaaa ggtccgtata ctccggaata ttaatagatc atggagataa ttaaaatgat 4500

aaccatctcg caaataaata agtattttac tgttttcgta acagttttgt aataaaaaaa 4560aaccatctcg caaataaata agtattttac tgttttcgta acagttttgt aataaaaaaa 4560

cctataaata ttccggatta ttcataccgt cccaccatcg ggcgcggatc ccggtccgaa 4620cctataaata ttccggatta ttcataccgt cccaccatcg ggcgcggatc ccggtccgaa 4620

gcgcgcggaa ttcaaaggcc tacgtcgacg agctcactag tcgcggccgc tttcgaatct 4680gcgcgcggaa ttcaaaggcc tacgtcgacg agctcactag tcgcggccgc tttcgaatct 4680

agagcctgca gtctcgacaa gcttgtcgag aagtactaga ggatcataat cagccatacc 4740agagcctgca gtctcgacaa gcttgtcgag aagtactaga ggatcataat cagccatacc 4740

acatttgtag aggttttact tgctttaaaa aacctcccac acctccccct gaacctgaaa 4800acatttgtag aggttttact tgctttaaaa aacctcccac acctccccct gaacctgaaa 4800

cataaaatga atgcaattgt tgttgttaac ttgtttattg cagcttataa tggttacaaa 4860cataaaatga atgcaattgt tgttgttaac ttgtttattg cagcttataa tggttacaaa 4860

taaagcaata gcatcacaaa tttcacaaat aaagcatttt tttcactgca ttctagttgt 4920taaagcaata gcatcacaaa tttcacaaat aaagcatttt tttcactgca ttctagttgt 4920

ggtttgtcca aactcatcaa tgtatcttat catgtctgga tctgatcact gcttgagcct 4980ggtttgtcca aactcatcaa tgtatctttat catgtctgga tctgatcact gcttgagcct 4980

aggagatccg aaccagataa gtgaaatcta gttccaaact attttgtcat ttttaatttt 5040aggagatccg aaccagataa gtgaaatcta gttccaaact attttgtcat ttttaatttt 5040

cgtattagct tacgacgcta cacccagttc ccatctattt tgtcactctt ccctaaataa 5100cgtattagct tacgacgcta cacccagttc ccatctattt tgtcactctt ccctaaataa 5100

tccttaaaaa ctccatttcc acccctccca gttcccaact attttgtccg cccacagcgg 5160tccttaaaaa ctccatttcc acccctccca gttcccaact attttgtccg cccacagcgg 5160

ggcatttttc ttcctgttat gtttttaatc aaacatcctg ccaactccat gtgacaaacc 5220ggcatttttc ttcctgttat gtttttaatc aaacatcctg ccaactccat gtgacaaacc 5220

gtcatcttcg gctacttt 5238gtcatcttcg gctacttt 5238

<210> 5<210> 5

<211> 3129<211> 3129

<212> DNA<212> DNA

<213> 人工序列（Artificial Sequence）<213> Artificial Sequence

<220><220>

<223> CpGfreeFluc<223> CpGfreeFluc

<400> 5<400> 5

tttagggtta gggttagggt tagggaaaaa tttagggtta gggttagggt tagggaaaaa 60tttagggtta gggttagggt tagggaaaaa tttagggtta gggttagggt tagggaaaaa 60

tttagggtta gggttagggt tagggaaaaa aagcttgagt caatgggaaa aacccattgg 120tttagggtta gggttagggt tagggaaaaa aagcttgagt caatgggaaa aacccattgg 120

agccaagtac actgactcaa tagggacttt ccattgggtt ttgcccagta cataaggtca 180agccaagtac actgactcaa tagggacttt ccattgggtt ttgcccagta cataaggtca 180

atagggggtg agtcaacagg aaagtcccat tggagccaag tacattgagt caatagggac 240atagggggtg agtcaacagg aaagtcccat tggagccaag tacattgagt caatagggac 240

tttccaatgg gttttgccca gtacataagg tcaatgggag gtaagccaat gggtttttcc 300tttccaatgg gttttgccca gtacataagg tcaatgggag gtaagccaat gggtttttcc 300

cattactgac atgtatactg agtcattagg gactttccaa tgggttttgc ccagtacata 360cattactgac atgtatactg agtcattagg gactttccaa tgggttttgc ccagtacata 360

aggtcaatag gggtgaatca acaggaaagt cccattggag ccaagtacac tgagtcaata 420aggtcaatag gggtgaatca acaggaaagt cccattggag ccaagtacac tgagtcaata 420

gggactttcc attgggtttt gcccagtaca aaaggtcaat agggggtgag tcaatgggtt 480gggactttcc attgggtttt gcccagtaca aaaggtcaat agggggtgag tcaatgggtt 480

tttcccatta ttggcacata cataaggtca ataggggtga ctagtggaga agagcatgct 540tttcccatta ttggcacata cataaggtca ataggggtga ctagtggaga agagcatgct 540

tgagggctga gtgcccctca gtgggcagag agcacatggc ccacagtccc tgagaagttg 600tgagggctga gtgcccctca gtgggcagag agcacatggc ccacagtccc tgagaagttg 600

gggggagggg tgggcaattg aactggtgcc tagagaaggt ggggcttggg taaactggga 660gggggagggg tgggcaattg aactggtgcc tagagaaggt ggggcttggg taaactggga 660

aagtgatgtg gtgtactggc tccacctttt tccccagggt gggggagaac catatataag 720aagtgatgtg gtgtactggc tccacctttt tccccagggt gggggagaac catataag 720

tgcagtagtc tctgtgaaca ttcaagcttc tgccttctcc ctcctgtgag tttggtaagt 780tgcagtagtc tctgtgaaca ttcaagcttc tgccttctcc ctcctgtgag tttggtaagt 780

cactgactgt ctatgcctgg gaaagggtgg gcaggaggtg gggcagtgca ggaaaagtgg 840cactgactgt ctatgcctgg gaaagggtgg gcaggaggtg gggcagtgca ggaaaagtgg 840

cactgtgaac cctgcagccc tagacaattg tactaacctt cttctctttc ctctcctgac 900cactgtgaac cctgcagccc tagacaattg tactaacctt cttctctttc ctctcctgac 900

aggttggtgt acagtagctt ccaccatgga ggatgccaag aatattaaga aaggccctgc 960aggttggtgt acagtagctt ccaccatgga ggatgccaag aatattaaga aaggccctgc 960

cccattctac cctctggaag atggcactgc tggtgagcaa ctgcacaagg ccatgaagag 1020cccattctac cctctggaag atggcactgc tggtgagcaa ctgcacaagg ccatgaagag 1020

gtatgccctg gtccctggca ccattgcctt cactgatgct cacattgagg tggacatcac 1080gtatgccctg gtccctggca ccattgcctt cactgatgct cacattgagg tggacatcac 1080

ctatgctgaa tactttgaga tgtctgtgag gctggcagaa gccatgaaaa gatatggact 1140ctatgctgaa tactttgaga tgtctgtgag gctggcagaa gccatgaaaa gatatggact 1140

gaacaccaac cacaggattg tggtgtgctc tgagaactct ctccagttct tcatgcctgt 1200gaacaccaac cacaggattg tggtgtgctc tgagaactct ctccagttct tcatgcctgt 1200

gttaggagcc ctgttcattg gagtggctgt ggcccctgcc aatgacatct acaatgagag 1260gttaggagcc ctgttcattg gagtggctgt ggcccctgcc aatgacatct acaatgagag 1260

agagctcctg aacagcatgg gcatcagcca gccaactgtg gtctttgtga gcaagaaggg 1320agagctcctg aacagcatgg gcatcagcca gccaactgtg gtctttgtga gcaagaaggg 1320

cctgcaaaag atcctgaatg tgcagaagaa gctgcccatc atccagaaga tcatcatcat 1380cctgcaaaag atcctgaatg tgcagaagaa gctgcccatc atccagaaga tcatcatcat 1380

ggacagcaag actgactacc agggcttcca gagcatgtat acctttgtga ccagccactt 1440ggacagcaag actgactacc agggcttcca gagcatgtat acctttgtga ccagccactt 1440

accccctggc ttcaatgagt atgactttgt gcctgagagc tttgacaggg acaagaccat 1500accccctggc ttcaatgagt atgactttgt gcctgagagc tttgacaggg acaagaccat 1500

tgctctgatt atgaacagct ctggctccac tggactgccc aaaggtgtgg ctctgcccca 1560tgctctgatt atgaacagct ctggctccac tggactgccc aaaggtgtgg ctctgcccca 1560

cagaactgct tgtgtgagat tcagccatgc cagagacccc atctttggca accagatcat 1620cagaactgct tgtgtgagat tcagccatgc cagagacccc atctttggca accagatcat 1620

ccctgacact gccatcctgt ctgtggttcc attccatcat ggctttggca tgttcacaac 1680ccctgacact gccatcctgt ctgtggttcc attccatcat ggctttggca tgttcacaac 1680

actggggtac ctgatctgtg gcttcagagt ggtgctgatg tataggtttg aggaggagct 1740actggggtac ctgatctgtg gcttcagagt ggtgctgatg tataggtttg aggaggagct 1740

gtttctgagg agcctacaag actacaagat ccagtctgcc ctgctggtgc ccactctgtt 1800gtttctgagg agcctacaag actacaagat ccagtctgcc ctgctggtgc ccactctgtt 1800

cagcttcttt gccaagagca ccctcattga caagtatgac ctgagcaacc tgcatgagat 1860cagcttcttt gccaagagca ccctcattga caagtatgac ctgagcaacc tgcatgagat 1860

tgcctctgga ggagcacccc tgagcaagga ggtgggtgag gctgtggcaa agaggttcca 1920tgcctctgga ggagcacccc tgagcaagga ggtgggtgag gctgtggcaa agaggttcca 1920

tctcccagga atcagacagg gctatggcct gactgagacc acctctgcca tcctcatcac 1980tctcccagga atcagacagg gctatggcct gactgagacc acctctgcca tcctcatcac 1980

ccctgaagga gatgacaagc ctggtgctgt gggcaaggtg gttccctttt ttgaggccaa 2040ccctgaagga gatgacaagc ctggtgctgt gggcaaggtg gttccctttt ttgaggccaa 2040

ggtggtggac ctggacactg gcaagaccct gggagtgaac cagaggggtg agctgtgtgt 2100ggtggtggac ctggacactg gcaagaccct gggagtgaac cagaggggtg agctgtgtgt 2100

gaggggtccc atgatcatgt ctggctatgt gaacaaccct gaggccacca atgccctgat 2160gaggggtccc atgatcatgt ctggctatgt gaacaaccct gaggccacca atgccctgat 2160

tgacaaggat ggctggctgc actctggtga cattgcctac tgggatgagg atgagcactt 2220tgacaaggat ggctggctgc actctggtga cattgcctac tgggatgagg atgagcactt 2220

tttcattgtg gacaggctga agagcctcat caagtacaaa ggctaccaag tggcacctgc 2280tttcattgtg gacaggctga agagcctcat caagtacaaa ggctaccaag tggcacctgc 2280

tgagctagag agcatcctgc tccagcaccc caacatcttt gatgctggtg tggctggcct 2340tgagctagag agcatcctgc tccagcaccc caacatcttt gatgctggtg tggctggcct 2340

gcctgatgat gatgctggag agctgcctgc tgctgttgtg gttctggagc atggaaagac 2400gcctgatgat gatgctggag agctgcctgc tgctgttgtg gttctggagc atggaaagac 2400

catgactgag aaggagattg tggactatgt ggccagtcag gtgaccactg ccaagaagct 2460catgactgag aaggagattg tggactatgt ggccagtcag gtgaccactg ccaagaagct 2460

gaggggaggt gtggtgtttg tggatgaggt gccaaagggt ctgactggca agctggatgc 2520gaggggaggt gtggtgtttg tggatgaggt gccaaagggt ctgactggca agctggatgc 2520

cagaaagatc agagagatcc tgatcaaggc caagaagggt ggcaaacaat tgatctctgg 2580cagaaagatc agagagatcc tgatcaaggc caagaagggt ggcaaacaat tgatctctgg 2580

agccaatgga gtctagctag ctggccagac atgataagat acattgatga gtttggacaa 2640agccaatgga gtctagctag ctggccagac atgataagat acattgatga gtttggacaa 2640

accacaacta gaatgcagtg aaaaaaatgc tttatttgtg aaatttgtga tgctattgct 2700accacaacta gaatgcagtg aaaaaaatgc tttatttgtg aaatttgtga tgctattgct 2700

ttatttgtaa ccattataag ctgcaataaa caagttaaca acaacaattg cattcatttt 2760ttatttgtaa ccattataag ctgcaataaa caagttaaca acaacaattg cattcatttt 2760

atgtttcagg ttcaggggga ggtgtgggag gttttttaaa gcaagtaaaa cctctacaaa 2820atgtttcagg ttcaggggga ggtgtgggag gttttttaaa gcaagtaaaa cctctacaaa 2820

tgtggtatgg aattcggatc cggtgtggaa agtccccagg ctccccagca ggcagaagta 2880tgtggtatgg aattcggatc cggtgtggaa agtccccagg ctccccagca ggcagaagta 2880

tgcaaagcat gcatctcaat tagtcagcaa ccaggtgtgg aaagtcccca ggctccccag 2940tgcaaagcat gcatctcaat tagtcagcaa ccaggtgtgg aaagtcccca ggctccccag 2940

caggcagaag tatgcaaagc atgcatctca attagtcagc aaccagagct ctggggactt 3000caggcagaag tatgcaaagc atgcatctca attagtcagc aaccagagct ctggggactt 3000

tccgctgggg actttccgct ggggactttc cgctggggac tttccgctgg ggactttccg 3060tccgctgggg actttccgct ggggactttc cgctggggac tttccgctgg ggactttccg 3060

catttaaatg gtacattttg ttctagaaca aaatgtaccg gtacattttg ttctggtaca 3120catttaaatg gtacattttg ttctagaaca aaatgtaccg gtacattttg ttctggtaca 3120

ttttgttct 3129ttttgttct 3129

<210> 6<210> 6

<211> 2305<211> 2305

<212> DNA<212> DNA

<213> 人工序列（Artificial Sequence）<213> Artificial Sequence

<220><220>

<223> inRep<223> inRep

<400> 6<400> 6

gacctttaat tcaacccaac acaatatatt atagttaaat aagaattatt atcaaatcat 60gacctttaat tcaacccaac acaatatatt atagttaaat aagaattatt atcaaatcat 60

ttgtatatta attaaaatac tatactgtaa attacatttt atttacaatc actcgacgaa 120ttgtatatta attaaaatac tatactgtaa attacatttt atttacaatc actcgacgaa 120

gacttgatca ccctacccgc catgccgggg ttttacgaga ttgtgattaa ggtccccagc 180gacttgatca ccctacccgc catgccgggg ttttacgaga ttgtgattaa ggtccccagc 180

gaccttgacg agcatctgcc cggcatttct gacagctttg tgaactgggt ggccgagaag 240gaccttgacg agcatctgcc cggcatttct gacagctttg tgaactgggt ggccgagaag 240

gaatgggagt tgccgccaga ttctgacatg gatctgaatc tgattgagca ggcacccctg 300gaatggggagt tgccgccaga ttctgacatg gatctgaatc tgattgagca ggcacccctg 300

accgtggccg agaagctgca gcgcgacttt ctgacggaat ggcgccgtgt gagtaaggcc 360accgtggccg agaagctgca gcgcgacttt ctgacggaat ggcgccgtgt gagtaaggcc 360

ccggaggccc ttttctttgt gcaatttgag aagggagaga gctacttcca catgcacgtg 420ccggaggccc ttttctttgt gcaatttgag aagggagaga gctacttcca catgcacgtg 420

ctcgtggaaa ccaccggggt gaaatccatg gttttgggac gtttcctgag tcagattcgc 480ctcgtggaaa ccaccggggt gaaatccatg gttttgggac gtttcctgag tcagattcgc 480

gaaaaactga ttcagagaat ttaccgcggg atcgagccga ctttgccaaa ctggttcgcg 540gaaaaactga ttcagagaat ttaccgcggg atcgagccga ctttgccaaa ctggttcgcg 540

gtcacaaaga ccagaaatgg cgccggaggc gggaacaagg tggtggatga gtgctacatc 600gtcacaaaga ccagaaatgg cgccggaggc gggaacaagg tggtggatga gtgctacatc 600

cccaattact tgctccccaa aacccagcct gagctccagt gggcgtggac taatatggaa 660cccaattact tgctccccaa aacccagcct gagctccagt gggcgtggac taatatggaa 660

cagtatttaa ggtaagtact ccctatcagt gatagagatc tatcatggag ataattaaaa 720cagtatttaa ggtaagtact ccctatcagt gatagagatc tatcatggag ataattaaaa 720

tgataaccat ctcgcaaata aataagtatt ttactgtttt cgtaacagtt ttgtaataaa 780tgataaccat ctcgcaaata aataagtatt ttactgtttt cgtaacagtt ttgtaataaa 780

aaaacctata aatattccgg attattcata ccgtcccacc atcgggcgcg aagggggaga 840aaaacctata aatattccgg attattcata ccgtcccacc atcgggcgcg aagggggaga 840

cctgtagtca gagcccccgg gcagcacaca ctgacatcca ctcccttcct attgtttcag 900cctgtagtca gagcccccgg gcagcacaca ctgacatcca ctcccttcct attgtttcag 900

cgcctgtttg aatctcacgg agcgtaaacg gttggtggcg cagcatctga cgcacgtgtc 960cgcctgtttg aatctcacgg agcgtaaacg gttggtggcg cagcatctga cgcacgtgtc 960

gcagacgcag gagcagaaca aagagaatca gaatcccaat tctgatgcgc cggtgatcag 1020gcagacgcag gagcagaaca aagagaatca gaatcccaat tctgatgcgc cggtgatcag 1020

atcaaaaact tcagccaggt acatggagct ggtcgggtgg ctcgtggaca aggggattac 1080atcaaaaact tcagccaggt acatggagct ggtcgggtgg ctcgtggaca aggggattac 1080

ctcggagaag cagtggatcc aggaggacca ggcctcatac atctccttca atgcggcctc 1140ctcggagaag cagtggatcc aggaggacca ggcctcatac atctccttca atgcggcctc 1140

caactcgcgg tcccaaatca aggctgcctt ggacaatgcg ggaaagatta tgagcctgac 1200caactcgcgg tcccaaatca aggctgcctt ggacaatgcg ggaaagatta tgagcctgac 1200

taaaaccgcc cccgactacc tggtgggcca gcagcccgtg gaggacattt ccagcaatcg 1260taaaaccgcc cccgactacc tggtgggcca gcagcccgtg gaggacattt ccagcaatcg 1260

gatttataaa attttggaac taaacgggta cgatccccaa tatgcggctt ccgtctttct 1320gatttataaa attttggaac taaacgggta cgatccccaa tatgcggctt ccgtctttct 1320

gggatgggcc acgaaaaagt tcggcaagag gaacaccatc tggctgtttg ggcctgcaac 1380gggatgggcc acgaaaaagt tcggcaagag gaacaccatc tggctgtttg ggcctgcaac 1380

taccgggaag accaacatcg cggaggccat agcccacact gtgcccttct acgggtgcgt 1440taccgggaag accaacatcg cggaggccat agcccacact gtgcccttct acgggtgcgt 1440

aaactggacc aatgagaact ttcccttcaa cgactgtgtc gacaagatgg tgatctggtg 1500aaactggacc aatgagaact ttcccttcaa cgactgtgtc gacaagatgg tgatctggtg 1500

ggaggagggg aagatgaccg ccaaggtcgt ggagtcggcc aaagccattc tcggaggaag 1560ggaggagggg aagatgaccg ccaaggtcgt ggagtcggcc aaagccattc tcggaggaag 1560

caaggtgcgc gtggaccaga aatgcaagtc ctcggcccag atagacccga ctcccgtgat 1620caaggtgcgc gtggaccaga aatgcaagtc ctcggcccag atagacccga ctcccgtgat 1620

cgtcacctcc aacaccaaca tgtgcgccgt gattgacggg aactcaacga ccttcgaaca 1680cgtcacctcc aacaccaaca tgtgcgccgt gattgacggg aactcaacga ccttcgaaca 1680

ccagcagccg ttgcaagacc ggatgttcaa atttgaactc acccgccgtc tggatcatga 1740ccagcagccg ttgcaagacc ggatgttcaa atttgaactc acccgccgtc tggatcatga 1740

ctttgggaag gtcaccaagc aggaagtcaa agactttttc cggtgggcaa aggatcacgt 1800ctttgggaag gtcaccaagc aggaagtcaa agactttttc cggtgggcaa aggatcacgt 1800

ggttgaggtg gagcatgaat tctacgtcaa aaagggtgga gccaagaaaa gacccgcccc 1860ggttgaggtg gagcatgaat tctacgtcaa aaagggtgga gccaagaaaa gacccgcccc 1860

cagtgacgca gatataagtg agcccaaacg ggtgcgcgag tcagttgcgc agccatcgac 1920cagtgacgca gatataagtg agcccaaacg ggtgcgcgag tcagttgcgc agccatcgac 1920

gtcagacgcg gaagcttcga tcaactacgc agacaggtac caaaacaaat gttctcgtca 1980gtcagacgcg gaagcttcga tcaactacgc agacaggtac caaaacaaat gttctcgtca 1980

cgtgggcatg aatctgatgc tgtttccctg cagacaatgc gagagaatga atcagaattc 2040cgtgggcatg aatctgatgc tgtttccctg cagacaatgc gagagaatga atcagaattc 2040

aaatatctgc ttcactcacg gacagaaaga ctgtttagag tgctttcccg tgtcagaatc 2100aaatatctgc ttcactcacg gacagaaaga ctgtttagag tgctttcccg tgtcagaatc 2100

tcaacccgtt tctgtcgtca aaaaggcgta tcagaaactg tgctacattc atcatatcat 2160tcaacccgtt tctgtcgtca aaaaggcgta tcagaaactg tgctacattc atcatatcat 2160

gggaaaggtg ccagacgctt gcactgcctg cgatctggtc aatgtggatt tggatgactg 2220gggaaaggtg ccagacgctt gcactgcctg cgatctggtc aatgtggatt tggatgactg 2220

catctttgaa caataaatga tttaaatcag gtatggctgc cgatggttat cttccagatt 2280catctttgaa caataaatga tttaaatcag gtatggctgc cgatggttat cttccagatt 2280

ggctcgagga cactctctct gaagg 2305ggctcgagga cactctctct gaagg 2305

<210> 7<210> 7

<211> 3884<211> 3884

<212> DNA<212> DNA

<213> 人工序列（Artificial Sequence）<213> Artificial Sequence

<220><220>

<223> CORep<223> CORep

<400> 7<400> 7

gaacaaacga cccaacaccc gtgcgtttta ttctgtcttt ttattgccgt catagcgcgg 60gaacaaacga cccaacaccc gtgcgtttta ttctgtcttt ttattgccgt catagcgcgg 60

gttccttccg gtattgtctc cttccgtgtt tcagttagcc tcccccatct cccggtaccg 120gttccttccg gtattgtctc cttccgtgtt tcagttagcc tcccccatct cccggtaccg 120

catgctatgc atcagctgct agcttactgc tcgaagatgc agtcgtccag atccacgttc 180catgctatgc atcagctgct agcttactgc tcgaagatgc agtcgtccag atccacgttc 180

acgagatcgc aagcagtgca agcgtcgggc accttgccca tgatgtggtg gatgtagcac 240acgagatcgc aagcagtgca agcgtcgggc accttgccca tgatgtggtg gatgtagcac 240

agcttctggt aggccttctt gacgacggac acgggttggg actcggagac ggggaagcat 300agcttctggt aggccttctt gacgacggac acgggttggg actcggagac ggggaagcat 300

tccagacagt ctttctggcc gtgggtgaag cagatgttgg agttctggtt catgcgctcg 360tccagacagt ctttctggcc gtgggtgaag cagatgttgg agttctggtt catgcgctcg 360

cactggcggc aagggaacag catcagattc atgcccacgt ggcggctgca cttattctgg 420cactggcggc aagggaacag catcagattc atgcccacgt ggcggctgca cttattctgg 420

tagcggtcgg cgtagttgat ggaggcttca gcatcggagg tggaaggctg agcgacggac 480tagcggtcgg cgtagttgat ggaggcttca gcatcggagg tggaaggctg agcgacggac 480

tcacgcacgc gcttaggctc gctgatatca gcgtcgctgg gagcggggcg cttcttagca 540tcacgcacgc gcttaggctc gctgatatca gcgtcgctgg gagcggggcg cttcttagca 540

ccgcccttct tcacgtagaa ctcgtgctcc acctccacga cgtgatcctt ggcccaacgg 600ccgcccttct tcacgtagaa ctcgtgctcc acctccacga cgtgatcctt ggcccaacgg 600

aagaagtcct tcacctcttg cttggtgact ttgccgaagt cgtggtccag acggcgggtg 660aagaagtcct tcacctcttg cttggtgact ttgccgaagt cgtggtccag acggcgggtg 660

agctcgaatt tgaacatgcg gtcttgcaga ggttgctgat gttcgaaggt agtggagttg 720agctcgaatt tgaacatgcg gtcttgcaga ggttgctgat gttcgaaggt agtggagttg 720

ccgtcgatga cagcgcacat gttggtgttg gaagtcacga tgacgggggt ggggtcgatc 780ccgtcgatga cagcgcacat gttggtgttg gaagtcacga tgacgggggt ggggtcgatc 780

tgagcggagg acttgcactt ctggtcgaca cgcaccttgc taccacccag aatggccttg 840tgagcggagg acttgcactt ctggtcgaca cgcaccttgc taccacccag aatggccttg 840

gcggattcga ccaccttggc agtcatcttg ccctcttccc accagatgac catcttgtcg 900gcggattcga ccaccttggc agtcatcttg ccctcttccc accagatgac catcttgtcg 900

acgcagtcgt tgaaggggaa gttctcgttg gtccagttga cgcagccgta aaagggcacg 960acgcagtcgt tgaaggggaa gttctcgttg gtccagttga cgcagccgta aaagggcacg 960

gtatgggcga tggcttcggc gatgttggtc ttaccagtgg tagcgggacc gaagagccag 1020gtatgggcga tggcttcggc gatgttggtc ttaccagtgg tagcgggacc gaagagccag 1020

atagtgttgc gcttgccgaa cttcttggta gcccaaccga ggaagacgga ggcggcatac 1080atagtgttgc gcttgccgaa cttcttggta gcccaaccga ggaagacgga ggcggcatac 1080

tgggggtcgt agccgttgag ctccagaatc ttgtagatgc ggttggagga gatgtcctcc 1140tgggggtcgt agccgttgag ctccagaatc ttgtagatgc ggttggagga gatgtcctcc 1140

acgggctgtt gaccgaccag ataatcggga gcggtcttgg tgaggctcat gatcttacca 1200acgggctgtt gaccgaccag ataatcggga gcggtcttgg tgaggctcat gatcttacca 1200

gcgttgtcga gggcagcctt gatctgggaa cgggagttgc tggcagcatt gaagctgatg 1260gcgttgtcga gggcagcctt gatctgggaa cgggagttgc tggcagcatt gaagctgatg 1260

tagctggctt ggtcctcttg gatccactgc ttctcgctag tgatgccctt gtcgaccagc 1320tagctggctt ggtcctcttg gatccactgc ttctcgctag tgatgccctt gtcgaccagc 1320

caaccgacca gttccatggt ggcccgggtt tcggaccgag atccgcgccc gatggtggga 1380caaccgacca gttccatggt ggcccgggtt tcggaccgag atccgcgccc gatggtggga 1380

cggtatgaat aatccggaat atttataggt ttttttatta caaaactgtt acgaaaacag 1440cggtatgaat aatccggaat atttataggt ttttttatta caaaactgtt acgaaaacag 1440

taaaatactt atttatttgc gagatggtta tcattttaat tatctccatg atctattaat 1500taaaatactt atttatttgc gagatggtta tcattttaat tatctccatg atctattaat 1500

attccggagt atacaataaa cgataacgcc gttggtggcg tgaggcatgt aaaaggttac 1560attccggagt atacaataaa cgataacgcc gttggtggcg tgaggcatgt aaaaggttac 1560

atcattatct tgttcgccat ccggttggta taaatagacg ttcatgttgg tttttgtttc 1620atcattatct tgttcgccat ccggttggta taaatagacg ttcatgttgg tttttgtttc 1620

agttgcaagt tggctgcggc gcgcgcagca cctttgcggc cgccaccatg gcggggtttt 1680agttgcaagt tggctgcggc gcgcgcagca cctttgcggc cgccaccatg gcggggtttt 1680

acgagattgt gattaaggtc cccagcgacc ttgacgagca tctgcccggc atttctgaca 1740acgagattgt gattaaggtc cccagcgacc ttgacgagca tctgcccggc atttctgaca 1740

gctttgtgaa ctgggtggcc gagaaggaat gggagttgcc gccagattct gacatggatc 1800gctttgtgaa ctgggtggcc gagaaggaat gggagttgcc gccagattct gacatggatc 1800

tgaatctgat tgagcaggca cccctgaccg tggccgagaa gctgcagcgc gactttctga 1860tgaatctgat tgagcaggca cccctgaccg tggccgagaa gctgcagcgc gactttctga 1860

cggaatggcg ccgtgtgagt aaggccccgg aggccctttt ctttgtgcaa tttgagaagg 1920cggaatggcg ccgtgtgagt aaggccccgg aggccctttt ctttgtgcaa tttgagaagg 1920

gagagagcta cttccacatg cacgtgctcg tggaaaccac cggggtgaaa tccatggttt 1980gagagagcta cttccacatg cacgtgctcg tggaaaccac cggggtgaaa tccatggttt 1980

tgggacgttt cctgagtcag attcgcgaaa aactgattca gagaatttac cgcgggatcg 2040tgggacgttt cctgagtcag attcgcgaaa aactgattca gagaatttac cgcggggatcg 2040

agccgacttt gccaaactgg ttcgcggtca caaagaccag aaatggcgcc ggaggcggga 2100agccgacttt gccaaactgg ttcgcggtca caaagaccag aaatggcgcc ggaggcggga 2100

acaaggtggt ggatgagtgc tacatcccca attacttgct ccccaaaacc cagcctgagc 2160acaaggtggt ggatgagtgc tacatcccca attacttgct ccccaaaacc cagcctgagc 2160

tccagtgggc gtggactaat atggaacagt atttaagcgc ctgtttgaat ctcacggagc 2220tccagtgggc gtggactaat atggaacagt atttaagcgc ctgtttgaat ctcacggagc 2220

gtaaacggtt ggtggcgcag catctgacgc acgtgtcgca gacgcaggag cagaacaaag 2280gtaaacggtt ggtggcgcag catctgacgc acgtgtcgca gacgcaggag cagaacaaag 2280

agaatcagaa tcccaattct gatgcgccgg tgatcagatc aaaaacttca gccaggtaca 2340agaatcagaa tcccaattct gatgcgccgg tgatcagatc aaaaacttca gccaggtaca 2340

tggagctggt cgggtggctc gtggacaagg ggattacctc ggagaagcag tggatccagg 2400tggagctggt cgggtggctc gtggacaagg ggattacctc ggagaagcag tggatccagg 2400

aggaccaggc ctcatacatc tccttcaatg cggcctccaa ctcgcggtcc caaatcaagg 2460aggaccaggc ctcatacatc tccttcaatg cggcctccaa ctcgcggtcc caaatcaagg 2460

ctgccttgga caatgcggga aagattatga gcctgactaa aaccgccccc gactacctgg 2520ctgccttgga caatgcggga aagattatga gcctgactaa aaccgccccc gactacctgg 2520

tgggccagca gcccgtggag gacatttcca gcaatcggat ttataaaatt ttggaactaa 2580tgggccagca gcccgtggag gacatttcca gcaatcggat ttataaaatt ttggaactaa 2580

acgggtacga tccccaatat gcggcttccg tctttctggg atgggccacg aaaaagttcg 2640acgggtacga tccccaatat gcggcttccg tctttctggg atgggccacg aaaaagttcg 2640

gcaagaggaa caccatctgg ctgtttgggc ctgcaactac cgggaagacc aacatcgcgg 2700gcaagaggaa caccatctgg ctgtttgggc ctgcaactac cgggaagacc aacatcgcgg 2700

aggccatagc ccacactgtg cccttctacg ggtgcgtaaa ctggaccaat gagaactttc 2760aggccatagc ccacactgtg cccttctacg ggtgcgtaaa ctggaccaat gagaactttc 2760

ccttcaacga ctgtgtcgac aagatggtga tctggtggga ggaggggaag atgaccgcca 2820ccttcaacga ctgtgtcgac aagatggtga tctggtggga ggaggggaag atgaccgcca 2820

aggtcgtgga gtcggccaaa gccattctcg gaggaagcaa ggtgcgcgtg gaccagaaat 2880aggtcgtgga gtcggccaaa gccattctcg gaggaagcaa ggtgcgcgtg gaccagaaat 2880

gcaagtcctc ggcccagata gacccgactc ccgtgatcgt cacctccaac accaacatgt 2940gcaagtcctc ggcccagata gacccgactc ccgtgatcgt cacctccaac accaacatgt 2940

gcgccgtgat tgacgggaac tcaacgacct tcgaacacca gcagccgttg caagaccgga 3000gcgccgtgat tgacgggaac tcaacgacct tcgaacacca gcagccgttg caagaccgga 3000

tgttcaaatt tgaactcacc cgccgtctgg atcatgactt tgggaaggtc accaagcagg 3060tgttcaaatt tgaactcacc cgccgtctgg atcatgactt tgggaaggtc accaagcagg 3060

aagtcaaaga ctttttccgg tgggcaaagg atcacgtggt tgaggtggag catgaattct 3120aagtcaaaga ctttttccgg tgggcaaagg atcacgtggt tgaggtggag catgaattct 3120

acgtcaaaaa gggtggagcc aagaaaagac ccgcccccag tgacgcagat ataagtgagc 3180acgtcaaaaa gggtggagcc aagaaaagac ccgcccccag tgacgcagat ataagtgagc 3180

ccaaacgggt gcgcgagtca gttgcgcagc catcgacgtc agacgcggaa gcttcgatca 3240ccaaacgggt gcgcgagtca gttgcgcagc catcgacgtc agacgcggaa gcttcgatca 3240

actacgcaga caggtaccaa aacaaatgtt ctcgtcacgt gggcatgaat ctgatgctgt 3300actacgcaga caggtaccaa aacaaatgtt ctcgtcacgt gggcatgaat ctgatgctgt 3300

ttccctgcag acaatgcgag agaatgaatc agaattcaaa tatctgcttc actcacggac 3360ttccctgcag acaatgcgag agaatgaatc agaattcaaa tatctgcttc actcacggac 3360

agaaagactg tttagagtgc tttcccgtgt cagaatctca acccgtttct gtcgtcaaaa 3420agaaagactg tttagagtgc tttcccgtgt cagaatctca acccgtttct gtcgtcaaaa 3420

aggcgtatca gaaactgtgc tacattcatc atatcatggg aaaggtgcca gacgcttgca 3480aggcgtatca gaaactgtgc tacattcatc atatcatggg aaaggtgcca gacgcttgca 3480

ctgcctgcga tctggtcaat gtggatttgg atgactgcat ctttgaacaa taaatgattt 3540ctgcctgcga tctggtcaat gtggatttgg atgactgcat ctttgaacaa taaatgattt 3540

aaatcaggta tggctgccga tggttatctt ccagattggc tcgaggacac tctctctgat 3600aaatcaggta tggctgccga tggttatctt ccagattggc tcgaggacac tctctctgat 3600

ctagagcctg cagtctcgac aagcttgtcg agaagtacta gaggatcata atcagccata 3660ctagagcctg cagtctcgac aagcttgtcg agaagtacta gaggatcata atcagccata 3660

ccacatttgt agaggtttta cttgctttaa aaaacctccc acacctcccc ctgaacctga 3720ccacatttgt agaggtttta cttgctttaa aaaacctccc acacctcccc ctgaacctga 3720

aacataaaat gaatgcaatt gttgttgtta acttgtttat tgcagcttat aatggttaca 3780aacataaaat gaatgcaatt gttgttgtta acttgtttat tgcagcttat aatggttaca 3780

aataaagcaa tagcatcaca aatttcacaa ataaagcatt tttttcactg cattctagtt 3840aataaagcaa tagcatcaca aatttcacaa ataaagcatttttttcactg cattctagtt 3840

gtggtttgtc caaactcatc aatgtatctt atcatgtctg gatc 3884gtggtttgtc caaactcatc aatgtatctt atcatgtctg gatc 3884

<210> 8<210> 8

<211> 3874<211> 3874

<212> DNA<212> DNA

<213> 人工序列（Artificial Sequence）<213> Artificial Sequence

<220><220>

<223> p10Rep<223> p10Rep

<400> 8<400> 8

attccggagt atacggacct ttaattcaac ccaacacaat atattatagt taaataagaa 1560attccggagt atacggacct ttaattcaac ccaacacaat atattatagt taaataagaa 1560

ttattatcaa atcatttgta tattaattaa aatactatac tgtaaattac attttattta 1620ttattatcaa atcatttgta tattaattaa aatactatac tgtaaattac attttattta 1620

caatcactcg acgaagactt gatcagcggc cgccaccatg gcggggtttt acgagattgt 1680caatcactcg acgaagactt gatcagcggc cgccaccatg gcggggtttt acgagattgt 1680

gattaaggtc cccagcgacc ttgacgagca tctgcccggc atttctgaca gctttgtgaa 1740gattaaggtc cccagcgacc ttgacgagca tctgcccggc atttctgaca gctttgtgaa 1740

ctgggtggcc gagaaggaat gggagttgcc gccagattct gacatggatc tgaatctgat 1800ctgggtggcc gagaaggaat gggagttgcc gccagattct gacatggatc tgaatctgat 1800

tgagcaggca cccctgaccg tggccgagaa gctgcagcgc gactttctga cggaatggcg 1860tgagcaggca cccctgaccg tggccgagaa gctgcagcgc gactttctga cggaatggcg 1860

ccgtgtgagt aaggccccgg aggccctttt ctttgtgcaa tttgagaagg gagagagcta 1920ccgtgtgagt aaggccccgg aggccctttt ctttgtgcaa tttgagaagg gagagagcta 1920

cttccacatg cacgtgctcg tggaaaccac cggggtgaaa tccatggttt tgggacgttt 1980cttccacatg cacgtgctcg tggaaaccac cggggtgaaa tccatggttt tgggacgttt 1980

cctgagtcag attcgcgaaa aactgattca gagaatttac cgcgggatcg agccgacttt 2040cctgagtcag attcgcgaaa aactgattca gagaatttac cgcggggatcg agccgacttt 2040

gccaaactgg ttcgcggtca caaagaccag aaatggcgcc ggaggcggga acaaggtggt 2100gccaaactgg ttcgcggtca caaagaccag aaatggcgcc ggaggcggga acaaggtggt 2100

ggatgagtgc tacatcccca attacttgct ccccaaaacc cagcctgagc tccagtgggc 2160ggatgagtgc tacatcccca attacttgct ccccaaaacc cagcctgagc tccagtgggc 2160

gtggactaat atggaacagt atttaagcgc ctgtttgaat ctcacggagc gtaaacggtt 2220gtggactaat atggaacagt atttaagcgc ctgtttgaat ctcacggagc gtaaacggtt 2220

ggtggcgcag catctgacgc acgtgtcgca gacgcaggag cagaacaaag agaatcagaa 2280ggtggcgcag catctgacgc acgtgtcgca gacgcaggag cagaacaaag agaatcagaa 2280

tcccaattct gatgcgccgg tgatcagatc aaaaacttca gccaggtaca tggagctggt 2340tcccaattct gatgcgccgg tgatcagatc aaaaacttca gccaggtaca tggagctggt 2340

cgggtggctc gtggacaagg ggattacctc ggagaagcag tggatccagg aggaccaggc 2400cgggtggctc gtggacaagg ggattacctc ggagaagcag tggatccagg aggaccaggc 2400

ctcatacatc tccttcaatg cggcctccaa ctcgcggtcc caaatcaagg ctgccttgga 2460ctcatacatc tccttcaatg cggcctccaa ctcgcggtcc caaatcaagg ctgccttgga 2460

caatgcggga aagattatga gcctgactaa aaccgccccc gactacctgg tgggccagca 2520caatgcggga aagattatga gcctgactaa aaccgccccc gactacctgg tgggccagca 2520

gcccgtggag gacatttcca gcaatcggat ttataaaatt ttggaactaa acgggtacga 2580gcccgtggag gacatttcca gcaatcggat ttataaaatt ttggaactaa acgggtacga 2580

tccccaatat gcggcttccg tctttctggg atgggccacg aaaaagttcg gcaagaggaa 2640tccccaatat gcggcttccg tctttctggg atgggccacg aaaaagttcg gcaagaggaa 2640

caccatctgg ctgtttgggc ctgcaactac cgggaagacc aacatcgcgg aggccatagc 2700caccatctgg ctgtttgggc ctgcaactac cgggaagacc aacatcgcgg aggccatagc 2700

ccacactgtg cccttctacg ggtgcgtaaa ctggaccaat gagaactttc ccttcaacga 2760ccacactgtg cccttctacg ggtgcgtaaa ctggaccaat gagaactttc ccttcaacga 2760

ctgtgtcgac aagatggtga tctggtggga ggaggggaag atgaccgcca aggtcgtgga 2820ctgtgtcgac aagatggtga tctggtggga ggaggggaag atgaccgcca aggtcgtgga 2820

gtcggccaaa gccattctcg gaggaagcaa ggtgcgcgtg gaccagaaat gcaagtcctc 2880gtcggccaaa gccattctcg gaggaagcaa ggtgcgcgtg gaccagaaat gcaagtcctc 2880

ggcccagata gacccgactc ccgtgatcgt cacctccaac accaacatgt gcgccgtgat 2940ggcccagata gacccgactc ccgtgatcgt cacctccaac accaacatgt gcgccgtgat 2940

tgacgggaac tcaacgacct tcgaacacca gcagccgttg caagaccgga tgttcaaatt 3000tgacgggaac tcaacgacct tcgaacacca gcagccgttg caagaccgga tgttcaaatt 3000

tgaactcacc cgccgtctgg atcatgactt tgggaaggtc accaagcagg aagtcaaaga 3060tgaactcacc cgccgtctgg atcatgactt tgggaaggtc accaagcagg aagtcaaaga 3060

ctttttccgg tgggcaaagg atcacgtggt tgaggtggag catgaattct acgtcaaaaa 3120ctttttccgg tgggcaaagg atcacgtggt tgaggtggag catgaattct acgtcaaaaa 3120

gggtggagcc aagaaaagac ccgcccccag tgacgcagat ataagtgagc ccaaacgggt 3180gggtggagcc aagaaaagac ccgcccccag tgacgcagat ataagtgagc ccaaacgggt 3180

gcgcgagtca gttgcgcagc catcgacgtc agacgcggaa gcttcgatca actacgcaga 3240gcgcgagtca gttgcgcagc catcgacgtc agacgcggaa gcttcgatca actacgcaga 3240

caggtaccaa aacaaatgtt ctcgtcacgt gggcatgaat ctgatgctgt ttccctgcag 3300caggtaccaa aacaaatgtt ctcgtcacgt gggcatgaat ctgatgctgt ttccctgcag 3300

acaatgcgag agaatgaatc agaattcaaa tatctgcttc actcacggac agaaagactg 3360acaatgcgag agaatgaatc agaattcaaa tatctgcttc actcacggac agaaagactg 3360

tttagagtgc tttcccgtgt cagaatctca acccgtttct gtcgtcaaaa aggcgtatca 3420tttagagtgc tttcccgtgt cagaatctca acccgtttct gtcgtcaaaa aggcgtatca 3420

gaaactgtgc tacattcatc atatcatggg aaaggtgcca gacgcttgca ctgcctgcga 3480gaaactgtgc tacattcatc atatcatggg aaaggtgcca gacgcttgca ctgcctgcga 3480

tctggtcaat gtggatttgg atgactgcat ctttgaacaa taaatgattt aaatcaggta 3540tctggtcaat gtggatttgg atgactgcat ctttgaacaa taaatgattt aaatcaggta 3540

tggctgccga tggttatctt ccagattggc tcgaggacac tctctctgat ctagagcctg 3600tggctgccga tggttatctt ccagattggc tcgaggacac tctctctgat ctagagcctg 3600

cagtctcgac aagcttgtcg agaagtacta gaggatcata atcagccata ccacatttgt 3660cagtctcgac aagcttgtcg agaagtacta gaggatcata atcagccata ccacatttgt 3660

agaggtttta cttgctttaa aaaacctccc acacctcccc ctgaacctga aacataaaat 3720agaggtttta cttgctttaa aaaacctccc acacctcccc ctgaacctga aacataaaat 3720

gaatgcaatt gttgttgtta acttgtttat tgcagcttat aatggttaca aataaagcaa 3780gaatgcaatt gttgttgtta acttgtttat tgcagctttat aatggttaca aataaagcaa 3780

tagcatcaca aatttcacaa ataaagcatt tttttcactg cattctagtt gtggtttgtc 3840tagcatcaca aatttcacaa ataaagcatt tttttcactg cattctagtt gtggtttgtc 3840

caaactcatc aatgtatctt atcatgtctg gatc 3874caaactcatc aatgtatctt atcatgtctg gatc 3874

<210> 9<210> 9

<211> 110<211> 110

<212> DNA<212> DNA

<213> 人工序列（Artificial Sequence）<213> Artificial Sequence

<220><220>

<223> p10<223> p10

<400> 9<400> 9

ttgtatatta attaaaatac tatactgtaa attacatttt atttacaatc 110ttgtatatta attaaaatac tatactgtaa attacatttt atttacaatc 110

<210> 10<210> 10

<211> 92<211> 92

<212> DNA<212> DNA

<213> 人工序列（Artificial Sequence）<213> Artificial Sequence

<220><220>

<223> polh<223> polh

<400> 10<400> 10

atcatggaga taattaaaat gataaccatc tcgcaaataa ataagtattt tactgttttc 60atcatggaga taattaaaat gataaccatc tcgcaaataa ataagtattt tactgttttc 60

gtaacagttt tgtaataaaa aaacctataa at 92gtaacagttt tgtaataaaa aaacctataa at 92

<210> 11<210> 11

<211> 1866<211> 1866

<212> DNA<212> DNA

<213> 人工序列（Artificial Sequence）<213> Artificial Sequence

<220><220>

<223> Rep78-WT<223> Rep78-WT

<400> 11<400> 11

atggcggggt tttacgagat tgtgattaag gtccccagcg accttgacga gcatctgccc 60atggcggggt tttacgagat tgtgattaag gtccccagcg accttgacga gcatctgccc 60

ggcatttctg acagctttgt gaactgggtg gccgagaagg aatgggagtt gccgccagat 120ggcatttctg acagctttgt gaactgggtg gccgagaagg aatgggagtt gccgccagat 120

tctgacatgg atctgaatct gattgagcag gcacccctga ccgtggccga gaagctgcag 180tctgacatgg atctgaatct gattgagcag gcacccctga ccgtggccga gaagctgcag 180

cgcgactttc tgacggaatg gcgccgtgtg agtaaggccc cggaggccct tttctttgtg 240cgcgactttc tgacggaatg gcgccgtgtg agtaaggccc cggaggccct tttctttgtg 240

caatttgaga agggagagag ctacttccac atgcacgtgc tcgtggaaac caccggggtg 300caatttgaga agggagagag ctacttccac atgcacgtgc tcgtggaaac caccggggtg 300

aaatccatgg ttttgggacg tttcctgagt cagattcgcg aaaaactgat tcagagaatt 360aaatccatgg ttttgggacg tttcctgagt cagattcgcg aaaaactgat tcagagaatt 360

taccgcggga tcgagccgac tttgccaaac tggttcgcgg tcacaaagac cagaaatggc 420taccgcggga tcgagccgac tttgccaaac tggttcgcgg tcacaaagac cagaaatggc 420

gccggaggcg ggaacaaggt ggtggatgag tgctacatcc ccaattactt gctccccaaa 480gccggaggcg ggaacaaggt ggtggatgag tgctacatcc ccaattactt gctccccaaa 480

acccagcctg agctccagtg ggcgtggact aatatggaac agtatttaag cgcctgtttg 540acccagcctg agctccagtg ggcgtggact aatatggaac agtatttaag cgcctgtttg 540

aatctcacgg agcgtaaacg gttggtggcg cagcatctga cgcacgtgtc gcagacgcag 600aatctcacgg agcgtaaacg gttggtggcg cagcatctga cgcacgtgtc gcagacgcag 600

gagcagaaca aagagaatca gaatcccaat tctgatgcgc cggtgatcag atcaaaaact 660gagcagaaca aagagaatca gaatcccaat tctgatgcgc cggtgatcag atcaaaaact 660

tcagccaggt acatggagct ggtcgggtgg ctcgtggaca aggggattac ctcggagaag 720tcagccaggt acatggagct ggtcgggtgg ctcgtggaca aggggattac ctcggagaag 720

cagtggatcc aggaggacca ggcctcatac atctccttca atgcggcctc caactcgcgg 780cagtggatcc aggaggacca ggcctcatac atctccttca atgcggcctc caactcgcgg 780

tcccaaatca aggctgcctt ggacaatgcg ggaaagatta tgagcctgac taaaaccgcc 840tcccaaatca aggctgcctt ggacaatgcg ggaaagatta tgagcctgac taaaaccgcc 840

cccgactacc tggtgggcca gcagcccgtg gaggacattt ccagcaatcg gatttataaa 900cccgactacc tggtgggcca gcagcccgtg gaggacattt ccagcaatcg gatttataaa 900

attttggaac taaacgggta cgatccccaa tatgcggctt ccgtctttct gggatgggcc 960attttggaac taaacgggta cgatccccaa tatgcggctt ccgtctttct gggatgggcc 960

acgaaaaagt tcggcaagag gaacaccatc tggctgtttg ggcctgcaac taccgggaag 1020acgaaaaagt tcggcaagag gaacaccatc tggctgtttg ggcctgcaac taccgggaag 1020

accaacatcg cggaggccat agcccacact gtgcccttct acgggtgcgt aaactggacc 1080accaacatcg cggaggccat agcccacact gtgcccttct acgggtgcgt aaactggacc 1080

aatgagaact ttcccttcaa cgactgtgtc gacaagatgg tgatctggtg ggaggagggg 1140aatgagaact ttcccttcaa cgactgtgtc gacaagatgg tgatctggtg ggaggagggg 1140

aagatgaccg ccaaggtcgt ggagtcggcc aaagccattc tcggaggaag caaggtgcgc 1200aagatgaccg ccaaggtcgt ggagtcggcc aaagccattc tcggaggaag caaggtgcgc 1200

gtggaccaga aatgcaagtc ctcggcccag atagacccga ctcccgtgat cgtcacctcc 1260gtggaccaga aatgcaagtc ctcggcccag atagacccga ctcccgtgat cgtcacctcc 1260

aacaccaaca tgtgcgccgt gattgacggg aactcaacga ccttcgaaca ccagcagccg 1320aacaccaaca tgtgcgccgt gattgacggg aactcaacga ccttcgaaca ccagcagccg 1320

ttgcaagacc ggatgttcaa atttgaactc acccgccgtc tggatcatga ctttgggaag 1380ttgcaagacc ggatgttcaa atttgaactc acccgccgtc tggatcatga ctttgggaag 1380

gtcaccaagc aggaagtcaa agactttttc cggtgggcaa aggatcacgt ggttgaggtg 1440gtcaccaagc aggaagtcaa agactttttc cggtgggcaa aggatcacgt ggttgaggtg 1440

gagcatgaat tctacgtcaa aaagggtgga gccaagaaaa gacccgcccc cagtgacgca 1500gagcatgaat tctacgtcaa aaagggtgga gccaagaaaa gacccgcccc cagtgacgca 1500

gatataagtg agcccaaacg ggtgcgcgag tcagttgcgc agccatcgac gtcagacgcg 1560gatataagtg agcccaaacg ggtgcgcgag tcagttgcgc agccatcgac gtcagacgcg 1560

gaagcttcga tcaactacgc agacaggtac caaaacaaat gttctcgtca cgtgggcatg 1620gaagcttcga tcaactacgc agacaggtac caaaacaaat gttctcgtca cgtgggcatg 1620

aatctgatgc tgtttccctg cagacaatgc gagagaatga atcagaattc aaatatctgc 1680aatctgatgc tgtttccctg cagacaatgc gagagaatga atcagaattc aaatatctgc 1680

ttcactcacg gacagaaaga ctgtttagag tgctttcccg tgtcagaatc tcaacccgtt 1740ttcactcacg gacagaaaga ctgtttagag tgctttcccg tgtcagaatc tcaacccgtt 1740

tctgtcgtca aaaaggcgta tcagaaactg tgctacattc atcatatcat gggaaaggtg 1800tctgtcgtca aaaaggcgta tcagaaactg tgctacattc atcatatcat gggaaaggtg 1800

ccagacgctt gcactgcctg cgatctggtc aatgtggatt tggatgactg catctttgaa 1860ccagacgctt gcactgcctg cgatctggtc aatgtggatt tggatgactg catctttgaa 1860

caataa 1866caataa 1866

<210> 12<210> 12

<211> 1194<211> 1194

<212> DNA<212> DNA

<213> 人工序列（Artificial Sequence）<213> Artificial Sequence

<220><220>

<223> Rep52-CO<223> Rep52-CO

<400> 12<400> 12

atggaactgg tcggttggct ggtcgacaag ggcatcacta gcgagaagca gtggatccaa 60atggaactgg tcggttggct ggtcgacaag ggcatcacta gcgagaagca gtggatccaa 60

gaggaccaag ccagctacat cagcttcaat gctgccagca actcccgttc ccagatcaag 120gaggaccaag ccagctacat cagcttcaat gctgccagca actcccgttc ccagatcaag 120

gctgccctcg acaacgctgg taagatcatg agcctcacca agaccgctcc cgattatctg 180gctgccctcg acaacgctgg taagatcatg agcctcacca agaccgctcc cgattatctg 180

gtcggtcaac agcccgtgga ggacatctcc tccaaccgca tctacaagat tctggagctc 240gtcggtcaac agcccgtgga ggacatctcc tccaaccgca tctacaagat tctggagctc 240

aacggctacg acccccagta tgccgcctcc gtcttcctcg gttgggctac caagaagttc 300aacggctacg acccccagta tgccgcctcc gtcttcctcg gttgggctac caagaagttc 300

ggcaagcgca acactatctg gctcttcggt cccgctacca ctggtaagac caacatcgcc 360ggcaagcgca acactatctg gctcttcggt cccgctacca ctggtaagac caacatcgcc 360

gaagccatcg cccataccgt gcccttttac ggctgcgtca actggaccaa cgagaacttc 420gaagccatcg cccataccgt gcccttttac ggctgcgtca actggaccaa cgagaacttc 420

cccttcaacg actgcgtcga caagatggtc atctggtggg aagagggcaa gatgactgcc 480cccttcaacg actgcgtcga caagatggtc atctggtggg aagagggcaa gatgactgcc 480

aaggtggtcg aatccgccaa ggccattctg ggtggtagca aggtgcgtgt cgaccagaag 540aaggtggtcg aatccgccaa ggccattctg ggtggtagca aggtgcgtgt cgaccagaag 540

tgcaagtcct ccgctcagat cgaccccacc cccgtcatcg tgacttccaa caccaacatg 600tgcaagtcct ccgctcagat cgaccccacc cccgtcatcg tgacttccaa caccaacatg 600

tgcgctgtca tcgacggcaa ctccactacc ttcgaacatc agcaacctct gcaagaccgc 660tgcgctgtca tcgacggcaa ctccactacc ttcgaacatc agcaacctct gcaagaccgc 660

atgttcaaat tcgagctcac ccgccgtctg gaccacgact tcggcaaagt caccaagcaa 720atgttcaaat tcgagctcac ccgccgtctg gaccacgact tcggcaaagt caccaagcaa 720

gaggtgaagg acttcttccg ttgggccaag gatcacgtcg tggaggtgga gcacgagttc 780gaggtgaagg acttcttccg ttgggccaag gatcacgtcg tggaggtgga gcacgagttc 780

tacgtgaaga agggcggtgc taagaagcgc cccgctccca gcgacgctga tatcagcgag 840tacgtgaaga agggcggtgc taagaagcgc cccgctccca gcgacgctga tatcagcgag 840

cctaagcgcg tgcgtgagtc cgtcgctcag ccttccacct ccgatgctga agcctccatc 900cctaagcgcg tgcgtgagtc cgtcgctcag ccttccacct ccgatgctga agcctccatc 900

aactacgccg accgctacca gaataagtgc agccgccacg tgggcatgaa tctgatgctg 960aactacgccg accgctacca gaataagtgc agccgccacg tgggcatgaa tctgatgctg 960

ttcccttgcc gccagtgcga gcgcatgaac cagaactcca acatctgctt cacccacggc 1020ttcccttgcc gccagtgcga gcgcatgaac cagaactcca acatctgctt cacccacggc 1020

cagaaagact gtctggaatg cttccccgtc tccgagtccc aacccgtgtc cgtcgtcaag 1080cagaaagact gtctggaatg cttccccgtc tccgagtccc aacccgtgtc cgtcgtcaag 1080

aaggcctacc agaagctgtg ctacatccac cacatcatgg gcaaggtgcc cgacgcttgc 1140aaggcctacc agaagctgtg ctacatccac cacatcatgg gcaaggtgcc cgacgcttgc 1140

actgcttgcg atctcgtgaa cgtggatctg gacgactgca tcttcgagca gtaa 1194actgcttgcg atctcgtgaa cgtggatctg gacgactgca tcttcgagca gtaa 1194

<210> 13<210> 13

<211> 1194<211> 1194

<212> DNA<212> DNA

<213> 人工序列（Artificial Sequence）<213> Artificial Sequence

<220><220>

<223> Rep52WT<223> Rep52WT

<400> 13<400> 13

atggagctgg tcgggtggct cgtggacaag gggattacct cggagaagca gtggatccag 60atggagctgg tcgggtggct cgtggacaag gggattacct cggagaagca gtggatccag 60

gaggaccagg cctcatacat ctccttcaat gcggcctcca actcgcggtc ccaaatcaag 120gaggaccagg cctcatacat ctccttcaat gcggcctcca actcgcggtc ccaaatcaag 120

gctgccttgg acaatgcggg aaagattatg agcctgacta aaaccgcccc cgactacctg 180gctgccttgg acaatgcggg aaagattatg agcctgacta aaaccgcccc cgactacctg 180

gtgggccagc agcccgtgga ggacatttcc agcaatcgga tttataaaat tttggaacta 240gtgggccagc agcccgtgga ggacatttcc agcaatcgga tttataaaat tttggaacta 240

aacgggtacg atccccaata tgcggcttcc gtctttctgg gatgggccac gaaaaagttc 300aacgggtacg atccccaata tgcggcttcc gtctttctgg gatgggccac gaaaaagttc 300

ggcaagagga acaccatctg gctgtttggg cctgcaacta ccgggaagac caacatcgcg 360ggcaagagga acaccatctg gctgtttggg cctgcaacta ccgggaagac caacatcgcg 360

gaggccatag cccacactgt gcccttctac gggtgcgtaa actggaccaa tgagaacttt 420gaggccatag cccacactgt gcccttctac gggtgcgtaa actggaccaa tgagaacttt 420

cccttcaacg actgtgtcga caagatggtg atctggtggg aggaggggaa gatgaccgcc 480cccttcaacg actgtgtcga caagatggtg atctggtggg aggaggggaa gatgaccgcc 480

aaggtcgtgg agtcggccaa agccattctc ggaggaagca aggtgcgcgt ggaccagaaa 540aaggtcgtgg agtcggccaa agccattctc ggaggaagca aggtgcgcgt ggaccagaaa 540

tgcaagtcct cggcccagat agacccgact cccgtgatcg tcacctccaa caccaacatg 600tgcaagtcct cggcccagat agacccgact cccgtgatcg tcacctccaa caccaacatg 600

tgcgccgtga ttgacgggaa ctcaacgacc ttcgaacacc agcagccgtt gcaagaccgg 660tgcgccgtga ttgacgggaa ctcaacgacc ttcgaacacc agcagccgtt gcaagaccgg 660

atgttcaaat ttgaactcac ccgccgtctg gatcatgact ttgggaaggt caccaagcag 720atgttcaaat ttgaactcac ccgccgtctg gatcatgact ttgggaaggt caccaagcag 720

gaagtcaaag actttttccg gtgggcaaag gatcacgtgg ttgaggtgga gcatgaattc 780gaagtcaaag actttttccg gtgggcaaag gatcacgtgg ttgaggtgga gcatgaattc 780

tacgtcaaaa agggtggagc caagaaaaga cccgccccca gtgacgcaga tataagtgag 840tacgtcaaaa agggtggagc caagaaaaga cccgccccca gtgacgcaga tataagtgag 840

cccaaacggg tgcgcgagtc agttgcgcag ccatcgacgt cagacgcgga agcttcgatc 900cccaaacggg tgcgcgagtc agttgcgcag ccatcgacgt cagacgcgga agcttcgatc 900

aactacgcag accgctacca aaacaaatgt tctcgtcacg tgggcatgaa tctgatgctg 960aactacgcag accgctacca aaacaaatgt tctcgtcacg tgggcatgaa tctgatgctg 960

tttccctgca gacaatgcga gagaatgaat cagaattcaa atatctgctt cactcacgga 1020tttccctgca gacaatgcga gagaatgaat cagaattcaa atatctgctt cactcacgga 1020

cagaaagact gtttagagtg ctttcccgtg tcagaatctc aacccgtttc tgtcgtcaaa 1080cagaaagact gtttagagtg ctttcccgtg tcagaatctc aacccgtttc tgtcgtcaaa 1080

aaggcgtatc agaaactgtg ctacattcat catatcatgg gaaaggtgcc agacgcttgc 1140aaggcgtatc agaaactgtg ctacattcat catatcatgg gaaaggtgcc agacgcttgc 1140

actgcctgcg atctggtcaa tgtggatttg gatgactgca tctttgaaca ataa 1194actgcctgcg atctggtcaa tgtggatttg gatgactgca tctttgaaca ataa 1194

<210> 14<210> 14

<211> 8310<211> 8310

<212> DNA<212> DNA

<213> 人工序列（Artificial Sequence）<213> Artificial Sequence

<220><220>

<223> pFastBac-p10Rep<223> pFastBac-p10Rep

<400> 14<400> 14

tcccccatct cccggtaccg catgctatgc atcagctgct agcttactgc tcgaagatgc 4320tcccccatct cccggtaccg catgctatgc atcagctgct agcttactgc tcgaagatgc 4320

agtcgtccag atccacgttc acgagatcgc aagcagtgca agcgtcgggc accttgccca 4380agtcgtccag atccacgttc acgagatcgc aagcagtgca agcgtcgggc accttgccca 4380

tgatgtggtg gatgtagcac agcttctggt aggccttctt gacgacggac acgggttggg 4440tgatgtggtg gatgtagcac agcttctggt aggccttctt gacgacggac acgggttggg 4440

actcggagac ggggaagcat tccagacagt ctttctggcc gtgggtgaag cagatgttgg 4500actcggagac ggggaagcat tccagacagt ctttctggcc gtgggtgaag cagatgttgg 4500

agttctggtt catgcgctcg cactggcggc aagggaacag catcagattc atgcccacgt 4560agttctggtt catgcgctcg cactggcggc aagggaacag catcagattc atgcccacgt 4560

ggcggctgca cttattctgg tagcggtcgg cgtagttgat ggaggcttca gcatcggagg 4620ggcggctgca cttattctgg tagcggtcgg cgtagttgat ggaggcttca gcatcggagg 4620

tggaaggctg agcgacggac tcacgcacgc gcttaggctc gctgatatca gcgtcgctgg 4680tggaaggctg agcgacggac tcacgcacgc gcttaggctc gctgatatca gcgtcgctgg 4680

gagcggggcg cttcttagca ccgcccttct tcacgtagaa ctcgtgctcc acctccacga 4740gagcggggcg cttcttagca ccgcccttct tcacgtagaa ctcgtgctcc acctccacga 4740

cgtgatcctt ggcccaacgg aagaagtcct tcacctcttg cttggtgact ttgccgaagt 4800cgtgatcctt ggcccaacgg aagaagtcct tcacctcttg cttggtgact ttgccgaagt 4800

cgtggtccag acggcgggtg agctcgaatt tgaacatgcg gtcttgcaga ggttgctgat 4860cgtggtccag acggcgggtg agctcgaatt tgaacatgcg gtcttgcaga ggttgctgat 4860

gttcgaaggt agtggagttg ccgtcgatga cagcgcacat gttggtgttg gaagtcacga 4920gttcgaaggt agtggagttg ccgtcgatga cagcgcacat gttggtgttg gaagtcacga 4920

tgacgggggt ggggtcgatc tgagcggagg acttgcactt ctggtcgaca cgcaccttgc 4980tgacgggggt ggggtcgatc tgagcggagg acttgcactt ctggtcgaca cgcaccttgc 4980

taccacccag aatggccttg gcggattcga ccaccttggc agtcatcttg ccctcttccc 5040taccacccag aatggccttg gcggattcga ccaccttggc agtcatcttg ccctcttccc 5040

accagatgac catcttgtcg acgcagtcgt tgaaggggaa gttctcgttg gtccagttga 5100accagatgac catcttgtcg acgcagtcgt tgaaggggaa gttctcgttg gtccagttga 5100

cgcagccgta aaagggcacg gtatgggcga tggcttcggc gatgttggtc ttaccagtgg 5160cgcagccgta aaagggcacg gtatgggcga tggcttcggc gatgttggtc ttaccagtgg 5160

tagcgggacc gaagagccag atagtgttgc gcttgccgaa cttcttggta gcccaaccga 5220tagcgggacc gaagagccag atagtgttgc gcttgccgaa cttcttggta gcccaaccga 5220

ggaagacgga ggcggcatac tgggggtcgt agccgttgag ctccagaatc ttgtagatgc 5280ggaagacgga ggcggcatac tggggggtcgt agccgttgag ctccagaatc ttgtagatgc 5280

ggttggagga gatgtcctcc acgggctgtt gaccgaccag ataatcggga gcggtcttgg 5340ggttggagga gatgtcctcc acgggctgtt gaccgaccag ataatcggga gcggtcttgg 5340

tgaggctcat gatcttacca gcgttgtcga gggcagcctt gatctgggaa cgggagttgc 5400tgaggctcat gatcttacca gcgttgtcga gggcagcctt gatctgggaa cgggagttgc 5400

tggcagcatt gaagctgatg tagctggctt ggtcctcttg gatccactgc ttctcgctag 5460tggcagcatt gaagctgatg tagctggctt ggtcctcttg gatccactgc ttctcgctag 5460

tgatgccctt gtcgaccagc caaccgacca gttccatggt ggcccgggtt tcggaccgag 5520tgatgccctt gtcgaccagc caaccgacca gttccatggt ggcccgggtt tcggaccgag 5520

atccgcgccc gatggtggga cggtatgaat aatccggaat atttataggt ttttttatta 5580atccgcgccc gatggtggga cggtatgaat aatccggaat atttataggt ttttttatta 5580

caaaactgtt acgaaaacag taaaatactt atttatttgc gagatggtta tcattttaat 5640caaaactgtt acgaaaacag taaaatactt atttatttgc gagatggtta tcattttaat 5640

tatctccatg atctattaat attccggagt atacggacct ttaattcaac ccaacacaat 5700tatctccatg atctattaat attccggagt atacggacct ttaattcaac ccaacacaat 5700

atattatagt taaataagaa ttattatcaa atcatttgta tattaattaa aatactatac 5760atattatagt taaataagaa ttattatcaa atcatttgta tattaattaa aatactatac 5760

tgtaaattac attttattta caatcactcg acgaagactt gatcagcggc cgccaccatg 5820tgtaaattac attttattta caatcactcg acgaagactt gatcagcggc cgccaccatg 5820

gcggggtttt acgagattgt gattaaggtc cccagcgacc ttgacgagca tctgcccggc 5880gcggggtttt acgagattgt gattaaggtc cccagcgacc ttgacgagca tctgcccggc 5880

atttctgaca gctttgtgaa ctgggtggcc gagaaggaat gggagttgcc gccagattct 5940atttctgaca gctttgtgaa ctgggtggcc gagaaggaat gggagttgcc gccagattct 5940

gacatggatc tgaatctgat tgagcaggca cccctgaccg tggccgagaa gctgcagcgc 6000gacatggatc tgaatctgat tgagcaggca cccctgaccg tggccgagaa gctgcagcgc 6000

gactttctga cggaatggcg ccgtgtgagt aaggccccgg aggccctttt ctttgtgcaa 6060gactttctga cggaatggcg ccgtgtgagt aaggccccgg aggccctttt ctttgtgcaa 6060

tttgagaagg gagagagcta cttccacatg cacgtgctcg tggaaaccac cggggtgaaa 6120tttgagaagg gagagagcta cttccacatg cacgtgctcg tggaaaccac cggggtgaaa 6120

tccatggttt tgggacgttt cctgagtcag attcgcgaaa aactgattca gagaatttac 6180tccatggttt tgggacgttt cctgagtcag attcgcgaaa aactgattca gagaatttac 6180

cgcgggatcg agccgacttt gccaaactgg ttcgcggtca caaagaccag aaatggcgcc 6240cgcgggatcg agccgacttt gccaaactgg ttcgcggtca caaagaccag aaatggcgcc 6240

ggaggcggga acaaggtggt ggatgagtgc tacatcccca attacttgct ccccaaaacc 6300ggaggcggga acaaggtggt ggatgagtgc tacatcccca attacttgct ccccaaaacc 6300

cagcctgagc tccagtgggc gtggactaat atggaacagt atttaagcgc ctgtttgaat 6360cagcctgagc tccagtgggc gtggactaat atggaacagt atttaagcgc ctgtttgaat 6360

ctcacggagc gtaaacggtt ggtggcgcag catctgacgc acgtgtcgca gacgcaggag 6420ctcacggagc gtaaacggtt ggtggcgcag catctgacgc acgtgtcgca gacgcaggag 6420

cagaacaaag agaatcagaa tcccaattct gatgcgccgg tgatcagatc aaaaacttca 6480cagaacaaag agaatcagaa tcccaattct gatgcgccgg tgatcagatc aaaaacttca 6480

gccaggtaca tggagctggt cgggtggctc gtggacaagg ggattacctc ggagaagcag 6540gccaggtaca tggagctggt cgggtggctc gtggacaagg ggattacctc ggagaagcag 6540

tggatccagg aggaccaggc ctcatacatc tccttcaatg cggcctccaa ctcgcggtcc 6600tggatccagg aggaccaggc ctcatacatc tccttcaatg cggcctccaa ctcgcggtcc 6600

caaatcaagg ctgccttgga caatgcggga aagattatga gcctgactaa aaccgccccc 6660caaatcaagg ctgccttgga caatgcggga aagattatga gcctgactaa aaccgccccc 6660

gactacctgg tgggccagca gcccgtggag gacatttcca gcaatcggat ttataaaatt 6720gactacctgg tgggccagca gcccgtggag gacatttcca gcaatcggat ttataaaatt 6720

ttggaactaa acgggtacga tccccaatat gcggcttccg tctttctggg atgggccacg 6780ttggaactaa acgggtacga tccccaatat gcggcttccg tctttctggg atgggccacg 6780

aaaaagttcg gcaagaggaa caccatctgg ctgtttgggc ctgcaactac cgggaagacc 6840aaaaagttcg gcaagaggaa caccatctgg ctgtttgggc ctgcaactac cgggaagacc 6840

aacatcgcgg aggccatagc ccacactgtg cccttctacg ggtgcgtaaa ctggaccaat 6900aacatcgcgg aggccatagc ccacactgtg cccttctacg ggtgcgtaaa ctggaccaat 6900

gagaactttc ccttcaacga ctgtgtcgac aagatggtga tctggtggga ggaggggaag 6960gagaactttc ccttcaacga ctgtgtcgac aagatggtga tctggtggga ggaggggaag 6960

atgaccgcca aggtcgtgga gtcggccaaa gccattctcg gaggaagcaa ggtgcgcgtg 7020atgaccgcca aggtcgtgga gtcggccaaa gccattctcg gaggaagcaa ggtgcgcgtg 7020

gaccagaaat gcaagtcctc ggcccagata gacccgactc ccgtgatcgt cacctccaac 7080gaccagaaat gcaagtcctc ggcccagata gacccgactc ccgtgatcgt cacctccaac 7080

accaacatgt gcgccgtgat tgacgggaac tcaacgacct tcgaacacca gcagccgttg 7140accaacatgt gcgccgtgat tgacgggaac tcaacgacct tcgaacacca gcagccgttg 7140

caagaccgga tgttcaaatt tgaactcacc cgccgtctgg atcatgactt tgggaaggtc 7200caagaccgga tgttcaaatt tgaactcacc cgccgtctgg atcatgactt tgggaaggtc 7200

accaagcagg aagtcaaaga ctttttccgg tgggcaaagg atcacgtggt tgaggtggag 7260accaagcagg aagtcaaaga ctttttccgg tgggcaaagg atcacgtggt tgaggtggag 7260

catgaattct acgtcaaaaa gggtggagcc aagaaaagac ccgcccccag tgacgcagat 7320catgaattct acgtcaaaaa gggtggagcc aagaaaagac ccgcccccag tgacgcagat 7320

ataagtgagc ccaaacgggt gcgcgagtca gttgcgcagc catcgacgtc agacgcggaa 7380ataagtgagc ccaaacgggt gcgcgagtca gttgcgcagc catcgacgtc agacgcggaa 7380

gcttcgatca actacgcaga caggtaccaa aacaaatgtt ctcgtcacgt gggcatgaat 7440gcttcgatca actacgcaga caggtaccaa aacaaatgtt ctcgtcacgt gggcatgaat 7440

ctgatgctgt ttccctgcag acaatgcgag agaatgaatc agaattcaaa tatctgcttc 7500ctgatgctgtttccctgcag acaatgcgag agaatgaatc agaattcaaa tatctgcttc 7500

actcacggac agaaagactg tttagagtgc tttcccgtgt cagaatctca acccgtttct 7560actcacggac agaaagactg tttagagtgc tttcccgtgt cagaatctca acccgtttct 7560

gtcgtcaaaa aggcgtatca gaaactgtgc tacattcatc atatcatggg aaaggtgcca 7620gtcgtcaaaa aggcgtatca gaaactgtgc tacattcatc atatcatggg aaaggtgcca 7620

gacgcttgca ctgcctgcga tctggtcaat gtggatttgg atgactgcat ctttgaacaa 7680gacgcttgca ctgcctgcga tctggtcaat gtggatttgg atgactgcat ctttgaacaa 7680

taaatgattt aaatcaggta tggctgccga tggttatctt ccagattggc tcgaggacac 7740taaatgattt aaatcaggta tggctgccga tggttatctt ccagatggc tcgaggacac 7740

tctctctgat ctagagcctg cagtctcgac aagcttgtcg agaagtacta gaggatcata 7800tctctctgat ctagagcctg cagtctcgac aagcttgtcg agaagtacta gaggatcata 7800

atcagccata ccacatttgt agaggtttta cttgctttaa aaaacctccc acacctcccc 7860atcagccata ccacatttgt agaggtttta cttgctttaa aaaacctccc acacctcccc 7860

ctgaacctga aacataaaat gaatgcaatt gttgttgtta acttgtttat tgcagcttat 7920ctgaacctga aacataaaat gaatgcaatt gttgttgtta acttgtttat tgcagctttat 7920

aatggttaca aataaagcaa tagcatcaca aatttcacaa ataaagcatt tttttcactg 7980aatggttaca aataaagcaa tagcatcaca aatttcacaa ataaagcatttttttcactg 7980

cattctagtt gtggtttgtc caaactcatc aatgtatctt atcatgtctg gatctgatca 8040cattctagtt gtggtttgtc caaactcatc aatgtatctt atcatgtctg gatctgatca 8040

ctgcttgagc ctaggagatc cgaaccagat aagtgaaatc tagttccaaa ctattttgtc 8100ctgcttgagc ctaggagatc cgaaccagat aagtgaaatc tagttccaaa ctattttgtc 8100

atttttaatt ttcgtattag cttacgacgc tacacccagt tcccatctat tttgtcactc 8160atttttaatt ttcgtattag cttacgacgc tacacccagt tcccatctat tttgtcactc 8160

ttccctaaat aatccttaaa aactccattt ccacccctcc cagttcccaa ctattttgtc 8220ttccctaaat aatccttaaa aactccattt ccacccctcc cagttcccaa ctattttgtc 8220

cgcccacagc ggggcatttt tcttcctgtt atgtttttaa tcaaacatcc tgccaactcc 8280cgcccacagc ggggcatttt tcttcctgtt atgtttttaa tcaaacatcc tgccaactcc 8280

atgtgacaaa ccgtcatctt cggctacttt 8310atgtgacaaa ccgtcatctt cggctacttt 8310

<210> 15<210> 15

<211> 21<211> 21

<212> DNA<212> DNA

<213> 人工序列（Artificial Sequence）<213> Artificial Sequence

<220><220>

<223> P1<223> P1

<400> 15<400> 15

gatccggtac cacgcgtcta g 21gatccggtac cacgcgtcta g 21

<210> 16<210> 16

<211> 26<211> 26

<212> DNA<212> DNA

<213> 人工序列（Artificial Sequence）<213> Artificial Sequence

<220><220>

<223> P2<223> P2

<400> 16<400> 16

ctcgacgtcg actttacttg tacagc 26ctcgacgtcg actttacttg tacagc 26

<210> 17<210> 17

<211> 21<211> 21

<212> DNA<212> DNA

<213> 人工序列（Artificial Sequence）<213> Artificial Sequence

<220><220>

<223> P3<223> P3

<400> 17<400> 17

gcggggtttt acgagattgt g 21gcggggtttt acgagattgt g 21

<210> 18<210> 18

<211> 20<211> 20

<212> DNA<212> DNA

<213> 人工序列（Artificial Sequence）<213> Artificial Sequence

<220><220>

<223> P4<223> P4

<400> 18<400> 18

ggggtgcctg ctcaatcaga 20ggggtgcctg ctcaatcaga 20

<210> 19<210> 19

<211> 20<211> 20

<212> DNA<212> DNA

<213> 人工序列（Artificial Sequence）<213> Artificial Sequence

<220><220>

<223> P5<223> P5

<400> 19<400> 19

gcagcacaca ctgacatcca 20gcagcacaca ctgacatcca 20

<210> 20<210> 20

<211> 22<211> 22

<212> DNA<212> DNA

<213> 人工序列（Artificial Sequence）<213> Artificial Sequence

<220><220>

<223> P6<223> P6

<400> 20<400> 20

gatcaccggc gcatcagaat tg 22gatcaccggc gcatcagaat tg 22

<210> 21<210> 21

<211> 22<211> 22

<212> DNA<212> DNA

<213> 人工序列（Artificial Sequence）<213> Artificial Sequence

<220><220>

<223> P7<223> P7

<400> 21<400> 21

acttcaagat ccgccacaac at 22acttcaagat ccgccacaac at 22

<210> 22<210> 22

<211> 21<211> 21

<212> DNA<212> DNA

<213> 人工序列（Artificial Sequence）<213> Artificial Sequence

<220><220>

<223> P8<223> P8

<400> 22<400> 22

tctcgttggg gtcttgctca g 21tctcgttggg gtcttgctca g 21

<210> 23<210> 23

<211> 23<211> 23

<212> DNA<212> DNA

<213> 人工序列（Artificial Sequence）<213> Artificial Sequence

<220><220>

<223> M13 F<223> M13 F

<400> 23<400> 23

cccagtcacg acgttgtaaa acg 23cccagtcacg acgttgtaaa acg 23

<210> 24<210> 24

<211> 23<211> 23

<212> DNA<212> DNA

<213> 人工序列（Artificial Sequence）<213> Artificial Sequence

<220><220>

<223> M13 R<223> M13 R

<400> 24<400> 24

agcggataac aatttcacac agg 23agcggataac aatttcacac agg 23

Claims

1. An isolated nucleic acid molecule, which comprises, in sequence, a first polyA, a nucleotide sequence encoding a Rep78 protein, a first promoter, a second promoter, a nucleotide sequence encoding a Rep52 protein, and a second polyA, wherein the first promoter is a transcriptional promoter of the nucleotide sequence encoding the Rep78 protein and the first polyA, and the second promoter is a transcriptional promoter of the nucleotide sequence encoding the Rep52 protein and the second polyA, wherein the sequence of the nucleotide sequence encoding the Rep52 protein and/or the nucleotide sequence encoding the Rep78 protein is codon-optimized to avoid homologous recombination, the first promoter is a p10 promoter, and the second promoter is a polh promoter, the nucleotide sequence encoding the Rep78 protein is as shown in SEQ ID NO: 11, and the nucleotide sequence encoding the Rep52 protein is as shown in SEQ ID NO: 12.

2. The isolated nucleic acid molecule according to claim 1, wherein the p10 promoter comprises the nucleotide sequence shown in SEQ ID NO: 9.

3. The isolated nucleic acid molecule according to claim 1, wherein the polh promoter comprises the nucleotide sequence shown in SEQ ID NO: 10.

4. An isolated nucleic acid molecule according to claim 1, wherein the 5' end of the first promoter is directly or indirectly linked to the 5' end of the second promoter.

5. An isolated nucleic acid molecule according to claim 1, wherein the 3' end of the first promoter is directly or indirectly connected to the 5' end of the nucleotide sequence encoding Rep78.

6. An isolated nucleic acid molecule according to claim 1, wherein the 3' end of the nucleotide sequence encoding the Rep78 protein is directly or indirectly connected to the 5' end of the first polyA.

7. An isolated nucleic acid molecule according to claim 1, wherein the 3' end of the second promoter is directly or indirectly connected to the 5' end of the nucleotide sequence encoding Rep52.

8. An isolated nucleic acid molecule according to claim 1, wherein the 3' end of the nucleotide sequence encoding the Rep52 protein is directly or indirectly connected to the 5' end encoding the second polyA.

9. The isolated nucleic acid molecule according to claim 1, wherein the polyA is selected from any one of SV40 polyA and HSV TK polyA.

10. The isolated nucleic acid molecule according to claim 1, comprising the nucleotide sequence shown in SEQ ID NO: 8.

11. A vector comprising the isolated nucleic acid molecule of any one of claims 1-10.

The vector according to claim 11 , which is a viral vector.

The vector according to claim 11 , which is a baculovirus vector.

The vector according to claim 11 , which is a pFastBac vector.

15. The vector according to any one of claims 11-14, comprising the nucleotide sequence shown in SEQ ID NO: 14.

16. A cell comprising the isolated nucleic acid molecule of any one of claims 1-10 or the vector of any one of claims 11-15, wherein the cell is of a non-plant or animal species.

17. The cell according to claim 16, which is an insect cell.

18. The cell of claim 16, which is a Spodoptera frugiperda cell.

19. A baculovirus expression system, comprising a first baculovirus vector and a second baculovirus vector comprising a nucleic acid sequence encoding a target gene, wherein the first baculovirus vector is the baculovirus vector according to any one of claims 13 to 14.

20. The baculovirus expression system according to claim 19, wherein from the 5' end to the 3' end, the nucleic acid sequence encoding the target gene comprises, in sequence, a first parvovirus inverted terminal repeat (ITR), the target gene, and a second ITR.

21 . The baculovirus expression system according to claim 20 , wherein at least one promoter is further included between the first ITR and the target gene.

22. The baculovirus expression system according to claim 20, wherein at least one eukaryotic promoter is further included between the first ITR and the target gene.

23. The baculovirus expression system according to claim 20, wherein at least one mammalian cell promoter is further included between the first ITR and the target gene.

24. The baculovirus expression system according to claim 20, wherein a mammalian cell promoter and an insect cell promoter are further included between the first ITR and the target gene.

25. The baculovirus expression system according to claim 24, wherein the mammalian cell promoter is selected from a ubiquitous promoter and a tissue-specific promoter.

26. The baculovirus expression system according to claim 25, wherein the ubiquitous promoter is CMV, SV40, EF1a, CAG or UBC promoter.

27. The baculovirus expression system according to claim 25, wherein the tissue-specific promoter is ALB, hAAT, TBG, TTR, GFAP, MHCK7 or hSyn promoter.

28. The baculovirus expression system of claim 24, wherein the insect cell promoter is the p10 promoter.

29. The baculovirus expression system according to claim 24, wherein the mammalian promoter and the insect cell promoter are CMV and p10 promoters.

30. An insect cell comprising the baculovirus expression system of any one of claims 19-29.

31. The insect cell according to claim 30, further comprising a third nucleotide sequence, wherein the third nucleotide sequence contains two parvovirus ITR nucleotide sequences and at least one nucleotide sequence encoding a target gene, and wherein the at least one nucleotide sequence encoding a target gene is located between the two parvovirus ITR nucleotide sequences.

32. The insect cell of claim 31, wherein the parvovirus is an adeno-associated virus.

33. The insect cell according to claim 31, wherein the third nucleotide sequence is part of another nucleic acid construct, wherein each nucleotide sequence encoding a gene of interest is operably linked to an expression control sequence for mammalian expression.

34. The insect cell according to claim 33, wherein the nucleic acid construct is an insect cell compatible vector.

35. The insect cell of claim 33, wherein the nucleic acid construct is a baculovirus vector.

36. Use of the baculovirus expression system according to any one of claims 19 to 29 or the insect cell according to any one of claims 30 to 35 in preparing a target nucleic acid molecule.

37. The use according to claim 36, wherein the target nucleic acid molecule is a linear DNA molecule (neDNA) with covalently blocked ends.

38. A method for preparing a target nucleic acid molecule, comprising culturing the insect cell according to any one of claims 30-35.

39. The preparation method according to claim 38, comprising:

a) providing the baculovirus expression system according to any one of claims 19 to 29;

b) inserting the target gene sequence into the second baculovirus vector;

c) co-transfecting the first baculovirus vector and the second baculovirus vector into insect cells;

d) growing the insect cells under conditions that allow replication and release of the DNA comprising the gene of interest;

e) Collecting target nucleic acid molecules.

40. The preparation method according to claim 39, further comprising protecting and isolating the target nucleic acid molecule.

41. A kit comprising the isolated nucleic acid molecule of any one of claims 1-10, the baculovirus expression system of any one of claims 19-29, or the insect cell of any one of claims 30-35.