TW202507002A

TW202507002A - Novel recombinase enzymes for site-specific dna-recombination

Info

Publication number: TW202507002A
Application number: TW113114269A
Authority: TW
Inventors: 米莉卡耶利西奇; 盧卡斯施密特; 弗蘭克巴克霍爾茲
Original assignee: 德勒斯登工業大學
Priority date: 2023-04-17
Filing date: 2024-04-17
Publication date: 2025-02-16
Also published as: CN121263522A; WO2024218137A1

Abstract

The present invention is in the field of recombining enzymes and provides a method for producing a site-specific DNA-recombination, comprising the steps of (a) contacting a nucleic acid comprising at least a first and a second recognition site which are essentially identical or essentially reverse complementary to each other with a protein having recombinase activity, and (b) allowing the protein having recombinase activity to produce the site-specific DNA-recombination, wherein a recognition site comprises a first half-site, a spacer and a second half-site, and wherein essentially identical or essentially reverse complementary to each other means that the nucleotide sequence of the first and the second half-site in the first recognition site may deviate in up to two nucleotides from the nucleotide sequence of the first and the second half-site in the second recognition site, wherein the protein having recombinase activity comprises an amino acid sequence having at least 80% identity to any one of SEQ ID NOs: 1 to 9, and wherein the at least two recognition sites comprise a nucleic acid sequence according to or reverse complementary to any one of SEQ ID NOs: 10 to 17 or to a functional mutant thereof, wherein the functional mutant comprises a nucleic acid sequence having at least 60% sequence identity to SEQ ID NO: 10 to 17.

Description

Novel recombinase enzymes for site-specific DNA recombination

本發明是有關具有催化位點特異性DNA重組之重組酶活性的蛋白質的用途，以及用於產生位點特異性DNA重組的方法。本發明可單獨或與其他重組酶系統組合應用於基因操作，例如在醫學或醫學研究中。The present invention relates to the use of proteins having recombinase activity for catalyzing site-specific DNA recombination, and methods for producing site-specific DNA recombination. The present invention can be applied alone or in combination with other recombinase systems for gene manipulation, for example in medicine or medical research.

位點特異性重組酶(Site-specific recombinases，SSR)是基因體標靶修飾的可靠工具，在研究、醫學和生物技術領域有多種應用。SSR在演化和機制上分為兩個不同的酵素家族：酪胺酸和絲胺酸重組酶(Meinke et al., 2016)。儘管如此，不同於改變DNA的系統(諸如CRISPR-Cas和其他基於核酸酶的技術)，SSR可以催化DNA股的切割和立即重新封口，而無需額外蛋白質的幫助(Meinke et al., 2016)。雖然正在大力開發基於核酸酶的方法來擴展其實用性(即鹼基編輯器和引導編輯器(prime editor)) (Komor et al., 2016；Gaudelli et al., 2017；Anzalone et al., 2022；Anzalone et al., 2019)，但這些系統通常仍然會引入DNA切口(nicks)並仰賴於細胞內在修復機制(Meinke et al., 2016)。相比之下，SSR可以自主運行，且重組的結果非常可預測。此外，大多數SSR的尺寸相對較小，使它們更便利於不同遞送載體(Meinke et al., 2016)。Site-specific recombinases (SSRs) are reliable tools for targeted modification of the genome with a variety of applications in research, medicine, and biotechnology. SSRs are evolutionarily and mechanistically divided into two distinct enzyme families: tyrosine and serine recombinases (Meinke et al., 2016). Nevertheless, unlike systems that modify DNA (such as CRISPR-Cas and other nuclease-based technologies), SSRs can catalyze the cleavage and immediate resealing of DNA strands without the help of additional proteins (Meinke et al., 2016). Although nuclease-based approaches are being vigorously developed to expand their utility (i.e., base editors and prime editors) (Komor et al., 2016; Gaudelli et al., 2017; Anzalone et al., 2022; Anzalone et al., 2019), these systems still generally introduce DNA nicks and rely on the cell's intrinsic repair machinery (Meinke et al., 2016). In contrast, SSRs can operate autonomously, and the outcome of recombination is very predictable. In addition, the relatively small size of most SSRs makes them more convenient for different delivery vectors (Meinke et al., 2016).

酪胺酸重組酶(Y-SSR) (諸如Cre和Flp)因其簡單性和在異源性宿主中進行有效基因體修飾的能力而被廣泛用於基因體工程改造(Sauer et al., 1988)。因此，這些SSR被廣泛用於建模並了解這些類型的酵素如何工作。Cre/loxP系統源自於噬菌體P1，由重組酶Cre和34-bp目標位點loxP組成(Sternberg et al., 1981)。loxP位點是回文序列，具有兩個被8 bp間隔子分隔開的13-bp反向重複序列。每個半位點被一個重組酶單元體(protomer)結合，在兩個loxP位點上形成四聚體複合物，一起催化間隔子區域內的股交換(Duyne and Hamilton, 1981)。取決於間隔子的相對位向，Cre能夠執行各種反應，諸如切除、併入、倒置(inversion)和易位(translocation) (Meinke et al., 2016)。此外，當基因體中存在兩個異特異性目標位點時，Cre單獨或與其他SSR組合，可以進行重組酶媒介的盒交換(recombinase-mediated cassette exchange，RMCE)以便精確替換DNA片段(Meinke et al., 2016；Anderson et al., 2012；Minorikawa and Nakayama, 2011)。Tyrosine recombinases (Y-SSRs) such as Cre and Flp are widely used for genome engineering due to their simplicity and ability to perform efficient genome modification in heterologous hosts (Sauer et al., 1988). Therefore, these SSRs are widely used to model and understand how these types of enzymes work. The Cre/loxP system is derived from bacteriophage P1 and consists of the recombinase Cre and a 34-bp target site loxP (Sternberg et al., 1981). The loxP site is a palindromic sequence with two 13-bp inverted repeats separated by an 8 bp spacer. Each half-site is bound by a recombinase unit (protomer), forming a tetrameric complex at the two loxP sites, which together catalyze strand exchange within the spacer region (Duyne and Hamilton, 1981). Depending on the relative orientation of the spacer, Cre can perform various reactions, such as excision, incorporation, inversion, and translocation (Meinke et al., 2016). In addition, when two heterospecific target sites are present in the genome, Cre alone or in combination with other SSRs can perform recombinase-mediated cassette exchange (RMCE) to precisely replace DNA fragments (Meinke et al., 2016; Anderson et al., 2012; Minorikawa and Nakayama, 2011).

除了最為廣泛使用之酪胺酸類位點特異性重組酶系統的Cre/loxP和Flp/FRT系統以外，其他重組酶系統也是本領域已知的。US 7,422,889和US 7,915,037揭露所謂的Dre/rox系統，其包含分離自腸內菌噬菌體D6的Dre重組酶，其辨識被稱為rox位點的位點。更多已知的重組酶系統是從弧菌屬質體p0908分離的VCre/VloxP系統以及sCre/SloxP系統(WO 2010/143606 A1；Suzuki and Nakayama, 2011)。In addition to the most widely used tyrosine site-specific recombinase systems, Cre/loxP and Flp/FRT systems, other recombinase systems are also known in the art. US 7,422,889 and US 7,915,037 disclose the so-called Dre/rox system, which comprises a Dre recombinase isolated from the enterobacterial phage D6, which recognizes a site called the rox site. More known recombinase systems are the VCre/VloxP system isolated from the Vibrio plasmid p0908 and the sCre/SloxP system (WO 2010/143606 A1; Suzuki and Nakayama, 2011).

其他位點特異性DNA重組酶系統是EP 2 877 585 B1中所揭示的Nigri/nox系統、EP 2 690 177 B1中所揭示的Vika/vox系統，以及EP 3 263 708 B1中所揭示的Panto/pox系統。Other site-specific DNA recombinase systems are the Nigri/nox system disclosed in EP 2 877 585 B1, the Vika/vox system disclosed in EP 2 690 177 B1, and the Panto/pox system disclosed in EP 3 263 708 B1.

本技術領域中已知的重組酶系統在不同來源的細胞中表現出不同活性。對於許多應用(諸如生產具有條件式基因剔除的基因轉殖動物)來說，兩種或更多種重組酶系統彼此組合使用。然而，新興的複雜遺傳學研究和應用需要同時使用多個重組酶。同時，由於例如基因體特異性(隱藏辨識目標位點上的脫靶活性)，並非所有詳細描述的位點特異性重組酶都同樣適用於所有模型生物。為此，根據目標生物體或實驗設定來挑選最佳重組酶非常重要。因而，本技術領域中需要提供額外的特異性重組酶系統，其可用於在多種細胞類型中催化短目標上的位點特異性DNA重組，並且具有高活性和低毒性。The recombinase systems known in the art show different activities in cells of different origin. For many applications (e.g. the production of transgenic animals with conditional gene knockouts), two or more recombinase systems are used in combination with each other. However, emerging complex genetic studies and applications require the simultaneous use of multiple recombinases. At the same time, not all well-described site-specific recombinases are equally applicable to all model organisms due to, for example, genome specificity (hidden off-target activities at the identified target site). For this reason, it is important to select the best recombinase depending on the target organism or the experimental setup. Therefore, there is a need in the art to provide additional specific recombinase systems that can be used to catalyze site-specific DNA recombination on short targets in a variety of cell types with high activity and low toxicity.

因此，本發明的一個目的是提供用於位點特異性基因重組的新型重組酶系統，其可用於各種細胞類型中。本發明的另一個目的是提供一種新型、高度特異性的重組酶系統，其在對細胞增生具有較小抑制作用或基本上沒有抑制作用的情況下用於位點特異性基因重組。Therefore, one object of the present invention is to provide a novel recombinase system for site-specific gene recombination that can be used in various cell types. Another object of the present invention is to provide a novel, highly specific recombinase system for site-specific gene recombination with little or substantially no inhibitory effect on cell proliferation.

本發明的基本目的可透過提供一種用於產生位點特異性DNA重組的方法來解決。根據本發明，用於產生位點特異性DNA重組的方法包含以下步驟：a)使包含基本上相同或基本上彼此反向互補的至少第一辨識位點和第二辨識位點的核酸與具有重組酶活性的蛋白質接觸，以及b)容許具有重組酶活性的蛋白質產生位點特異性DNA重組。根據該方法，辨識位點包含第一半位點、間隔子以及第二半位點，而基本上相同或基本上彼此反向互補表示第一辨識位點中的第一半位點和第二半位點的核苷酸序列可能與第二辨識位點中的第一半位點和第二半位點的核苷酸序列相差至多兩個核苷酸。The basic object of the present invention can be solved by providing a method for generating site-specific DNA recombination. According to the present invention, the method for generating site-specific DNA recombination comprises the following steps: a) contacting a nucleic acid comprising at least a first recognition site and a second recognition site that are substantially identical or substantially inversely complementary to each other with a protein having recombinase activity, and b) allowing the protein having recombinase activity to generate site-specific DNA recombination. According to the method, the recognition site comprises a first half site, a spacer, and a second half site, and substantially identical or substantially inversely complementary to each other means that the nucleotide sequence of the first half site and the second half site in the first recognition site may differ from the nucleotide sequence of the first half site and the second half site in the second recognition site by at most two nucleotides.

根據本發明方法的另一個態樣，具有重組酶活性的蛋白質包含與SEQ ID NO：7具有至少80%同一性的胺基酸序列，且至少兩個辨識位點包含根據SEQ ID NO：16或其功能突變體的核酸序列或與SEQ ID NO：16或其功能突變體反向互補的核酸序列，其中功能突變體包含與SEQ ID NO：16具有至少60%序列同一性的核酸序列。According to another aspect of the method of the present invention, the protein having recombinase activity comprises an amino acid sequence having at least 80% identity with SEQ ID NO: 7, and at least two recognition sites comprise a nucleic acid sequence according to SEQ ID NO: 16 or a functional variant thereof or a nucleic acid sequence that is reverse complementary to SEQ ID NO: 16 or a functional variant thereof, wherein the functional variant comprises a nucleic acid sequence having at least 60% sequence identity with SEQ ID NO: 16.

根據本發明方法的另一個態樣，具有重組酶活性的蛋白質包含與SEQ ID NO：3具有至少80%同一性的胺基酸序列，且至少兩個辨識位點包含根據SEQ ID NO：12或其功能突變體的核酸序列或與SEQ ID NO：12或其功能突變體反向互補的核酸序列，其中功能突變體包含與SEQ ID NO：12具有至少60%序列同一性的核酸序列。According to another aspect of the method of the present invention, the protein having recombinase activity comprises an amino acid sequence having at least 80% identity with SEQ ID NO: 3, and at least two recognition sites comprise a nucleic acid sequence according to SEQ ID NO: 12 or a functional variant thereof or a nucleic acid sequence that is reverse complementary to SEQ ID NO: 12 or a functional variant thereof, wherein the functional variant comprises a nucleic acid sequence having at least 60% sequence identity with SEQ ID NO: 12.

根據本發明方法的一個態樣，具有重組酶活性的蛋白質包含與SEQ ID NO：1具有至少80%同一性的胺基酸序列，且至少兩個辨識位點包含根據SEQ ID NO：10或其功能突變體的核酸序列或與SEQ ID NO：10或其功能突變體反向互補的核酸序列，其中功能突變體包含與SEQ ID NO：10具有至少60%序列同一性的核酸序列。According to one aspect of the method of the present invention, the protein having recombinase activity comprises an amino acid sequence having at least 80% identity with SEQ ID NO: 1, and at least two recognition sites comprise a nucleic acid sequence according to SEQ ID NO: 10 or a functional variant thereof or a nucleic acid sequence that is reverse complementary to SEQ ID NO: 10 or a functional variant thereof, wherein the functional variant comprises a nucleic acid sequence having at least 60% sequence identity with SEQ ID NO: 10.

根據本發明方法的另一個態樣，具有重組酶活性的蛋白質包含與SEQ ID NO：2具有至少80%同一性的胺基酸序列，且至少兩個辨識位點包含根據SEQ ID NO：11或其功能突變體的核酸序列或與SEQ ID NO：11或其功能突變體反向互補的核酸序列，其中功能突變體包含與SEQ ID NO：11具有至少60%序列同一性的核酸序列。According to another aspect of the method of the present invention, the protein having recombinase activity comprises an amino acid sequence having at least 80% identity with SEQ ID NO: 2, and at least two recognition sites comprise a nucleic acid sequence according to SEQ ID NO: 11 or a functional variant thereof or a nucleic acid sequence that is reverse complementary to SEQ ID NO: 11 or a functional variant thereof, wherein the functional variant comprises a nucleic acid sequence having at least 60% sequence identity with SEQ ID NO: 11.

根據本發明方法的另一個態樣，具有重組酶活性的蛋白質包含與SEQ ID NO：4具有至少80%同一性的胺基酸序列，且至少兩個辨識位點包含根據SEQ ID NO：13或其功能突變體的核酸序列或與SEQ ID NO：13或其功能突變體反向互補的核酸序列，其中功能突變體包含與SEQ ID NO：13具有至少60%序列同一性的核酸序列。According to another aspect of the method of the present invention, the protein having recombinase activity comprises an amino acid sequence having at least 80% identity with SEQ ID NO: 4, and at least two recognition sites comprise a nucleic acid sequence according to SEQ ID NO: 13 or a functional variant thereof or a nucleic acid sequence that is reverse complementary to SEQ ID NO: 13 or a functional variant thereof, wherein the functional variant comprises a nucleic acid sequence having at least 60% sequence identity with SEQ ID NO: 13.

根據本發明方法的另一個態樣，具有重組酶活性的蛋白質包含與SEQ ID NO：5具有至少80%同一性的胺基酸序列，且至少兩個辨識位點包含根據SEQ ID NO：14或其功能突變體的核酸序列或與SEQ ID NO：14或其功能突變體反向互補的核酸序列，其中功能突變體包含與SEQ ID NO：14具有至少60%序列同一性的核酸序列。According to another aspect of the method of the present invention, the protein having recombinase activity comprises an amino acid sequence having at least 80% identity with SEQ ID NO: 5, and at least two recognition sites comprise a nucleic acid sequence according to SEQ ID NO: 14 or a functional variant thereof or a nucleic acid sequence that is reverse complementary to SEQ ID NO: 14 or a functional variant thereof, wherein the functional variant comprises a nucleic acid sequence having at least 60% sequence identity with SEQ ID NO: 14.

根據本發明方法的另一個態樣，具有重組酶活性的蛋白質包含與SEQ ID NO：6具有至少80%同一性的胺基酸序列，且至少兩個辨識位點包含根據SEQ ID NO：15或其功能突變體的核酸序列或與SEQ ID NO：15或其功能突變體反向互補的核酸序列，其中功能突變體包含與SEQ ID NO：15具有至少60%序列同一性的核酸序列。According to another aspect of the method of the present invention, the protein having recombinase activity comprises an amino acid sequence having at least 80% identity with SEQ ID NO: 6, and at least two recognition sites comprise a nucleic acid sequence according to SEQ ID NO: 15 or a functional variant thereof or a nucleic acid sequence that is reverse complementary to SEQ ID NO: 15 or a functional variant thereof, wherein the functional variant comprises a nucleic acid sequence having at least 60% sequence identity with SEQ ID NO: 15.

根據本發明方法的另一個態樣，具有重組酶活性的蛋白質包含與SEQ ID NO：8具有至少80%同一性的胺基酸序列，且至少兩個辨識位點包含根據SEQ ID NO：17或其功能突變體的核酸序列或與SEQ ID NO：17或其功能突變體反向互補的核酸序列，其中功能突變體包含與SEQ ID NO：17具有至少60%序列同一性的核酸序列。According to another aspect of the method of the present invention, the protein having recombinase activity comprises an amino acid sequence having at least 80% identity with SEQ ID NO: 8, and at least two recognition sites comprise a nucleic acid sequence according to SEQ ID NO: 17 or a functional variant thereof or a nucleic acid sequence that is reverse complementary to SEQ ID NO: 17 or a functional variant thereof, wherein the functional variant comprises a nucleic acid sequence having at least 60% sequence identity with SEQ ID NO: 17.

根據本發明方法的另一個態樣，具有重組酶活性的蛋白質包含與SEQ ID NO：9具有至少80%同一性的胺基酸序列，且至少兩個辨識位點包含根據SEQ ID NO：15或其功能突變體的核酸序列或與SEQ ID NO：15或其功能突變體反向互補的核酸序列，其中功能突變體包含與SEQ ID NO：15具有至少60%序列同一性的核酸序列。According to another aspect of the method of the present invention, the protein having recombinase activity comprises an amino acid sequence having at least 80% identity with SEQ ID NO: 9, and at least two recognition sites comprise a nucleic acid sequence according to SEQ ID NO: 15 or a functional variant thereof or a nucleic acid sequence that is reverse complementary to SEQ ID NO: 15 or a functional variant thereof, wherein the functional mutant comprises a nucleic acid sequence having at least 60% sequence identity with SEQ ID NO: 15.

根據一個實施例，具有重組酶活性的蛋白質包含至少兩個蛋白質單體。According to one embodiment, the protein having recombinase activity comprises at least two protein monomers.

根據一個實施例，被重組的核酸序列存在於細胞中。According to one embodiment, the recombinant nucleic acid sequence is present in a cell.

根據一個較佳實施例，該方法進一步包含將編碼具有重組酶活性的蛋白質的核酸引入細胞中的步驟。According to a preferred embodiment, the method further comprises the step of introducing a nucleic acid encoding a protein having recombinase activity into the cell.

根據另一個實施例，細胞包含編碼具有重組酶活性的蛋白質的核酸。According to another embodiment, the cell comprises a nucleic acid encoding a protein having recombinase activity.

根據一個較佳實施例，編碼具有重組酶活性的蛋白質的核酸包含調節核酸序列，且其中編碼具有重組酶活性的蛋白質的核酸的表現受到調節核酸序列所調節。According to a preferred embodiment, the nucleic acid encoding the protein having recombinase activity comprises a regulatory nucleic acid sequence, and wherein the expression of the nucleic acid encoding the protein having recombinase activity is regulated by the regulatory nucleic acid sequence.

根據又一個實施例，細胞是真核細胞或細菌細胞。According to yet another embodiment, the cell is a eukaryotic cell or a bacterial cell.

根據又一個態樣，本發明提供與SEQ ID NO：7、SEQ ID NO：3、SEQ ID NO：1、SEQ ID NO：2、SEQ ID NO：4、SEQ ID NO：5、SEQ ID NO：6、SEQ ID NO：8或SEQ ID NO：9具有至少80%同一性的蛋白質用於產生位點特異性DNA重組的用途。According to another aspect, the present invention provides the use of a protein having at least 80% identity to SEQ ID NO:7, SEQ ID NO:3, SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:8 or SEQ ID NO:9 for generating site-specific DNA recombination.

根據又再一個態樣，本發明提供具有重組酶活性的蛋白質用於在基本上相同或基本上彼此反向互補的辨識位點處催化位點特異性DNA重組的用途，其中辨識位點包含第一半位點、間隔子以及第二半位點，且其中基本上相同或基本上彼此反向互補表示第一辨識位點中的第一半位點和第二半位點的核苷酸序列可能與第二辨識位點中的第一半位點和第二半位點的核苷酸序列相差至多兩個核苷酸。According to yet another aspect, the present invention provides a use of a protein having recombinase activity for catalyzing site-specific DNA recombination at recognition sites that are substantially identical or substantially reverse complementary to each other, wherein the recognition sites comprise a first half site, a spacer, and a second half site, and wherein substantially identical or substantially reverse complementary to each other means that the nucleotide sequences of the first half site and the second half site in the first recognition site may differ from the nucleotide sequences of the first half site and the second half site in the second recognition site by at most two nucleotides.

根據本發明用途的另一個態樣，具有重組酶活性的蛋白質包含與SEQ ID NO：7具有至少80%同一性的胺基酸序列，且至少兩個辨識位點包含根據SEQ ID NO：16或其功能突變體的核酸序列或與SEQ ID NO：16或其功能突變體反向互補的核酸序列，其中功能突變體包含與SEQ ID NO：16具有至少60%序列同一性的核酸序列。According to another aspect of the use of the present invention, the protein having recombinase activity comprises an amino acid sequence having at least 80% identity with SEQ ID NO: 7, and at least two recognition sites comprise a nucleic acid sequence according to SEQ ID NO: 16 or a functional variant thereof or a nucleic acid sequence that is reverse complementary to SEQ ID NO: 16 or a functional variant thereof, wherein the functional variant comprises a nucleic acid sequence having at least 60% sequence identity with SEQ ID NO: 16.

根據本發明用途的另一個態樣，具有重組酶活性的蛋白質包含與SEQ ID NO：3具有至少80%同一性的胺基酸序列，且至少兩個辨識位點包含根據SEQ ID NO：12或其功能突變體的核酸序列或與SEQ ID NO：12或其功能突變體反向互補的核酸序列，其中功能突變體包含與SEQ ID NO：12具有至少60%序列同一性的核酸序列。According to another aspect of the use of the present invention, the protein having recombinase activity comprises an amino acid sequence having at least 80% identity with SEQ ID NO: 3, and at least two recognition sites comprise a nucleic acid sequence according to SEQ ID NO: 12 or a functional variant thereof or a nucleic acid sequence that is reverse complementary to SEQ ID NO: 12 or a functional variant thereof, wherein the functional variant comprises a nucleic acid sequence having at least 60% sequence identity with SEQ ID NO: 12.

根據本發明用途的一個態樣，具有重組酶活性的蛋白質包含與SEQ ID NO：1具有至少80%同一性的胺基酸序列，且至少兩個辨識位點包含根據SEQ ID NO：10或其功能突變體的核酸序列或與SEQ ID NO：10或其功能突變體反向互補的核酸序列，其中功能突變體包含與SEQ ID NO：10具有至少60%序列同一性的核酸序列。According to one aspect of the use of the present invention, the protein having recombinase activity comprises an amino acid sequence having at least 80% identity with SEQ ID NO: 1, and at least two recognition sites comprise a nucleic acid sequence according to SEQ ID NO: 10 or a functional variant thereof or a nucleic acid sequence that is reverse complementary to SEQ ID NO: 10 or a functional variant thereof, wherein the functional variant comprises a nucleic acid sequence having at least 60% sequence identity with SEQ ID NO: 10.

根據本發明用途的另一個態樣，具有重組酶活性的蛋白質包含與SEQ ID NO：2具有至少80%同一性的胺基酸序列，且至少兩個辨識位點包含根據SEQ ID NO：11或其功能突變體的核酸序列或與SEQ ID NO：11或其功能突變體反向互補的核酸序列，其中功能突變體包含與SEQ ID NO：11具有至少60%序列同一性的核酸序列。According to another aspect of the use of the present invention, the protein having recombinase activity comprises an amino acid sequence having at least 80% identity with SEQ ID NO: 2, and at least two recognition sites comprise a nucleic acid sequence according to SEQ ID NO: 11 or a functional variant thereof or a nucleic acid sequence that is reverse complementary to SEQ ID NO: 11 or a functional variant thereof, wherein the functional variant comprises a nucleic acid sequence having at least 60% sequence identity with SEQ ID NO: 11.

根據本發明用途的另一個態樣，具有重組酶活性的蛋白質包含與SEQ ID NO：4具有至少80%同一性的胺基酸序列，且至少兩個辨識位點包含根據SEQ ID NO：13或其功能突變體的核酸序列或與SEQ ID NO：13或其功能突變體反向互補的核酸序列，其中功能突變體包含與SEQ ID NO：13具有至少60%序列同一性的核酸序列。According to another aspect of the use of the present invention, the protein having recombinase activity comprises an amino acid sequence having at least 80% identity with SEQ ID NO: 4, and at least two recognition sites comprise a nucleic acid sequence according to SEQ ID NO: 13 or a functional variant thereof or a nucleic acid sequence that is reverse complementary to SEQ ID NO: 13 or a functional variant thereof, wherein the functional variant comprises a nucleic acid sequence having at least 60% sequence identity with SEQ ID NO: 13.

根據本發明用途的另一個態樣，具有重組酶活性的蛋白質包含與SEQ ID NO：5具有至少80%同一性的胺基酸序列，且至少兩個辨識位點包含根據SEQ ID NO：14或其功能突變體的核酸序列或與SEQ ID NO：14或其功能突變體反向互補的核酸序列，其中功能突變體包含與SEQ ID NO：14具有至少60%序列同一性的核酸序列。According to another aspect of the use of the present invention, the protein having recombinase activity comprises an amino acid sequence having at least 80% identity with SEQ ID NO: 5, and at least two recognition sites comprise a nucleic acid sequence according to SEQ ID NO: 14 or a functional variant thereof or a nucleic acid sequence that is reverse complementary to SEQ ID NO: 14 or a functional variant thereof, wherein the functional variant comprises a nucleic acid sequence having at least 60% sequence identity with SEQ ID NO: 14.

根據本發明用途的另一個態樣，具有重組酶活性的蛋白質包含與SEQ ID NO：6具有至少80%同一性的胺基酸序列，且至少兩個辨識位點包含根據SEQ ID NO：15或其功能突變體的核酸序列或與SEQ ID NO：15或其功能突變體反向互補的核酸序列，其中功能突變體包含與SEQ ID NO：15具有至少60%序列同一性的核酸序列。According to another aspect of the use of the present invention, the protein having recombinase activity comprises an amino acid sequence having at least 80% identity with SEQ ID NO: 6, and at least two recognition sites comprise a nucleic acid sequence according to SEQ ID NO: 15 or a functional variant thereof or a nucleic acid sequence that is reverse complementary to SEQ ID NO: 15 or a functional variant thereof, wherein the functional mutant comprises a nucleic acid sequence having at least 60% sequence identity with SEQ ID NO: 15.

根據本發明用途的另一個態樣，具有重組酶活性的蛋白質包含與SEQ ID NO：8具有至少80%同一性的胺基酸序列，且至少兩個辨識位點包含根據SEQ ID NO：17或其功能突變體的核酸序列或與SEQ ID NO：17或其功能突變體反向互補的核酸序列，其中功能突變體包含與SEQ ID NO：17具有至少60%序列同一性的核酸序列。According to another aspect of the use of the present invention, the protein having recombinase activity comprises an amino acid sequence having at least 80% identity with SEQ ID NO: 8, and at least two recognition sites comprise a nucleic acid sequence according to SEQ ID NO: 17 or a functional variant thereof or a nucleic acid sequence that is reverse complementary to SEQ ID NO: 17 or a functional variant thereof, wherein the functional variant comprises a nucleic acid sequence having at least 60% sequence identity with SEQ ID NO: 17.

根據本發明用途的另一個態樣，具有重組酶活性的蛋白質包含與SEQ ID NO：9具有至少80%同一性的胺基酸序列，且至少兩個辨識位點包含根據SEQ ID NO：15或其功能突變體的核酸序列或與SEQ ID NO：15或其功能突變體反向互補的核酸序列，其中功能突變體包含與SEQ ID NO：15具有至少60%序列同一性的核酸序列。According to another aspect of the use of the present invention, the protein having recombinase activity comprises an amino acid sequence having at least 80% identity with SEQ ID NO: 9, and at least two recognition sites comprise a nucleic acid sequence according to SEQ ID NO: 15 or a functional variant thereof or a nucleic acid sequence that is reverse complementary to SEQ ID NO: 15 or a functional variant thereof, wherein the functional mutant comprises a nucleic acid sequence having at least 60% sequence identity with SEQ ID NO: 15.

根據另一個態樣，本發明提供一種長度不超過40個鹼基對的核酸，並且包含： (i)與SEQ ID NO：16具有至少80%序列同一性的核酸序列或與其反向互補的核酸序列；或 (ii)與SEQ ID NO：12具有至少80%序列同一性的核酸序列或與其反向互補的核酸序列；或 (iii)與SEQ ID NO：10具有至少80%序列同一性的核酸序列或與其反向互補的核酸序列；或 (iv)與SEQ ID NO：11具有至少80%序列同一性的核酸序列或與其反向互補的核酸序列；或 (v)與SEQ ID NO：13具有至少80%序列同一性的核酸序列或與其反向互補的核酸序列；或 (vi)與SEQ ID NO：14具有至少80%序列同一性的核酸序列或與其反向互補的核酸序列；或 (vii)與SEQ ID NO：15具有至少80%序列同一性的核酸序列或與其反向互補的核酸序列；或 (viii)與SEQ ID NO：17具有至少80％序列同一性的核酸序列或與其反向互補的核酸序列。 According to another aspect, the present invention provides a nucleic acid having a length of no more than 40 base pairs and comprising: (i) a nucleic acid sequence having at least 80% sequence identity with SEQ ID NO: 16 or a nucleic acid sequence that is reverse complementary to it; or (ii) a nucleic acid sequence having at least 80% sequence identity with SEQ ID NO: 12 or a nucleic acid sequence that is reverse complementary to it; or (iii) a nucleic acid sequence having at least 80% sequence identity with SEQ ID NO: 10 or a nucleic acid sequence that is reverse complementary to it; or (iv) a nucleic acid sequence having at least 80% sequence identity with SEQ ID NO: 11 or a nucleic acid sequence that is reverse complementary to it; or (v) a nucleic acid sequence having at least 80% sequence identity with SEQ ID NO: 13 or a nucleic acid sequence that is reverse complementary to it; or (vi) a nucleic acid sequence having at least 80% sequence identity with SEQ ID NO: 14 or a nucleic acid sequence that is reverse complementary to it; or (vii) a nucleic acid sequence having at least 80% sequence identity with SEQ ID NO: 15 or a nucleic acid sequence that is reverse complementary to it; NO: 15 has at least 80% sequence identity or a nucleic acid sequence that is reverse complementary to it; or (viii) a nucleic acid sequence that has at least 80% sequence identity with SEQ ID NO: 17 or a nucleic acid sequence that is reverse complementary to it.

根據另一個態樣，本發明提供一種包含至少一個，且較佳地至少兩個基本上相同或基本上反向互補的本發明核酸的載體，其中DNA片段較佳地側接兩個基本上相同或基本上反向互補的核酸。According to another aspect, the present invention provides a vector comprising at least one, and preferably at least two, substantially identical or substantially reverse complementary nucleic acids of the present invention, wherein the DNA segment is preferably flanked by two substantially identical or substantially reverse complementary nucleic acids.

根據一個實施例，載體進一步包含編碼具有重組酶活性的蛋白質的核酸，其中具有重組酶活性的蛋白質包含與SEQ ID NO：1至9中任一者具有至少80%同一性的胺基酸序列。According to one embodiment, the vector further comprises a nucleic acid encoding a protein having recombinase activity, wherein the protein having recombinase activity comprises an amino acid sequence having at least 80% identity to any one of SEQ ID NOs: 1-9.

根據又一個態樣，本發明提供一種包含編碼具有重組酶活性的蛋白質的核酸的載體，其中具有重組酶活性的蛋白質包含與SEQ ID NO：1至9中任一者具有至少80%同一性的胺基酸序列。According to another aspect, the present invention provides a vector comprising a nucleic acid encoding a protein having recombinase activity, wherein the protein having recombinase activity comprises an amino acid sequence having at least 80% identity with any one of SEQ ID NOs: 1 to 9.

根據本發明的一個較佳態樣，提供本發明載體在本發明方法中的用途。According to a preferred embodiment of the present invention, use of the carrier of the present invention in the method of the present invention is provided.

根據又另一個態樣，本發明提供一種與SEQ ID NO：7、SEQ ID NO：3、SEQ ID NO：1、SEQ ID NO：2、SEQ ID NO：4、SEQ ID NO：5、SEQ ID NO：6、SEQ ID NO：8或SEQ ID NO：9具有至少80%同一性的蛋白質或本發明載體，其用於醫學中。較佳地，蛋白質或載體用於治療個體的遺傳性疾病或病症。更佳地，遺傳性疾病或病症的特徵在於個體之基因體的修飾。According to yet another aspect, the present invention provides a protein or vector of the present invention having at least 80% identity to SEQ ID NO: 7, SEQ ID NO: 3, SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 8 or SEQ ID NO: 9 for use in medicine. Preferably, the protein or vector is used to treat a genetic disease or condition in an individual. More preferably, the genetic disease or condition is characterized by modification of the genome of the individual.

根據又一個態樣，本發明提供一種經分離的宿主細胞，其包含以下重組DNA片段：至少一個，且較佳地至少兩個本發明核酸；及/或本發明載體。According to another aspect, the present invention provides an isolated host cell comprising the following recombinant DNA fragments: at least one, and preferably at least two nucleic acids of the present invention; and/or the vector of the present invention.

根據一個實施例，經分離的宿主細胞進一步包含(i)編碼具有重組酶活性的蛋白質的核酸，其中蛋白質包含與SEQ ID NO：1至9中任一者具有至少80%同一性的胺基酸序列；或(ii)本發明載體。According to one embodiment, the isolated host cell further comprises (i) a nucleic acid encoding a protein having recombinase activity, wherein the protein comprises an amino acid sequence having at least 80% identity to any one of SEQ ID NOs: 1 to 9; or (ii) a vector of the present invention.

根據又一個態樣，本發明提供一種非人類宿主生物體，其包含：(i)至少一個，且較佳地至少兩個本發明核酸；或(ii)本發明載體。According to another aspect, the present invention provides a non-human host organism comprising: (i) at least one, and preferably at least two nucleic acids of the present invention; or (ii) a vector of the present invention.

根據一個態樣，本發明提供一種醫藥組成物，其包含與SEQ ID NO：7、SEQ ID NO：3、SEQ ID NO：1、SEQ ID NO：2、SEQ ID NO：4、SEQ ID NO：5、SEQ ID NO：6、SEQ ID NO：8或SEQ ID NO：9具有至少80%同一性的蛋白質；本發明核酸；本發明載體；本發明的經分離的宿主細胞，或本發明的非人類宿主生物體，以及視情況存在的醫藥上可接受的賦形劑。According to one aspect, the present invention provides a pharmaceutical composition comprising a protein having at least 80% identity with SEQ ID NO:7, SEQ ID NO:3, SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:8 or SEQ ID NO:9; a nucleic acid of the present invention; a vector of the present invention; an isolated host cell of the present invention, or a non-human host organism of the present invention, and optionally a pharmaceutically acceptable excipient.

根據隨附申請專利範圍和下面的詳細說明，本發明的更多態樣和實施例將變得明白易懂。Further aspects and embodiments of the present invention will become apparent from the appended claims and the following detailed description.

序列表 SEQ ID NO：1 (YR6)： MIENQLSLLGDFSGVRPDDVKAAVQAAQKKGINVAENEQFKAVFDHLLGEFKKREERYSPNTLRRLESAWTCFVDWCLAHHRHSLPATPDTVEAFFIERSETLHRNTLSVYRWAISRVHRVAGCPDPCLDIYVEDRLKAISRKKVREGETVKQASPFNEQHLLKLTSLWYLSDKLLLRRNLALLAVAYESMLRAAELANIRVSDLELSGDGTAVLTIPITKTNHSGEPDTCILSQDVVSLLMDYTEAGRLDMRADGYLFVGISKHNTCINPKRDADTGECLHKPITTKTVEGVFYSAWQALELERQGVKPFTAHSARVGAAQDLLKKGYNTLQIQQSGRWSSGTMVARYGRAILARDGAMAHSRVKTRNVSIDWGSGGSKNTI SEQ ID NO：2 (YR1)： MIENQLSLLGDFTDVRPSDVKTAIEKAQKKGVVVAEDHVFQAAINHLLNEFKKREDRYSPNTLRRLESAWGCFVEWCLDNKRHSLPASPDTTEKFLIYKAESVHRNTLSIYKWAISRVHRVAGCPNPCNDVFVEDRYKALVRVKVQSGEAIKQASPFNELHLNALVEKWKQHERVLERRNLALLGVAYESMLRAAELANIKLSDIELAGDGTAILTIPITKTNHSGDPDTCILSHDVVGLIMDYIEAGELHLKQDGYLFTGVSKHNKCTKPKVDKETGEVTYKPITTKTVEGIFKAAWSELELGRQGVKPFTGHSARVGATQDLLRKGYNTLQIQQSGRWSSEVMVARYGRAILARESAMAQSRVKTKNIDLSWGSKR SEQ ID NO：3 (YR2)： MNNEIIHSTSNTSLSQYPAEHIQKALANGDIPTDSHLFQSAADHLINEYRSREGLAENTFLALDTGWSLFVDWCVEHNRVSLPASSKTVEDYVKSISKVLRRNTIRVRKWAITKIHKICGLPNPFDSEFVTQTISGIYKKKLHEDEITEQASPFNETHLEALELLYADSTLKKRRDLLMMTIAYESLLRSSELCNIKLKHLRLIGKEIHITIPVTKTNHSGNPDVVALSEHATNQVLEYLNDHSMKLSGDGYLFRRLRRNGLAYPSTKQAMSNQSVIDVFNSVHNDLGGSDVLHCEPFTSHSCRVGGAQDLLAAGYSILQVQQAGRWSDPSMVYRYGRGIFAAKSAMAHFRRNRQKPRN SEQ ID NO：4 (YR4)： MSELLPLTPLTVDRNSDITERLRQFVQDKEAFSPNTWRQLLSVMRICNRWSEDNQRSFLPMSADDLRDYLSFLAESGRASSTVTSHAALISMLHRNAGLPVPNVSPLVFRTMKKINRVAVINGERAGQAVPFRLSDLLALDEEWSGSDNLQALRDLAFLHVAYATLLRISELSRLRVRDVMRAGDGRIILDVAWTKTIVQTGGLIKALSARSTQRLEEWIEASGLSSQPDAWLFTAVHRSGRPLIAEKPMSTRALEQIFSRAWRTAGKEGAVKANKNRYTGWSGHSARVGAAQDMADKGYPIARIMQEGTWKKPETLMRYIRHVDAHKGAMVEFMEQYGDPDYPG SEQ ID NO：5 (YR8)： MGKLSPTNQTLPAIQAEEDVLARLKEFVQDKEAFSPNTWRQLMSVMRICHRWSIENSRSFLPMLPADLRDYLNWLQESGRASSTIATHGSLISMLHRNAGLIPPNTSPLVFRAVKKINRVAVVTGERTGQAVPFRLEDLLELDALWSDSISLRHKRDLAFLHVAYSTLLRISELARLRVRDISRATDGRIILNVSYTKTIVQTGGLIKSLSSQSSRRLTEWMSVSGINAEPDAFLFCPVHRSGSATLSVTRPLSTPAIESIFAQAWLTIGAGEPIIPNKGRYTAWTGHSARVGAAQDMAGRGYAVAQIMQEGTWKKPETLMRYIRNLQAHEGAMTDIMEKSTLDHNNTK SEQ ID NO：6 (YR9)： MLAVLHEDLERAAAYKKAARAAATHRAYNSDWIIYTDWCRTRGLEAMPAHPEQIAAFVANQAASGLKPSTIERRVAAIGHHHRTSNYPAPAAHPEAGGLREALAGIRNEKRAKKTRKEPADATALRDMLAQIKGDGLRARRDRAALAIGMAAALRRSELVALTLENVGILEHGIELYLGATKTDQAGEGTTIAIPEGTRLRPKALLLDWISAVRVLEAGVVRTPAQEAAVPLFRRLTRSDQLTGEPMSDKAVARLVKRYAGAAGYDAAKFSGHSLRAGFLTEAANQGATIFKMQEVSRHKTVQVLSDYVRSADRFRDHAGERFL SEQ ID NO：7 (YR11)： MVGGMSFVRRDVVVIPDNPDLNDEVIRNLNAFMKDREAFAENTWKQLMMAVRLWCHWCIAKGRPYLPVDADYLRDYLLELHDNGLAPATISNYAAMLNLLHRQAGLIPAGESQKVKRVLKKISRTSIIKGETVGQAIPFRIADLNQVDEAWEASDRLKTIRNLAFLFVAYNTLLRISNIAHLKVKDLAFDHDGSVMLNIGYTKTLVDGKGITKALSPRASARVLKWLHVSGLLDHPDAYLFCKVYRTNKASVTTDKPLTLHPLESIFSEAWAVIHGEKVGIKNKGRYATWTGHSARVGAAQDMTESGYSLAQIMHEGTWKAPKTVLGYTRNLEAKKSVMIDLVG SEQ ID NO：8 (YR12)： MTEMIVANPLLAQFSASDDISAKLASFVRDREAFSSNTWRQLLSVMRICWRWSEENHRSFLPMAPEDLRDYLLHLQCIGRASSTISTHAALISMLHRNAGLVPPNVSPDVFRVVKKINRAAVIAGERTGQAVPFCRQDLKKLDTAWQGSPRLQQLRDLAFMHVAYSTLLRLSELSRLRVRDISRAADGRMILDVAWTKTIVQSGGIVKALSTQSSQRLTDWIVAAGLTGEPDAMIFCPVHRSNRMTKKIFSPMSTPCLEDIFLRAREAAGVAALSRTNKGRYAGWSGHSARVGAAQDMARKGFSVAQIMQEGTWTRTETVMRYIRMVEAHKGAMIGLMEEDE SEQ ID NO：9 (YR9. 2)： MLAILHEDLERAAAYKKAARAAATHRAYNSDWIIYTNWCRTRGLEAMPAHPEQIAAFVANQAASGLKPSTIERRVAAIGHYHRTSNYPAPAAHPEAGGLREVLAGIRNEKRAKKTRKEPADATALRDMLAQIKGDGLRARRDRAVLAIGMAAALRRSELVALTLENVGILEHGIELYLGATKTDQAGEGTTIAIPKGTRLRPKALLLDWISAVRVLEAGVVRTPAQEAAVPLFRRLTRSDQLTGEPMSDKAVARLVKRYAGAAGYDAAKFSGHSLRAGFLTEAANQGATIFKMQEVSRHKTVQVLSDYVRSADRFRDHAGERFL SEQ ID NO 辨識位點、間隔子序列為粗體且畫底線位點被稱為 10 TCAATTTCCGAGA ATGACAGT TCTCAGAAATTAA lox6 11 TCAATTTCTGAGA AGTGTAAT TCTCAGAAATTGA lox1 12 ATAACTTGAGATA ACGCGAAT TATCACAAGTTAA lox2 13 TGACTTCGTATA AGAACAAT TATACGAAGTTA lox4 14 CTAACTTTATATA AGTCCCAT TATATAATGTTAG lox8 15 CGTCTGTCCGATA ATCTTTCT TATCGGACATACT lox9 16 GTCAACTTCACATA AGTGTTAT TATGTGGAGTTGAC lox11 17 ATAACCTAATATA ATTGTATT TATATTAGGTCAG lox12 18 TTAATTTCTGAGA ACTGTCAT TCTCGGAAATTGA lox6的rev. compl. Sequence Listing SEQ ID NO: 1 (YR6): MIENQLSLLGDFSGVRPDDVKAAVQAAQKKGINVAENEQFKAVFDHLLGEFKKREERYSPNTLRRLESAWTCFVDWCLAHHRHSLPATPDTVEAFFIERSETLHRNTLSVYRWAISRVHRVAGCPDPCLDIYVEDRLKAISRKKVREGETVKQASPFNEQHLLKLTSLWYLSDKLLLRRNLALLAVAYESM LRAAELANIRVSDLELSGDGTAVLTIPITKTNHSGEPDTCILSQDVVSSLLMDYTEAGRLDMRADGYLFVGISKHNTCINPKRDADTGECLHKPITTKTVEGVFYSAWQALELERQGVKPFTAHSARVGAAQDLLKKGYNTLQIQQSGRWSSGTMVARYGRAILARDGAMAHSRVKTRNVSIDWGSGGSKNTI SEQ ID NO: 2 (YR1): MIENQLSLLGDFTDVRPSDVKTAIEKAQKKGVVVAEDHVFQAAINHLLNEFKKREDRYSPNTLRRLESAWGCFVEWCLDNKRHSLPASPDTTEKFLIYKAESVHRNTLSIYKWAISRVHRVAGCPNPCNDVFVEDRYKALVRVKVQSGEAIKQASPFNELHLNALVEKWKQHERVLERRNLALLGVAYE SMLRAAELANIKLSDIELAGDGTAILTIPITKTNHSGDPDTCILSHDVVGLIMDYIEAGELHLKQDGYLFTGVSKHNKCTKPKVDKETGEVTYKPITTKTVEGIFKAAWSELELGRQGVKPFTGHSARVGATQDLLRKGYNTLQIQQSGRWSSEVMVARYGRAILARESAMAQSRVKTKNIDLSWGSKR SEQ ID NO: 3 (YR2): MNNEIIHSTSNTSLSQYPAEHIQKALANGDIPTDSHLFQSAADHLINEYRSREGLAENTFLALDTGWSLFVDWCVEHNRVSLPASSKTVEDYVKSISKVLRRNTIRVRKWAITKIHKICGLPNPFDSEFVTQTISGIYKKKLHEDEITEQASPFNETHLEALELLYADSTLKKRRDLLM MTIAYESLLRSSELCNIKLKHLRLIGKEIHITIPVTKTNHSGNPDVVALSEHATNQVLEYLNDHSMKLSGDGYLFRRLRRNGLAYPSTKQAMSNQSVIDVFNSVHNDLGGSDVLHCEPFTSHSCRVGGAQDLLAAGYSILQVQQAGRWSDPSMVYRYGRGIFAAKSAMAHFRRNRQKPRN SEQ ID NO: 4 (YR4): MSELLPLTPLTVDRNSDITERLRQFVQDKEAFSPNTWRQLLSVMRICNRWSEDNQRSFLPMSADDLRDYLSFLAESGRASSTVTSHAALISMLHRNAGLPVPNVSPLVFRTMKKINRVAVINGERAGQAVPFRLSDLLALDEEWSGSDNLQALRDLAFLHVAYATLLRISEL SRLRVRDVMRAGDGRIILDVAWTKTIVQTGGLIKALSARSTQRLEEWIEASGLSSQPDAWLFTAVHRSGRPLIAEKPMSTRALEQIFSRAWRTAGKEGAVKANKNRYTGWSGHSARVGAAQDMADKGYPIARIMQEGTWKKPETLMRYIRHVDAHKGAMVEFMEQYGDPDYPG SEQ ID NO: 5 (YR8): MGKLSPTNQTLPAIQAEEDVLARLKEFVQDKEAFSPNTWRQLMSVMRICHRWSIENSRSFLPMLPADLRDYLNWLQESGRASSTIATHGSLISMLHRNAGLIPPNTSPLVFRAVKKINRVAVVTGERTGQAVPFRLEDLLELDALWSDSISLRHKRDLAFLHVAYSTLLRISEL ARLRVRDISRATDGRIILNVSYTKTIVQTGGLIKSLSSQSSRRLTEWMSVSGINAEPDAFLFCPVHRSGSATLSVTRPLSTPAIESIFAQAWLTIGAGEPIIPNKGRYTAWTGHSARVGAAQDMAGRGYAVAQIMQEGTWKKPETLMRYIRNLQAHEGAMTDIMEKSTLDHNNTK SEQ ID NO: 6 (YR9): MLAVLHEDLERAAAYKKAARAAATHRAYNSDWIIYTDWCRTRGLEAMPAHPEQIAAFVANQAASGLKPSTIERRVAAIGHHHRTSNYPAPAAHPEAGGLREALAGIRNEKRAKKTRKEPADATALRDMLAQIKGDGLRARRDRAALAIGMAAALRRSELVAL TLENVGILEHGIELYLGATKTDQAGEGTTIAIPEGTRLRPKALLLDWISAVRVLEAGVVRTPAQEAAVPLFRRLTRSDQLTGEPMSDKAVARLVKRYAGAAGYDAAKFSGHSLRAGFLTEAANQGATIFKMQEVSRHKTVQVLSDYVRSADRFRDHAGERFL SEQ ID NO: 7 (YR11): MVGGMSFVRRDVVVIPDNPDLNDEVIRNLNAFMKDREAFAENTWKQLMMAVRLWCHWCIAKGRPYLPVDADYLRDYLLELHDNGLAPATISNYAAMLNLLHRQAGLIPAGESQKVKRVLKKISRTSIIKGETVGQAIPFRIADLNQVDEAWEASDRLKTIRNLAFLFVAYNT LLRISNIAHLKVKDLAFDHDGSVMLNIGYTKTLVDGKGITKALSPRASARVLKWLHVSGLLDHPDAYLFCKVYRTNKASVTTDKPLTLHPLESIFSEAWAVIHGEKVGIKNKGRYATWTGHSARVGAAQDMTESGYSLAQIMHEGTWKAPKTVLGYTRNLEAKKSVMIDLVG SEQ ID NO: 8 (YR12): MTEMIVANPLLAQFSASDDISAKLASFVRDREAFSSNTWRQLLSVMRICWRWSEENHRSFLPMAPEDLRDYLLHLQCIGRASSTISTHAALISMLHRNAGLVPPNVSPDVFRVVKKINRAAVIAGERTGQAVPFCRQDLKKLDTAWQGSPRLQQLRDLAFMHVAYSTLLRL SELSRLRVRDISRAADGRMILDVAWTKTIVQSGGIVKALSTQSSQRLTDWIVAAGLTGEPDAMIFCPVHRSNRMTKKIFSPMSTPCLEDIFLRAREAAGVAALSRTNKGRYAGWSGHSARVGAAQDMARKGFSVAQIMQEGTWTRTETVMRYIRMVEAHKGAMIGLMEEDE SEQ ID NO: 9 (YR9. 2): MLAILHEDLERAAAYKKAARAAATHRAYNSDWIIYTNWCRTRGLEAMPAHPEQIAAFVANQAASGLKPSTIERRVAAIGHYHRTSNYPAPAAHPEAGGLREVLAGIRNEKRAKKTRKEPADATALRDMLAQIKGDGLRARRDRAVLAIGMAAALRRSELVAL TLENVGILEHGIELYLGATKTDQAGEGTTIAIPKGTRLRPKALLLDWISAVRVLEAGVVRTPAQEAAVPLFRRLTRSDQLTGEPMSDKAVARLVKRYAGAAGYDAAKFSGHSLRAGFLTEAANQGATIFKMQEVSRHKTVQVLSDYVRSADRFRDHAGERFL SEQ ID NO Identification sites and spacer sequences are bold and underlined The site is called 10 TCAATTTCCGAGAATGACAGTTCTCAGAAATTAA lox6 11 TCAATTTCTGAGA AGTGTAAT TCTCAGAAATTGA lox1 12 ATAACTTGAGATAACGCGAATTATCACAAGTTAA lox2 13 TGACTTCGTATA AGAACAAT TATACGAAGTTA lox4 14 CTAACTTTATATA AGTCCCAT TATATAATGTTAG lox8 15 CGTCTGTCCGATA ATCTTTCT TATCGGACATACT lox9 16 GTCAACTTCACATA AGTGTTATTATGTGGAGTTGAC lox11 17 ATAACCTAATATA ATTGTATT TATATTAGGTCAG lox12 18 TTAATTTCTGAGAACTGTCATTCTCGGAAATTGA lox6 rev. compl.

在下面詳細說明本發明之前，應當理解本發明並不限於本文所描述的特定方法學、方案和試劑，因為這些可能會有所改變。也應理解，本文所用術語僅以說明特定實施例為目的，並不希望限制本發明的範疇，本發明的範疇僅由隨附申請專利範圍所囿限。除非另有定義，否則本文所用的全部技術和科學術語具有與本技術領域具有通常知識者通常理解的相同含義。Before the present invention is described in detail below, it should be understood that the present invention is not limited to the specific methodology, protocols and reagents described herein, as these may vary. It should also be understood that the terms used herein are only for the purpose of describing specific embodiments and are not intended to limit the scope of the present invention, which is limited only by the scope of the attached patent application. Unless otherwise defined, all technical and scientific terms used herein have the same meaning as those commonly understood by those of ordinary skill in the art.

較佳地，本文所用的術語如「A multilingual glossary of biotechnological terms: (IUPAC Recommendations)」, Leuenberger, H.G.W, Nagel, B. and Kölbl, H. eds. (1995), Helvetica Chimica Acta, CH-4010 Basel, Switzerland中所述來定義。Preferably, the terms used herein are defined as described in "A multilingual glossary of biotechnological terms: (IUPAC Recommendations)", Leuenberger, H.G.W, Nagel, B. and Kölbl, H. eds. (1995), Helvetica Chimica Acta, CH-4010 Basel, Switzerland.

在這份說明書和隨後的申請專利範圍通篇中，除非上下文另有要求，否則單詞「包含(comprise)」以及諸如「包含(comprises)」和「包含(comprising)」的變體將被理解為暗示包括所規定的整數或步驟或成組整數或步驟，但不排除任何其他整數或步驟或成組整數或步驟。在以下段落中，將進一步定義本發明的不同態樣。這樣定義的各個態樣可能與任何其他態樣(等)組合，除非明確指出為相反。指示為視情況存在，較佳地或有利的任何特徵可能與指示為視情況存在，較佳地或有利的任何其他特徵組合。Throughout this specification and the claims that follow, unless the context requires otherwise, the word "comprise" and variations such as "comprises" and "comprising" will be understood to imply the inclusion of a stated number or step or group of numbers or steps but not the exclusion of any other number or step or group of numbers or steps. In the following paragraphs, various aspects of the invention are further defined. Each aspect so defined may be combined with any other aspect(s) unless expressly indicated to the contrary. Any feature indicated as optionally present, preferably or advantageous may be combined with any other feature indicated as optionally present, preferably or advantageous.

這份說明書上下文通篇引用了數份文件。本文引用的文件中的每一者(包括所有專利案、專利申請案、科學出版物、製造商說明書、說明書等)，無論上文或下文，均以全文引用的方式併入。本文中的任何內容均不應被解釋為承認本發明無權因為先前發明而早於此等揭示內容。本文引用的一些文獻的特徵是「以引用的方式併入」。在這些併入的參考文獻的定義或教示內容與本說明書中所引用的定義或教示內容之間相衝突的情況下，則以本說明書的文本為準。Several documents are cited throughout this specification. Each of the documents cited herein (including all patents, patent applications, scientific publications, manufacturer's instructions, specifications, etc.), whether supra or infra, is incorporated by reference in its entirety. Nothing herein should be construed as an admission that the present invention is not entitled to antedate such disclosures by virtue of prior invention. Some of the documents cited herein are characterized as "incorporated by reference." In the event of a conflict between the definitions or teachings of these incorporated references and the definitions or teachings cited in this specification, the text of this specification controls.

下面將說明本發明的元件。這些元件均以具體實施例列出；然而應理解，它們可能以任何方式和任何數量進行組合以創建額外的實施例。各種描述的實例和較佳實施例不應被解釋為將本發明限於明確說明的實施例。這個說明應被理解為支持並含括將明確說明的實施例與任何數量的已揭示及/或較佳元件相結合的實施例。此外，除非上下文另有指示，否則本件申請案中全部描述的元件的任何排列和組合應被認為由本件申請案的說明所揭示。定義 The elements of the present invention are described below. These elements are listed in specific embodiments; however, it should be understood that they may be combined in any manner and in any amount to create additional embodiments. The various described examples and preferred embodiments should not be interpreted as limiting the present invention to the embodiments expressly described. This description should be understood to support and encompass embodiments that combine the expressly described embodiments with any number of disclosed and/or preferred elements. In addition, unless the context indicates otherwise, any arrangement and combination of elements described throughout this application should be considered to be disclosed by the description of this application. Definitions

在下文中提供本說明書中經常使用的術語的一些定義。這些術語在其使用的各種情況下，在說明書的其餘部分中將具有各自定義的含義和較佳含義。Some definitions of terms frequently used in this specification are provided below. These terms will have their respective defined meanings and preferred meanings in the remainder of the specification, in each case where they are used.

如本說明書和隨附申請專利範圍中所使用的，單數形式「一(a)」、「一(an)」和「該(the)」包括複數指示物，除非內容另外明確指出。As used in this specification and the appended claims, the singular forms "a," "an," and "the" include plural referents unless the content clearly dictates otherwise.

「序列同一性百分比」是透過在一個比較範圍內比較兩個最佳比對的序列來確定，其中比較範圍內的序列部分與參考序列(其不包括添加或缺失)相比可包含添加或缺失以達到兩個序列的最佳比對。透過確定兩個序列中出現相同核酸鹼基或胺基酸殘基處的位置數以產生匹配位置數，將匹配位置數除以比較範圍內的位置總數，並將結果乘以100得出序列同一性百分比來計算百分比。"Percentage of sequence identity" is determined by comparing two optimally aligned sequences over a comparison range, wherein the portion of the sequence in the comparison range may include additions or deletions compared to the reference sequence (which does not include additions or deletions) to achieve optimal alignment of the two sequences. The percentage is calculated by determining the number of positions where the same nucleic acid base or amino acid residue occurs in the two sequences to generate the number of matched positions, dividing the number of matched positions by the total number of positions in the comparison range, and multiplying the result by 100 to obtain the percentage of sequence identity.

術語「相同」在本文中使用於兩個或更多個核酸或多肽序列的上下文內，是指相同的兩個或更多個序列或子序列，即包含相同的核苷酸或胺基酸序列。如果序列具有特定百分比的相同核苷酸或胺基酸殘基，則序列彼此為「相同」。根據本發明，當使用下列序列比對演算法或藉由手動比對和目視檢查之一者測量時，在比較範圍或指定區域內進行比較和比對以獲得最大對應性時，至少60%相同包括相對於指定序列至少61%、至少62%、至少63%、至少64%、至少65%、至少66%、至少67%、至少68%、至少69%、至少70%、至少71%、至少72%、至少73%、至少74%，至少75%、至少76%、至少77%、至少78%、至少79%、至少80%、至少81%、至少82%、至少83%、至少84%、至少85%、至少86%、至少87%、至少88%、至少89%、至少90%、至少91%、至少92%、至少93%、至少94%、至少95%、至少96%、至少97%、至少98%或至少99%同一性。這些定義也涉及測試序列的互補序列。因此，在說明書通篇關於多肽和多核苷酸序列比較使用術語「至少XY%序列同一性」。這個用語較佳是指針對各別參考多肽或針對各別參考多核苷酸至少60%、至少65%、至少70%、至少75%、至少80%、至少81%、至少82%、至少83%、至少84%、至少85%、至少86%、至少87%、至少88%、至少89%、至少90%、至少91%、至少92%、至少93%，至少94%、至少95%、至少96%、至少97%、至少98%或至少99%的序列同一性。The term "identical" is used herein in the context of two or more nucleic acid or polypeptide sequences to refer to two or more sequences or subsequences that are identical, i.e., contain the same sequence of nucleotides or amino acids. Sequences are "identical" to one another if they have a certain percentage of identical nucleotide or amino acid residues. According to the present invention, at least 60% identity includes at least 61%, at least 62%, at least 63%, at least 64%, at least 65%, at least 66%, at least 67%, at least 68%, at least 69%, at least 70%, at least 71%, at least 72%, at least 73%, at least 74%, at least 75%, at least 76%, at least 77%, at least 78%, at least 79%, at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98% or at least 99% identity to a specified sequence when compared and aligned to obtain maximum correspondence over a comparison range or a specified region. These definitions also relate to the complementary sequences of the test sequences. Thus, the term "at least XY% sequence identity" is used throughout the specification with respect to polypeptide and polynucleotide sequence comparisons. This term preferably refers to at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98% or at least 99% sequence identity to a respective reference polypeptide or to a respective reference polynucleotide.

在本發明的上下文中，具有重組酶活性且包含與給定SEQ ID NO具有至少80%同一性的胺基酸序列的蛋白質較佳表示該蛋白質具有與給定SEQ ID NO具有至少85%、至少90%、至少92%、至少94%、至少96%、至少98%或至少99%序列同一性的胺基酸序列。In the context of the present invention, a protein having recombinase activity and comprising an amino acid sequence having at least 80% identity to a given SEQ ID NO preferably means that the protein has an amino acid sequence having at least 85%, at least 90%, at least 92%, at least 94%, at least 96%, at least 98% or at least 99% sequence identity to the given SEQ ID NO.

同樣地，在本發明的上下文中，與給定SEQ ID NO具有至少60%序列同一性的核酸序列或與其反向互補的核酸序列較佳地表示該核酸與給定SEQ ID NO具有至少65%、至少70%、至少75%、至少80%、至少85%、至少90%、至少92%、至少94%、至少96%、至少98%或至少99%序列同一性的序列或與該SEQ ID NO反向互補的核酸序列。Likewise, in the context of the present invention, a nucleic acid sequence having at least 60% sequence identity with a given SEQ ID NO, or a nucleic acid sequence that is the reverse complement thereof, preferably means a sequence of the nucleic acid having at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 94%, at least 96%, at least 98% or at least 99% sequence identity with a given SEQ ID NO, or a nucleic acid sequence that is the reverse complement of the SEQ ID NO.

本文中所用的術語「序列比較」是指其中一個序列充當參考序列，測試序列與其進行比較的過程。當使用序列比較演算法時，將測試序列和參考序列輸入電腦中，如果需要的話指定子序列座標，並指定序列演算法程式參數。通常使用預設程式參數，或可以指定替代參數。然後序列比較演算法基於程式參數計算測試序列相對於參考序列的序列同一性百分比。如果比較兩個序列且未指定比較時要相對於其計算序列同一性百分比的比較參考序列，如果沒有另外具體說明，則將參照兩個待比較序列中較長者來計算序列同一性。如果指明了參考序列，則如果沒有另外具體指明，則基於本發明的SEQ ID NO之一所指示的參考序列全長來確定序列同一性。The term "sequence comparison" as used herein refers to a process in which one sequence serves as a reference sequence and a test sequence is compared to it. When a sequence comparison algorithm is used, the test sequence and the reference sequence are input into a computer, subsequence coordinates are specified if necessary, and sequence algorithm program parameters are specified. Default program parameters are usually used, or alternative parameters can be specified. The sequence comparison algorithm then calculates the percentage of sequence identity of the test sequence relative to the reference sequence based on the program parameters. If two sequences are compared and a comparison reference sequence to which the percentage of sequence identity is calculated is not specified during comparison, if not otherwise specifically stated, the sequence identity will be calculated with reference to the longer of the two sequences to be compared. If a reference sequence is specified, then if not otherwise specifically stated, the sequence identity is determined based on the full length of the reference sequence indicated by one of the SEQ ID NOs of the present invention.

比較用的序列比對方法是本領域眾所周知的。比較用的序列的最佳比對可以例如透過Smith和Waterman的局部同源性演算法(Adv. Appl. Math. 2:482, 1970)、透過Needleman和Wunsch的同源性比對演算法(J. Mol. Biol. 48:443, 1970)、透過Pearson和Lipman的相似性方法搜尋(Proc. Natl. Acad. Sci. USA 85:2444, 1988)、透過這些演算法的電腦化實施(例如Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Dr., Madison, Wis.的GAP、BESTFIT、FASTA，和TFASTA)，或透過手動比對和目視檢查(參見例如Ausubel et al., Current Protocols in Molecular Biology (1995 supplement))。適於確定序列同一性百分比和序列相似性的演算法是BLAST和BLAST 2.0演算法，這些演算法分別描述於Altschul等人(Nuc. Acids Res. 25:3389-402, 1977)，以及Altschul等人(J. Mol. Biol. 215:403-10, 1990)中。用於執行BLAST分析的軟體可透過National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/)公開取得。這個演算法首先透過辨識查詢序列中長度為W的短單字來辨識高分序列對(HSP)，當與資料庫序列中相同長度的單字比對時，這些短單字匹配或滿足一些正值閾值分數T。T被稱為鄰域單字分數閾值(Altschul 等人，同上)。這些最初的鄰域單字結果充當啟動搜尋的種子，以便能找到含有它們的更長HSP。單字結果沿著每個序列在兩個方向上延伸，直到可以增加累積比對分數。對於核苷酸序列來說，使用參數M (一對匹配殘基的獎勵分數；始終＞0)和N (不匹配殘基的懲罰分數；始終＜0)來計算累積分數。對於胺基酸序列來說，使用評分矩陣來計算累積分數。在以下情況下，會停止在每個方向上的單字結果擴展：累積比對分數從其最大實現值下降了數量X；由於累積了一個或多個負評分殘基比對，累積分數降至零或更低；或到達任一序列的末端。BLAST演算法參數W、T和X決定了比對的靈敏度和速度。BLASTN程式(用於核苷酸序列)使用預設字長(W)為11、期望值(E)為10、M=5、N=-4以及兩股的比較。關於胺基酸序列，BLASTP程式使用預設字長為3、期望值(E)為10以及BLOSUM62評分矩陣(參見Henikoff and Henikoff, Proc. Natl. Acad. Sci. USA 89:10915, 1989)比對(B)為50，期望值(E)為10，M=5，N=-4，以及兩股的比較。BLAST演算法也對兩個序列之間的相似性進行統計分析(參見例如Karlin and Altschul, Proc. Natl. Acad. Sci. USA 90:5873-87, 1993)。BLAST演算法提供的一種相似性度量是最小和機率(P(N))，它提供了兩個核苷酸或胺基酸序列之間偶然發生匹配的機率的指示。例如，如果測試核酸與參考核酸比較時的最小和機率小於約0.2，通常小於約0.01，且更通常小於約0.001，則認為核酸與參考序列相似。Methods for comparing sequences are well known in the art. Optimal alignment of sequences for comparison can be achieved, for example, by the local homology algorithm of Smith and Waterman (Adv. Appl. Math. 2:482, 1970), by the homology alignment algorithm of Needleman and Wunsch (J. Mol. Biol. 48:443, 1970), by the search by similarity method of Pearson and Lipman (Proc. Natl. Acad. Sci. USA 85:2444, 1988), by computerized implementations of these algorithms (e.g., GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Dr., Madison, Wis.), or by manual alignment and visual inspection (see, e.g., Ausubel et al., Current Protocols in Molecular Biology (1995 supplement)). Suitable algorithms for determining percent sequence identity and sequence similarity are the BLAST and BLAST 2.0 algorithms, which are described in Altschul et al. (Nuc. Acids Res. 25:3389-402, 1977), and Altschul et al. (J. Mol. Biol. 215:403-10, 1990), respectively. Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/). The algorithm first identifies high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence that either match or satisfy some positive threshold score T when aligned with a word of the same length in a database sequence. T is called the neighborhood word score threshold (Altschul et al., supra). These initial neighborhood word results act as seeds for initiating searches to find longer HSPs containing them. The word results are extended in both directions along each sequence for as far as the cumulative alignment score can be increased. For nucleotide sequences, the cumulative score is calculated using the parameters M (reward score for a pair of matching residues; always >0) and N (penalty score for mismatching residues; always <0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of word results in each direction ceases if: the cumulative alignment score falls by the amount X from its maximum achieved value; the cumulative score falls to zero or below due to the accumulation of one or more negative scoring residual alignments; or the end of either sequence is reached. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses the defaults wordlength (W) of 11, expectation (E) of 10, M=5, N=-4, and a comparison of both strands. For amino acid sequences, the BLASTP program uses a default word length of 3, an expectation value (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff and Henikoff, Proc. Natl. Acad. Sci. USA 89:10915, 1989) with a comparison (B) of 50, an expectation value (E) of 10, M=5, N=-4, and a comparison of both strands. The BLAST algorithm also performs a statistical analysis of the similarity between two sequences (see, e.g., Karlin and Altschul, Proc. Natl. Acad. Sci. USA 90:5873-87, 1993). A similarity measure provided by the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability of a match occurring accidentally between two nucleotide or amino acid sequences. For example, a nucleic acid is considered similar to a reference sequence if the smallest sum probability when the test nucleic acid is compared to the reference nucleic acid is less than about 0.2, typically less than about 0.01, and more typically less than about 0.001.

術語「核酸」與「核酸分子」在本文中以同義使用並且被理解為本領域廣泛接受的，即去氧核糖核苷酸或核糖核苷酸鹼基或兩者的單股或雙股寡聚體或聚合體。如本文所用的術語「核酸」不僅包括去氧核糖核酸(DNA)和核糖核酸(RNA)，也包括其中鹼基為腺嘌呤(A)、胞嘧啶(C)、鳥嘌呤(G)和胸腺嘧啶或尿嘧啶(U)的所有其他線性聚合體按照對應序列(核酸序列)排列。本發明也包含對應的RNA序列(其中胸腺嘧啶被尿嘧啶取代)、互補序列和具有經修飾核酸主鏈或3'或5'-端的序列。然而，偏好呈DNA形式的核酸。The terms "nucleic acid" and "nucleic acid molecule" are used synonymously herein and are understood to be widely accepted in the art, i.e., single-stranded or double-stranded oligomers or polymers of deoxyribonucleotides or ribonucleotide bases or both. The term "nucleic acid" as used herein includes not only deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), but also all other linear polymers in which the bases are adenine (A), cytosine (C), guanine (G), and thymine or uracil (U) arranged in a corresponding sequence (nucleic acid sequence). The present invention also encompasses corresponding RNA sequences in which thymine is replaced by uracil, complementary sequences, and sequences with modified nucleic acid backbones or 3' or 5'-ends. However, nucleic acids in the form of DNA are preferred.

術語「具有重組酶活性的蛋白質」包含能夠操縱基因體結構的任何酶。更具體地，該術語指可以催化選自切除、併入、倒置和易位反應的重組的各個酶。具有重組酶活性的蛋白質是本領域眾所周知的(參見上文背景的討論)並且具體包括DNA重組酶，諸如位點特異性重組酶且更具體地酪胺酸重組酶。具有重組酶活性的蛋白質可以呈單體、二聚體或四聚體的形式存在。二聚體和四聚體分別可包含具有重組酶活性的相同蛋白質的兩個或四個單體(同二聚體或同四聚體)，或者兩個或更多個具有重組酶活性的不同單體(異二聚體或異四聚體)。The term "protein with recombinase activity" includes any enzyme that can manipulate the structure of the genome. More specifically, the term refers to each enzyme that can catalyze the reorganization selected from excision, incorporation, inversion and translocation reactions. Proteins with recombinase activity are well known in the art (see the discussion of the background above) and specifically include DNA recombinases, such as site-specific recombinases and more specifically tyrosine recombinases. Proteins with recombinase activity can exist in the form of monomers, dimers or tetramers. Dimers and tetramers can respectively include two or four monomers (homodimers or homotetramers) of the same protein with recombinase activity, or two or more different monomers (heterodimers or heterotetramers) with recombinase activity.

如本文所用的術語「辨識位點」(有時也稱為「目標位點」)是指重組酶酵素辨識的一個特定核苷酸序列，並且在該處發生DNA斷裂和股交換。這些序列的長度通常在30到200個鹼基對之間，由中央間隔子序列側接的兩個反向重複重組酶結合區組成(Meinke et al., 2016)。這種辨識位點的一個實例可以在SSR Cre/loxP結合複合物中看到，其中Cre重組酶結合至34個鹼基對loxP目標序列。loxP辨識位點包含兩個13個鹼基對倒置重複Cre結合元件，側接8個鹼基對間隔子區。左半位點是間隔子左側的13個鹼基對結合元件，而右半位點是間隔子右側的13鹼基對結合元件。取決於辨識位點及其間隔子的數量和相對位向，DNA重組酶執行遺傳內容的切除、併入、倒置或替換(Meinke et al., 2016綜述)。因此，根據本發明的「辨識位點」是包含第一半位點、第二半位點和分隔第一半位點和第二半位點的間隔子的核苷酸序列。As used herein, the term "recognition site" (sometimes also referred to as "target site") refers to a specific nucleotide sequence that is recognized by the recombinase enzyme and at which DNA cleavage and strand exchange occurs. These sequences are typically between 30 and 200 base pairs in length and consist of two inverted repeat recombinase binding regions flanked by a central spacer sequence (Meinke et al., 2016). An example of such a recognition site can be seen in the SSR Cre/loxP binding complex, where the Cre recombinase binds to a 34 base pair loxP target sequence. The loxP recognition site contains two 13 base pair inverted repeat Cre binding elements flanked by 8 base pair spacer regions. The left half site is a 13 base pair binding element on the left side of the spacer, while the right half site is a 13 base pair binding element on the right side of the spacer. Depending on the number and relative orientation of the recognition sites and their spacers, DNA recombinases perform excision, incorporation, inversion or substitution of genetic content (reviewed by Meinke et al., 2016). Therefore, a "recognition site" according to the present invention is a nucleotide sequence comprising a first half site, a second half site and a spacer separating the first half site and the second half site.

關於發生重組事件，重組酶複合物辨識DNA雙股上的第一辨識位點和第二辨識位點。辨識位點也稱為上游辨識位點和下游辨識位點，取決於它們在DNA雙股上的位置。For recombination events to occur, the recombinase complex recognizes the first recognition site and the second recognition site on the DNA double strand. The recognition sites are also called upstream recognition sites and downstream recognition sites, depending on their location on the DNA double strand.

在對稱辨識位點中，第一半位點(例如左半位點)和第二半位點(例如右半位點)是相同的並且是回文的(反向互補序列)。在不對稱辨識位點中，第一半位點(例如左半位點)和第二半位點(例如右半位點)是不相同並且不是回文的，即它們在至少一個核苷酸上彼此有別。In a symmetric recognition site, the first half site (e.g., the left half site) and the second half site (e.g., the right half site) are identical and palindromic (reverse complement sequence). In an asymmetric recognition site, the first half site (e.g., the left half site) and the second half site (e.g., the right half site) are not identical and are not palindromic, i.e., they differ from each other in at least one nucleotide.

如本文所用的術語「功能突變體」表示可以從術語所指的核酸序列添加、插入、刪除或取代一或多個核酸。在多肽和蛋白質的情況下，術語「功能突變體」表示可以向術語所指的胺基酸序列添加、插入、刪除或取代一或多個胺基酸。較佳的突變是點突變或特定核酸或胺基酸的交換。具體而言，在辨識位點的功能突變體的情況下，已知辨識位點的間隔子區交換並不會影響其作為特定重組酶目標的活性。因此，辨識位點的功能突變體明確地包括間隔子區中的突變直至間隔子區的完全取代。辨識位點的功能突變體還包括辨識位點的半位點中的突變。例如，與功能突變體衍生而來的參考核苷酸序列相比，如本文所揭示的辨識位點的功能突變體可能在其核苷酸序列中包括一個、兩個、三個、四個、五個、六個、七個或八個突變。根據本發明的一個較佳實施例，辨識位點的功能突變體包括在功能突變體衍生而來的第一半位點的核苷酸序列中、第二半位點的核苷酸序列中，或兩個半位點的核苷酸序列中的一個或兩個突變。The term "functional mutant" as used herein means that one or more nucleic acids can be added, inserted, deleted or replaced from the nucleic acid sequence referred to by the term. In the case of polypeptides and proteins, the term "functional mutant" means that one or more amino acids can be added, inserted, deleted or replaced to the amino acid sequence referred to by the term. Preferred mutations are point mutations or exchanges of specific nucleic acids or amino acids. Specifically, in the case of functional mutants of recognition sites, it is known that the exchange of spacer regions of the recognition sites will not affect its activity as a specific recombinase target. Therefore, the functional mutants of the recognition sites explicitly include mutations in the spacer regions until the complete replacement of the spacer regions. The functional mutants of the recognition sites also include mutations in the half sites of the recognition sites. For example, the functional mutant of the identification site disclosed herein may include one, two, three, four, five, six, seven or eight mutations in its nucleotide sequence compared to the reference nucleotide sequence from which the functional mutant is derived. According to a preferred embodiment of the present invention, the functional mutant of the identification site includes one or two mutations in the nucleotide sequence of the first half site, the nucleotide sequence of the second half site, or the nucleotide sequence of both half sites from which the functional mutant is derived.

如本文所用的術語「報導載體」包括質體、病毒或其他核酸載體，其透過遺傳重組(以重組方式)，例如透過插入或併入該核酸序列而包含根據本發明的核酸序列。原核載體以及真核載體(例如人工染色體，諸如YAC (酵母人工染色體))均適用於本發明。The term "reporter vector" as used herein includes plasmids, viruses or other nucleic acid vectors, which contain the nucleic acid sequence according to the present invention by genetic recombination (recombinantly), for example by insertion or incorporation of the nucleic acid sequence. Prokaryotic vectors as well as eukaryotic vectors (e.g. artificial chromosomes, such as YAC (yeast artificial chromosomes)) are suitable for use in the present invention.

如本文所用的術語「治療有效量」表示研究人員、獸醫、醫生或其他臨床醫生所尋求的在組織系統、動物或人類中引起生物學或醫學反應的活性化合物或藥劑之量，生物學或醫學反應包括正在治療的疾病或病症的症狀減輕。As used herein, the term "therapeutically effective amount" means that amount of an active compound or agent that will elicit the biological or medicinal response in a tissue system, animal or human that is being sought by a researcher, veterinarian, physician or other clinician, including a reduction in symptoms of the disease or condition being treated.

如本文所用的術語「醫藥組成物」是指用於辨識、預防或治療疾病或組織狀態的物質及/或物質的組合。醫藥組成物被配製為適合投予給患者以預防及/或治療疾病。此外，醫藥組成物是指活性劑與惰性或活性載劑的組合，使得此組成物適合於治療用途。這樣的載劑也被稱為醫藥上可接受的。醫藥組成物可依其化學和物理性質配製供用於經口、腸胃外、局部、吸入、直腸、舌下、穿皮、皮下或陰道施用途徑。醫藥組成物包含固體、半固體、液體、穿皮治療系統(transdermal therapeutic systems，TTS)。固體組成物選自由錠劑、包衣錠劑、粉劑、顆粒、丸劑、膠囊、發泡錠或穿皮治療系統組成之群組。也包含液體組成物，選自由溶液、糖漿、輸注液、提取物、用於靜脈內施用的溶液、用於輸注的溶液或本發明的載劑系統的溶液組成之群組。可用於本發明上下文中的半固體組成物包括乳液、懸浮劑、乳劑、洗劑、凝膠、小球(globule)，口頰錠劑和栓劑。The term "pharmaceutical composition" as used herein refers to a substance and/or combination of substances used to identify, prevent or treat a disease or tissue state. A pharmaceutical composition is formulated to be suitable for administration to a patient to prevent and/or treat a disease. In addition, a pharmaceutical composition refers to a combination of an active agent and an inert or active carrier so that the composition is suitable for therapeutic use. Such a carrier is also referred to as pharmaceutically acceptable. The pharmaceutical composition can be formulated for oral, parenteral, topical, inhalation, rectal, sublingual, transdermal, subcutaneous or vaginal administration depending on its chemical and physical properties. Pharmaceutical compositions include solid, semisolid, liquid, and transdermal therapeutic systems (TTS). Solid compositions are selected from the group consisting of tablets, coated tablets, powders, granules, pills, capsules, blister tablets or transdermal therapeutic systems. Also included are liquid compositions selected from the group consisting of solutions, syrups, infusion solutions, extracts, solutions for intravenous administration, solutions for infusion or solutions of the carrier system of the present invention. Semisolid compositions that can be used in the context of the present invention include emulsions, suspensions, creams, lotions, gels, globules, buccal tablets and suppositories.

如本文所用，術語「醫藥上可接受的」含括人類和獸醫用途：例如，術語「醫藥上可接受的」含括獸醫學上可接受的化合物或人類醫學和保健中可接受的化合物。As used herein, the term "pharmaceutically acceptable" encompasses both human and veterinary uses: for example, the term "pharmaceutically acceptable" encompasses a compound that is acceptable in veterinary medicine or a compound that is acceptable in human medicine and healthcare.

如本文所用的術語「個體」是指動物，較佳是哺乳動物，最佳是人類。實施例的說明 As used herein, the term "individual" refers to an animal, preferably a mammal, and most preferably a human. Description of the Embodiments

根據本發明，用於產生位點特異性DNA重組的方法包含以下步驟：a)使包含基本上相同或基本上彼此反向互補的至少第一辨識位點和第二辨識位點的核酸與具有重組酶的蛋白質接觸，以及b)容許具有重組酶活性的蛋白質產生位點特異性DNA重組。According to the present invention, a method for generating site-specific DNA recombination comprises the steps of: a) contacting a nucleic acid comprising at least a first recognition site and a second recognition site that are substantially identical or substantially reverse complementary to each other with a protein having a recombinase, and b) allowing the protein having recombinase activity to generate site-specific DNA recombination.

根據該方法並且如上文所定義，辨識位點是包含第一半位點、第二半位點，以及分隔開第一半位點和第二半位點的間隔子的核苷酸序列。According to the method and as defined above, the identification site is a nucleotide sequence comprising a first half site, a second half site, and a spacer separating the first half site and the second half site.

如在本發明上下文中所使用的，術語「基本上相同」或「基本上彼此反向互補」表示第一辨識位點中的第一半位點和第二半位點的核苷酸序列可能與第二個辨識位點中的第一半位點和第二半位點的核苷酸序列中相差至多兩個核苷酸。作為實例，它指的是本文中確定的辨識位點並且如圖4B中所示。以重組酶YR6 (SEQ ID NO：1)為例，其辨識位點具有TCAATTTCCGAGA ATGACAGT TCTCAGAAATTAA (SEQ ID NO：10)的序列，其中間隔子序列以粗體和畫底線表示。在第一辨識位點和第二辨識位點相同的情況下，兩個辨識位點都具有SEQ ID NO：10的序列。在序列彼此反向互補的情況下，兩個辨識位點之一者具有SEQ ID NO：10的序列，而另一者具有TTAATTTCTGAGA ACTGTCAT TCTCGGAAATTGA (SEQ ID NO：19)的序列。在基本上相同序列的情況下，辨識位點中的一者具有SEQ ID NO：10的序列，而另一者具有在第一半位點和第二半位點的序列中有至多兩個不同核苷酸的SEQ ID NO：10。這表示辨識位點的第一半位點包含與另一個辨識位點的第一半位點的核苷酸序列相差兩個核苷酸的核苷酸序列，而第二半位點和間隔子在兩者中是相同的，辨識位點的第一半位點包含與另一個辨識位點的第一半位點的核苷酸序列相差一個核苷酸的核苷酸序列，而相同辨識位點的第二半位點包含與另一個辨識位點的第二半位點的核苷酸序列相差一個核苷狻的核苷酸序列，且間隔子序列在兩者中是相同的，或辨識位點的第二半位點包含與另一個辨識位點的第二半位點的核苷酸序列相差兩個核苷酸的核苷酸序列，且兩個辨識位點的第一半位點和間隔子是相同的。相同的原理適用於兩個辨識位點包含基本上彼此反向互補的序列的情況。在這種情況下，必須將反向互補序列與非反向互補序列進行比較。再以YR6的辨識位點為例，第一辨識位點具有SEQ ID NO：10的序列，而另一個辨識位點為其反向互補序列，即SEQ ID NO：19。在兩個辨識序列基本上彼此反向互補的情況下，第一辨識位點可能具有SEQ ID NO：10的序列，而第二辨識位點可能具有SEQ ID NO：19的序列，前提是SEQ ID NO：19就第一半位點和第二半位點的序列來說包含兩個不同的核苷酸，但不在間隔子的序列中。與上面對於基本上相同的辨識位點的實例類似，兩個不同的核苷酸可能在第一半位點中、第二半位點中，或者在第一半位點和第二半位點中各有一個不同的核苷酸。 As used in the context of the present invention, the term "substantially identical" or "substantially complementary to each other" means that the nucleotide sequence of the first half site and the second half site in the first recognition site may differ from the nucleotide sequence of the first half site and the second half site in the second recognition site by at most two nucleotides. As an example, it refers to the recognition sites identified herein and as shown in Figure 4B. Taking the recombinase YR6 (SEQ ID NO: 1) as an example, its recognition site has the sequence of TCAATTTCCGAGA ATGACAGT TCTCAGAAATTAA (SEQ ID NO: 10), wherein the spacer sequence is shown in bold and underlined. In the case where the first recognition site and the second recognition site are identical, both recognition sites have the sequence of SEQ ID NO: 10. In the case where the sequences are reverse complements of each other, one of the two recognition sites has the sequence of SEQ ID NO: 10 and the other has the sequence of TTAATTTCTGAGA ACTGTCAT TCTCGGAAATTGA (SEQ ID NO: 19). In the case of substantially identical sequences, one of the recognition sites has the sequence of SEQ ID NO: 10 and the other has SEQ ID NO: 10 with at most two different nucleotides in the sequence of the first and second half sites. This means that the first half of the recognition site comprises a nucleotide sequence that differs by two nucleotides from the nucleotide sequence of the first half of the other recognition site, while the second half and the spacer are identical in both, the first half of the recognition site comprises a nucleotide sequence that differs by one nucleotide from the nucleotide sequence of the first half of the other recognition site, while the second half of the same recognition site comprises a nucleotide sequence that differs by one nucleotide from the nucleotide sequence of the second half of the other recognition site, and the spacer sequence is identical in both, or the second half of the recognition site comprises a nucleotide sequence that differs by two nucleotides from the nucleotide sequence of the second half of the other recognition site, and the first half and the spacer of the two recognition sites are identical. The same principle applies to the case where the two recognition sites comprise sequences that are essentially reverse complements of each other. In this case, the reverse complement sequence must be compared to the non-reverse complement sequence. Taking the recognition sites of YR6 as an example, the first recognition site has the sequence of SEQ ID NO: 10, and the other recognition site is its reverse complement sequence, i.e., SEQ ID NO: 19. In the case where the two recognition sequences are substantially reverse complements of each other, the first recognition site may have the sequence of SEQ ID NO: 10, and the second recognition site may have the sequence of SEQ ID NO: 19, provided that SEQ ID NO: 19 contains two different nucleotides with respect to the sequence of the first half site and the second half site, but is not in the sequence of the spacer. Similar to the example above for substantially identical recognition sites, the two different nucleotides may be in the first half site, in the second half site, or one different nucleotide in each of the first half site and the second half site.

根據本發明的一個實施例，第一辨識位點和第二辨識位點在其半位點序列上相同或彼此反向互補，意味著第一辨識位點和第二辨識位點在其第一半位點之間和在其第二半位點之間不具有任何序列偏差。在這樣的實施例中，第一辨識位點和第二辨識位點的間隔子序列不必然相同或彼此反向互補。根據本發明的又一個實施例，第一辨識位點和第二辨識位點在其包括第一半位點和第二半位點以及間隔子的整個序列上相同或彼此反向互補，表示第一辨識位點和第二辨識位點彼此之間不具有任何序列偏差。According to one embodiment of the present invention, the first recognition site and the second recognition site are identical or reverse complementary to each other in their half site sequences, meaning that the first recognition site and the second recognition site do not have any sequence deviation between their first half sites and between their second half sites. In such an embodiment, the spacer sequences of the first recognition site and the second recognition site are not necessarily identical or reverse complementary to each other. According to another embodiment of the present invention, the first recognition site and the second recognition site are identical or reverse complementary to each other in their entire sequence including the first half site and the second half site and the spacer, meaning that the first recognition site and the second recognition site do not have any sequence deviation between each other.

編碼依本發明的辨識位點的核酸包含最多40個，較佳地34個鹼基對。本發明的此核酸包括本發明的重組酶蛋白的辨識位點。因此，本發明提供長度不超過40個鹼基對，且包含下列的核酸： (i)與SEQ ID NO：16具有至少80%序列同一性的核酸序列或與其反向互補的核酸序列；或 (ii)與SEQ ID NO：12具有至少80%序列同一性的核酸序列或與其反向互補的核酸序列；或 (iii)與SEQ ID NO：10具有至少80%序列同一性的核酸序列或與其反向互補的核酸序列；或 (iv)與SEQ ID NO：11具有至少80%序列同一性的核酸序列或與其反向互補的核酸序列；或 (v)與SEQ ID NO：13具有至少80%序列同一性的核酸序列或與其反向互補的核酸序列；或 (vi)與SEQ ID NO：14具有至少80%序列同一性的核酸序列或與其反向互補的核酸序列；或 (vii)與SEQ ID NO：15具有至少80%序列同一性的核酸序列或與其反向互補的核酸序列；或 (viii)與SEQ ID NO：17具有至少80％序列同一性的核酸序列或與其反向互補的核酸序列。 The nucleic acid encoding the recognition site according to the invention comprises a maximum of 40, preferably 34 base pairs. This nucleic acid of the invention comprises the recognition site of the recombinase protein of the invention. Therefore, the present invention provides a nucleic acid having a length of no more than 40 base pairs and comprising the following: (i) a nucleic acid sequence having at least 80% sequence identity with SEQ ID NO: 16 or a nucleic acid sequence that is reverse complementary to it; or (ii) a nucleic acid sequence having at least 80% sequence identity with SEQ ID NO: 12 or a nucleic acid sequence that is reverse complementary to it; or (iii) a nucleic acid sequence having at least 80% sequence identity with SEQ ID NO: 10 or a nucleic acid sequence that is reverse complementary to it; or (iv) a nucleic acid sequence having at least 80% sequence identity with SEQ ID NO: 11 or a nucleic acid sequence that is reverse complementary to it; or (v) a nucleic acid sequence having at least 80% sequence identity with SEQ ID NO: 13 or a nucleic acid sequence that is reverse complementary to it; or (vi) a nucleic acid sequence having at least 80% sequence identity with SEQ ID NO: 14 or a nucleic acid sequence that is reverse complementary to it; or (vii) a nucleic acid sequence having at least 80% sequence identity with SEQ ID NO: 15 or a nucleic acid sequence that is reverse complementary to it; NO: 15 has at least 80% sequence identity or a nucleic acid sequence that is reverse complementary to it; or (viii) a nucleic acid sequence that has at least 80% sequence identity with SEQ ID NO: 17 or a nucleic acid sequence that is reverse complementary to it.

在本發明的方法中，具有重組酶活性且包含與給定SEQ ID NO具有至少80%同一性的胺基酸序列的蛋白質較佳地表示該蛋白質具有與給定SEQ ID NO具有至少85%、至少90%、至少92%、至少94%、至少96%，至少98%或至少99%序列同一性的胺基酸序列。In the methods of the present invention, a protein having recombinase activity and comprising an amino acid sequence having at least 80% identity with a given SEQ ID NO preferably means that the protein has an amino acid sequence having at least 85%, at least 90%, at least 92%, at least 94%, at least 96%, at least 98% or at least 99% sequence identity with the given SEQ ID NO.

同樣地，在本發明的方法中，與給定SEQ ID NO具有至少60%序列同一性的核酸序列或與其反向互補的核酸序列較佳地表示該核酸具有與給定SEQ ID NO具有至少65%、至少70%、至少75%、至少80%、至少85%、至少90%、至少92%、至少94%、至少96%，至少98%或至少99%序列同一性的序列或與該SEQ ID NO反向互補的核酸序列。Similarly, in the methods of the present invention, a nucleic acid sequence having at least 60% sequence identity with a given SEQ ID NO, or a nucleic acid sequence that is the reverse complement thereof, preferably means that the nucleic acid has a sequence having at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 94%, at least 96%, at least 98% or at least 99% sequence identity with a given SEQ ID NO, or a nucleic acid sequence that is the reverse complement of the SEQ ID NO.

在本發明方法中，重組酶蛋白與其較佳地在細胞內部的辨識位點中的至少兩者接觸。在重組酶蛋白結合至辨識位點時，便發生了位點特異性DNA重組。例如，使具有SEQ ID NO：1的重組酶YR6在細胞內部與至少兩個具有SEQ ID NO：10的lox6位點接觸。在重組酶蛋白結合至該等lox6位點時，便發生位點特異性DNA重組。In the method of the present invention, the recombinase protein is brought into contact with at least two of its recognition sites, preferably inside the cell. When the recombinase protein binds to the recognition sites, site-specific DNA recombination occurs. For example, the recombinase YR6 having SEQ ID NO: 1 is brought into contact with at least two lox6 sites having SEQ ID NO: 10 inside the cell. When the recombinase protein binds to the lox6 sites, site-specific DNA recombination occurs.

可在活體外或活體內進行本發明方法。在本發明於動物(包括人類)中實施的情況下，則其可用於非治療用途。根據一個實施例，該方法並非用於人類或動物的治療性處理。The methods of the invention may be performed in vitro or in vivo. Where the invention is practiced in animals (including humans), it may be used for non-therapeutic purposes. According to one embodiment, the method is not intended for therapeutic treatment of humans or animals.

該方法適用於常規使用位點特異性重組酶的所有領域(包括誘導型剔除或嵌入小鼠和其他基因轉殖動物模型)。在本發明的較佳方法中，位點特異性重組導致DNA併入的、缺失、倒置、易位或交換。根據本發明的一個實施例，該方法用於創造動物模型，其可用於生醫研究，例如作為人類疾病的模型。The method is applicable to all fields where site-specific recombinases are routinely used (including induced knockout or insertion mice and other transgenic animal models). In the preferred method of the present invention, site-specific recombination results in DNA incorporation, deletion, inversion, translocation or exchange. According to one embodiment of the present invention, the method is used to create an animal model, which can be used in biomedical research, for example as a model of human disease.

根據本發明的一個實施例，具有重組酶活性的蛋白質以單體存在。根據本發明的一個較佳實施例，具有重組酶活性的蛋白質包含至少兩個蛋白質單體，即呈二聚體形式(二聚體)。這樣的二聚體可以包含兩個相同類型的單體(即兩個相同的蛋白質單體，同二聚體)，或兩個不同類型的單體(即，兩個不同的蛋白質單體，異二聚體)。根據本發明的又一個較佳實施例，具有重組酶活性的蛋白質包含至少四個蛋白質單體，即呈四聚體形式(四聚體)。這樣的四聚體可以包含四個相同類型的單體(即四個相同的蛋白質單體，同四聚體)，或不同類型的單體，諸如兩個、三個或四個不同的單體(異四聚體)。根據一個尤佳的實施例，具有重組酶活性的蛋白質包含有包含不同蛋白質單體的異二聚體或異四聚體，即異二聚體或異四聚體。According to one embodiment of the present invention, the protein with recombinase activity exists as a monomer. According to a preferred embodiment of the present invention, the protein with recombinase activity comprises at least two protein monomers, i.e., in the form of a dimer (dimer). Such a dimer can comprise two monomers of the same type (i.e., two identical protein monomers, homodimers), or two monomers of different types (i.e., two different protein monomers, heterodimers). According to another preferred embodiment of the present invention, the protein with recombinase activity comprises at least four protein monomers, i.e., in the form of a tetramer (tetramer). Such a tetramer can comprise four monomers of the same type (i.e., four identical protein monomers, homotetramers), or monomers of different types, such as two, three or four different monomers (heterotetramers). According to a particularly preferred embodiment, the protein having recombinase activity comprises a heterodimer or heterotetramer comprising different protein monomers, i.e., a heterodimer or a heterotetramer.

根據本發明的一個尤佳的實施例，具有重組酶活性的蛋白質是重組酶，更佳是DNA重組酶。According to a preferred embodiment of the present invention, the protein having recombinase activity is a recombinase, more preferably a DNA recombinase.

根據一個較佳實施例，待重組的核酸序列已經存在於細胞中。或者，可以透過具有通常知識者已知的常規方法(諸如透過重組技術)將待重組的核酸序列引入細胞中。According to a preferred embodiment, the nucleic acid sequence to be recombined already exists in the cell. Alternatively, the nucleic acid sequence to be recombined can be introduced into the cell by conventional methods known to those of ordinary skill (e.g., by recombinant technology).

根據一個實施例，編碼本發明的重組酶蛋白的核酸序列已經存在於細胞中或透過具有通常知識者已知的常規手段(諸如透過重組技術)被引入細胞中。因此，這樣的方法進一步包括將編碼本發明重組酶蛋白的核酸引入細胞中的步驟。According to one embodiment, the nucleic acid sequence encoding the recombinase protein of the present invention is already present in the cell or is introduced into the cell by conventional means known to those of ordinary skill (such as by recombinant technology). Therefore, such a method further comprises the step of introducing the nucleic acid encoding the recombinase protein of the present invention into the cell.

或者，根據本發明的一個實施例，細胞已經包含編碼具有重組酶活性的本發明蛋白質的核酸。Alternatively, according to one embodiment of the present invention, the cell already contains a nucleic acid encoding a protein of the present invention having recombinase activity.

關於活化編碼具有重組酶活性的蛋白質的核酸表現，編碼重組酶蛋白的核酸進一步包含調節核酸序列，較佳為啟動子區域。因此，透過活化調節核酸序列來啟動或調節編碼具有重組酶活性的蛋白質的核酸表現。因而，為了誘導DNA重組，調節核酸序列(較佳地啟動子區域)受到活化以表現編碼重組酶蛋白的基因。較佳地，調節核酸序列(較佳地啟動子區域)較佳地與編碼重組酶蛋白的序列一起被引入細胞中，或者調節核酸序列在本發明方法開始時已經存在於細胞中。在第二種情況下，僅編碼重組酶蛋白的核酸被引入細胞中(並受到調節核酸序列的控制)。Regarding the expression of nucleic acid encoding a protein with recombinase activity, the nucleic acid encoding the recombinase protein further comprises a regulatory nucleic acid sequence, preferably a promoter region. Therefore, the expression of nucleic acid encoding a protein with recombinase activity is activated or regulated by activating the regulatory nucleic acid sequence. Thus, in order to induce DNA recombination, the regulatory nucleic acid sequence (preferably the promoter region) is activated to express the gene encoding the recombinase protein. Preferably, the regulatory nucleic acid sequence (preferably the promoter region) is preferably introduced into the cell together with the sequence encoding the recombinase protein, or the regulatory nucleic acid sequence is already present in the cell at the beginning of the method of the present invention. In the second case, only the nucleic acid encoding the recombinase protein is introduced into the cell (and is controlled by the regulatory nucleic acid sequence).

如本文所用的術語「調節核酸序列」是指DNA的基因調節區域。除了啟動子區域以外，這個術語還含括距離基因更遠的操作子區域以及影響基因表現的核酸序列，諸如順式元件、增強子或緘默子。如本文所用的術語「啟動子區域」是指DNA上允許基因受調節而表現的核苷酸序列。啟動子區域允許編碼相應蛋白質的核酸受調節而表現。啟動子區域位於基因的5'末端處，因此在RNA編碼區之前。細菌和真核啟動子均適用於本發明。The term "regulatory nucleic acid sequence" as used herein refers to the gene regulatory region of DNA. In addition to the promoter region, this term also includes operator regions farther away from the gene and nucleic acid sequences that affect gene expression, such as cis elements, enhancers or introns. The term "promoter region" as used herein refers to a nucleotide sequence on DNA that allows a gene to be regulated and expressed. The promoter region allows the nucleic acid encoding the corresponding protein to be regulated and expressed. The promoter region is located at the 5' end of the gene, therefore before the RNA coding region. Both bacterial and eukaryotic promoters are applicable to the present invention.

根據本發明的一個實施例，辨識位點被納入細胞中或被引入到細胞中(較佳地透過重組技術)。因此，本發明的方法可能包括將下列核酸引入細胞中的步驟： a) 第一核酸(第一個辨識位點)，其包含根據SEQ ID NO：10至17中之一者或與其反向互補的核酸序列；或為其功能突變體的核酸序列； b) 第二核酸(第二辨識位點)，其包含與第一核酸(第一辨識位點)的核酸序列基本上相同或基本上反向互補的核酸序列。 According to one embodiment of the present invention, the recognition site is incorporated into a cell or introduced into a cell (preferably by recombinant technology). Therefore, the method of the present invention may include the step of introducing the following nucleic acids into a cell: a) a first nucleic acid (first recognition site) comprising a nucleic acid sequence according to one of SEQ ID NOs: 10 to 17 or a reverse complement thereof; or a nucleic acid sequence that is a functional mutant thereof; b) a second nucleic acid (second recognition site) comprising a nucleic acid sequence that is substantially identical to or substantially reverse complement to the nucleic acid sequence of the first nucleic acid (first recognition site).

根據本發明的一個較佳實施例，將編碼本發明重組酶蛋白的核酸和至少兩個辨識位點被引入細胞中。這個方法包括以下步驟： a) 將下列核酸引入細胞中： (i) 編碼重組酶蛋白的第一核酸，其中該核酸被引入DNA中，使得調節核酸序列(較佳地啟動子區域)控制編碼重組酶蛋白的核酸表現， (ii) 第二核酸，其包含根據SEQ ID NO：10至17中之一者或與其反向互補的核酸序列；或為其功能突變體的核酸序列； (iii) 第三核酸，其包含與(ii)中定義的核酸序列基本上相同或基本上反向互補的核酸序列，和 b) 活化調節核酸序列(較佳啟動子區域)以誘導第一核酸的表現，供用於合成具有重組酶活性的蛋白質。 According to a preferred embodiment of the present invention, a nucleic acid encoding the recombinase protein of the present invention and at least two recognition sites are introduced into a cell. This method comprises the following steps: a) introducing the following nucleic acids into a cell: (i) a first nucleic acid encoding a recombinase protein, wherein the nucleic acid is introduced into DNA so that a regulatory nucleic acid sequence (preferably a promoter region) controls the expression of the nucleic acid encoding the recombinase protein, (ii) a second nucleic acid comprising a nucleic acid sequence according to one of SEQ ID NOs: 10 to 17 or a reverse complement thereof; or a nucleic acid sequence that is a functional mutant thereof; (iii) a third nucleic acid comprising a nucleic acid sequence that is substantially identical to or substantially reverse complement to the nucleic acid sequence defined in (ii), and b) activating the regulatory nucleic acid sequence (preferably a promoter region) to induce the expression of the first nucleic acid for use in synthesizing a protein having recombinase activity.

根據這個方法，編碼重組酶蛋白的核酸序列被引入細胞中，且至少兩個辨識位點被引入細胞的基因體或附加型DNA中。步驟(i)至(iii)可以按任何順序進行。According to this method, a nucleic acid sequence encoding a recombinase protein is introduced into a cell and at least two identification sites are introduced into the genome or episomal DNA of the cell. Steps (i) to (iii) may be performed in any order.

使用本領域具有通常知識者已知的基因操縱技術將核酸引入細胞中。適當的方法包括細胞轉形、轉染或病毒感染，藉此將編碼蛋白質的核酸序列作為載體的組分或編碼病毒之DNA或RNA的部分引入細胞中。細胞培養是透過本領域具有通常知識者已知各別細胞的培養方法進行。因此，較佳將細胞轉移至習知培養基中，並在有利於細胞存活的溫度和大氣下培養。Nucleic acids are introduced into cells using gene manipulation techniques known to those of ordinary skill in the art. Suitable methods include cell transformation, transfection or viral infection, whereby nucleic acid sequences encoding proteins are introduced into cells as components of vectors or as part of DNA or RNA encoding viruses. Cell culture is performed by culture methods for individual cells known to those of ordinary skill in the art. Therefore, it is preferred that the cells be transferred to a known culture medium and cultured at a temperature and atmosphere that is conducive to cell survival.

本發明還包括核酸序列或多核苷酸，其中重組酶蛋白的編碼序列在同一閱讀框內融合至有助於蛋白質從宿主細胞表現和分泌的多核苷酸序列。例如，充當控制多肽從細胞轉運出來的分泌序列的前導序列可能融合至編碼重組酶蛋白的序列。具有這樣一個前導序列的多肽或蛋白質被稱為前蛋白或前原蛋白，並且可能具有被宿主細胞切割以形成蛋白質的成熟形式的前導序列。這些多核苷酸可能具有5'延伸區域，以便其編碼原蛋白，這是成熟蛋白加上在N端處的額外胺基酸殘基。具有這樣一個前序列的表現產物稱為原蛋白，它是成熟蛋白的不活化形式；然而，一旦前序列被切下，活性成熟蛋白就會保留下來。額外序列也可能附接至蛋白質並且成為成熟蛋白的一部分。因此，例如，本發明的多核苷酸可能編碼多肽、或具有原序列的蛋白質，或具有原序列和前序列(諸如前導序列)兩者的蛋白質。The present invention also includes nucleic acid sequences or polynucleotides in which the coding sequence of the recombinase protein is fused in the same reading frame to a polynucleotide sequence that facilitates the expression and secretion of the protein from the host cell. For example, a leader sequence that acts as a secretory sequence to control the transport of the polypeptide out of the cell may be fused to a sequence encoding the recombinase protein. A polypeptide or protein having such a leader sequence is called a preprotein or preproprotein and may have a leader sequence that is cut by the host cell to form a mature form of the protein. These polynucleotides may have a 5' extension region so that they encode a proprotein, which is a mature protein plus an additional amino acid residue at the N-terminus. The expression product having such a presequence is called a proprotein, which is an inactive form of the mature protein; however, once the presequence is cut off, the active mature protein will remain. Additional sequences may also be attached to the protein and become part of the mature protein. Thus, for example, a polynucleotide of the invention may encode a polypeptide, or a protein having a prosequence, or a protein having both a prosequence and a presequence (e.g., a leader sequence).

本發明的核酸還可能具有與標記序列框內融合的編碼序列，標記序列允許純化本發明的蛋白質。標記序列可能是親和力標籤或表位標籤，諸如多組胺酸標籤、鏈黴親和素標籤、Xpress標籤、FLAG標籤、纖維素或幾丁質結合標籤、麩胺硫-S轉移酶標籤(GST)、血球凝集素(HA)標籤、c-myc標籤或V5標籤。The nucleic acids of the invention may also have a coding sequence fused in frame to a marker sequence that allows purification of the protein of the invention. The marker sequence may be an affinity tag or an epitope tag, such as a polyhistidine tag, a streptavidin tag, an Xpress tag, a FLAG tag, a cellulose or chitin binding tag, a glutamine sulfoxide-S transferase tag (GST), a hemagglutinin (HA) tag, a c-myc tag, or a V5 tag.

HA標籤將對應於從流感血球凝集素蛋白獲得的表位(Wilson et al., 1984)，而c-myc標籤可能是來自人類Myc蛋白的表位(Evans et al., 1985)。The HA tag would correspond to an epitope derived from the influenza hemagglutinin protein (Wilson et al., 1984), while the c-myc tag would be an epitope derived from the human Myc protein (Evans et al., 1985).

如果本發明的核酸是mRNA，特別是用作為藥劑，則可以透過在最大化mRNA的轉譯和穩定性、避免其免疫刺激活性與開發活體內遞送技術方面已達到的長足進展來促進mRNA治療劑的遞送。5'帽和3'多(A)尾是成熟真核mRNA有效轉譯和延長半衰期的主要貢獻因子。將帽類似物(諸如ARCA(anti-reverse cap analogs)，抗逆向帽類似物)和120-150 bp的多(A)尾併入活體外轉錄(IVT) mRNA中已經顯著增進編碼蛋白質的表現和mRNA穩定性。新型帽類似物(諸如經1,2-二硫代二磷酸酯修飾的帽)對RNA脫帽複合物具有抗性，可以進一步提高RNA轉譯的效率。用頻繁出現的同義密碼子替換編碼蛋白質的mRNA序列中的稀有密碼子，即所謂的密碼子優化，也有助於提高蛋白質合成的效率，並限制稀有密碼子造成的mRNA不穩定，從而防止轉錄本加速降解。同樣，改造3'和5'非轉譯區(UTR) (其含有負責募集RNA結合蛋白(RBP)和miRNA的序列)可以提高蛋白質產物的含量。有趣的是，UTR可以經過特意修飾以編碼調節元件(例如K-轉角基序和miRNA結合位點)，提供一種以細胞特異性方式控制RNA表現的方法。一些RNA鹼基修飾(諸如N1-甲基-假尿苷)不僅有助於掩蔽mRNA免疫刺激活性，還已經被證明可以透過提高轉譯起始來增加mRNA轉譯。除了對蛋白質轉譯所觀察到的影響外，鹼基修飾和密碼子優化還會影響mRNA的二級結構，進而影響其轉譯。本發明也考慮了本發明核酸分子的各別修飾。If the nucleic acid of the present invention is mRNA, especially for use as a pharmaceutical agent, the delivery of mRNA therapeutics can be facilitated by the great progress that has been made in maximizing the translation and stability of mRNA, avoiding its immunostimulatory activity, and developing in vivo delivery technologies. The 5' cap and 3' poly(A) tail are major contributors to the efficient translation and extended half-life of mature eukaryotic mRNA. Incorporation of cap analogs (such as ARCA (anti-reverse cap analogs)) and 120-150 bp poly(A) tails into in vitro transcribed (IVT) mRNAs has significantly enhanced the expression of the encoded protein and mRNA stability. Novel cap analogs, such as those modified with 1,2-dithiobisphosphate, are resistant to RNA decapping complexes and can further improve the efficiency of RNA translation. Replacing rare codons in protein-encoding mRNA sequences with frequently occurring synonymous codons, a process known as codon optimization, can also help improve the efficiency of protein synthesis and limit mRNA instability caused by rare codons, thereby preventing accelerated transcript degradation. Similarly, engineering the 3' and 5' untranslated regions (UTRs), which contain sequences responsible for recruiting RNA-binding proteins (RBPs) and miRNAs, can increase the amount of protein production. Interestingly, UTRs can be deliberately modified to encode regulatory elements, such as K-turn motifs and miRNA binding sites, providing a way to control RNA expression in a cell-specific manner. Some RNA base modifications (such as N1-methyl-pseudouridine) not only help mask the mRNA immunostimulatory activity, but have also been shown to increase mRNA translation by improving translation initiation. In addition to the effects observed on protein translation, base modifications and codon optimization can also affect the secondary structure of mRNA and thus its translation. The present invention also contemplates individual modifications of the nucleic acid molecules of the present invention.

RNA或複數個RNA較佳地編碼本發明的重組酶酵素或其次單位中的任一者。用於遞送和表現核酸並且特別是RNA的具體方法揭示於例如EP2590676和EP3115064中，其以引用的方式併入本文。RNA可能存在於顆粒中並且較佳地是自我複製的。在活體內投予顆粒後，RNA從顆粒中被釋放出來並在細胞內部轉譯以提供DNA重組酶或其任何單體次單位。RNA or a plurality of RNAs preferably encodes any of the recombinase enzymes of the present invention or its subunits. Specific methods for delivering and expressing nucleic acids and in particular RNA are disclosed in, for example, EP2590676 and EP3115064, which are incorporated herein by reference. The RNA may be present in particles and is preferably self-replicating. After the particles are administered in vivo, the RNA is released from the particles and translated inside the cells to provide the DNA recombinase or any of its monomeric subunits.

當自我複製的RNA分子(複製子)被遞送到脊椎動物細胞時，即使沒有任何蛋白質，也可以透過從自身轉錄(經由從自身生成的反義複本)產生多個子RNA。這些子RNA以及共線亞基因體轉錄本可能自行轉譯以提供編碼多肽的原位表現，或者可能被轉錄以提供與遞送的RNA具有相同意義的更多轉錄本，其被轉譯以提供多肽的原位表現。這個轉錄序列的總體結果是引入的複製子RNA的數量巨大擴增，因此編碼的多肽成為細胞的主要多肽產物。When a self-replicating RNA molecule (replicon) is delivered to a vertebrate cell, multiple daughter RNAs may be generated by transcription from itself (via antisense copies generated from itself), even in the absence of any protein. These daughter RNAs, as well as the colinear subgenomic transcripts, may be transcribed by themselves to provide in situ expression of the encoded polypeptide, or may be transcribed to provide further transcripts of the same meaning as the delivered RNA, which are transcribed to provide in situ expression of the polypeptide. The overall result of this transcriptional sequence is a huge expansion in the amount of the introduced replicon RNA, so that the encoded polypeptide becomes the major polypeptide product of the cell.

較佳的自我複製RNA分子編碼(i)可以從自我複製RNA分子轉錄RNA的RNA依賴性RNA聚合酶，和(ii)本發明的重組酶蛋白。聚合酶可以是阿伐病毒複製酶，例如包含阿伐病毒蛋白nsP1、nsP2、nsP3和nsP4中的一或多者。偏好本發明的自我複製RNA分子並未編碼阿伐病毒(alphavirus)結構蛋白。因此，較佳的自我複製RNA可以在細胞中使得自身基因體RNA複本的產生，但不會產生含有RNA的病毒顆粒。可用於本發明上下文的自我複製RNA分子可能具有兩個開放閱讀框。第一個(5')開放閱讀框編碼複製酶，而第二個(3')開放閱讀框編碼本發明的多肽。在一些實施例中，RNA可能具有額外的(例如下游)開放閱讀框，例如用於進一步編碼附加多肽。Preferred self-replicating RNA molecules encode (i) an RNA-dependent RNA polymerase that can transcribe RNA from the self-replicating RNA molecule, and (ii) a recombinase protein of the present invention. The polymerase can be an alphavirus replicase, for example comprising one or more of the alphavirus proteins nsP1, nsP2, nsP3 and nsP4. Preferred self-replicating RNA molecules of the present invention do not encode alphavirus structural proteins. Therefore, preferred self-replicating RNA can cause the production of its own genome RNA copies in cells, but will not produce viral particles containing RNA. Self-replicating RNA molecules that can be used in the context of the present invention may have two open reading frames. The first (5') open reading frame encodes the replicase, and the second (3') open reading frame encodes the polypeptide of the present invention. In some embodiments, the RNA may have additional (e.g., downstream) open reading frames, e.g., for further encoding additional polypeptides.

此類RNA特別適用於基因療法中的一般用途，並且特別適用於治療遺傳性病症或疾病。Such RNA is particularly suitable for general use in gene therapy, and particularly for the treatment of genetic disorders or diseases.

根據本發明的方法可以使用真核細胞和原核細胞進行。較佳的原核細胞是細菌細胞。尤佳的原核細胞是大腸桿菌細胞。較佳的真核細胞是酵母細胞(較佳地釀酒酵母)、昆蟲細胞、非昆蟲無脊椎動物細胞、兩棲類細胞或哺乳動物細胞(較佳地體細胞或多能幹細胞，包括胚胎幹細胞和其他多能幹細胞，像是誘導型多能幹細胞)，以及其他天然細胞或已建立的細胞株(包括NIH3T3、CHO、HeLa、HEK293、hiPS)。在人類胚胎幹細胞的情況下，較佳在不破壞人類胚胎的情況下獲得細胞，例如透過衍生自囊胚的單一分裂球的贅生物(outgrowth)(如Chung et al., 2008所述)、透過單性生殖(例如從單原核卵母細胞，如Lin et al. 2007所述)，或透過人類卵母細胞的孤雌生殖活化(如透過Mai et al. 2007所述)。還有較佳是非人類宿主生物體的細胞，較佳非人類生殖細胞、體細胞或多能幹細胞，包括胚胎幹細胞或囊胚。The method according to the present invention can be performed using eukaryotic cells and prokaryotic cells. Preferred prokaryotic cells are bacterial cells. Particularly preferred prokaryotic cells are Escherichia coli cells. Preferred eukaryotic cells are yeast cells (preferably brewing yeast), insect cells, non-insect invertebrate cells, amphibian cells or mammalian cells (preferably somatic cells or pluripotent stem cells, including embryonic stem cells and other pluripotent stem cells, such as induced pluripotent stem cells), as well as other natural cells or established cell lines (including NIH3T3, CHO, HeLa, HEK293, hiPS). In the case of human embryonic stem cells, the cells are preferably obtained without destroying the human embryo, for example by outgrowth of a single blastomere derived from a blastocyst (as described by Chung et al., 2008), by parthenogenesis (e.g. from a monopronuclear oocyte, as described by Lin et al. 2007), or by parthenogenetic activation of a human oocyte (as described by Mai et al. 2007). Also preferred are cells of a non-human host organism, preferably non-human germ cells, somatic cells or pluripotent stem cells, including embryonic stem cells or blastocysts.

根據一個實施例，細胞是真核細胞或細菌細胞。According to one embodiment, the cell is a eukaryotic cell or a bacterial cell.

根據另一個實施例，細胞不是人類生殖細胞。According to another embodiment, the cell is not a human germ cell.

在本發明的上下文中，與給定SEQ ID NO具有至少80%同一性的蛋白質較佳地表示該蛋白質具有與給定SEQ ID NO具有至少85%、至少90%、至少92%、至少有94%、至少96%、至少98%或至少99%序列同一性的胺基酸序列。In the context of the present invention, a protein having at least 80% identity to a given SEQ ID NO preferably means that the protein has an amino acid sequence having at least 85%, at least 90%, at least 92%, at least 94%, at least 96%, at least 98% or at least 99% sequence identity to the given SEQ ID NO.

根據又再一個態樣，本發明提供具有重組酶活性的蛋白質用於在基本上相同或基本上彼此反向互補的辨識位點處催化位點特異性DNA重組的用途，其中辨識位點包含第一半位點、間隔子和第二半位點，並且其中基本上相同或基本上彼此反向互補表示第一辨識位點中的第一半位點和第二半位點的核苷酸序列與第二辨識位點中的第一半位點和第二半位點的核苷酸序列的兩個核苷酸可能相差至多兩個核苷酸。According to yet another aspect, the present invention provides a use of a protein having recombinase activity for catalyzing site-specific DNA recombination at recognition sites that are substantially identical or substantially reverse complementary to each other, wherein the recognition sites comprise a first half site, a spacer, and a second half site, and wherein substantially identical or substantially reverse complementary to each other means that the nucleotide sequence of the first half site and the second half site in the first recognition site may differ from the nucleotide sequence of the first half site and the second half site in the second recognition site by at most two nucleotides.

根據本發明用途的一個態樣，具有重組酶活性的蛋白質包含與SEQ ID NO：1具有至少80%同一性的胺基酸序列，且至少兩個辨識位點包含根據SEQ ID NO：10或其功能突變體的核酸序列，或與SEQ ID NO：10或其功能突變體反向互補的核酸序列，其中功能突變體包含與SEQ ID NO：10具有至少60%序列同一性的核酸序列。According to one aspect of the use of the present invention, the protein having recombinase activity comprises an amino acid sequence having at least 80% identity with SEQ ID NO: 1, and at least two recognition sites comprise a nucleic acid sequence according to SEQ ID NO: 10 or a functional variant thereof, or a nucleic acid sequence that is reverse complementary to SEQ ID NO: 10 or a functional variant thereof, wherein the functional variant comprises a nucleic acid sequence having at least 60% sequence identity with SEQ ID NO: 10.

在本發明的用途中，具有重組酶活性並包含與給定SEQ ID NO具有至少80%同一性的胺基酸序列的蛋白質較佳地表示該蛋白質具有與給定SEQ ID NO具有至少85%、至少90%、至少92%、至少94%、至少96%、至少98%或至少99%序列同一性的胺基酸序列。In the use of the present invention, a protein having recombinase activity and comprising an amino acid sequence having at least 80% identity with a given SEQ ID NO preferably means that the protein has an amino acid sequence having at least 85%, at least 90%, at least 92%, at least 94%, at least 96%, at least 98% or at least 99% sequence identity with the given SEQ ID NO.

同樣地，在本發明的用途中，與給定SEQ ID NO具有至少60%序列同一性的核酸序列或與其反向互補的核酸序列表示該核酸具有與給定SEQ ID NO具有至少65%、至少70%、至少75%、至少80%、至少85%、至少90%、至少92%、至少94%、至少96%、至少98%或至少99%序列同一性的序列，或與該SEQ ID NO反向互補的核酸序列。Similarly, in the use of the present invention, a nucleic acid sequence having at least 60% sequence identity to a given SEQ ID NO, or a nucleic acid sequence that is the reverse complement thereof, means that the nucleic acid has a sequence having at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 94%, at least 96%, at least 98% or at least 99% sequence identity to a given SEQ ID NO, or a nucleic acid sequence that is the reverse complement of the SEQ ID NO.

根據又一個態樣，本發明提供一種包含至少一個，較佳地至少兩個基本上相同或基本上反向互補的本發明核酸的載體，其中DNA片段較佳地側接兩個基本上相同或基本上反向互補的核酸。According to yet another aspect, the present invention provides a vector comprising at least one, preferably at least two substantially identical or substantially reverse complementary nucleic acids of the present invention, wherein the DNA segment is preferably flanked by two substantially identical or substantially reverse complementary nucleic acids.

此外，本發明提供包含本發明核酸序列的載體。根據一個實施例，此類載體包含至少一個，較佳地至少兩個如本文針對辨識位點所述的基本上相同或基本上反向互補的核酸序列。根據又一個實施例，根據本發明的載體包含編碼具有重組酶活性的蛋白質的核酸，較佳地其中蛋白質包含與SEQ ID NO：1至9中之一者展現出至少80%，較佳地至少85%、至少90%、至少95%，以及最佳地至少99%序列同一性的胺基酸序列。In addition, the present invention provides vectors comprising the nucleic acid sequences of the present invention. According to one embodiment, such vectors comprise at least one, preferably at least two, substantially identical or substantially reverse complementary nucleic acid sequences as described herein for the identification sites. According to yet another embodiment, the vector according to the present invention comprises a nucleic acid encoding a protein having recombinase activity, preferably wherein the protein comprises an amino acid sequence exhibiting at least 80%, preferably at least 85%, at least 90%, at least 95%, and most preferably at least 99% sequence identity with one of SEQ ID NOs: 1 to 9.

本發明進一步提供一種載體(本文中也稱為「報導載體」)，其包含至少一個核酸，該至少一個核酸包含根據SEQ ID NO：10至17中之一者或與SEQ ID NO：10至17中之一者反向互補的核酸序列，或為其功能突變體的核酸序列。通常，表現載體包含複製起點、啟動子，以及允許對包含報導載體的宿主細胞進行表型選擇的特定基因序列。在本發明的一個較佳實施例中，載體包含至少兩個辨識位點，即至少兩個彼此獨立地展現出根據SEQ ID NO：10至17中之一者或為其功能突變體的核酸序列或與SEQ ID NO：10至17中之一者或為其功能突變體的核酸序列反向互補的核酸序列，或為其功能突變體的核酸序列。至少兩個辨識位點較佳地不連續地位於載體中。相反，至少兩個辨識位點被定位為使得它們側接在感興趣的DNA片段，當辨識位點被各自的重組酶蛋白辨識時，該DNA片段被重組，較佳地被切除或倒置。感興趣的DNA片段較佳地可含有基因或啟動子區域。當感興趣的DNA片段側接有兩個具有相同位向(相同核酸序列)的辨識位點時，該DNA片段被切下。當感興趣的DNA片段側接兩個以相反位向排列的辨識位點(即辨識位點包含彼此反向互補的核酸序列)，則該DNA片段的倒置會受到重組酶蛋白所催化。The present invention further provides a vector (also referred to herein as a "reporter vector") comprising at least one nucleic acid comprising a nucleic acid sequence according to one of SEQ ID NOs: 10 to 17 or a nucleic acid sequence that is reverse complementary to one of SEQ ID NOs: 10 to 17, or a nucleic acid sequence that is a functional mutant thereof. Typically, an expression vector comprises a replication origin, a promoter, and a specific gene sequence that allows phenotypic selection of a host cell comprising the reporter vector. In a preferred embodiment of the present invention, the vector comprises at least two recognition sites, i.e., at least two nucleic acid sequences that independently exhibit a nucleic acid sequence according to one of SEQ ID NOs: 10 to 17 or a functional mutant thereof, or a nucleic acid sequence that is reverse complementary to a nucleic acid sequence of one of SEQ ID NOs: 10 to 17 or a functional mutant thereof, or a nucleic acid sequence that is a functional mutant thereof. At least two recognition sites are preferably located discontinuously in the vector. In contrast, at least two recognition sites are positioned so that they flank the DNA fragment of interest, and when the recognition sites are recognized by the respective recombinase proteins, the DNA fragment is recombined, preferably excised or inverted. The DNA fragment of interest may preferably contain a gene or a promoter region. When the DNA fragment of interest is flanked by two recognition sites with the same orientation (same nucleic acid sequence), the DNA fragment is excised. When the DNA fragment of interest is flanked by two recognition sites arranged in opposite orientations (i.e., the recognition sites contain nucleic acid sequences that complement each other in reverse), the inversion of the DNA fragment is catalyzed by the recombinase proteins.

本發明載體中的任一者在供產生位點特異性DNA重組的方法中的用途也包括在本發明中。The use of any of the vectors of the present invention in a method for producing site-specific DNA recombination is also included in the present invention.

根據一個實施例，經分離的宿主細胞進一步包含編碼具有重組酶活性的蛋白質的核酸，其中蛋白質包含與SEQ ID NO：1至9中任一者具有至少80%同一性的胺基酸序列。According to one embodiment, the isolated host cell further comprises a nucleic acid encoding a protein having recombinase activity, wherein the protein comprises an amino acid sequence having at least 80% identity to any one of SEQ ID NOs: 1-9.

根據又一個實施例，經分離的宿主細胞進一步包含本發明載體。According to yet another embodiment, the isolated host cell further comprises a vector of the present invention.

較佳地排除天然包含本文提及的核酸或載體的經分離宿主細胞。根據一個實施例，本發明僅涉及那些重組地且非天然地(即透過宿主細胞的遺傳修飾)包含上述核酸或載體的經分離宿主細胞。Preferably, isolated host cells that naturally contain the nucleic acids or vectors mentioned herein are excluded. According to one embodiment, the present invention relates only to isolated host cells that recombinantly and non-naturally (i.e., through genetic modification of the host cell) contain the above-mentioned nucleic acids or vectors.

根據又一個態樣，本發明提供一種非人類宿主生物體，其包含至少一個，且較佳地至少兩個本發明核酸。根據又一個態樣，本發明提供一種非人類宿主生物體，其包含本發明載體。According to another aspect, the present invention provides a non-human host organism comprising at least one, and preferably at least two nucleic acids of the present invention. According to another aspect, the present invention provides a non-human host organism comprising a vector of the present invention.

再者，本發明包括經分離的宿主細胞或經分離的宿主生物體，其包含(i)至少一個，較佳地至少兩個本發明核酸，其包含如上文所定義的辨識位點(較佳地兩個本發明核酸，其分別包括辨識位點，並且側接又一個DNA片段)，及/或如上所定義編碼根據具有重組酶活性的蛋白質的本發明核酸，或(ii)本發明載體，其包含有包含如上所定義的辨識位點的至少兩個核酸(較佳地兩個分別包括辨識位點並且側接又一個DNA片段的本發明核酸)，及/或包含編碼如上定義具有重組酶活性的蛋白質的核酸的載體。Furthermore, the present invention comprises an isolated host cell or an isolated host organism comprising (i) at least one, preferably at least two, nucleic acids according to the invention comprising a recognition site as defined above (preferably two nucleic acids according to the invention, each comprising a recognition site and flanked by a further DNA fragment), and/or a nucleic acid according to the invention encoding a protein having recombinase activity as defined above, or (ii) a vector according to the invention comprising at least two nucleic acids comprising a recognition site as defined above (preferably two nucleic acids according to the invention, each comprising a recognition site and flanked by a further DNA fragment), and/or a vector comprising a nucleic acid encoding a protein having recombinase activity as defined above.

本發明也包括經分離的宿主細胞，其包含以下重組DNA片段：(i)至少一個，較佳地至少兩個包含辨識位點的本發明核酸及/或編碼重組酶蛋白的本發明核酸，或(ii)本發明載體，其包含至少兩個包含辨識位點的核酸(較佳地兩包括辨識位點且側接又一個感興趣DNA片段的本發明核酸)及/或包含編碼重組酶蛋白的核酸的本發明載體。The present invention also includes an isolated host cell comprising the following recombinant DNA segments: (i) at least one, preferably at least two, nucleic acids of the present invention comprising a recognition site and/or a nucleic acid of the present invention encoding a recombinase protein, or (ii) a vector of the present invention comprising at least two nucleic acids of the present invention comprising a recognition site (preferably two nucleic acids of the present invention comprising a recognition site and flanked by another DNA segment of interest) and/or a vector of the present invention comprising a nucleic acid encoding a recombinase protein.

本發明僅涉及那些包含上述核酸或載體(以重組且非天然的方式，即藉由遺傳修飾宿主細胞)的經分離宿主細胞。The present invention relates only to those isolated host cells that contain the above-mentioned nucleic acids or vectors (in a recombinant and non-natural manner, i.e., by genetically modifying the host cell).

尤佳的是經分離宿主細胞，其含有編碼本發明重組酶蛋白的核酸和至少兩個其辨識位點(其以相同或相反方向定向)。Particularly preferred are isolated host cells containing nucleic acid encoding a recombinase protein of the invention and at least two recognition sites thereof, oriented in the same or opposite directions.

本發明涵意內的宿主細胞是天然存在的細胞或細胞株(視情況經轉形的或經遺傳修飾的)，其以重組的方式包含至少一個如上所述的本發明載體或本發明核酸。因此，本發明包括瞬時轉染體(例如透過mRNA注射)或包括至少一個本發明表現載體作為質體或人工染色體的宿主細胞，以及其中本發明表現載體被穩定地併入到該宿主細胞的基因體中的宿主細胞。Host cells within the meaning of the present invention are naturally occurring cells or cell strains (transformed or genetically modified as appropriate) that recombinantly comprise at least one vector of the present invention or the nucleic acid of the present invention as described above. Thus, the present invention includes transient transfectants (e.g., by mRNA injection) or host cells that comprise at least one expression vector of the present invention as a plasmid or artificial chromosome, as well as host cells in which the expression vector of the present invention is stably incorporated into the genome of the host cell.

本發明上下文中的適宜宿主細胞特別是真核細胞，包括幹細胞，像是造血幹細胞、神經元幹細胞、衍生自脂肪組織的幹細胞、胎兒幹細胞、臍帶幹細胞、誘導型多能幹細胞與胚胎幹細胞。就人類胚胎幹細胞而言，較佳地不是從破壞胚胎衍生而來。此外，較佳地排除修飾作為宿主細胞的人類生殖系和人類配子。Suitable host cells in the context of the present invention are in particular eukaryotic cells, including stem cells, such as hematopoietic stem cells, neural stem cells, stem cells derived from adipose tissue, fetal stem cells, umbilical cord stem cells, induced pluripotent stem cells and embryonic stem cells. In the case of human embryonic stem cells, preferably they are not derived from a destroyed embryo. In addition, it is preferred to exclude the modification of the human germ line and human gametes as host cells.

使用本發明，還可以在宿主生物體(諸如哺乳動物)中誘導組織特異性或位點特異性重組。因而，本發明還包括包含以下重組DNA片段的非人類宿主生物體：至少一個，較佳地至少兩個包含辨識位點的本發明核酸(較佳地兩個本發明核酸，其分別包括側接又一個感興趣DNA片段的辨識位點)及/或編碼重組酶蛋白的本發明核酸。Using the present invention, tissue-specific or site-specific recombination can also be induced in host organisms (such as mammals). Thus, the present invention also includes non-human host organisms comprising the following recombinant DNA fragments: at least one, preferably at least two nucleic acids of the present invention comprising a recognition site (preferably two nucleic acids of the present invention, each comprising a recognition site flanked by another DNA fragment of interest) and/or a nucleic acid of the present invention encoding a recombinase protein.

明確納入僅包含編碼本發明重組酶蛋白的重組核酸(且分別不包含包括辨識位點的核酸)的非人類宿主生物體。Specifically included are non-human host organisms that comprise only a recombinant nucleic acid encoding a recombinase protein of the invention (and separately not a nucleic acid comprising a recognition site).

此外，本發明包括僅包含至少一個，較佳地至少兩個辨識位點(且不包含編碼本發明重組酶蛋白的核酸)的非人類宿主生物體。當兩個非人類宿主生物體雜交時，其中第一宿主生物體包含編碼本發明重組酶蛋白的重組核酸，而第二宿主生物體包含至少兩個重組辨識位點(較佳地側接又一個感興趣的DNA片段)，後代包括表現重組酶蛋白並進一步包括辨識位點的宿主生物體，因此位點特異性DNA重組(像是組織特異性條件性剔除)是可能的。In addition, the present invention includes non-human host organisms that only contain at least one, preferably at least two, recognition sites (and do not contain nucleic acids encoding recombinase proteins of the present invention). When two non-human host organisms are crossed, wherein the first host organism contains a recombinant nucleic acid encoding a recombinase protein of the present invention, and the second host organism contains at least two recombination recognition sites (preferably flanked by another DNA segment of interest), the progeny include host organisms that express the recombinase protein and further include the recognition sites, so that site-specific DNA recombination (such as tissue-specific conditional knockouts) is possible.

也提供了非人類宿主生物體，其可能包含如上所述的本發明載體或本發明核酸，其分別穩定地併入到宿主生物體的基因體或宿主生物體的個別細胞中。Also provided are non-human host organisms which may comprise a vector of the invention or a nucleic acid of the invention as described above, which are stably incorporated into the genome of the host organism or into individual cells of the host organism, respectively.

本發明的較佳宿主生物體是植物、無脊椎動物和脊椎動物，特別是牛科、黑腹果蠅、秀麗隱桿線蟲、非洲爪蟾、青鱂、斑馬魚或小鼠，或這些生物體的胚胎。Preferred host organisms of the present invention are plants, invertebrates and vertebrates, especially bovines, Drosophila melanogaster, Cryptobacilus elegans, Xenopus laevis, Gymnosoma turcica, zebrafish or mice, or embryos of these organisms.

本發明也提供了適合在各種細胞類型的細胞中產生位點特異性重組的新型重組酶系統。這樣的系統包括相應的重組酶蛋白(SEQ ID NO：1至9中之一者)和至少兩個如本文所確定的相應辨識位點。利用這樣的系統，可以實現多種基因操縱，特別是重新排列被本發明辨識位點所側接的DNA片段，辨識位點呈相同位向(切除)、相反位向(倒置)，或當一個特異性辨識位點存在於兩個DNA分子的各者上時而一者(如果呈環狀)呈任何位向存在(併入)。例示性的操縱是由如本文所確定的各別重組酶蛋白媒介切除側接兩個以相同方向定向之辨識位點的DNA片段。其中，本發明的重組酶系統提供切除目標DNA的可能性，諸如側接兩個辨識位點的終止DNA片段，該目標DNA位於基因的5'並對應於該基因啟動子的3'。如果沒有重組，終止子序列會阻礙基因表現，而在重組酶媒介經由兩個側接辨識位點切除終止子時，基因位於啟動子附近，因此將會被表現。當例如組織或生物體特異性或誘導型啟動子區域用於表現重組酶蛋白時，調節重組酶表現的不同類型啟動子區域尤其允許條件性DNA重組。The present invention also provides novel recombinase systems suitable for producing site-specific recombination in cells of various cell types. Such systems comprise a corresponding recombinase protein (one of SEQ ID NOs: 1 to 9) and at least two corresponding recognition sites as defined herein. Using such systems, a variety of genetic manipulations can be achieved, in particular the rearrangement of DNA segments flanked by the recognition sites of the present invention, with the recognition sites being in the same orientation (excision), in opposite orientations (inversion), or when one specific recognition site is present on each of two DNA molecules and one (if circular) is present in any orientation (incorporation). An exemplary manipulation is the excision of a DNA segment flanked by two recognition sites oriented in the same direction mediated by a respective recombinase protein as defined herein. Among other things, the recombinase system of the present invention provides the possibility of excising a target DNA, such as a terminator DNA fragment flanked by two recognition sites, which is located 5' of a gene and corresponds to 3' of the promoter of the gene. Without recombination, the terminator sequence would block gene expression, whereas upon recombinase-mediated excision of the terminator via the two flanking recognition sites, the gene is located near the promoter and will therefore be expressed. Different types of promoter regions that regulate recombinase expression allow in particular conditional DNA recombination when, for example, tissue- or organism-specific or inducible promoter regions are used to express the recombinase protein.

本發明的重組酶系統適用於與其他重組酶系統組合使用，並且成為對於同時或依序需要多個重組酶的遺傳實驗特別有價值的工具。The recombinase system of the present invention is suitable for use in combination with other recombinase systems and becomes a particularly valuable tool for genetic experiments requiring multiple recombinases simultaneously or sequentially.

本發明進一步提供具有重組酶活性之蛋白質的用途，其中蛋白質包含與SEQ ID NO：1至9之胺基酸序列中之一者展現出至少85%、至少90%、至少95%，較佳地至少95%，甚至更佳至少99%%胺基酸序列同一性的胺基酸序列，以催化位點特異性DNA重組。因此，上述位點特異性重組酶用於對本發明的至少一個，較佳地至少兩個基本上相同或基本上彼此反向互補的辨識位點進行位點特異性DNA重組。至少一個辨識位點包含分別根據SEQ ID NO：10至17的核酸序列或與SEQ ID NO：10至17的核酸序列反向互補的核酸序列，或為其功能突變體的核酸序列。The present invention further provides the use of a protein having recombinase activity, wherein the protein comprises an amino acid sequence that exhibits at least 85%, at least 90%, at least 95%, preferably at least 95%, even more preferably at least 99%% amino acid sequence identity with one of the amino acid sequences of SEQ ID NOs: 1 to 9, to catalyze site-specific DNA recombination. Thus, the above-mentioned site-specific recombinase is used to perform site-specific DNA recombination on at least one, preferably at least two, substantially identical or substantially reverse complementary recognition sites of the present invention. At least one recognition site comprises a nucleic acid sequence according to SEQ ID NOs: 10 to 17, respectively, or a nucleic acid sequence that is reverse complementary to a nucleic acid sequence of SEQ ID NOs: 10 to 17, or a nucleic acid sequence that is a functional mutant thereof.

根據又一個態樣，本發明提供一種與SEQ ID NO：7、SEQ ID NO：3、SEQ ID NO：1、SEQ ID NO：2、SEQ ID NO：4、SEQ ID NO：5、SEQ ID NO：6、SEQ ID NO：8或SEQ ID NO：9具有至少80%同一性的蛋白質或本發明載體，其用於藥物。根據一個較佳實施例，蛋白質或載體用於治療個體的遺傳性疾病或病症。較佳地，遺傳性疾病或病症的特徵在於個體基因體的修飾。According to another aspect, the present invention provides a protein or a vector of the present invention having at least 80% identity to SEQ ID NO: 7, SEQ ID NO: 3, SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 8 or SEQ ID NO: 9 for use in medicine. According to a preferred embodiment, the protein or vector is used to treat a genetic disease or condition in an individual. Preferably, the genetic disease or condition is characterized by modification of the individual's genome.

本發明也提供一種醫藥組成物，其包含本發明的重組酶蛋白、本發明載體或宿主細胞，或一或多種本發明核酸，以及視情況醫藥上可接受的載劑。The present invention also provides a pharmaceutical composition comprising the recombinant enzyme protein of the present invention, the vector or host cell of the present invention, or one or more nucleic acids of the present invention, and optionally a pharmaceutically acceptable carrier.

含有本發明治療活性劑的醫藥組成物可能是呈適於所選投藥模式的任何形式。Pharmaceutical compositions containing the therapeutically active agents of the invention may be in any form suitable for the chosen mode of administration.

在一個實施例中，本發明的醫藥組成物是腸胃外投予。In one embodiment, the pharmaceutical composition of the present invention is administered parenterally.

如本文所用的片語「腸胃外投予」和「腸胃外地投予」表示除了經腸和局部投予之外的投予方式，通常透過注射，並且包括表皮、靜脈內、肌肉內、動脈內、鞘內、囊內、眼眶內、心內、皮內、腹膜內、肌腱內、氣管內、皮下(subcutaneous)、皮下(subcuticular)、關節內、囊下、蜘蛛膜下、脊椎內、顱內、胸腔內、硬膜內與胸骨內注射和輸注。As used herein, the phrases "parenteral administration" and "administered parenterally" refer to modes of administration other than enteral and topical administration, usually by injection, and include epidermal, intravenous, intramuscular, intraarterial, intrathecal, intracapsular, intraorbital, intracardiac, intradermal, intraperitoneal, intratendinous, intratracheal, subcutaneous, subcuticular, intraarticular, subcapsular, subarachnical, intraspinal, intracranial, intrathoracic, intradural, and intrasternal injection and infusion.

如本文提及的治療活性劑包括但不限於本發明的重組酶蛋白和本發明的辨識位點。本發明的治療活性劑可以作為單獨的活性劑或與其他活性劑組合呈單位投予形式，如與習知醫藥支持物的混合物投予給動物和人類。As mentioned herein, therapeutic agents include but are not limited to the recombinase proteins of the present invention and the recognition sites of the present invention. The therapeutic agents of the present invention can be administered to animals and humans as a single active agent or in combination with other active agents in a unit administration form, such as a mixture with a conventional pharmaceutical support.

在進一步的實施例中，醫藥組成物含有對於能夠被注射的調配物而言是醫藥上可接受的載劑(也稱為媒劑)。這些特別可能是等滲的、無菌的鹽水溶液(磷酸二氫鈉或磷酸氫二鈉、氯化鈉、氯化鉀、氯化鈣或氯化鎂或這些鹽的混合物)，或乾燥的(特別是經凍乾的)組成物，添加時取決於如果使用無菌水或生理鹽水，則可以配製可注射溶液。In a further embodiment, the pharmaceutical composition contains a pharmaceutically acceptable carrier (also referred to as a vehicle) for a formulation that can be injected. These may be in particular isotonic, sterile aqueous saline solutions (sodium dihydrogen phosphate or sodium dihydrogen phosphate, sodium chloride, potassium chloride, calcium chloride or magnesium chloride or mixtures of these salts), or dry (in particular lyophilized) compositions, added depending on if sterile water or physiological saline is used, injectable solutions can be prepared.

適於注射使用的醫藥形式包括無菌水溶液或分散劑；包括芝麻油、花生油或丙二醇水溶液的調配物；以及用於臨時配製無菌可注射溶液或分散劑的無菌粉末。在所有情況下，形式必須是無菌的且必須是流體的。它在製造條件和儲存條件下必須是穩定的，並且必須防止微生物(例如細菌和真菌)的污染作用。Pharmaceutical forms suitable for injectable use include sterile aqueous solutions or dispersions; formulations including sesame oil, peanut oil or propylene glycol in water; and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersions. In all cases, the form must be sterile and must be fluid. It must be stable under the conditions of manufacture and storage and must be preserved against the contaminating action of microorganisms (e.g., bacteria and fungi).

包含如游離鹼或藥理學上可接受之鹽形式的治療活性劑的溶液可以在與界面活性劑(諸如羥丙基纖維素)適當混合的水中進行製備。分散劑還可以在甘油、液體聚乙二醇及其混合物中以及在油中製備。在普通儲存條件和使用條件下，這些製劑含有防腐劑以防止微生物生長。Solutions containing the therapeutically active agent in the form of a free base or a pharmacologically acceptable salt can be prepared in water suitably mixed with a surfactant such as hydroxypropylcellulose. Dispersions can also be prepared in glycerol, liquid polyethylene glycols and mixtures thereof and in oils. Under ordinary conditions of storage and use, these preparations contain a preservative to prevent the growth of microorganisms.

治療活性劑可以配製為呈中性或鹽形式的組成物。醫藥上可接受的鹽包括酸加成鹽(與蛋白質的游離胺基形成)以及其與無機酸(諸如例如鹽酸或磷酸)或有機酸(諸如乙酸、草酸、酒石酸、扁桃酸及類似者)形成者。與游離羧基形成的鹽也可以衍生自無機鹼(諸如例如氫氧化鈉、氫氧化鉀、氫氧化銨、氫氧化鈣或氫氧化鐵)以及有機鹼(如異丙胺、三甲胺、組胺酸、普魯卡因以及類似者)。The therapeutically active agent can be formulated as a composition in neutral or salt form. Pharmaceutically acceptable salts include acid addition salts (formed with free amino groups of proteins) and those formed with inorganic acids such as, for example, hydrochloric acid or phosphoric acid or organic acids such as acetic acid, oxalic acid, tartaric acid, mandelic acid, and the like. Salts formed with free carboxyl groups can also be derived from inorganic bases such as, for example, sodium hydroxide, potassium hydroxide, ammonium hydroxide, calcium hydroxide, or iron hydroxide, and organic bases such as isopropylamine, trimethylamine, histidine, procaine, and the like.

載劑也可以作為溶劑或分散介質，含有例如水、乙醇、多元醇(例如甘油、丙二醇和液體聚乙二醇與類似者)、其適當的混合物，和植物油。適當的流動性可以例如透過使用包衣(諸如卵磷脂)、在分散劑的情況下透過維持所需的粒度以及透過使用表面活性劑來維持。可以透過各種抗菌劑和抗真菌劑來防止微生物的作用，例如對羥基苯甲酸酯、氯丁醇、苯酚、山梨酸、硫柳汞與類似物。在許多情況下，偏好納入等滲劑，例如糖或氯化鈉。可透過在組成物中使用延遲吸收劑(例如單硬脂酸鋁和明膠)來延長可注射組成物的吸收。The carrier can also serve as a solvent or dispersion medium containing, for example, water, ethanol, polyols (e.g., glycerol, propylene glycol, and liquid polyethylene glycol and the like), suitable mixtures thereof, and vegetable oils. Proper fluidity can be maintained, for example, by the use of a coating such as lecithin, by the maintenance of the required particle size in the case of dispersions, and by the use of a surfactant. The action of microorganisms can be prevented by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, sorbic acid, thimerosal, and the like. In many cases, it is preferred to include isotonic agents, for example, sugars or sodium chloride. The absorption of the injectable composition can be prolonged by the use in the composition of an agent that delays absorption, for example, aluminum monostearate and gelatin.

透過將呈所需數量的活性多肽與上面列出的幾種其他成分(根據需要)併入適當的溶劑中，然後過濾滅菌來製備無菌可注射溶液。一般而言，分散劑透過將各種無菌活性成分併入無菌媒劑中來製備，無菌媒劑含有基礎分散介質和來自上面列舉彼等的所需其他成分。在用於製備無菌可注射溶液的無菌粉末的情況下，較佳的製備方法是真空乾燥和凍乾技術，其產生活性成分加上來自其先前經無菌過濾的溶液的任何額外所需成分的粉末。Sterile injectable solutions are prepared by incorporating the active polypeptide in the required amount with several of the other ingredients listed above (as required) in an appropriate solvent and then filtering and sterilizing. Generally speaking, dispersions are prepared by incorporating the various sterile active ingredients into a sterile vehicle containing a basic dispersion medium and the required other ingredients from those listed above. In the case of sterile powders for the preparation of sterile injectable solutions, the preferred methods of preparation are vacuum drying and freeze-drying techniques, which produce a powder of the active ingredient plus any additional desired ingredients from a previously sterile filtered solution thereof.

調配後，可以按照與劑量調配物相容的方式並以治療有效之數量投予溶液。調配物以多種劑型容易投予，諸如上述可注射溶液類型，但也可以採用藥物釋放膠囊與類似者。也可以投予多次劑量。適當的話，本文所述的治療活性劑可能配製在任何適合遞送的媒劑中。例如，可以將它們置於醫藥上可接受的懸浮液、溶液或乳液中。適當的介質包括鹽水和脂質體製劑。更具體地，醫藥上可接受的載劑可能包括無菌水性或非水性溶液、懸浮液和乳液。非水性溶劑的實例是丙二醇、聚乙二醇、植物油(諸如橄欖油)，以及可注射的有機酯(諸如油酸乙酯)。水性載劑包括但不限於水、醇/水溶液、乳液或懸浮液，包括鹽水和緩衝介質。靜脈內媒劑包括液體和營養補充劑、電解質補充劑(諸如基於林格氏葡萄糖的彼等)與類似者。After formulation, the solution can be administered in a manner compatible with the dosage formulation and in a therapeutically effective amount. The formulation is easily administered in a variety of dosage forms, such as the injectable solution type described above, but drug release capsules and the like can also be used. Multiple doses can also be administered. If appropriate, the therapeutically active agents described herein may be formulated in any medium suitable for delivery. For example, they can be placed in a pharmaceutically acceptable suspension, solution or emulsion. Suitable media include saline and liposomal preparations. More specifically, pharmaceutically acceptable carriers may include sterile aqueous or non-aqueous solutions, suspensions and emulsions. Examples of non-aqueous solvents are propylene glycol, polyethylene glycol, vegetable oils such as olive oil, and injectable organic esters such as ethyl oleate. Aqueous carriers include, but are not limited to, water, alcohol/aqueous solutions, emulsions or suspensions, including saline and buffered media. Intravenous vehicles include liquids and nutrient supplements, electrolyte supplements (such as those based on Ringer's dextrose), and the like.

還可能存在防腐劑和其他添加劑，諸如例如抗微生物劑、抗氧化劑、螯合劑和惰性氣體與類似者。Preservatives and other additives may also be present, such as, for example, antimicrobials, antioxidants, chelating agents and inert gases and the like.

膠體分散系統也可能用於標靶基因遞送。膠體分散系統包括巨分子複合物、奈米膠囊、微球、珠粒和基於脂質的系統，包括水包油乳液、微胞、混合型微胞以及脂質體。Colloidal dispersion systems may also be used for targeted gene delivery. Colloidal dispersion systems include macromolecular complexes, nanocapsules, microspheres, beads, and lipid-based systems, including oil-in-water emulsions, micelles, hybrid micelles, and liposomes.

適當的治療方案可以由醫生決定，並將取決於個體的年齡、性別、體重和疾病的階段。例如，為了使用病毒表現載體遞送編碼本發明之經基因工程改造的DNA重組酵素的核酸序列，每個單位劑量的經基因工程改造的DNA重組酵素表現載體可能包含組成物，其包括在醫藥上可接受的流體中之呈例如濃度範圍為10 ¹¹至10 ¹⁶個病毒基因體/ml的病毒表現載體。 The appropriate treatment regimen can be determined by the physician and will depend on the individual's age, sex, weight, and stage of the disease. For example, to deliver a nucleic acid sequence encoding a genetically engineered DNA recombinase of the present invention using a viral expression vector, each unit dose of the genetically engineered DNA recombinase expression vector may comprise a composition comprising the viral expression vector in a pharmaceutically acceptable fluid at a concentration ranging from, for example, 10 ¹¹ to 10 ¹⁶ viral genomes/ml.

用於投予呈重組多肽形式的本發明之經基因改造的DNA重組酵素或其次單位的有效劑量和劑量方案取決於待治療的疾病或病況並且可以由本領域具有通常知識者決定。The effective dosage and dosage regimen for administering the genetically engineered DNA recombinase of the present invention or its subunit in the form of a recombinant polypeptide depends on the disease or condition to be treated and can be determined by one of ordinary skill in the art.

具有本領域通常知識的醫師或獸醫可能輕鬆決定並開立所需醫藥組成物的有效量。例如，醫師或獸醫可以以低於為了達到期望治療效果所需的含量開始醫藥組成物中所採用的本發明治療活性劑的劑量，並逐漸增加劑量直至達到期望的效果。一般而言，本發明組成物的適宜每日劑量將是有效產生治療效果的最低劑量的遞送系統的數量。這種有效劑量通常將取決於上述因素。投予可能例如可以是靜脈內、肌肉內、腹膜內或皮下，並且例如在接近目標部位投予。如有需要，醫藥組成物的有效每日劑量可能作為兩個、三個、四個、五個、六個或更多個分劑量在一天中以適當的間隔分開投予，視情況呈單位劑型投予。儘管本發明的遞送系統可以單獨投予，但遞送系統偏好作為如上所述的醫藥組成物投予。A physician or veterinarian with ordinary knowledge in the art may readily determine and prescribe the effective amount of the pharmaceutical composition required. For example, the physician or veterinarian may start the dosage of the therapeutically active agent of the present invention employed in the pharmaceutical composition at a level lower than that required to achieve the desired therapeutic effect, and gradually increase the dosage until the desired effect is achieved. In general, a suitable daily dose of the composition of the present invention will be the amount of the delivery system that is the lowest dose effective to produce a therapeutic effect. Such an effective dose will generally depend on the factors described above. Administration may, for example, be intravenous, intramuscular, intraperitoneal, or subcutaneous, and, for example, administered close to the target site. If desired, the effective daily dose of the pharmaceutical composition may be administered as two, three, four, five, six or more divided doses at appropriate intervals throughout the day, optionally in unit dosage form. Although the delivery system of the present invention can be administered alone, the delivery system is preferably administered as a pharmaceutical composition as described above.

進一步提供了包含如本文所述的治療活性劑的套組。在一個實施例中，套組提供製備成一或多個單位劑型的治療活性劑，以準備投予給個體，例如在預裝填注射器中或在安瓿中。在另一個實施例中，治療活性劑以凍乾形式提供。Further provided are kits comprising a therapeutic agent as described herein. In one embodiment, the kit provides the therapeutic agent prepared in one or more unit dosage forms, ready for administration to a subject, such as in a pre-filled syringe or in an ampoule. In another embodiment, the therapeutic agent is provided in a lyophilized form.

根據另一個態樣，本發明提供一種治療或預防疾病的方法，該方法包含向有需要的個體投予治療有效量之與SEQ ID NO：7、SEQ ID NO：3、SEQ ID NO：1、SEQ ID NO：2、SEQ ID NO：4、SEQ ID NO：5、SEQ ID NO：6、SEQ ID NO：8 或SEQ ID NO：9具有至少80%同一性的蛋白質、本發明核酸、本發明載體、本發明的經分離之宿主細胞、本發明的非人類宿主生物體，或本發明醫藥組成物。According to another aspect, the present invention provides a method for treating or preventing a disease, the method comprising administering to a subject in need thereof a therapeutically effective amount of a protein having at least 80% identity to SEQ ID NO:7, SEQ ID NO:3, SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:8 or SEQ ID NO:9, a nucleic acid of the present invention, a vector of the present invention, an isolated host cell of the present invention, a non-human host organism of the present invention, or a pharmaceutical composition of the present invention.

發明人篩選了超過500種推定的重組酶候選者，其中選擇了17個候選者進行了實驗測試，並對八種新型的重組酶系統進行了詳細的分子特徵鑑定。為了最佳地使用不同的重組酶系統，必須特徵鑑定它們的應用性質。確定活體內重組酶活性對於計畫實驗可能至關重要。八種新特徵鑑定的重組酶在細菌中的預測目標位點上顯示出各異的活性，從YR1、YR4、YR6中超過90%到YR8和YR9的低至15%左右。這個低活性的可能解釋可能包括需要額外的輔因子，或更有效地重組目標位點的最佳溫度。對重組酶的特異性進行了剖析，為所有目標位點上可能的交叉重組事件提供了有價值的概述。為了在高等生物中應用，測試了新型重組酶在哺乳動物細胞中的活性和相容性。在HEK293T細胞的基於質體的分析中，所有重組酶均證明高活性。有趣的是，YR8在哺乳動物細胞中表現出高重組率，而這種重組酶在細菌中僅具有微弱的活性，表明重組酶的活性概況在異源性宿主中可能有所不同。The inventors screened more than 500 putative recombinase candidates, of which 17 candidates were selected for experimental testing, and eight novel recombinase systems were molecularly characterized in detail. In order to optimally use different recombinase systems, their application properties must be characterized. Determining the recombinase activity in vivo may be crucial for planning experiments. The eight newly characterized recombinases showed varying activities at the predicted target sites in bacteria, ranging from over 90% in YR1, YR4, YR6 to as low as around 15% in YR8 and YR9. Possible explanations for this low activity may include the need for additional cofactors, or an optimal temperature for more efficient recombination of the target site. The specificity of the recombinases was dissected, providing a valuable overview of possible crossover recombination events at all target sites. For applications in higher organisms, the novel recombinases were tested for activity and compatibility in mammalian cells. All recombinases demonstrated high activity in plastid-based assays in HEK293T cells. Interestingly, YR8 showed high recombination rates in mammalian cells, whereas this recombinase was only weakly active in bacteria, suggesting that the activity profile of recombinases may differ in heterologous hosts.

關於在異源性細胞中的應用，考慮宿主基因體中可能存在的假位點也很重要。對人類和小鼠基因體中所有已知Cre型重組酶的lox樣位點進行了篩選。這些資訊在兩個用途方面有用：i)它對可能損及實驗的潛在脫靶位點提供了估計；以及ii)它提供了可能有效用於基因體工程改造練習的潛在內源目標位點(例如用於靶向遞送DNA貨物進入安全港基因座)。With regard to applications in heterologous cells, it is also important to consider possible pseudosites present in the host genome. All known Cre-type recombinases in the human and mouse genomes were screened for lox-like sites. This information is useful in two ways: i) it provides an estimate of potential off-target sites that may compromise the experiment; and ii) it provides potential endogenous target sites that may be useful for genome engineering exercises (e.g. for targeted delivery of DNA cargo into safe harbor loci).

不同天然存在的重組酶的揭示內容和特徵鑑定對於經由定向演化加速新型酵素的開發特別有用。已經證實，相關基因的洗牌可以加速演化過程(Crameri et al., 1998)。因此，透過家族改組(family shuffling)，可能會在更短的時間內獲得具有所需性質的新型重組酶。實例實例 1 ：鑑定假定的重組酶及其目標位點 The disclosure and characterization of different naturally occurring recombinases is particularly useful for accelerating the development of novel enzymes via directed evolution. It has been demonstrated that shuffling of related genes can accelerate the evolutionary process (Crameri et al., 1998). Therefore, by family shuffling, novel recombinases with desired properties may be obtained in a shorter time. Examples Example 1 : Identification of putative recombinases and their target sites

透過使用BLAST+ 2.10.1 (https://www.ncbi.nlm.nih.gov/books/NBK131777/)中的tblastn和Cre、Vika (Karimova et al., 2012)、Nigri和Panto (Karimova et al. 2016)的蛋白質序列做為參考物來搜尋NCBI核苷酸收集數據庫(v5)鑑定潛在的新型Y-SSR。使用GNU awk過濾低於90%同一性和序列長度為300至400個胺基酸的結果。透過efetch (https://dataguide.nlm.nih.gov/edirect/efetch.html)取得蛋白質序列。使用bastdbcmd (BLAST+的一部分)收集潛在SSR的完整基因體序列。針對回文透過搜尋潛在Y-SSR上游和下游1000 bp的基因體序列，確定了潛在的目標位點。使用EMBOSS回文(v6.6.0.0) (Rice et al., 2000)進行回文搜尋，最小回文長度為13至15個鹼基對，空位限制為8，允許一個錯配。然後使用GNU awk將程式輸出轉換為表格格式，並使用dplyr套件與R中的蛋白質數據結合。與參考物的Levenshtein距離(stringdist R套件，https://journal.r-project.org/archive/2014/RJ-2014-011/index.html)低於10的潛在SSR已被移除。潛在的SSR是基於Levenshtein距離用完全階層式分群法(base R)進行分群，並以截止距離為11形成分群。同樣的分群方法也用於回文序列的潛在半位點，在此分群截止的距離為2。選出認為分群的前幾個測試候選者，它們所在的生物體及其與參考重組酶的距離。Potential novel Y-SSRs were identified by searching the NCBI nucleotide collection database (v5) using tblastn in BLAST+ 2.10.1 (https://www.ncbi.nlm.nih.gov/books/NBK131777/) and the protein sequences of Cre, Vika (Karimova et al., 2012), Nigri and Panto (Karimova et al. 2016) as references. Results with less than 90% identity and sequence length of 300 to 400 amino acids were filtered using GNU awk. Protein sequences were obtained using efetch (https://dataguide.nlm.nih.gov/edirect/efetch.html). The complete genome sequences of potential SSRs were collected using bastdbcmd (part of BLAST+). Potential target sites were identified by searching the genome sequence 1000 bp upstream and downstream of potential Y-SSRs for palindromes. Palindrome searches were performed using EMBOSS Palindrome (v6.6.0.0) (Rice et al., 2000) with a minimum palindrome length of 13 to 15 base pairs, a gap limit of 8, and one mismatch allowed. The program output was then converted to a tabular format using GNU awk and combined with protein data in R using the dplyr package. Potential SSRs with a Levenshtein distance to the reference (stringdist R package, https://journal.r-project.org/archive/2014/RJ-2014-011/index.html) less than 10 were removed. Potential SSRs were grouped based on Levenshtein distance using complete hierarchical clustering (base R) and clusters were formed with a cutoff distance of 11. The same clustering approach was also used for potential half-sites of palindromic sequences, where the cutoff distance for clustering was 2. The top few test candidates considered to be clustered were selected, along with their organisms and their distance from the reference recombinase.

使用EMBOSS needle (Needleman and Wunsch, 1970)透過對蛋白質序列進行全對全成對序列比對，然後對序列差異進行完全階層式分群，產生已知和假定重組酶的系統發生樹。樹的可視化是使用R套件tidygraph和ggraph完成。為了簡潔起見，以98%序列相似性切割數，將幾乎相同的蛋白質表示為單一節點。Phylogenetic trees of known and putative recombinases were generated using the EMBOSS needle (Needleman and Wunsch, 1970) by full pairwise alignment of protein sequences followed by complete hierarchical clustering of sequence differences. Visualization of the trees was done using the R packages tidygraph and ggraph. For simplicity, nearly identical proteins were represented as a single node at a cutoff of 98% sequence similarity.

使用PatMaN (Prüfer et al., 2008)鑑定與潛在目標位點具有高度相似性的人類和小鼠基因體序列。僅對半位點進行搜尋，允許至多2個錯配。如果發現匹配相同半位點的兩個基因體序列位於相反的股上，其間的距離為8 bp，則它們被稱為各別重組酶的潛在目標位點。使用BEDTools套件執行基因體座標操縱和序列擷取步驟(Quinlan and Hall., 2010)。實例 2 ：質體構建 Human and mouse genomic sequences with high similarity to potential target sites were identified using PatMaN (Prüfer et al., 2008). Only half sites were searched, allowing up to 2 mismatches. If two genomic sequences matching the same half site were found, located on opposite strands with a distance of 8 bp between them, they were called potential target sites for the respective recombinases. Genomic coordinate manipulation and sequence extraction steps were performed using the BEDTools suite (Quinlan and Hall., 2010). Example 2 : Plasmid construction

為了在大腸桿菌中表現，由Twist Biosciences合成了17個預測候選者的密碼子優化DNA序列，並經由BsrGI和XbaI限制位點被選殖到pEVO載體中(圖1B) (Buchholz and Stewart., 2001)。經由引子引入目標位點，引子被設計成攜帶所需目標位點並與pEVO載體重疊。使用pEVO載體作為模板時所生成的PCR片段隨後經由Cold Fusion選殖到經BglII消化的pEVO骨架(System Biosciences)中。For expression in E. coli, codon-optimized DNA sequences of 17 predicted candidates were synthesized by Twist Biosciences and cloned into the pEVO vector via BsrGI and XbaI restriction sites (Figure 1B) (Buchholz and Stewart., 2001). The target site was introduced via primers designed to carry the desired target site and overlap with the pEVO vector. The PCR fragment generated when using the pEVO vector as a template was then cloned into the pEVO backbone (System Biosciences) digested with BglII by Cold Fusion.

為了在哺乳動物細胞中表現重組酶，使用了慢病毒PGK-NLS-BFP質體(圖5D)。從pEVO載體擴增重組酶序列，並經由BsrGI和XbaI限制位點進行選殖。前面提到的帶有重組酶的慢病毒載體可以被轉染到HEK293T細胞中進行瞬時表現，或者用於病毒生產和感染以進行連續表現。To express the recombinase in mammalian cells, a lentiviral PGK-NLS-BFP plasmid was used (Figure 5D). The recombinase sequence was amplified from the pEVO vector and cloned via the BsrGI and XbaI restriction sites. The aforementioned lentiviral vectors carrying the recombinase can be transfected into HEK293T cells for transient expression or used for virus production and infection for continuous expression.

為了構建重組報導子，使用pCAG-loxP-mCherry-loxP-GFP「紅綠燈」載體(Karpinski et al. 2016)。使用含有各別目標位點的寡核苷酸來擴增mCherry盒，隨後將該片段經由NheI和HindIII限制位點接合至pCAG質體。實例 3 ：重組報導子分析 To construct the recombinant reporter, the pCAG-loxP-mCherry-loxP-GFP "red and green light" vector (Karpinski et al. 2016) was used. Oligonucleotides containing the respective target sites were used to amplify the mCherry cassette, and the fragment was subsequently ligated into the pCAG plasmid via the NheI and HindIII restriction sites. Example 3 : Analysis of recombinant reporters

為了使重組酶在其預測目標位點上的重組活性可視化，如前所述使用基於質體的分析(Karimova et al., 2012 and 2016) (圖1A)。簡言之，以L-阿拉伯糖(Sigma-Aldrich Chemie GmbH)誘導pBAD啟動子表現重組酶。將含有帶有重組酶和重組目標位點的pEVO質體的單一殖株在具有25 μg/ml Cm和0或100 μg/ml L-阿拉伯糖的6 ml LB培養基中於37℃和200 rpm下培養過夜。用BsrGI和SbfI限制酶消化後，藉由瓊脂糖凝膠電泳偵測重組酶媒介的切除事件。與非重組質體相比，重組質體的大小較小。因此，凝膠電泳後，可以看到重組質體遷移較慢的一條非重組帶(~5.0 kb)，以及遷移較快的一條帶(~4.3 kb) (圖1A)。To visualize the recombinase activity at its predicted target site, a plasmid-based assay was used as described previously (Karimova et al., 2012 and 2016) (Figure 1A). Briefly, the pBAD promoter was induced to express the recombinase with L-arabinose (Sigma-Aldrich Chemie GmbH). Single clones containing pEVO plasmids with the recombinase and the recombinase target site were cultured overnight at 37°C and 200 rpm in 6 ml LB medium with 25 μg/ml Cm and 0 or 100 μg/ml L-arabinose. After digestion with BsrGI and SbfI restriction enzymes, recombinase-mediated excision events were detected by agarose gel electrophoresis. The size of the recombinant plasmids was smaller compared to the non-recombinant plasmids. Therefore, after gel electrophoresis, a non-recombinant band (~5.0 kb) that migrates slowly and a band (~4.3 kb) that migrates faster can be seen from the recombinant plasmid (Figure 1A).

為了比較活性重組酶在其天然目標位點上的重組效率，在6 ml培養體積中以濃度增加的L-阿拉伯糖(0、1、10或100 μg/ml 培養基)誘導重組酶表現過夜。針對每個誘導含量製備測試消化物，並藉由瓊脂糖凝膠電泳估算重組效率。為了量化重組酶活性，使用Fiji-ImageJ進行影像處理來確定帶強度的比率。經量化的重組在R 4.0.3中使用dplyr v1.0.7進行繪製，並使用ggplot2 v3.3.5進行可視化。所有測試消化物均以三重複進行(n = 3)。To compare the recombination efficiency of active recombinases at their native target sites, recombinase expression was induced overnight in 6 ml culture volumes with increasing concentrations of L-arabinose (0, 1, 10, or 100 μg/ml medium). Test digests were prepared for each level of induction, and recombination efficiency was estimated by agarose gel electrophoresis. To quantify recombinase activity, image processing was performed using Fiji-ImageJ to determine the ratio of band intensities. Quantified recombinases were plotted in R 4.0.3 using dplyr v1.0.7 and visualized using ggplot2 v3.3.5. All test digests were performed in triplicate (n = 3).

關於哺乳動物重組報導子分析，將HEK293T細胞以每孔2 x 10 ⁵個細胞的密度鋪盤於24孔培養皿中，並在補充有10%胎牛血清(Invitrogen)、1%青黴素/鏈黴素(10,000 U/ml, Thermo Fisher)的葡萄糖杜貝可氏改良伊格氏培養基(DMEM, Gibco ^®)中培養。滿度(confluency)為70-80%時，使用Lipofectamine® 2000 Transfection Reagent (Invitrogen)根據製造商的說明書用表現重組酶及pCAG-lox-mCherry-lox-GFP紅綠燈報導子的pPGK-NLS-Recombinase-P2A-BFP質體共轉染細胞。每孔各自使用0.5 μg的DNA (每個質體0.25 μg)和2.5 μl的Lipofectamine® 2000試劑，稀釋在100 μl的Opti-MEM ^®Reduced Serum Media中。次日更換培養基，並在37℃和5% CO2下進一步培養細胞。在目標之間重組後，mCherry盒會被切下，而CAG啟動子開始驅動下游綠色螢光蛋白(GFP)表現。轉染後兩天，用螢光活化細胞分析對細胞進行分析，然後用螢光顯微鏡(EVOS FL成像系統；Thermo Fisher Scientific)成像。實例 4 ：螢光活化細胞分析 For mammalian recombinant reporter assays, HEK293T cells were plated at 2 x 10 ⁵ cells per well in 24-well culture dishes and cultured in Dulbecco's modified Eagle's medium (DMEM, Gibco ^® ) supplemented with 10% fetal bovine serum (Invitrogen), 1% penicillin/streptomycin (10,000 U/ml, Thermo Fisher). At 70-80% confluency, cells were co-transfected with pPGK-NLS-Recombinase-P2A-BFP plasmids expressing recombinase and pCAG-lox-mCherry-lox-GFP red and green light reporters using Lipofectamine® 2000 Transfection Reagent (Invitrogen) according to the manufacturer's instructions. 0.5 μg of DNA (0.25 μg per plasmid) and 2.5 μl of Lipofectamine® 2000 reagent diluted in 100 μl of Opti- ^MEM® Reduced Serum Media were used per well. The medium was changed the next day and the cells were further cultured at 37°C and 5% CO2. After recombination between targets, the mCherry cassette is cleaved and the CAG promoter begins to drive downstream green fluorescent protein (GFP) expression. Two days after transfection, cells were analyzed using fluorescence activated cell assay and then imaged using a fluorescence microscope (EVOS FL Imaging System; Thermo Fisher Scientific). Example 4 : Fluorescence activated cell assay

用PBS洗滌HEK293T一次，然後使用胰蛋白酶(Gibco)脫離。接著將細胞重新懸浮於杜貝可氏改良伊格氏培養基(DMEM, Gibco®)中，並使用MACSQuant® VYB流動式細胞測量儀(Miltenyi)進行分析。使用FlowJo™ 10 (BD)進行數據分析。實例 5 ：在哺乳動物細胞中的過度表現研究 HEK293T cells were washed once with PBS and then detached using trypsin (Gibco). Cells were then resuspended in Dulbecco's modified Eagle's medium (DMEM, Gibco®) and analyzed using a MACSQuant® VYB flow cytometer (Miltenyi). Data analysis was performed using FlowJo™ 10 (BD). Example 5 : Overexpression studies in mammalian cells

關於病毒遞送，如前所述使用pPGK-Recombinase-P2A-BFP載體產生慢病毒顆粒(Sürün et al., 2020)。將NIH/3T3小鼠纖維母細胞以每孔4 x 10 ⁴個細胞的密度接種在24孔盤中，並在37℃和5% CO ₂下生長。次日，以MOI為0,5的不同慢病毒轉導纖維母細胞，以達到約50%的感染率。使用MACSQuant ^®VYB流動式細胞測量儀(Miltenyi)在15天的時間內追蹤至少2 x 10 ⁴個細胞中的BFP表現細胞百分比。使用GraphPad Prism計算並可視化最後一個時間點(第15天)和第一個時間點的BFP細胞百分比差異，並藉由使用95%信賴區間(CI)進行單因子ANOVA來計算相對於BFP對照的統計顯著性(CI)。實例 6 ：交叉重組分析：奈米孔定序 (Nanopore sequencing) Regarding viral delivery, lentiviral particles were produced using the pPGK-Recombinase-P2A-BFP vector as described previously (Sürün et al., 2020). NIH/3T3 mouse fibroblasts were seeded at a density of 4 x 10 ⁴ cells per well in 24-well plates and grown at 37°C and 5% CO _2. The following day, fibroblasts were transduced with different lentiviruses at an MOI of 0.5 to achieve an infection rate of approximately 50%. The percentage of BFP-expressing cells in at least 2 x 10 ⁴ cells was tracked over a period of 15 days using a MACSQuant ^® VYB flow cytometer (Miltenyi). The difference in the percentage of BFP cells between the last time point (day 15) and the first time point was calculated and visualized using GraphPad Prism, and statistical significance relative to the BFP control was calculated by performing a one-way ANOVA using 95% confidence intervals (CI). Example 6 : Cross-recombination analysis: Nanopore sequencing

從pEVO載體擴增重組酶和Cre、Vika、Panto、Dre (Anastassiadis et al., 2009)和VCre (Suzuki and Nakayama, 2011)，使用Isolate II PCR和Gel Cleanup Kit (Bioline)進行清理並按1：1比率混合在一起。如前所述，使用Cold Fusion Cloning套組將所有各別的目標位點選殖到pEVO載體中，並將所得載體也以等莫耳比混合。重組酶和pEVO骨架的混合物用BsrGI和SbfI消化，並在單一反應中接合，從而創造出不同重組酶/目標位點對的庫。將質體轉形到XL1-Blue電勝任大腸桿菌細胞中，並以100 μg/ml L-阿拉伯糖生長過夜以誘導重組酶表現。在次日，用BsrGI和ScaI將質體線性化，並使用Isolate II PCR和Gel Cleanup Kit (Bioline)藉由瓊脂糖凝膠切除分離攜帶重組酶序列和目標位點的片段。接著根據「Amplicons by Ligation」方案在MinION R9.4.1 Flow Cell (Oxford Nanopore Technologies)上使用SQK-LSK110 Kit製備這些DNA片段供用於奈米孔定序。用guppy v5.0.7以高精度模型(Oxford Nanopore Technologies)進行序列資料的鹼基檢測(base calling)。然後使用filtlong (https://github.com/rrwick/Filtlong)過濾讀段長度為至少1800 bp且最小平均phred分數為十的序列讀數。為了識別重組酶，使用minimap2 v2.17 (Li, 2018)將讀段與參考重組酶序列進行比對。使用仿射：局部模型(https://www.ebi.ac.uk/about/vertebrate-genomics/software/exonerate)使用Exonerate v2.2.0辨識目標位點。然後從兩個比對中取得讀段ID和匹配的參考物，並與具有dplyr套件的R組合。使用R套件ggplot2進行資料可視化。實例 7 ：新型重組酶在細菌中重組其預測的目標位點 Recombinases and Cre, Vika, Panto, Dre (Anastassiadis et al., 2009), and VCre (Suzuki and Nakayama, 2011) were amplified from the pEVO vector, cleaned up using the Isolate II PCR and Gel Cleanup Kit (Bioline) and mixed together in a 1:1 ratio. All individual target sites were cloned into the pEVO vector using the Cold Fusion Cloning Kit as described previously, and the resulting vectors were also mixed in an equimolar ratio. The mixture of recombinases and the pEVO backbone was digested with BsrGI and SbfI and ligated in a single reaction to create a library of different recombinase/target site pairs. Plasmids were transformed into XL1-Blue electrocompetent E. coli cells and grown overnight with 100 μg/ml L-arabinose to induce recombinase expression. On the next day, plasmids were linearized with BsrGI and ScaI, and fragments carrying the recombinase sequence and the target site were isolated by agarose gel excision using the Isolate II PCR and Gel Cleanup Kit (Bioline). These DNA fragments were then prepared for nanopore sequencing using the SQK-LSK110 Kit on a MinION R9.4.1 Flow Cell (Oxford Nanopore Technologies) according to the "Amplicons by Ligation" protocol. Base calling of sequence data was performed using guppy v5.0.7 with the high precision model (Oxford Nanopore Technologies). Sequence reads with a read length of at least 1800 bp and a minimum average phred score of ten were then filtered using filtlong (https://github.com/rrwick/Filtlong). To identify recombinases, reads were aligned to reference recombinase sequences using minimap2 v2.17 (Li, 2018). Target sites were identified using Exonerate v2.2.0 using the affine:local model (https://www.ebi.ac.uk/about/vertebrate-genomics/software/exonerate). Read IDs and matching references were then taken from both alignments and combined with R with the dplyr package. Data visualization was performed using the R package ggplot2 .

在預測的目標序列上測試候選重組酶的活性。簡言之，挑出候選者並將其編碼序列單獨地選殖到L-阿拉伯糖誘導型pEVO重組報導載體(Buchholz and Stewart, 2001)中，該載體帶有各別預測目標位點的兩個複本(圖1B)。然後將質體轉形到大腸桿菌中並在含有L-阿拉伯糖的培養基中培養過夜以誘導重組酶表現。表現後，成功重組從質體中切除~700 bp DNA片段。透過線性化質體的瓊脂糖電泳可看到這種大小差異(圖1A)。十七個候選者中有八個在其預測目標位點上顯示出活性，這可以從出現~4 kb重組帶看出(圖2)。候選者中的五個已在未向培養基添加L-阿拉伯糖的情況下在樣品中證明有效率的重組(YR1、YR2、YR4、YR6和YR12)，表明這些酶即使以非常低的程度表現時還是具有活性。其他重組酶(YR8、YR9和YR11)僅當生長培養基中存在L-阿拉伯糖時才會重組質體，表明它們需要更高的誘導才能在這個分析中具有活性。這些結果證實成功鑑定出新型Y-SSR及其各自的目標位點。Candidate recombinases were tested for activity on the predicted target sequence. Briefly, candidates were selected and their coding sequences were individually cloned into the L-arabinose-inducible pEVO recombinase reporter vector (Buchholz and Stewart, 2001), which carries two copies of the respective predicted target site (Figure 1B). The plasmids were then transformed into E. coli and cultured overnight in medium containing L-arabinose to induce recombinase expression. Following expression, successful recombinase excised a ~700 bp DNA fragment from the plasmid. This size difference was visualized by agarose electrophoresis of linearized plasmids (Figure 1A). Eight of the seventeen candidates showed activity at their predicted target sites, as evidenced by the appearance of a ~4 kb recombinant band (Figure 2). Five of the candidates demonstrated efficient recombining in samples without L-arabinose added to the medium (YR1, YR2, YR4, YR6, and YR12), indicating that these enzymes are active even when expressed at very low levels. The other recombinases (YR8, YR9, and YR11) recombined plastids only when L-arabinose was present in the growth medium, indicating that they require higher induction to be active in this assay. These results confirm the successful identification of novel Y-SSRs and their respective target sites.

為了要更為詳細地特徵鑑定活性Y-SSR，在不同表現程度下量化它們的重組效率，以便研究劑量反應並與已充分確立的重組酶Cre和Vika (Karimova et al., 2012)相比，獲得其效率的更多並排比較結果。將具有所需重組酶/目標位點對的pEVO質體轉形到大腸桿菌中，並在不同濃度的L-阿拉伯糖下培養過夜以誘導重組酶表現(Guzman et al., 1995)。然後將萃取的質體DNA在瓊脂糖凝膠上進行重組分析。帶強度(band intensities)的量化結果表明，新型的重組酶在其各自目標位點上具有不同的活性概況(圖3)。儘管大多數重組酶在高L-阿拉伯糖濃度下生長時具有高活性，顯示出重組率介於87%至100%之間(YR1、YR2、YR4、YR6和YR11)，但它們在低L-阿拉伯糖濃度下表現時的表現卻截然不同。雖然YR1、YR2、YR4和YR11在以1或10 µg/ml的L-阿拉伯糖誘導時顯示出較低的重組率，但YR6即使在如此低的誘導含量下也具有很高的活性，其活性概況類似於Cre和Vika (圖3)。有趣的是，YR12重組酶表現出通常恆定的重組率，在0 µg/ml L-阿拉伯糖時為~50%，在用100 µg/ml的L-阿拉伯糖誘導時達到峰值在70%。另一方面，YR8和YR9在最高L-阿拉伯糖濃度下顯示出最弱的活性，其目標位點僅分別重組達到25%和17%。To characterize the active Y-SSRs in more detail, their recombination efficiency was quantified at different levels of expression in order to study dosing responses and obtain more side-by-side comparisons of their efficiency compared to the well-established recombinases Cre and Vika (Karimova et al., 2012). pEVO plasmids with the desired recombinase/target site pairs were transformed into E. coli and cultured overnight in the presence of different concentrations of L-arabinose to induce recombinase expression (Guzman et al., 1995). Extracted plasmid DNA was then analyzed for recombination on agarose gels. Quantification of band intensities indicated that the novel recombinases had different activity profiles at their respective target sites (Figure 3). Although most recombinases were highly active when grown at high L-arabinose concentrations, showing recombination rates between 87% and 100% (YR1, YR2, YR4, YR6, and YR11), their performance was quite different when expressed at low L-arabinose concentrations. While YR1, YR2, YR4, and YR11 showed lower recombination rates when induced with 1 or 10 µg/ml L-arabinose, YR6 was highly active even at such low induction levels, with an activity profile similar to that of Cre and Vika (Figure 3). Interestingly, the YR12 recombinase showed a generally constant recombination rate of ~50% at 0 µg/ml L-arabinose, peaking at 70% when induced with 100 µg/ml L-arabinose. On the other hand, YR8 and YR9 showed the weakest activity at the highest L-arabinose concentration, with only 25% and 17% of their target sites recombined, respectively.

重組酶YR9進一步進行受質連結定向演化(Buchholz and Stewart, 2001)，從而顯著提高了YR9重組酶的活性。名為YR9.2 (SEQ ID NO：9)的表現最好的殖株在細菌中顯示出活性增進約七倍(圖7A)，而在哺乳動物細胞中顯示出增進八倍(圖7C)。實例 8 ：剖析 Cre 樣重組酶的目標位點選擇性 The recombinase YR9 was further subjected to substrate-linked directed evolution (Buchholz and Stewart, 2001), which significantly improved the activity of the YR9 recombinase. The best performing strain, designated YR9.2 (SEQ ID NO: 9), showed approximately sevenfold improvement in activity in bacteria (Figure 7A) and eightfold improvement in mammalian cells (Figure 7C). Example 8 : Dissecting the target site selectivity of Cre -like recombinases

具有不同序列特異性的SSR經常組合使用，以進行複雜的基因體或合成生物學實驗(Feil, 2007；Sheets et al., 2020；Merrick et al., 2018；Livet et al., 2007；Snippert et al ., 2010)。關於這樣的應用，了解酶的特異性並考慮可能的交叉反應性非常重要(Fenno et al., 2014；Weinberg et al., 2017)。為了測試所有可能的組合，開發出高通量定序方法，其中可以在單一實驗中對所有目標位點量化已知(Panto、Dre、Cre、Vika和VCre)和新型Y-SSR (YR1、YR2、YR4等)的活性。建立了用於在169 (13x13)個單獨載體上產生13種重組酶及其各自13個目標位點的所有組合的兩步驟選殖方案。13個目標序列被單獨地選殖到pEVO載體。然後將所得的構建體匯集並線性化，以在一個接合反應中選殖到13個重組酶編碼序列的池中(圖4A)。過夜培養並誘導重組酶表現後，回收質體DNA，並在3'端切下帶有重組酶序列的片段以及在5'端切下帶有目標位點的片段。使用Oxford Nanopore Technologies的長讀段定序平台，總共獲得了417,769個讀段，其均含指定的重組酶和目標位點。Y-SSR和目標位點的所有可能169種組合均以224個讀段的最小覆蓋率進行了鑑定。使用這個數據，計算了所有目標位點上單獨重組酶的重組率，並提供各個重組酶的特異性概況(圖4B)。實例 9 ：重組酶在人類細胞中的活性 SSRs with different sequence specificities are often used in combination to perform complex genomic or synthetic biology experiments (Feil, 2007; Sheets et al., 2020; Merrick et al., 2018; Livet et al., 2007; Snippert et al., 2010). For such applications, it is important to understand the specificity of the enzymes and consider possible cross-reactivity (Fenno et al., 2014; Weinberg et al., 2017). In order to test all possible combinations, high-throughput sequencing methods have been developed, in which the activities of known (Panto, Dre, Cre, Vika, and VCre) and novel Y-SSRs (YR1, YR2, YR4, etc.) can be quantified for all target sites in a single experiment. A two-step cloning scheme was established to generate all combinations of 13 recombinases and their respective 13 target sites on 169 (13x13) individual vectors. The 13 target sequences were individually cloned into the pEVO vector. The resulting constructs were then pooled and linearized to be cloned into a pool of 13 recombinase encoding sequences in one ligation reaction (Figure 4A). After overnight culture and induction of recombinase expression, plasmid DNA was recovered and a fragment with the recombinase sequence was cut at the 3' end and a fragment with the target site was cut at the 5' end. Using the long read sequencing platform of Oxford Nanopore Technologies, a total of 417,769 reads were obtained, all of which contained the specified recombinase and target site. All possible 169 combinations of Y-SSR and target site were identified with a minimum coverage of 224 reads. Using this data, the recombination rates of individual recombinases at all target sites were calculated and a profile of the specificity of each recombinase was provided (Figure 4B). Example 9 : Activity of recombinases in human cells

在人類細胞株中測試新型Y-SSR的活性。簡言之，用重組酶表現質體(如哺乳動物重組分析的實例3中所述)與帶有相應目標位點的重組報導質體一起共轉染HEK293T細胞(圖5A)。在報導質體中，受CAG啟動子所驅動的mCherry盒側接目標(lox)位點。重組後，從質體刪除mCherry盒，繼而CAG啟動子驅動GFP盒的表現(圖5A)。因此，重組酶對其預測目標位點的活性可以藉由螢光顯微鏡可視化並藉由流式細胞分析術量化。當分析共轉染實驗時，所有測試的重組酶均展現出GFP陽性細胞(圖5B)，證實這些重組酶在HEK293T細胞中具有活性，而當使用「空」表現載體(缺乏重組酶編碼序列)進行共轉染時觀察到無GFP陽性細胞(圖5B)。流式細胞分析術分析顯示，在幾乎所有樣品中，超過90%共轉染有重組酶表現質體和報導子的細胞(BFP和mCherry雙重陽性的細胞)也是GFP陽性，表示這些重組酶在這個情況下具有高度活性(圖5C)。實例 10 ：重組酶表現對細胞增生的影響 The activity of the novel Y-SSR was tested in human cell lines. Briefly, HEK293T cells were co-transfected with a recombinase expression plasmid (as described in Example 3 of mammalian recombination analysis) together with a recombinase reporter plasmid carrying the corresponding target site (Figure 5A). In the reporter plasmid, the mCherry cassette driven by the CAG promoter was flanked by the target (lox) sites. After recombination, the mCherry cassette was deleted from the plasmid, and the CAG promoter then drove the expression of the GFP cassette (Figure 5A). Therefore, the activity of the recombinase on its predicted target site can be visualized by fluorescence microscopy and quantified by flow cytometry. When co-transfection experiments were analyzed, all recombinases tested exhibited GFP-positive cells (Figure 5B), confirming that these recombinases are active in HEK293T cells, whereas no GFP-positive cells were observed when co-transfections were performed using an "empty" expression vector (lacking the recombinase coding sequence) (Figure 5B). Flow cytometric analysis showed that in almost all samples, more than 90% of cells co-transfected with the recombinase expression plasmid and reporter (cells double positive for BFP and mCherry) were also GFP-positive, indicating that these recombinases are highly active in this setting (Figure 5C). Example 10 : Effects of recombinase expression on cell proliferation

SSR在異源性宿主中的研究對於界定其應用性質非常重要。重組酶可以辨識基因體中的隱密(偽)重組位點，這些位點可能會重組並導致脫靶效應，可能導致生長停滯或細胞凋亡。事實上，在人類和小鼠基因體中已經描述過活性假-loxP位點(Thyagarajan et al., 2000)。因此，當Cre在人類或小鼠細胞中過度表現時，可能會損害細胞增生(Loonstra et al., 2001；Schmidt et al., 2000；Pugach et al., 2015)。The study of SSRs in heterologous hosts is important to define their application properties. Recombinases can recognize cryptic (pseudo) recombination sites in the genome that may recombine and cause off-target effects, possibly leading to growth arrest or apoptosis. Indeed, active pseudo-loxP sites have been described in human and mouse genomes (Thyagarajan et al., 2000). Therefore, when Cre is overexpressed in human or mouse cells, it may impair cell proliferation (Loonstra et al., 2001; Schmidt et al., 2000; Pugach et al., 2015).

為了測試新鑑定的Y-SSR在過度表現時對細胞增生的潛在影響，構建了慢病毒載體，該載體允許重組酶和tagBFP共表現(圖6A)。單獨使用過度表現Cre、不活化Cre變異體(CreY324F)、Vika和tagBFP的病毒顆粒作為對照。感染NIH3T3細胞並監測BFP陽性細胞百分比的變化持續15天。隨著時間的推移，BFP陽性細胞的減少表明因為重組酶過度表現對細胞增生有負面影響(Schmidt et al., 2000；Pugach et al., 2015)。在這個分析中測試Cre重組酶時，BFP陽性細胞的數目逐漸下降，而催化不活化形式的Cre則沒有影響(圖6B)。與Cre相比之下，YR4和R8顯示BFP陽性細胞百分比下降較不明顯(分別為~15%；p＜0.0001和~17%；p＜0.0001)，顯示這些重組酶的過度表現會略為抑制細胞增生(圖6B)。反之，在表現其他重組酶的細胞中，BFP陽性細胞的百分比沒有顯著變化(圖6B)，表示這些重組酶的過度表現在細胞中耐受良好。 引用的非專利文獻Anastassiadis,K., Fu,J., Patsch,C., Hu,S., Weidlich,S., Duerschke,K., Buchholz,F., Edenhofer,F. and Stewart,A.F. (2009) Dre recombinase, like Cre, is a highly efficient site-specific recombinase in E. coli, mammalian cells and mice. Dis Model Mech, 2, 508–515. Anderson,R.P., Voziyanova,E. and Voziyanov,Y. (2012) Flp and Cre expressed from Flp–2A–Cre and Flp–IRES–Cre transcription units mediate the highest level of dual recombinase-mediated cassette exchange. Nucleic Acids Res, 40, e62–e62. Anzalone,A.V., Gao,X.D., Podracky,C.J., Nelson,A.T., Koblan,L.W., Raguram,A., Levy,J.M., Mercer,J.A.M. and Liu,D.R. (2022) Programmable deletion, replacement, integration and inversion of large DNA sequences with twin prime editing. Nat Biotechnol, 40, 731–740. Anzalone,A.V., Randolph,P.B., Davis,J.R., Sousa,A.A., Koblan,L.W., Levy,J.M., Chen,P.J., Wilson,C., Newby,G.A., Raguram,A., et al. (2019) Search-and-replace genome editing without double-strand breaks or donor DNA. Nature, 576, 149–157. Buchholz, F., and Stewart, A.F. (2001). Alteration of Cre recombinase site specificity by substrate-linked protein evolution. Nat Biotechnol 19, 1047–1052. Crameri,A., Raillard,S.-A., Bermudez,E. and Stemmer,W.P.C. (1998) DNA shuffling of a family of genes from diverse species accelerates directed evolution. Nature, 391, 288–291. Duyne,G.D.V. (2001) A Structural View of Cre- loxP Site-Specific Recombination. Annu Rev Bioph Biom, 30, 87–104. Feil,R. (2007) Conditional Mutagenesis: An Approach to Disease Models. Handb Exp Pharmacol, 10.1007/978-3-540-35109-2_1. Fenno,L.E., Mattis,J., Ramakrishnan,C., Hyun,M., Lee,S.Y., He,M., Tucciarone,J., Selimbeyoglu,A., Berndt,A., Grosenick,L., et al. (2014) Targeting cells with single vectors using multiple-feature Boolean logic. Nat Methods, 11, 763–772. Gaudelli,N.M., Komor,A.C., Rees,H.A., Packer,M.S., Badran,A.H., Bryson,D.I. and Liu,D.R. (2017) Programmable base editing of A•T to G•C in genomic DNA without DNA cleavage. Nature, 551, 464–471. Guzman,L.M., Belin,D., Carson,M.J. and Beckwith,J. (1995) Tight regulation, modulation, and high-level expression by vectors containing the arabinose PBAD promoter. J Bacteriol, 177, 4121–4130. Karimova,M., Abi-Ghanem,J., Berger,N., Surendranath,V., Pisabarro,M.T. and Buchholz,F. (2012) Vika/vox, a novel efficient and specific Cre/loxP-like site-specific recombination system. Nucleic Acids Res, 41, e37–e37. Karimova,M., Splith,V., Karpinski,J., Pisabarro,M.T. and Buchholz,F. (2016) Discovery of Nigri/nox and Panto/pox site-specific recombinase systems facilitates advanced genome engineering. Sci Rep-uk, 6, 30130. Karpinski,J., Hauber,I., Chemnitz,J., Schäfer,C., Paszkowski-Rogacz,M., Chakraborty,D., Beschorner,N., Hofmann-Sieber,H., Lange,U.C., Grundhoff,A., et al. (2016) Directed evolution of a recombinase that excises the provirus of most HIV-1 primary isolates with high specificity. Nat Biotechnol, 34, 401–409. Komor,A.C., Kim,Y.B., Packer,M.S., Zuris,J.A. and Liu,D.R. (2016) Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage. Nature, 533, 420–424. Li,H. (2018) Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics, 34, 3094–3100. Livet,J., Weissman,T.A., Kang,H., Draft,R.W., Lu,J., Bennis,R.A., Sanes,J.R. and Lichtman,J.W. (2007) Transgenic strategies for combinatorial expression of fluorescent proteins in the nervous system. Nature, 450, 56–62. Loonstra,A., Vooijs,M., Beverloo,H.B., Allak,B.A., Drunen,E. van, Kanaar,R., Berns,A. and Jonkers,J. (2001) Growth inhibition and DNA damage induced by Cre recombinase in mammalian cells. Proc National Acad Sci, 98, 9209–9214. Meinke,G., Bohm,A., Hauber,J., Pisabarro,M.T. and Buchholz,F. (2016) Cre Recombinase and Other Tyrosine Recombinases. Chem Rev, 116, 12785–12820. Merrick,C.A., Zhao,J. and Rosser,S.J. (2018) Serine Integrases: Advancing Synthetic Biology. Acs Synth Biol, 7, 299–310. Minorikawa,S. and Nakayama,M. (2011) Recombinase-mediated cassette exchange (RMCE) and BAC engineering via VCre/VloxP and SCre/SloxP systems. Biotechniques, 50, 235–246. Needleman,S.B. and Wunsch,C.D. (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol, 48, 443–453. Prüfer,K., Stenzel,U., Dannemann,M., Green,R.E., Lachmann,M. and Kelso,J. (2008) PatMaN: rapid alignment of short sequences to large databases. Bioinformatics, 24, 1530–1531. Pugach,E.K., Richmond,P.A., Azofeifa,J.G., Dowell,R.D. and Leinwand,L.A. (2015) Prolonged Cre expression driven by the α-myosin heavy chain promoter can be cardiotoxic. J Mol Cell Cardiol, 86, 54–61. Quinlan,A.R. and Hall,I.M. (2010) BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics, 26, 841–842. Sauer,B. and Henderson,N. (1988) Site-specific DNA recombination in mammalian cells by the Cre recombinase of bacteriophage P1. Proc National Acad Sci, 85, 5166–5170. Schmidt,E.E., Taylor,D.S., Prigge,J.R., Barnett,S. and Capecchi,M.R. (2000) Illegitimate Cre-dependent chromosome rearrangements in transgenic mouse spermatids. Proc National Acad Sci, 97, 13702–13707. Sternberg,N. and Hamilton,D. (1981) Bacteriophage P1 site-specific recombination I. Recombination between loxP sites. J Mol Biol, 150, 467–486. Sürün,D., Schneider,A., Mircetic,J., Neumann,K., Lansing,F., Paszkowski-Rogacz,M., Hänchen,V., Lee-Kirsch,M.A. and Buchholz,F. (2020) Efficient Generation and Correction of Mutations in Human iPS Cells Utilizing mRNAs of CRISPR Base Editors and Prime Editors. Genes-basel, 11, 511. Rice,P., Longden,I. and Bleasby,A. (2000) EMBOSS: The European Molecular Biology Open Software Suite. Trends Genet, 16, 276–277. Sheets,M.B., Wong,W.W. and Dunlop,M.J. (2020) Light-Inducible Recombinases for Bacterial Optogenetics. Acs Synth Biol, 9, 227–235. Snippert,H.J., Flier,L.G. van der, Sato,T., Es,J.H. van, Born,M. van den, Kroon-Veenboer,C., Barker,N., Klein,A.M., Rheenen,J. van, Simons,B.D., et al. (2010) Intestinal Crypt Homeostasis Results from Neutral Competition between Symmetrically Dividing Lgr5 Stem Cells. Cell, 143, 134–144. Suzuki,E. and Nakayama,M. (2011) VCre/VloxP and SCre/SloxP: new site-specific recombination systems for genome engineering. Nucleic Acids Res, 39, e49–e49. Thyagarajan,B., Guimarães,M.J., Groth,A.C. and Calos,M.P. (2000) Mammalian genomes contain active recombinase recognition sites. Gene, 244, 47–54. Weinberg,B.H., Pham,N.T.H., Caraballo,L.D., Lozanoski,T., Engel,A., Bhatia,S. and Wong,W.W. (2017) Large-scale design of robust genetic circuits with multiple inputs and outputs for mammalian cells. Nat Biotechnol, 35, 453–462. To test the potential effects of the newly identified Y-SSR on cell proliferation when overexpressed, a lentiviral vector was constructed that allows co-expression of the recombinase and tagBFP (Figure 6A). Viral particles overexpressing Cre, an inactive Cre variant (CreY324F), Vika, and tagBFP alone were used as controls. NIH3T3 cells were infected and the percentage of BFP-positive cells was monitored for 15 days. The decrease in BFP-positive cells over time suggests that overexpression of the recombinase has a negative effect on cell proliferation (Schmidt et al., 2000; Pugach et al., 2015). When the Cre recombinases were tested in this assay, the number of BFP-positive cells progressively decreased, whereas the catalytically inactive form of Cre had no effect (Fig. 6B). In contrast to Cre, YR4 and R8 showed a less pronounced decrease in the percentage of BFP-positive cells (~15%; p<0.0001 and ~17%; p<0.0001, respectively), indicating that overexpression of these recombinases slightly inhibited cell proliferation (Fig. 6B). In contrast, the percentage of BFP-positive cells did not change significantly in cells expressing the other recombinases (Fig. 6B), indicating that overexpression of these recombinases is well tolerated in the cells. Cited non-patent literature Anastassiadis, K., Fu, J., Patsch, C., Hu, S., Weidlich, S., Duerschke, K., Buchholz, F., Edenhofer, F. and Stewart, AF (2009) Dre recombinase, like Cre, is a highly efficient site-specific recombinase in E. coli, mammalian cells and mice. Dis Model Mech, 2, 508–515. Anderson,RP, Voziyanova,E. and Voziyanov,Y. (2012) Flp and Cre expressed from Flp–2A–Cre and Flp–IRES–Cre transcription units mediate the highest level of dual recombinase-mediated cassette exchange. Nucleic Acids Res, 40, e62–e62. Anzalone,AV, Gao,XD, Podracky,CJ, Nelson,AT, Koblan,LW, Raguram,A., Levy,JM, Mercer,JAM and Liu,DR (2022) Programmable deletion, replacement, integration and inversion of large DNA sequences with twin prime editing. Nat Biotechnol, 40, 731–740. Anzalone,AV, Randolph,PB, Davis,JR, Sousa,AA, Koblan,LW, Levy,JM, Chen,PJ, Wilson, C., Newby, GA, Raguram, A., et al. (2019) Search-and-replace genome editing without double-strand breaks or donor DNA. Nature, 576, 149–157. Buchholz, F., and Stewart, AF (2001). Alteration of Cre recombinase site specificity by substrate-linked protein evolution. Nat Biotechnol 19, 1047–1052. Crameri,A., Raillard, S.-A., Bermudez, E. and Stemmer, WPC (1998) DNA shuffling of a family of genes from diverse species accelerates directed evolution. Nature, 391, 288–291. Duyne, GDV (2001) A Structural View of Cre- loxP Site-Specific Recombination. Annu Rev Bioph Biom, 30, 87–104. Feil,R. (2007) Conditional Mutagenesis: An Approach to Disease Models. Handb Exp Pharmacol, 10.1007/978-3-540-35109-2_1. Fenno,LE, Mattis,J., Ramakrishnan,C., Hyun,M., Lee,SY, He,M., Tucciarone,J., Selimbeyoglu,A., Berndt,A., Grosenick,L., et al. (2014) Targeting cells with single vectors using multiple-feature Boolean logic. Nat Methods, 11, 763–772. Gaudelli,NM, Komor,AC, Rees,HA, Packer,MS, Badran,AH, Bryson,DI and Liu,DR (2017) Programmable base editing of A•T to G•C in genomic DNA without DNA cleavage. Nature, 551, 464–471. Guzman,LM, Belin,D., Carson,MJ and Beckwith,J. (1995) Tight regulation, modulation, and high-level expression by vectors containing the arabinose PBAD promoter. J Bacteriol, 177, 4121–4130. Karimova,M., Abi-Ghanem,J., Berger,N., Surendranath,V., Pisabarro,MT and Buchholz, F. (2012) Vika/vox, a novel efficient and specific Cre/loxP-like site-specific recombination system. Nucleic Acids Res, 41, e37–e37. Karimova, M., Splith, V., Karpinski, J., Pisabarro, MT and Buchholz, F. (2016) Discovery of Nigri/nox and Panto/pox site-specific Recombinase systems facilitates advanced genome engineering. Sci Rep-uk, 6, 30130. Karpinski,J., Hauber,I., Chemnitz,J., Schäfer,C., Paszkowski-Rogacz,M., Chakraborty,D., Beschorner,N., Hofmann-Sieber,H., Lange,UC, Grundhoff,A., et al. (2016) Directed evolution of a recombination that excises the provirus of most HIV-1 primary isolates with high specificity. Nat Biotechnol, 34, 401–409. Komor,AC, Kim,YB, Packer,MS, Zuris,JA and Liu,DR (2016) Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage. Nature, 533, 420–424. Li,H. (2018) Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics, 34, 3094–3100. Livet,J., Weissman,TA, Kang,H., Draft,RW, Lu,J., Bennis,RA, Sanes,JR and Lichtman,JW (2007) Transgenic strategies for combinatorial expression of fluorescent proteins in the nervous system. Nature, 450, 56–62. Loonstra,A., Vooijs,M., Beverloo,HB, Allak,BA, Drunen,E. van, Kanaar,R., Berns,A. and Jonkers,J. (2001) Growth inhibition and DNA damage induced by Cre recombinase in mammalian cells. Proc National Acad Sci, 98, 9209–9214. Meinke,G., Bohm,A., Hauber,J., Pisabarro,MT and Buchholz,F. (2016) Cre Recombinase and Other Tyrosine Recombinases. Chem Rev, 116, 12785–12820. Merrick,CA, Zhao,J. and Rosser,SJ (2018) Serine Integrases: Advancing Synthetic Biology. Acs Synth Biol, 7, 299–310. Minorikawa,S. and Nakayama, M. (2011) Recombinase-mediated cassette exchange (RMCE) and BAC engineering via VCre/VloxP and SCre/SloxP systems. Biotechniques, 50, 235–246. Needleman,SB and Wunsch,CD (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol, 48, 443–453. Prüfer,K., Stenzel,U., Dannemann,M., Green,RE, Lachmann,M. and Kelso,J. (2008) PatMaN: rapid alignment of short sequences to large databases. Bioinformatics, 24, 1530–1531. Pugach,EK, Richmond,PA, Azofeifa,JG, Dowell,RD and Leinwand,LA (2015) Prolonged Cre expression driven by the α-myosin heavy chain promoter can be cardiotoxic. J Mol Cell Cardiol, 86, 54–61. Quinlan,AR and Hall,IM (2010) BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics, 26, 841–842. Sauer,B. and Henderson,N. (1988) Site-specific DNA recombination in mammalian cells by the Cre recombinase of bacteriophage P1. Proc National Acad Sci, 85, 5166–5170. Schmidt, EE, Taylor, DS, Prigge, JR, Barnett, S. and Capecchi, MR (2000) Illegitimate Cre-dependent chromosome rearrangements in transgenic mouse spermatids. Proc National Acad Sci, 97, 13702–13707. Sternberg, N. and Hamilton, D. (1981) Bacteriophage P1 site-specific recombination I. Recombination between loxP sites. J Mol Biol, 150, 467–486. Sürün, D., Schneider, A., Mircetic, J., Neumann, K., Lansing, F., Paszkowski-Rogacz, M., Hänchen, V., Lee-Kirsch, MA and Buchholz, F. (2020) Efficient Generation and Correction of Mutations in Human iPS Cells Utilizing mRNAs of CRISPR Base Editors and Prime Editors. Genes-basel, 11, 511. Rice,P., Longden,I. and Bleasby,A. (2000) EMBOSS: The European Molecular Biology Open Software Suite. Trends Genet, 16, 276–277. Sheets,MB, Wong,WW and Dunlop, MJ (2020) Light-Inducible Recombinases for Bacterial Optogenetics. Acs Synth Biol, 9, 227–235. Snippert, HJ, Flier, LG van der, Sato, T., Es, JH van, Born, M. van den, Kroon-Veenboer, C., Barker, N., Klein, AM, Rheenen, J. van, Simons, BD, et al. (2010) Intestinal Crypt Homeostasis Results from Neutral Competition between Symmetrically Dividing Lgr5 Stem Cells. Cell, 143, 134–144. Suzuki,E. and Nakayama,M. (2011) VCre/VloxP and SCre/SloxP: new site-specific recombination systems for genome engineering. Nucleic Acids Res, 39, e49–e49. Thyagarajan,B., Guimarães,MJ, Groth,AC and Calos,MP (2000) Mammalian genomes contain active recombinase recognition sites. Gene, 244, 47–54. Weinberg,BH, Pham,NTH, Caraballo,LD, Lozanoski,T., Engel,A., Bhatia,S. and Wong,WW (2017) Large-scale design of robust genetic circuits with multiple inputs and outputs for mammalian cells. Nat Biotechnol, 35, 453–462.

無without

本發明透過以下圖式和實例進一步說明，但不限於此。The present invention is further described by the following figures and examples, but is not limited thereto.

圖 1A顯示質體重組分析的概述。顯示出重要特徵，諸如限制位點、重組酶編碼序列和目標位點(三角形)。瓊脂糖凝膠上也顯示預期重組產物的示意圖。標記和經重組和未經重組質體的預期大小各自或一起顯示，如通常在凝膠上看到的那樣。圖 1B顯示pEVO重組報導子的質體圖譜。蛋白質編碼基因、複製起點(oriP15A)和pBAD啟動子如箭頭所示。蛋白質編碼基因包括氯黴素抗性基因(cmR)、阿拉伯糖調節蛋白(araC)和編碼感興趣重組酶的基因。添加阿拉伯糖後，重組酶的表現受到pBAD啟動子所驅動。兩個lox位點之間的重組導致約700 bp填充序列的切除。BsrGI和SbfI限制酶用於重組酶的選殖以及用於測試消化的質體的線性化。 Figure 1A shows an overview of plastid recombination analysis. Important features such as restriction sites, recombinase coding sequences, and target sites (triangles) are shown. A schematic of the expected recombinant products is also shown on the agarose gel. Markers and the expected sizes of recombinant and non-recombinant plastids are shown individually or together as normally seen on the gel. Figure 1B shows a plastid map of the pEVO recombinant reporter. Protein coding genes, the origin of replication (oriP15A), and the pBAD promoter are indicated by arrows. Protein coding genes include the chloramphenicol resistance gene (cmR), arabinose regulatory protein (araC), and the gene encoding the recombinase of interest. Upon addition of arabinose, expression of the recombinase is driven by the pBAD promoter. Recombination between the two lox sites results in the excision of approximately 700 bp of stuffer sequence. BsrGI and SbfI restriction enzymes were used for selection of the recombinases and for linearization of test digested plasmids.

圖 2顯示十七個測試的假定Y-SSR/目標位點對的重組活性。每個樣品均在向生長培養基添加或不添加L-阿拉伯糖(100 µg/ml)的情況下進行重組酶表現測試，以「-」或「+」表示。重組由與單一三角形對齊的帶表示，而非重組質體由兩個三角形表示。活性重組酶以方框突顯。M = GeneRuler ^TMDNA Ladder Mix (Thermo Fisher)。 Figure 2 shows the recombinase activity of seventeen tested putative Y-SSR/target site pairs. Each sample was tested for recombinase expression in the presence or absence of L-arabinose (100 µg/ml) added to the growth medium, indicated by "-" or "+". Recombination is indicated by a band aligned with a single triangle, while non-recombinant plasmids are indicated by two triangles. Active recombinases are highlighted by boxes. M = GeneRuler ^TM DNA Ladder Mix (Thermo Fisher).

圖 3顯示大腸桿菌中新型SSR重組活性的量化結果和再現性。重組酶表現隨著沿著x軸指示的L-阿拉伯糖濃度(μg/ml)增加而受到誘導。Vika和Cre被納入作為陽性對照。透過測量瓊脂糖凝膠上偵測到的未經重組帶和經重組帶的帶強度比率來計算重組。以三重複進行細菌分析(n = 3)。 Figure 3 shows the quantification and reproducibility of novel SSR recombinant activity in E. coli. Recombinase expression was induced with increasing L-arabinose concentrations (μg/ml) indicated along the x-axis. Vika and Cre were included as positive controls. Recombination was calculated by measuring the ratio of the band intensities of the non-recombinant and recombinant bands detected on agarose gel. Bacterial analysis was performed in triplicate (n = 3).

圖 4A顯示深度定序結果的分析。交叉重組事件透過每種可能組合的重組百分比熱圖展示。目標位點呈水平展示並根據其相似性排序。測試的重組酶在縱軸上根據其同源性進行排列。中靶(On-target)重組事件以紅色方塊框出。圖 4B顯示透過基於質體的重組分析對交叉重組事件的驗證。各別重組酶的重組活性顯示在其在中靶位點和脫靶(off-target)位點上。藉由瓊脂糖凝膠電泳評估重組。每個樣品均在有或沒有L-阿拉伯糖誘導重組酶表現的情況下進行測試，以「-」或「+」 (100 µg/ml L-阿拉伯糖)表示。重組由與單一三角形對齊的帶表示，而非重組帶由兩個三角形表示。M = GeneRulerTM DNA Ladder Mix (Thermo Fisher)。圖的部分內容是使用BioRender.com創建的。 Figure 4A shows the analysis of deep sequencing results. Crossover recombination events are displayed by heatmaps of the recombination percentage for each possible combination. Target sites are displayed horizontally and ranked according to their similarity. The tested recombinases are arranged according to their homology on the vertical axis. On-target recombination events are framed by red squares. Figure 4B shows the validation of crossover recombination events by plastid-based recombination analysis. The recombination activity of individual recombinases is shown at their on-target and off-target sites. Recombination was assessed by agarose gel electrophoresis. Each sample was tested with or without L-arabinose-induced recombinase expression, indicated by "-" or "+" (100 µg/ml L-arabinose). Recombination is represented by a band aligned with a single triangle, while non-recombination bands are represented by two triangles. M = GeneRulerTM DNA Ladder Mix (Thermo Fisher). Portions of the figure were created using BioRender.com.

圖 5顯示新型重組酶在哺乳動物細胞中的活性。圖 5A是哺乳動物重組報導子和表現構建體的圖示。報導載體和表現載體中標記了重要特徵。報導載體重組後，mCherry盒將會被切除，從而允許pCAG啟動子(箭頭)表現GFP (綠色)。黑色三角形代表每個相應重組酶的不同lox位點。NLS，核定位訊號。圖 5B顯示在Hek293T細胞中的螢光顯微鏡分析。顯示用空表現質體和帶有八個新型目標位點的非重組報導質體轉染(上圖)，或用報導子與帶有各自重組酶的表現質體共轉染(下圖)的細胞。Vika/vox和Cre/loxP被納入作為陽性對照。Ctrl，陰性對照；Rec，重組酶。圖 5C顯示圖5B中所示樣品的FACS分析。深灰色直方圖描繪對照樣品，其中非重組報導質體與「空」表現質體共轉染，而淺灰色直方圖顯示以相應重組酶轉染的樣品。圖 5D顯示表現慢病毒載體(左)和報導載體(右)的質體圖譜。用於哺乳動物病毒RNA表現的CMV、PGK和CAG啟動子、重組酶-P2A-BFP盒和mCherry盒分別顯示為白色箭頭。也描述pLentiX載體上病毒生產的特徵。將慢病毒載體轉染至HEK293T細胞中進行重組分析，或用於病毒生產和感染以測試重組酶持續表現的效果。用於選殖重組酶的BsrGI和XbaI限制位點標記在表現載體上，而在pCAG-lox-mCherry-lox-GFP報導質體上標記的NheI和HindIII位點用於選殖所有目標位點。Ori - 細菌中的ColE1複製起點；AmpR - 胺芐青黴素(ampicillin)抗性基因；bGH多(A)訊號 - 牛生長激素聚腺苷酸化訊號； rbGlob-多A-兔β-珠蛋白聚腺苷酸化訊號。 Figure 5 shows the activity of the novel recombinases in mammalian cells. Figure 5A is a schematic representation of the mammalian recombinase reporter and expression constructs. Important features are marked in the reporter and expression vectors. After recombination of the reporter vector, the mCherry cassette will be excised, allowing the pCAG promoter (arrow) to express GFP (green). The black triangles represent the different lox sites for each corresponding recombinase. NLS, nuclear localization signal. Figure 5B shows fluorescence microscopy analysis in Hek293T cells. Cells transfected with an empty expression plasmid and a non-recombinant reporter plasmid with eight novel target sites (top panel), or co-transfected with the reporter and an expression plasmid with the respective recombinase (bottom panel) are shown. Vika/vox and Cre/loxP were included as positive controls. Ctrl, negative control; Rec, recombinase. Figure 5C shows FACS analysis of the samples shown in Figure 5B. The dark gray histogram depicts control samples in which non-recombinant reporter plasmids were co-transfected with "empty" expression plasmids, while the light gray histograms show samples transfected with the corresponding recombinase. Figure 5D shows plasmid maps of expression lentiviral vectors (left) and reporter vectors (right). The CMV, PGK and CAG promoters used for mammalian viral RNA expression, the recombinase-P2A-BFP cassette and the mCherry cassette are shown as white arrows, respectively. The characteristics of virus production on the pLentiX vector are also depicted. Lentiviral vectors were transfected into HEK293T cells for recombinant analysis or used for virus production and infection to test the effect of sustained expression of the recombinase. BsrGI and XbaI restriction sites for cloning recombinases are marked on the expression vector, while NheI and HindIII sites marked on the pCAG-lox-mCherry-lox-GFP reporter plasmid are used to clone all target sites. Ori - ColE1 origin of replication in bacteria; AmpR - ampicillin resistance gene; bGH poly(A) signal - bovine growth hormone polyadenylation signal; rbGlob - polyA - rabbit β-globin polyadenylation signal.

圖 6顯示在哺乳動物細胞中過度表現本發明重組酶對細胞生長的影響。圖 6A是實驗配置的概述。重要步驟以箭頭表示。用雙順反子慢病毒表現構建體按大約50%的比率轉導細胞，其中各別重組酶的表現是經由P2A連接至BFP表現。每72小時藉由流式細胞分析術分析細胞，並在2週內記錄BFP陽性細胞的百分比。BFP陽性細胞百分比下降表示受感染細胞的增生劣勢。圖 6B顯示生長率的分析。繪製第3天和第15天之間BFP陽性細胞百分比的差異(生物重複顯示為點，n = 3)。誤差槓代表平均值的標準差(SD)。以單因子ANOVA檢定計算相對於BFP對照的統計顯著性。(****)：P≦0.0001。 Figure 6 shows the effect of overexpressing the recombinases of the present invention on cell growth in mammalian cells. Figure 6A is an overview of the experimental configuration. Important steps are indicated by arrows. Cells were transduced at a rate of approximately 50% with a bicistronic lentiviral expression construct, in which expression of the respective recombinase was linked to BFP expression via P2A. Cells were analyzed by flow cytometry every 72 hours, and the percentage of BFP-positive cells was recorded within 2 weeks. A decrease in the percentage of BFP-positive cells indicates a proliferative disadvantage of the infected cells. Figure 6B shows an analysis of growth rate. The difference in the percentage of BFP-positive cells between day 3 and day 15 is plotted (biological replicates are shown as points, n = 3). Error bars represent standard deviations (SD) of the mean. Statistical significance relative to BFP control was calculated by one-way ANOVA test. (****): P ≦ 0.0001.

圖 7顯示YR9重組酶定向演化的結果。圖 7A，基於質體的 wtYR9重組酶和從演化庫中隨機挑選的三個殖株的活性分析。右側的瓊脂糖凝膠圖片顯示與 wtYR9相比，在四種不同表現量(0 µg/ml、1 µg/ml、10 µg/ml和100 µg/ml)下重組lox9的最佳殖株。右側顯示了重組的量化結果和再現性。透過比較左側凝膠的重組帶強度與非重組帶強度的比率來計算重組效率。實驗以三重複進行(n = 3)。與YR9 wt比較是用t檢定完成，並使用Bonferroni方法調整p值以進行多重比較。顯著性：(ns) p ＞ 0.05，(*) p ≦ 0.05，(**) p≦ 0.01，(***) p ≦ 0.001，(****) p ≦ 0.0001。圖 7B，YR9改良殖株的突變分析(單字母代碼)。顯示 wtYR9重組酶的胺基酸序列作為參考品。字母從上到下代表在殖株YR9.10、YR9.2和YR9.1中發現的變化。預測的DNA斷裂-重接合超家族的C端催化結構域被標記。圖 7C，哺乳動物細胞中最活躍的殖株YR9.10的活性。顯示wt YR9和YR9.10重組酶的FACS分析。深灰色直方圖描繪對照樣品，其中非重組報導質體與「空」表現質體共轉染，而淺灰色直方圖顯示以相應重組酶轉染的樣品。 Figure 7 shows the results of directed evolution of the YR9 recombinase. Figure 7A , Activity analysis of the plastid-based wt YR9 recombinase and three clones randomly picked from the evolution library. The agarose gel images on the right show the best clones that recombined lox9 at four different expression levels (0 µg/ml, 1 µg/ml, 10 µg/ml, and 100 µg/ml) compared to wt YR9. The quantification and reproducibility of the recombination are shown on the right. The recombination efficiency was calculated by comparing the ratio of the intensity of the recombinant band to the intensity of the non-recombinant band in the gel on the left. The experiment was performed in triplicate (n = 3). Comparison with YR9 wt was done with a t-test, and the p-value was adjusted for multiple comparisons using the Bonferroni method. Significance: (ns) p ＞ 0.05, (*) p ≦ 0.05, (**) p≦ 0.01, (***) p ≦ 0.001, (****) p ≦ 0.0001. Figure 7B , Mutation analysis of YR9 improved strains (single letter code). The amino acid sequence of wt YR9 recombinase is shown as a reference. Letters from top to bottom represent the changes found in strains YR9.10, YR9.2 and YR9.1. The predicted C-terminal catalytic domain of the DNA breakage-rejoining superfamily is marked. Figure 7C , Activity of the most active strain YR9.10 in mammalian cells. FACS analysis of wt YR9 and YR9.10 recombinases is shown. Dark grey histograms depict control samples in which a non-recombinant reporter plasmid was co-transfected with an "empty" expression plasmid, while light grey histograms show samples transfected with the corresponding recombinase.

無without

TW202507002A_113114269_SEQL.xmlTW202507002A_113114269_SEQL.xml

Claims

A method for generating site-specific DNA recombination, the method comprising the steps of: a) contacting a nucleic acid comprising at least a first recognition site and a second recognition site that are substantially identical or substantially reverse complementary to each other with a protein having recombinase activity, and b) allowing the protein having recombinase activity to generate site-specific DNA recombination. wherein the recognition sites comprise a first half site, a spacer and a second half site, and wherein substantially identical or substantially reverse complementary to each other means that the nucleotide sequences of the first half site and the second half site in the first recognition site may differ from the nucleotide sequences of the first half site and the second half site in the second recognition site by at most two nucleotides, wherein (i) the protein having recombinase activity comprises an amino acid sequence having at least 80% identity with SEQ ID NO:7, and wherein the at least two recognition sites comprise a nucleic acid sequence according to SEQ ID NO:16 or a functional variant thereof or a nucleic acid sequence reverse complementary to SEQ ID NO:16 or a functional variant thereof, wherein the functional variant comprises a nucleic acid sequence having at least 60% sequence identity with SEQ ID NO:16; or (ii) the protein having recombinase activity comprises an amino acid sequence having at least 80% identity with SEQ ID NO:7 NO:3 has an amino acid sequence with at least 80% identity, and wherein the at least two recognition sites comprise a nucleic acid sequence according to SEQ ID NO:12 or a functional variant thereof or a nucleic acid sequence that is reverse complementary to SEQ ID NO:12 or a functional variant thereof, wherein the functional variant comprises a nucleic acid sequence with at least 60% sequence identity to SEQ ID NO:12; or (iii) the protein having recombinase activity comprises an amino acid sequence with at least 80% identity to SEQ ID NO:1, and wherein the at least two recognition sites comprise a nucleic acid sequence according to SEQ ID NO:10 or a functional variant thereof or a nucleic acid sequence that is reverse complementary to SEQ ID NO:10 or a functional variant thereof, wherein the functional variant comprises a nucleic acid sequence with at least 60% sequence identity to SEQ ID NO:10; or (iv) the protein having recombinase activity comprises an amino acid sequence with at least 80% identity to SEQ ID NO:10, and wherein the at least two recognition sites comprise a nucleic acid sequence according to SEQ ID NO:10 or a functional variant thereof or a nucleic acid sequence that is reverse complementary to SEQ ID NO:10 or a functional variant thereof, wherein the functional variant comprises a nucleic acid sequence with at least 60% sequence identity to SEQ ID NO:10; NO:2, wherein the at least two recognition sites comprise a nucleic acid sequence according to SEQ ID NO:11 or a functional variant thereof or a nucleic acid sequence that is reverse complementary to SEQ ID NO:11 or a functional variant thereof, wherein the functional variant comprises a nucleic acid sequence that is at least 60% identical to SEQ ID NO:11; or (v) the protein having recombinase activity comprises an amino acid sequence that is at least 80% identical to SEQ ID NO:4, wherein the at least two recognition sites comprise a nucleic acid sequence according to SEQ ID NO:13 or a functional variant thereof or a nucleic acid sequence that is reverse complementary to SEQ ID NO:13 or a functional variant thereof, wherein the functional variant comprises a nucleic acid sequence that is at least 60% identical to SEQ ID NO:13; or (vi) the protein having recombinase activity comprises an amino acid sequence that is at least 80% identical to SEQ ID NO:4, wherein the at least two recognition sites comprise a nucleic acid sequence according to SEQ ID NO:13 or a functional variant thereof or a nucleic acid sequence that is reverse complementary to SEQ ID NO:13 or a functional variant thereof, wherein the functional variant comprises a nucleic acid sequence that is at least 60% identical to SEQ ID NO:13; NO:5, wherein the at least two recognition sites comprise a nucleic acid sequence according to SEQ ID NO:14 or a functional variant thereof or a nucleic acid sequence that is reverse complementary to SEQ ID NO:14 or a functional variant thereof, wherein the functional variant comprises a nucleic acid sequence that is at least 60% identical to SEQ ID NO:14; or (vii) the protein having recombinase activity comprises an amino acid sequence that is at least 80% identical to SEQ ID NO:6, wherein the at least two recognition sites comprise a nucleic acid sequence according to SEQ ID NO:15 or a functional variant thereof or a nucleic acid sequence that is reverse complementary to SEQ ID NO:15 or a functional variant thereof, wherein the functional variant comprises a nucleic acid sequence that is at least 60% identical to SEQ ID NO:15; or (viii) the protein having recombinase activity comprises an amino acid sequence that is at least 80% identical to SEQ ID NO:6, wherein the at least two recognition sites comprise a nucleic acid sequence according to SEQ ID NO:15 or a functional variant thereof or a nucleic acid sequence that is reverse complementary to SEQ ID NO:15 or a functional variant thereof, wherein the functional variant comprises a nucleic acid sequence that is at least 60% identical to SEQ ID NO:15; NO:8 has an amino acid sequence with at least 80% identity, and wherein the at least two recognition sites comprise a nucleic acid sequence according to SEQ ID NO:17 or a functional variant thereof or a nucleic acid sequence that is reverse complementary to SEQ ID NO:17 or a functional variant thereof, wherein the functional variant comprises a nucleic acid sequence with at least 60% sequence identity to SEQ ID NO:17; or (ix) the protein having recombinase activity comprises an amino acid sequence with at least 80% identity to SEQ ID NO:9, and wherein the at least two recognition sites comprise a nucleic acid sequence according to SEQ ID NO:15 or a functional variant thereof or a nucleic acid sequence that is reverse complementary to SEQ ID NO:15 or a functional variant thereof, wherein the functional variant comprises a nucleic acid sequence with at least 60% sequence identity to SEQ ID NO:15.

The method of claim 1, wherein the protein having recombinase activity comprises at least two protein monomers.

The method of claim 1 or 2, wherein the recombinant nucleic acid sequence is present in a cell, preferably further comprising the step of introducing a nucleic acid encoding the protein having recombinase activity into the cell, or wherein the cell comprises a nucleic acid encoding the protein having recombinase activity.

The method of claim 3, wherein the nucleic acid encoding the protein having recombinase activity comprises a regulatory nucleic acid sequence, and wherein the expression of the nucleic acid encoding the protein having recombinase activity is regulated by the regulatory nucleic acid sequence, and/or wherein the cell is a eukaryotic cell or a bacterial cell.

A protein having at least 80% identity to SEQ ID NO:7, SEQ ID NO:3, SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:8 or SEQ ID NO:9 for use in generating site-specific DNA recombination.

A protein having recombinase activity for catalyzing site-specific DNA recombination at substantially identical or substantially mutually reverse complementary recognition sites, wherein the recognition sites comprise a first half site, a spacer and a second half site, and wherein substantially identical or substantially mutually reverse complementary means that the nucleotide sequence of the first half site and the second half site in the first recognition site may differ from the nucleotide sequence of the first half site and the second half site in the second recognition site by at most two nucleotides, wherein: (i) the protein comprises an amino acid sequence having at least 80% identity with SEQ ID NO: 7, and wherein at least one recognition site comprises a nucleic acid sequence according to SEQ ID NO: 16 or a functional variant thereof or a nucleic acid sequence reverse complementary to SEQ ID NO: 16 or a functional variant thereof, wherein the functional variant comprises a nucleic acid sequence having at least 60% sequence identity with SEQ ID NO: 16; or (ii) the protein comprises an amino acid sequence having at least 80% identity with SEQ ID NO: 7, and wherein at least one recognition site comprises a nucleic acid sequence according to SEQ ID NO: 16 or a functional variant thereof or a nucleic acid sequence reverse complementary to SEQ ID NO: 16 or a functional variant thereof, wherein the functional variant comprises a nucleic acid sequence having at least 60% sequence identity with SEQ ID NO: 16; or NO:3, wherein the at least one recognition site comprises a nucleic acid sequence according to SEQ ID NO:12 or a functional variant thereof or a nucleic acid sequence that is reverse complementary to SEQ ID NO:12 or a functional variant thereof, wherein the functional variant comprises a nucleic acid sequence that is at least 60% identical to SEQ ID NO:12; or (iii) the protein comprises an amino acid sequence that is at least 80% identical to SEQ ID NO:1, wherein the at least one recognition site comprises a nucleic acid sequence according to SEQ ID NO:10 or a functional variant thereof or a nucleic acid sequence that is reverse complementary to SEQ ID NO:10 or a functional variant thereof, wherein the functional variant comprises a nucleic acid sequence that is at least 60% identical to SEQ ID NO:10; or (iv) the protein comprises an amino acid sequence that is at least 80% identical to SEQ ID NO:2, wherein the at least one recognition site comprises a nucleic acid sequence according to SEQ ID NO:10 or a functional variant thereof or a nucleic acid sequence that is reverse complementary to SEQ ID NO:10 or a functional variant thereof, wherein the functional variant comprises a nucleic acid sequence that is at least 60% identical to SEQ ID NO:10. NO: 11 or a functional variant thereof or a nucleic acid sequence that is reverse complementary to SEQ ID NO: 11 or a functional variant thereof, wherein the functional variant comprises a nucleic acid sequence that has at least 60% sequence identity with SEQ ID NO: 11; or (v) the protein comprises an amino acid sequence that has at least 80% identity with SEQ ID NO: 4, and wherein the at least one recognition site comprises a nucleic acid sequence according to SEQ ID NO: 13 or a functional variant thereof or a nucleic acid sequence that is reverse complementary to SEQ ID NO: 13 or a functional variant thereof, wherein the functional variant comprises a nucleic acid sequence that has at least 60% sequence identity with SEQ ID NO: 13; or (vi) the protein comprises an amino acid sequence that has at least 80% identity with SEQ ID NO: 5, and wherein the at least one recognition site comprises a nucleic acid sequence according to SEQ ID NO: 14 or a functional variant thereof or a nucleic acid sequence that is reverse complementary to SEQ ID NO: 14 or a functional variant thereof NO: 14 or a functional variant thereof, wherein the functional variant comprises a nucleic acid sequence having at least 60% sequence identity with SEQ ID NO: 14; or (vii) the protein comprises an amino acid sequence having at least 80% identity with SEQ ID NO: 6, and wherein the at least one recognition site comprises a nucleic acid sequence according to SEQ ID NO: 15 or a functional variant thereof or a nucleic acid sequence that is reverse complementary to SEQ ID NO: 15 or a functional variant thereof, wherein the functional variant comprises a nucleic acid sequence having at least 60% sequence identity with SEQ ID NO: 15; or (viii) the protein comprises an amino acid sequence having at least 80% identity with SEQ ID NO: 8, and wherein the at least one recognition site comprises a nucleic acid sequence according to SEQ ID NO: 17 or a functional variant thereof or a nucleic acid sequence that is reverse complementary to SEQ ID NO: 17 or a functional variant thereof, wherein the functional variant comprises an amino acid sequence having at least 80% sequence identity with SEQ ID NO: 18. NO: 17 has at least 60% sequence identity; or (ix) the protein comprises an amino acid sequence having at least 80% identity with SEQ ID NO: 9, and wherein the at least one recognition site comprises a nucleic acid sequence according to SEQ ID NO: 15 or a functional variant thereof or a nucleic acid sequence that is reverse complementary to SEQ ID NO: 15 or a functional variant thereof, wherein the functional variant comprises a nucleic acid sequence having at least 60% sequence identity with SEQ ID NO: 15.

A nucleic acid having a length of no more than 40 base pairs and comprising: (i) a nucleic acid sequence having at least 60% sequence identity with SEQ ID NO: 16 or a nucleic acid sequence that is the reverse complement thereof; or (ii) a nucleic acid sequence having at least 60% sequence identity with SEQ ID NO: 12 or a nucleic acid sequence that is the reverse complement thereof; or (iii) a nucleic acid sequence having at least 60% sequence identity with SEQ ID NO: 10 or a nucleic acid sequence that is the reverse complement thereof; or (iv) a nucleic acid sequence having at least 60% sequence identity with SEQ ID NO: 11 or a nucleic acid sequence that is the reverse complement thereof; or (v) a nucleic acid sequence having at least 60% sequence identity with SEQ ID NO: 13 or a nucleic acid sequence that is the reverse complement thereof; or (vi) a nucleic acid sequence having at least 60% sequence identity with SEQ ID NO: 14 or a nucleic acid sequence that is the reverse complement thereof; or (vii) a nucleic acid sequence having at least 60% sequence identity with SEQ ID NO: 15 or a nucleic acid sequence that is the reverse complement thereof; NO: 15 has at least 60% sequence identity or a nucleic acid sequence that is reverse complementary to it; or (viii) a nucleic acid sequence that has at least 60% sequence identity with SEQ ID NO: 17 or a nucleic acid sequence that is reverse complementary to it.

A vector comprising at least one, and preferably at least two substantially identical or substantially reverse complementary nucleic acids as claimed in claim 7, wherein the DNA fragment is preferably flanked by two substantially identical or substantially reverse complementary nucleic acids, and the vector preferably further comprises a nucleic acid encoding a protein having recombinase activity, wherein the protein having recombinase activity comprises an amino acid sequence having at least 80% identity with any one of SEQ ID NOs: 1 to 9.

A vector comprising a nucleic acid encoding a protein having recombinase activity, wherein the protein having recombinase activity comprises an amino acid sequence having at least 80% identity with any one of SEQ ID NOs: 1 to 9.

A use of a carrier as claimed in any one of claims 8 or 9 in a method as claimed in any one of claims 1 to 4.

A protein having at least 80% identity to SEQ ID NO:7, SEQ ID NO:3, SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:8 or SEQ ID NO:9, or a vector as claimed in claim 8 or 9, for use in medicine, preferably for treating a genetic disease or disorder in an individual, more preferably wherein the genetic disease or disorder is characterized by modification of the genome of the individual.

An isolated host cell comprising the following recombinant DNA fragments: (i) at least one, and preferably at least two, nucleic acids as described in claim 7; and/or (ii) a vector as described in any one of claims 8 or 9.

The isolated host cell of claim 12, further comprising: (i) a nucleic acid encoding a protein having recombinase activity, wherein the protein comprises an amino acid sequence having at least 80% identity with any one of SEQ ID NOs: 1 to 9; or (ii) a vector of any one of claims 8 or 9.

A non-human host organism comprising: (i) at least one, and preferably at least two, nucleic acids as in claim 7; or (ii) a vector as in any one of claims 8 or 9.

A pharmaceutical composition comprising a protein having at least 80% identity with SEQ ID NO:7, SEQ ID NO:3, SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:8 or SEQ ID NO:9, a nucleic acid as in claim 7, a vector as in any one of claim 8 or 9, an isolated host cell as in claim 12 or 13, or a non-human host organism as in claim 14, and optionally a pharmaceutically acceptable excipient.